DNA Sequencing
DNA Sequencing
DNA Sequence Analysis
DNA sequencing is a fundamental technique in molecular biology that determines the precise order of nucleotides (A, C, G, and T) in a DNA molecule. One interesting mathematical problem in DNA sequencing involves reconstructing a DNA sequence from its constituent triplets.
When a DNA strand interacts with a sequencing array, each consecutive triplet (three consecutive nucleotides) in the strand is highlighted in the array. Given these highlights, can we determine the original DNA sequence?
Enter a DNA sequence (using only A, C, G, and T) to see how its triplets are highlighted in the array and visualized in the De Bruijn graph in real-time. Use the Play button to see the Euler path through the graph.
Your DNA Sequence:
Triplets Array:
De Bruijn Graph:
The problem shown in the interactive above is an example of sequence reconstruction from triplets. In the array, each cell represents a possible triplet of nucleotides. When a DNA sequence is analyzed, all triplets present in the sequence are highlighted.
For example, the sequence AACTCCAGTATGGC contains these triplets: - AAC (positions 1-3) - ACT (positions 2-4) - CTC (positions 3-5) - TCC (positions 4-6) - CCA (positions 5-7) - CAG (positions 6-8) - AGT (positions 7-9) - GTA (positions 8-10) - TAT (positions 9-11) - ATG (positions 10-12) - TGG (positions 11-13) - GGC (positions 12-14)
The reverse problem is even more interesting: given only the highlighted cells in the array, can you determine the original DNA sequence? This is a classic example of sequence reconstruction from overlapping fragments, which has important applications in genome sequencing and assembly.
Middle Layer Graph Explorer
The middle layer graph described in the text can be explored interactively below. Use the slider to select a value of n (where r = (n-1)/2), then click “Generate Middle Layer Graph” to visualize the graph of r-element and (r+1)-element subsets.