Shotgun sequencing is a method used in genetics for sequencing long DNA strands. The DNA is first cut into small pieces by restriction enzymes. Then the pieces are sequenced individually. By doing this for several copies of the same long DNA strand, overlapping fragments are created. Finally, computer programs align these overlapping sequences and determine the original (long) sequence.
An extremely simplified example with only two sequences is as follows:
Original strand : XXXAGCATGCTGCAGTCATGCTTAGGCTAXXXXIn real-world applications, there are thousands or millions of sequences to deal with, with the addition of transcription and sequencing errors. The computational power required to re-align the sequence in real projects is enormous. For the shotgun sequencing of the human genome in the Human Genome Project run by Celera Genomics in 2000, several supercomputers were running some month nonstop to align all human DNA correctly.
First shotgun sequence : XXXAGCATGCTGCAG TCATGCTTAGGCTAXXXX
Second shotgun sequence : TTAGGCTAXXXX XXXAGCATGCTGCAGTCATGC
Reconstructed strand : XXXAGCATGCTGCAGTCATGCTTAGGCTAXXXX