It consists of three algorithms: BWA-backtrack, BWA-SW and BWA-MEM. BWA is largely influenced by BWT-SW. Nonetheless, BWA-short usually has higher power to distinguish the optimal hit from many suboptimal hits. Each line consists of: Col, field, description 1, qname, query (pair) name 2, fLAG bitwise flag 3, rname.
BWA is a software package for mapping low-divergent sequences against a large reference genome, such as the human genome.
The latest BWA-SW also works for paired-end reads longer than 100bp.
I first implemented the basic smem algorithm in the fastmap command for an experiment and then extended the basic algorithm and added the extension part in Feburary 2013 to make BWA-MEM a fully featured mapper).
For mapping Illumina short-insert reads to the human genome,
This is not a bug.
Tag Meaning NM Edit distance MD Mismatching positions/bases AS Alignment score BC Barcode sequence X0 Number of best hits X1 Number of suboptimal hits found by BWA XN Number of ambiguous bases in the referenece XM Number of mismatches in the alignment XO Number of. This means BWA will be very slow if r is high because in this case BWA has to visit hits with many differences and looking for these hits is expensive. This feature makes it possible to integrate the forward and reverse complemented genome in one FM-index, which speeds up both BWA-short and BWA-SW. Pairing is slower for shorter reads. Since version .6, BWA has been able to work with a reference genome longer than 4GB. It further performs Smith-Waterman alignment for unmapped reads to rescue reads with a high erro rate, and for high-quality anomalous pairs to fix potential alignment errors. The first algorithm is designed for Illumina sequence reads up to 100bp, while the rest two for longer sequences ranged from 70bp to 1Mbp. While BWA-SW learns from BWT-SW, it introduces heuristics that can hardly be applied to the original algorithm. (2009) Fast and accurate short read alignment with Burrows-Wheeler transform.