5.2 Glossary ¶

5.2.1 Pairwise alignment (noun)¶

A hypothesis about which bases or amino acids in two biological sequences are derived from a common ancestral base or amino acid. By definition, the aligned sequences will be of equal length with gaps (usually denoted with -, or . for terminal gaps) indicating hypothesized insertion deletion events. A pairwise alignment may be represented as follows:

ACC---GTAC
CCCATCGTAG

5.2.2 kmer (noun)¶

A kmer is simply a word (or list of adjacent characters) in a sequence of length k. For example, the overlapping kmers in the sequence ACCGTGACCAGTTACCAGTTTGACCAA are as follows:

In [1]:
import skbio
skbio.DNA('ACCGTGACCAGTTACCAGTTTGACCAA').kmer_frequencies(k=5, overlap=True)


It is common for bioinformaticians to substitute the value of k for the letter k in the word kmer. For example, you might here someone say "we identified all seven-mers in our sequence", to mean they identified all kmers of length seven.