Multiple sequence alignment manipulation

How to generate character bootstrap replicates from a FASTA-formatted multiple sequence alignment?

The following awk one-liner writes $nrep FASTA-formatted files, each containing one character bootstrap replicates generated from an initial multiple sequence alignment file $infile in FASTA format.

Random selection can be controlled by the integer seed value $seed. Output file names are of the form $basefile._r_.fasta, where $basefile should be specified and r varies from 1 to $nrep.

[200503ac]

How to generate block bootstrap replicates from a FASTA-formatted multiple sequence alignment?

The following awk one-liner writes $nrep FASTA-formatted files, each containing one block bootstrap replicates generated from an initial multiple sequence alignment file $infile in FASTA format.

Blocks are non-overlapping and of size $bsize (e.g. = 3 to carry out codon bootstrap). Random selection can be controlled by the integer seed value $seed. Output file names are of the form $basefile._r_.fasta, where $basefile should be specified and r varies from 1 to $nrep.

[200503ac]

How to infer a consensus sequence from a FASTA-formatted multiple sequence alignment?

The following awk one-liner returns a consensus sequence from the FASTA-formatted multiple sequence alignment file $infile. Of note, the awk variable unk is the list of every character state that should be considered as unknown (separated by pipe symbols |).

[200503ac]