The following awk one-liner writes $nrep
FASTA-formatted files, each containing one character bootstrap replicates generated from an initial multiple sequence alignment file $infile
in FASTA format.
Random selection can be controlled by the integer seed value $seed
. Output file names are of the form $basefile
._r_.fasta, where $basefile
should be specified and r varies from 1 to $nrep
.
awk -v s=$seed -v r=$nrep -v f=$basefile '/^>/{seq[n]=sn;lbl[++n]=$0;sn="";next} {sn=sn$0}
END{l=length(seq[n]=sn);srand(s);++r;
while(--r>0){i=c=0;while(++c<=l)b[c]=1+int(l*rand());++l;o=f""r".fasta";
while(++i<=n){sb="";si=seq[i];c=l;while(--c>0)sb=sb""substr(si,b[c],1);print lbl[i]"\n"sb>>o}--l}}' $infile
The following awk one-liner writes $nrep
FASTA-formatted files, each containing one block bootstrap replicates generated from an initial multiple sequence alignment file $infile
in FASTA format.
Blocks are non-overlapping and of size $bsize
(e.g. = 3 to carry out codon bootstrap). Random selection can be controlled by the integer seed value $seed
. Output file names are of the form $basefile
._r_.fasta, where $basefile
should be specified and r varies from 1 to $nrep
.
awk -v s=$seed -v r=$nrep -v x=$bsize -v f=$basefile '/^>/{seq[n]=sn;lbl[++n]=$0;sn="";next} {sn=sn$0}
END{if((l=length(seq[n]=sn))%x!=0){print"alignment length ("l") is not a multiple of "x;exit1}l/=x;srand(s);++r;
while(--r>0){i=c=0;while(++c<=l)b[c]=int(l*rand());++l;o=f""r".fasta";
while(++i<=n){sb="";si=seq[i];c=l;while(--c>0)sb=sb""substr(si,1+x*b[c],x);print lbl[i]"\n"sb>>o}--l}}' $infile
The following awk one-liner returns a consensus sequence from the FASTA-formatted multiple sequence alignment file $infile
. Of note, the awk variable unk
is the list of every character state that should be considered as unknown (separated by pipe symbols |).
awk -v unk="?|-|X" '/^>/{seq[n]=sn;++n;sn="";next} {sn=sn$0}
END{l=length(seq[n]=sn);while(++s<=l){delete no;c="?";i=m=0;while(++i<=n)(!((r=substr(seq[i],s,1))~unk))&&(x=++no[r])>m&&m=x&&c=r;printf c}print""}' $infile