The following awk one-liner converts the FASTA-formatted sequence file $infile into the reformatted FASTA file $outfile:
The following one-liners allow each sequence inside the FASTA-formatted file $infile to be broken into several lines of lengths up to $cutoff characters that are written into the FASTA file $outfile.
Given a FASTA-formatted file $infile containing a nucleotide sequence, the following command lines will write its reverse-complement into the FASTA-formatted file $outfile.
Given a string $name and two nucleotide indexes (i.e. $start and $end), the following command lines will search for the first sequence containing the pattern $name inside the FASTA-formatted file $infile, and next write into $outfile its subsequence determined by $start and $end (both inclusive).
The program eFASTA is dedicated to the extraction of subsequence.
Given a codon sequence inside the FASTA-formatted file $infile, the following command lines will translate it (based on the standard genetic code) and write the resuling amino acid sequence into the FASTA-formatted file $outfile.
Of note, any STOP codon will be translated into ?.
The program eFASTA could also be used for translating a codon sequence.
eFASTA -f $infile -c $(grep ">" $infile):1-$(egrep -v "^>|^$" $infile | tr -d '\n' | wc -m) -o $outfile -cds ;
rm -f $outfile.fna ;
mv $outfile.faa $outfile ;