The following awk one-liner converts the FASTA-formatted sequence file $infile
into the reformatted FASTA file $outfile
:
The following one-liners allow each sequence inside the FASTA-formatted file $infile
to be broken into several lines of lengths up to $cutoff
characters that are written into the FASTA file $outfile
.
Given a FASTA-formatted file $infile
containing a nucleotide sequence, the following command lines will write its reverse-complement into the FASTA-formatted file $outfile
.
Given a string $name
and two nucleotide indexes (i.e. $start
and $end
), the following command lines will search for the first sequence containing the pattern $name
inside the FASTA-formatted file $infile
, and next write into $outfile
its subsequence determined by $start
and $end
(both inclusive).
The program eFASTA is dedicated to the extraction of subsequence.
Given a codon sequence inside the FASTA-formatted file $infile
, the following command lines will translate it (based on the standard genetic code) and write the resuling amino acid sequence into the FASTA-formatted file $outfile
.
Of note, any STOP codon will be translated into ?
.
The program eFASTA could also be used for translating a codon sequence.
eFASTA -f $infile -c $(grep ">" $infile):1-$(egrep -v "^>|^$" $infile | tr -d '\n' | wc -m) -o $outfile -cds ;
rm -f $outfile.fna ;
mv $outfile.faa $outfile ;