Conserved Orthologs: Multiple Alignments Pipeline
For example, we have run Codon Usage Pipeline
on two sets of BLAST reports: soybean and tomato versus Arabidopsis. We have generated two subsets of alignments between
soybean - Arabidopsis and tomato - Arabidopsis. Then we may want to find overlapping sequences (alignments) between two sets, e.g.
we want to generate multiple alignments for overlapping regions: soybean - tomato - Arabidopsis for further analysis.
To accomplish this task (generation of multiple alignments soybean - tomato - Arabidopsis) we will use four files derived
by Codon Usage Pipeline:
(contains soybean fragments of sequences corresponding to alignments to Arabidopsis)
(Arabidopsis fragments of sequences corresponding to alignments to soybean)
(contains tomato fragments of sequences corresponding to alignments to Arabidopsis)
(Arabidopsis fragments of sequences corresponding to alignments to tomato)
Then we will run overlap_finder_017.py script on these four files.
Script will ask in which order to input sequences. By running of the script the output will be generated in the form
of text file overlapping_seqs.txt containing all triple alignments [soybean - tomato - Arabidopsis]
if common overlap is greater than 60 nucleotides and directory overlapping_seqs.dir where each alignment is
represented as separate file. We can rename those file and directory into meaningful names, for example:
by mouse click you can download output files and examine them.
Note, that overlap_finder_017.py script will work only in the case if all four
input files were derived by Codon Usage Pipeline.
There are many assumptions in data structure which will work properly if all steps performed accordingly to described protocols.
last modified: December 19 2003