June 2007

Prunus_ESTs_June_01_2007.cds._180_.fasta - ESTs ( 4852 sequences) - COS candidates

Prunus_ESTs_June_01_2007.cds._180_.assembly - assembly (1758 unigenes) - COS candidates

Py_ContigViewer_Uni_2007_06_07_Prunus_cos.py - contig viewer

tcl_blast_parser_123_V025S.tcl - blast parser

ASSEMBLY INFO:

Rosaceae_ESTs_Prunus_spc_June_01_2007.cds._180_.cap3.out.Info - Assembly Info

Rosaceae_ESTs_Prunus_spc_June_01_2007.cds._180_.cap3.out.Complexity1.txt - Contig complexity 1

Rosaceae_ESTs_Prunus_spc_June_01_2007.cds._180_.cap3.out.Complexity2.txt - Contig complexity 2

=== === ===

December 2007

rosaceae_sequences_412832_Dec_2007.Clean.fasta - all 412,827 Rosaceae ESTs from NCBI GenBank, as of December 2007

rosaceae_sequences_412832_Dec_2007.species_code.txt - species code (nine letter prefix)

A_thaliana_ATGC_2006_08_24.protein.COS_STRICT.fasta - Arabidopsis single copy genes

rosaceae_sequences_412832_Dec_2007.Clean.COS.CDS.fasta - ESTs COS candidates (30,801 COS ESTs out of total 412,827)

rosaceae_sequences_412832_Dec_2007.Clean.COS.CDS.assembly - CAP3 assembly (7,573 unigenes; 3,820 contigs and 3,753 singlets)


ASSEMBLY INFO:

rosaceae_sequences_412832_Dec_2007.Clean.COS.CDS.cap3.out.Info - info about assembly (how many ESTs per contig)

rosaceae_sequences_412832_Dec_2007.Clean.COS.CDS.cap3.out.Complexity1.txt Contig complexity 1

rosaceae_sequences_412832_Dec_2007.Clean.COS.CDS.cap3.out.Complexity2.txt Contig complexity 2

rosaceae_sequences_412832_Dec_2007.Clean.COS.CDS.cap3.out.clip - cap3 clip info

BLAST INFO:

rosaceae_sequences_412832_Dec_2007.Clean.COS.CDS.assembly.vs.ATH.info1 - assembly BLAST versus Arabidopsis (first best hit only)

rosaceae_sequences_412832_Dec_2007.Clean.COS.CDS.assembly.vs.ATH.info2 - assembly BLAST versus Arabidopsis (all hits)

rosaceae_sequences_412832_Dec_2007.Clean.COS.CDS.header.uniq - list of different species involved into asembly

Py_ContigViewer_Uni_2007_06_07_Rosaceae_cos.py - contig viewer configured to view Rosaceae COS assembly


RosCOS FINAL SELECTION AND QC BLAST:

rosaceae_sequences_412832_Dec_2007.Clean.COS.CDS.assembly.vs.ATH_COS - BLAST-X versus Arabidopsis single copy genes

rosaceae_sequences_412832_Dec_2007.Clean.COS.CDS.assembly.vs.ATH_COS.info1 - parsed BLAST-X output

rosaceae_sequences_412832_Dec_2007.Clean.COS.CDS.assembly.selected_1041 - selected 1041 (1039) set 

rosaceae_sequences_412832_Dec_2007.Clean.COS.CDS.assembly.selected_1041.vs.ATH_COS - 1039 BLAST-X versus ATH_SCG

rosaceae_sequences_412832_Dec_2007.Clean.COS.CDS.assembly.selected_1041.vs.ATH_COS.info1 - parsed BLAST output


A_thaliana_ATGC_2006_08_24.protein.COS_STRICT.vs.RosCOS.TBlastN - ATH_SCG TBlastN versus RosCOS

A_thaliana_ATGC_2006_08_24.protein.COS_STRICT.vs.RosCOS.TBlastN.all_hits - ATH_SCG TBlastN versus RosCOS - parsed all_hits table

A_thaliana_ATGC_2006_08_24.protein.COS_STRICT.vs.RosCOS.TBlastN.blast_stat - ATH_SCG TBlastN versus RosCOS - blast_stat table - number of hits per query

A_thaliana_ATGC_2006_08_24.protein.COS_STRICT.vs.RosCOS.TBlastN.query_overlap - ATH_SCG TBlastN versus RosCOS - query overlap info for multiple hits


List of Unique Arabidopsis genes in selected subsets:

GO_ATH_Count_Arabidopsis_SCG.IDs - list of all 3790 Arabidopsis single copy genes (SCG)

GO_ATH_Count_RosCOS_Assembly_All.IDs - list of all 2324 Arabidopsis genes in RosCOS assembly

GO_ATH_Count_RosCOS_Set_1039.IDs - list of all 901 Arabidopsis genes in selected 1039 RosCOS set

=== === ===

June 05 2008

prunus_malus_binary_080605.ESTs.fasta - Prunus and Malus ESTs involved into assembly. Number of Prunus ESTs is doubled, redundant pairs have identical IDs with distinct suffixes '.C' or '.D'

prunus_malus_binary_080605.assembly - binary Prunus/Malus assembly. Note, that contigs with two ESTs actually are singletons because of duplication of Prunus ESTs

prunus_malus_binary_080605.assembly.Info - Info about assembly

prunus_malus_binary_080605.assembly.Info.Complexity1A.txt - Contig complexity 1A (binary Prunus/Malus)

prunus_malus_binary_080605.assembly.Info.Complexity1B.txt - Contig complexity 1B per species/genotype

prunus_malus_binary_080605.assembly.Info.Complexity2.txt - Contig complexity 2

prunus_malus_binary_080605.assembly.vs.A_thaliana.BlastX.info1 - assembly BLAST versus Arabidopsis (first best hit only)

prunus_malus_binary_080605.assembly.vs.A_thaliana.BlastX.info2 - assembly BLAST versus Arabidopsis (all hits)

=== === ===