Brief Description

Compositae Genome Project Database (CGPDB) is a source of comprehensive information about 135,000 lettuce and sunflower ESTs. All data displayed in this database are results of joint efforts of the Compositae Genome Initiative. Database contains detailed information about individual EST reads as well as about EST assemblies. Contig Viewer provides details about EST assemblies as well as polymorphic sites between different genotypes.

Ordering of particular clones can be done via Arizona Genome Center. Note, that you need to provide correct AGI (Arizona Genome Institute) library ID and GenBank Accession number.

Details about library construction can be found here.


Database Structure

Currently database contains the following tables:

  • Lettuce_Contigs_ID, Sunflower_Contigs_ID - list of contigs produced by CAP3 assembly. Tables contain information which ESTs were used to assemble particular contig and size of fragment overlaps. By clicking on Contig_ID you can view actual sequence alignment.

  • Lettuce_Info, Sunflower_Info - tables contain information about TAGs for each EST read. If an EST read belongs to a contig, this information is given in a column "Cluster_ID". Column "Seqs_N" shows how many other EST reads belong to a given contig. By clicking on "EST_ID" you can view the actual chromatogram. Note, that chromatogram viewer displays untrimmed unmasked raw reads. Chromatogram viewer is written in Java and works with Netscape-7 and IE-6.

  • lettuce_all_reads_trimmed, sunflower_all_reads_trimmed - Actual set of sequences after trimming, visual inspection and additional semi-manual trimming which were used for contig assembly. By clicking on "Sequence_ID" you can view the actual chromatogram.

  • lettuce_assembly, sunflower_assembly - CAP3 assembly of trimmed sequences. Tables contain information about all contigs and singletons (unigene set). In other words, there are a non-redundant datasets.

  • lettuce_assembly_blastX, sunflower_assembly_blastX - Results of blastX search against nr protein database at NCBI of unigene set. 24 best hits (if found) represented for each contig or singleton. Normalized expectation is: (-log(exp))/100

  • lettuce_vs_ath_TIGR, sunflower_vs_ath_TIGR - Results of blastx search against Arabidopsis TIGR database (predicted ORFs) of unigene set. 48 best hits (if found) represented for each contig or singleton. Normalized expectation is: (-log(exp))/100

  • lettuce_pfam, sunflower_pfam - results of hmmrsearch for Pfam domains in the translated regions of the assemblies.

  • lettuce_clustering, sunflower_clustering - Results of tblastx search EST assembly against itself. Two sequences considered as linked if they have identity at least 40% with sequence overlap at least 100 amino acids (based on blast translation). Clustering was done using tcl_blast_parser_123_V007.tcl and Graph9 programs. Example of clustering visualization you can find here.



Contig Viewer navigation:



Meaning of EST designation

Contigs:
QG_CA_Contig#### or QH_CA_Contig#### - "QG" stands for Lettuce dataset, "QH" stands for Sunflower dataset.

Lettuce EST reads:
QGA QGB QGC QGD QGI = QG_(A,B,C,D,I) libraries - cultivated lettuce, Salinas (multiple tissues and growth conditions identified by TAG IDs)

QGE QGF QGG QGH QGJ = QG_(E,F,G,H,J) libraries - wild lettuce, Lactuca serriola (multiple tissues and growth conditions identified by TAG IDs)


Sunflower EST reads:
QHA QHB QHC QHD QHI = QH_(A,B,C,D,I) libraries - sunflower RHA801 (multiple tissues and growth conditions identified by TAG IDs)

QHE QHF QHG QHH QHJ = QH_(E,F,G,H,J) libraries - sunflower RHA280 (multiple tissues and growth conditions identified by TAG IDs)

QHK QHL - Helianthus paradoxus (seedling, root, leaf and flower tissues) (no TAG IDs are available)
QHK - normal growth conditions
QHL - salt stress

QHM QHN - Helianthus argophyllus (seedlings, root and leaf tissues) (no TAG IDs are available)
QHM - normal growth conditions
QHN - drought stress


TAG IDs description:

Lettuce:
TAG0 - callus
TAG1 - roots
TAG2 - none
TAG3 - flowers pre-fertilized
TAG4 - flowers post-fertilized
TAG5 - chemical induction
TAG6 - none
TAG7 - roots environmental stress
TAG8 - shoots environmental stress
TAG9 - germinating seeds
TAG10 - flowers environmental stress
TAG11 - leaves dark grow


Sunflower:
TAG0 - callus
TAG1 - roots
TAG2 - disk and ray flowers
TAG3 - flowers pre-fertilized
TAG4 - developing kernel
TAG5 - chemical induction
TAG6 - none
TAG7 - roots environmental stress
TAG8 - shoots environmental stress
TAG9 - germinating seeds
TAG10 - flowers environmental stress
TAG11 - hulls




email: Alexander Kozik
email: Richard Michelmore

Last modified, December 11 2003