Detailed View Window
Results for each record are presented in a detailed view window composed of multiple tabs displaying different sections of the information for each entry.
The *Basic Info tab (see Figure 2D) contains the genomic coordinates of the feature based on the current sequence release. Coordinates for older releases can be obtained using the “previous coordinates” button. For RCs, the RC attributes — has_expression, is_CRM, is_minimized — are listed. Other information contained in the Basic Info tab includes the the species (currently only D. melanogaster); the name of the associated gene(s) with links to FlyBase and FlyMine, and, for TFBSs, FlyTF records; and links to the FlyBase Gbrowse and UCSC genome browsers. Note that because TFBSs and CRMs are not strand-specific sequence features, no strand information is reflected in the graphical views. When accessing the Flybase Gbrowse genome browser we have occasionally experienced a timeout error and are working to diagnose the cause. The REDfly ID of the record and date of the last update are also provided.
The Location tab (see Figure 2E) provides a snapshot of the genomic region taken from FlyBase Gbrowse and displays genes, transcripts, and CRMs. TFBSs, inferred CRMs, or new CRM annotations not yet in FlyBase are not currently displayed; however, a yellow vertical bar marks the position of the current feature. The position of the feature relative to transcripts of the associated gene is provided above the graphic.
The Images tab (RCs/CRMs only; see Figure 2F) shows the expression pattern of the reporter gene. A subset of these images are provided courtesy of FlyExpress and clicking on these will bring the user to the FlyExpress website, from which a search can be initiated for other genes with a similar expression pattern. Images are currently available for only a subset of REDfly records. In many cases, if no image is available the figure number showing the RC in the published report is provided.
The Citation/Evidence tab displays the reference and PubMed ID and links to the PubMed record for the current annotation. The name of the REDfly curator responsible for annotating this feature is also provided. This tab also provides the sequence source terms and the evidence for the feature.
Sequence Source Terms: Many older references do not provide exact sequence referents (e.g., genome coordinates, PCR primer sequences, GenBank IDs). Most often, sequence ranges are given as restriction maps. Because sequence polymorphisms between the clones used by researchers and the published genome sequence can lead to gain or loss of restriction sites and thus affect our determination of the reported sequence, we differentiate between those sequences unambiguously provided in the reference or through communication with the authors and those inferred from restriction maps. In those places where we were unable to locate a referenced restriction site or where sizes of the restriction fragments were not well matched with the reported sizes, we list the sequence end as "estimated/uncertain." In time, we hope to reconcile all ambiguities through communication with the authors.
Sequences reported as "inferred from restriction map" use as endpoints the first nucleotide of the restriction site for both the 5' and 3' ends of the sequence. Depending on the actual cut site of the enzyme, therefore, and modification and/or sites used for subcloning, the exact CRM sequence tested by the authors may differ from the reported site by several basepairs.
Orientation of CRMs is given as matching the orientation of the transcription unit, i.e. "5' end estimated" refers to the 5' end of the CRM when oriented in the same 5' to 3' direction as the gene.
TFBS sequences initially from the FlyReg database do not contain sequence source terms.
All RC and CRM records are linked to the REDfly annotations of any TFBSs that fall within them. These are listed in the TFBS tab (for RC/CRM records; see Figure 2G); clicking within a row will open a window with detailed results for that record. Similarly, if a TFBS falls within a known RC/CRM, the name of the RC/CRM and a link to its REDfly record is provided in the RC tab. Searches of REDfly can be restricted to just those TFBSs that map to known CRMs, and vice-versa, using the options in the Advanced Search pane.
The Sequence tab (see Figure 2H) displays the size (in basepairs) and sequence of the current feature. For TFBSs, the "sequence with flank" is also provided. This includes the TFBS sequence in capital letters, with approximately 20 bp of additional sequence extending on each end. This extended sequence allows for the usually short TFBSs to be mapped unambiguously to the genome.
The Expression tab (see Figure 2I) lists the expression terms associated with each record, using the anatomy ontology as described above. Although TFBSs do not of themselves have expression patterns, where a TFBS maps in a RC/CRM, it inherits the expression pattern information from that RC/CRM. Clicking on a column header will sort by that column. Clicking on a term will initiate a REDfly search in a new browser window for all records containing the specified term.
Both FlyBase and the BDGP in situ database use the anatomy ontology for reporting gene expression patterns. We have therefore provided links from each expression pattern in REDfly to each of these databases. Following these links will generate a list of genes annotated as having the selected expression pattern. As mappings between the anatomy ontologies of different organisms are developed, we hope to create links to similarly expressed genes in these organisms as well.
The Notes tab contains free-text notes that elaborate on the basic annotation of the feature. In particular, the notes can indicate details of expression patterns that cannot be adequately captured by the anatomy ontology.