Select / Download tool

Use this tool to select promoters based on promoter name / ID, genomic context (such as presence of core promoter elements) or expression level. After selection, you can download them in various formats (e.g. FASTA, BED, etc.), liftOver them to a different assembly or use them to perform further analysis such as motif enrichment/search and chromatin status.

Database

Restrict the selection to the following IDs:

Promoters with the following characteristics:

 
marked as
average expression of at least tags
expressed in at least samples

Additional options:

Select only the most representative promoter for a gene

How to use it

This tool allows the selection of all or a subset of promoters from an EPDnew database. Selection can be restricted based on Promoter or Gene IDs, genomic context or other characteristics. Multiple criteria can be used at the same time, for example providing a set of Gene IDs and restricting the selection to promoters that contain a TATA box. Here is a description of the criteria used by each selection method.

Selection by ID:
write one ID per line without the use of any symbols (',', ';', '|', etc.) to separate IDs. In the output page, promoters are always annotated using EPDnew IDs. To facilitate the conversion between user-provided IDs and EPDnew IDs, the output page provides a log file with the conversion table. Note that multiple promoters can be associated with one ID. Users can restrict the selection to only one promoter per gene by activating the check box, which will select the most representative promoter (see 'Additional Options'). IDs can be of the following types (some of them are species-specific):

  • EPDnew ID: promoter ID used here (MAPK1_1, TP53_1, TBP_1, etc). It is available for all databases.
  • ENSEMBL GENE ID: gene ID from the Ensembl database (ENSG00000002016, ENSG00000003509, ENSG00000003989).
  • RefSeq ID: transcript ID from the RefSeq database (NM_032974, NM_002355, NM_001013836).
  • FlyBase ID: ID from FlyBase annotation (FBgn0025740, FBgn0039897, FBgn0039904)
  • WormBase ID: ID from WormBase annotation (WBGene00022279, WBGene00022037, WBGene00022368)
  • AGI ID: Arabidopsis Gene ID (AT1G01010)
  • Gramene GENE ID: Gramene Gene ID (GRMZM2G330436, GRMZM2G440537, GRMZM2G008710)
  • sgdGene ID: Saccharomyces Genome Database Gene ID (YAL061W, YAL024C, YAL001C)
  • PomBase ID: S. pombe Genome Database Gene ID (SPCP20C8.02c, SPCC330.04c, SPCC1235.07)

Selection by precomputed characteristics:

  • TATA box: a promoter is with a TATA box if the motif is found at position −28 (± 3 bp) from the TSS (evaluated using FindM).
  • Initiator: a promoter is with an Initiator motif if it is found at position 0 from the TSS (evaluated using FindM).
  • CCAAT box: a promoter is with a CCAAT motif if it is found in the region −200 to −50 from the TSS (evaluated using FindM).
  • GC box: a promoter is with a GC motif if it is found in the region −200 to −50 from the TSS (evaluated using FindM).
  • Average Expression: for each sample used in generating an EPDnew collection, promoter expression is calculated as the number of tags matching the region from −250 to +250 bp relative to the TSS. Each sample is normalized to a total tag count of 10 M.
  • Expression call: a promoter is expressed in sample X if the number of tags that map at the TSS is higher than 3.

Additional Options:
Users can restrict the selection only to the most representative promoter for a gene. In this case only one promoter will be associated with a gene, the one that has been validated by the largest number of samples or, if inconclusive, the one located most upstream. Note that for some organisms, the samples might not be representative of the normal growth conditions and be restricted to specific tissues, growth conditions or developmental stages. This may have an impact on the selection of the most representative promoter (not general but specific to the conditions used during the experiment).

Note: Depending on the organism selected, some motifs may not be available for filtering (e.g. P. falciparum).

Last update October 2019