Use this tool to select all promoters or restrict them based on promoter name / ID or by all or some of their genomic contexes (such as presence of core promoter elements) or expression levels. After selection, you can download them in various format (for example in FASTA, BED, etc...), liftOver them to a different assembly or use them to perform further analysis such as motif enrichment/search and chromatin status.


This tool allows the selection of all or a subset of promoters from an EPDnew database. Selection can be restriscted based on Promoter or Gene IDs, genomic context or other characteristics. Multiple criteria can be used at the same time, for example providing a set of Gene IDs and restrict the selection for promoters that have a TATA-box. Here is a description of the criteria used by each selection method.

Selection by ID:
write one ID per line without the use of any symbols (',', ';', '|') to separate IDs. In the output page promoters are always annotated using EPDnew IDs. To facilitate the conversion between user provided IDs and EPDnew IDs, the output page provides a log file with the convesions table. Note that multiple promoters can be associated to one ID. Users can restrict the selection to only one promoter per gene by activating the check box for selecting the most representative promoter (see 'Additional Options'). IDs can be of the following types (some of them are species-specific):

  • EPDnew ID: promoter ID used here (MAPK1_1, TP53_1, TBP_1, etc). It is available on all databases.
  • ENSEMBL GENE ID: gene ID from the Ensembl database (ENSG00000002016, ENSG00000003509, ENSG00000003989).
  • RefSeq ID: transcript ID from the RefSeq database (NM_032974, NM_002355, NM_001013836).
  • FlyBase ID: ID from FlyBase annotation (FBgn0025740, FBgn0039897, FBgn0039904)
  • WormBase ID: ID from WormBase annotation (WBGene00022279, WBGene00022037, WBGene00022368)
  • AGI ID: Arabidopsis Gene ID (AT1G01010)
  • Gramene GENE ID: Gramene Gene ID (GRMZM2G330436, GRMZM2G440537, GRMZM2G008710)
  • sgdGene ID: Saccharomyces Genome Database Gene ID (YAL061W, YAL024C, YAL001C)
  • PomBase ID: S. pombe Genome Database Gene ID (SPCP20C8.02c, SPCC330.04c, SPCC1235.07)

Selection by precomputed characteristics:

  • TATA-box: a promoter is with a TATA-box if the motif is found at position −28 (± 3 bp) from the transcription start site (evaluated using FindM).
  • Initiator: a promoter is with an Initiator motif if it is found at position 0 from the transcription start site (evaluated using FindM).
  • CCAAT-box: a promoter is with a CCAAT motif if it is found in the region −200 to −50 from the transcription start site (evaluated using FindM).
  • GC-box: a promoter is with a GC motif if it is found in the region −200 to −50 from the transcription start site (evaluated using FindM).
  • CpG: at the moment this selection criteria is not active.
  • Average Expression: for each sample used in generating EPDnew collection, the promoter expression is calculated as the number of CAGE tags matching the region from −250 to +250 bp from the TSS. Each sample is normalized to a total tag count of 10M.
  • Expression call: a promoter is expressed in sample X if the number of tags that map at the TSS is higher than 3.

Additional Options:
Users can restrict the selection only to the most representative promoter for a gene. In this case only one promoter will be associated to a gene. This is the promoter that has been validated by the largest number of samples or, if inconclusive, the most upstream. Note that for some organisms the samples might not be representative of the normal growth conditions and be restricted to some tissue, growth conditions or development. This can have an impact on the selection of the most representative promoter (not general but specific to the conditions used during the experiment).

Last update May 2017