Project

General

Profile

Release notes on public datasets

  • Bacteria
    Available since February 2015 at: http://genome.jouy.inra.fr/Insyght_2692_complete_bacteria/
    This dataset features 2692 complete bacterial genomes available in Ensembl Bacteria. It was initially released on June 2014. We report ∼2.4 billion orthologs; on average, their alignments have a coverage of ∼90%, an identity of ∼43%, a score of ∼241 and an e-value of ∼2e−05. We report ∼1.85 billion singleton orthologs and ∼140 million syntenies that comprise ∼550 million orthologs and ∼190 million non-BDBH homologs. On average, alignments of non-BDBH homologs have a coverage of ∼76%, an identity of ∼35%, a score of ∼145 and an e-value of ∼8e−05.
    This database was unavailable from Nov 10th 2016 to Sept 14th 2017 due to a database failure.
  • Archaea
    Available since February 2017 at: http://genome.jouy.inra.fr/Insyght_210_archaea/
    This dataset features 210 complete, reference, and non redondant Archaea genomes available in Uniprot. It was initially released on February 2017. We report ∼12.4 million orthologs; on average, their alignments have a coverage of ~90.4%, an identity of ~41%, a score of ~208, and an e-value of ~2e-4. We report ∼8.7 million singleton orthologs and ∼1.2 million syntenies that comprise ∼3.7 million orthologs and ∼570,000 non-BDBH homologs. On average, alignments of non-BDBH homologs have a coverage of ~78.4%, an identity of ~36.4%, a score of ~153, and an e-value of ~4.5e-4.
  • Taxonomy
    The above public databases make use of taxonomic information. The latest update of our taxonomy data is Feb 23rd 2017.