GLORIA

GEOMAR Library Ocean Research Information Access

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
Filter
  • Cogghill, Penny  (2)
Material
Person/Organisation
Language
Years
  • 1
    Online Resource
    Online Resource
    Springer Science and Business Media LLC ; 2009
    In:  Nature Precedings ( 2009-4-28)
    In: Nature Precedings, Springer Science and Business Media LLC, ( 2009-4-28)
    Type of Medium: Online Resource
    ISSN: 1756-0357
    Language: Unknown
    Publisher: Springer Science and Business Media LLC
    Publication Date: 2009
    detail.hit.zdb_id: 2637018-9
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 2
    Online Resource
    Online Resource
    Springer Science and Business Media LLC ; 2009
    In:  Nature Precedings
    In: Nature Precedings, Springer Science and Business Media LLC
    Abstract: Pfam is a database of conserved protein families or domains commonly used for genome annotation and sequence classification. It comprises two parts: (1) Pfam-A families, which are fully annotated and consist of a representative seed alignment, HMMs, and a full alignment comprising all sequences that score above the curated threshold; (2) Pfam-B families, which are automatically generated clusters of domains not matched by Pfam-A but that often indicate conserved sequence regions. Pfam release 23.0 predicts at least one Pfam-A domain on 74% of the sequences in UniProtKB, and predicts either a Pfam-A or Pfam-B domain on 93% of the sequences in UniProtKB.With the ever increasing rate of deposition of new proteins of all qualities into the underlying repositories, it is essential that Pfam continues to grow in order to maintain its coverage. We have used a number of strategies to improve the annotation provided by Pfam, and these include both building new families and expanding existing ones. Pfam has also greatly benefited from contributions from its user community. New family and functional annotation submissions from an S. pombe curator have ensured that Pfam has a high coverage - 83% - of the S. pombe proteome. Many of the early Pfam-A models have not been altered since they were first deposited. As the diversity of the sequence databases grows, the diversity within a Pfam seed alignment can become too narrow for representing the breadth of sequences that should belong to that family. The result is that some of the early Pfam-A HMMs fail to detect remote homologues. To address this problem we have rebuilt a large proportion of Pfam-A families, which has increased the Pfam-A coverage by 1-2%. Another strategy we have used has been that of targeted building, where a particular system or complex is examined in detail to ensure families exist for all components and annotation is consistent. In terms of building new Pfam-A families, the two major starting points are Pfam-B clusters and novel structures. From these we have built ~1000 families between releases 22.0 and 23.0, and a further 800 families since release 23.0.Between Pfam releases 22.0 and 23.0 we have changed the the way in which Pfam-B families are generated. Previously, Pfam-B families were created from PRODOM clusters that were based on a much smaller sequence database than the one upon which Pfam was built. We now use the ADDA algorithm that generates clusters from the same underlying sequence database as Pfam is based on, thus resulting in a more comprehensive Pfam-B contribution. This has increased the sequence coverage contributed by Pfam-B substantially from 3.9% to 11.8%. In a further drive to improve coverage, Pfam is currently evaluating a new release of the HMMER software (HMMER3) used to construct and search the Pfam HMMs. Early results show that HMMER3 is ~100 fold faster and has increased specificity and sensitivity compared with HMMER2.
    Type of Medium: Online Resource
    ISSN: 1756-0357
    Language: English
    Publisher: Springer Science and Business Media LLC
    Publication Date: 2009
    detail.hit.zdb_id: 2637018-9
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...