GLORIA

GEOMAR Library Ocean Research Information Access

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
Filter
  • Cold Spring Harbor Laboratory  (7)
  • 1
    In: Genome Research, Cold Spring Harbor Laboratory, Vol. 27, No. 5 ( 2017-05), p. 849-864
    Abstract: The human reference genome assembly plays a central role in nearly all aspects of today's basic and clinical research. GRCh38 is the first coordinate-changing assembly update since 2009; it reflects the resolution of roughly 1000 issues and encompasses modifications ranging from thousands of single base changes to megabase-scale path reorganizations, gap closures, and localization of previously orphaned sequences. We developed a new approach to sequence generation for targeted base updates and used data from new genome mapping technologies and single haplotype resources to identify and resolve larger assembly issues. For the first time, the reference assembly contains sequence-based representations for the centromeres. We also expanded the number of alternate loci to create a reference that provides a more robust representation of human population variation. We demonstrate that the updates render the reference an improved annotation substrate, alter read alignments in unchanged regions, and impact variant interpretation at clinically relevant loci. We additionally evaluated a collection of new de novo long-read haploid assemblies and conclude that although the new assemblies compare favorably to the reference with respect to continuity, error rate, and gene completeness, the reference still provides the best representation for complex genomic regions and coding sequences. We assert that the collected updates in GRCh38 make the newer assembly a more robust substrate for comprehensive analyses that will promote our understanding of human biology and advance our efforts to improve health.
    Type of Medium: Online Resource
    ISSN: 1088-9051 , 1549-5469
    RVK:
    Language: English
    Publisher: Cold Spring Harbor Laboratory
    Publication Date: 2017
    detail.hit.zdb_id: 1483456-X
    SSG: 12
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 2
    In: Genome Research, Cold Spring Harbor Laboratory, Vol. 15, No. 1 ( 2005-01), p. 1-18
    Abstract: We have sequenced the genome of a second Drosophila species, Drosophila pseudoobscura , and compared this to the genome sequence of Drosophila melanogaster , a primary model organism. Throughout evolution the vast majority of Drosophila genes have remained on the same chromosome arm, but within each arm gene order has been extensively reshuffled, leading to a minimum of 921 syntenic blocks shared between the species. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 25–55 million years (Myr) since the pseudoobscura / melanogaster divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome-wide average, consistent with the rapid evolution of sex-specific proteins. Cis -regulatory sequences are more conserved than random and nearby sequences between the species—but the difference is slight, suggesting that the evolution of cis -regulatory elements is flexible. Overall, a pattern of repeat-mediated chromosomal rearrangement, and high coadaptation of both male genes and cis -regulatory sequences emerges as important themes of genome divergence between these species of Drosophila .
    Type of Medium: Online Resource
    ISSN: 1088-9051
    RVK:
    Language: English
    Publisher: Cold Spring Harbor Laboratory
    Publication Date: 2005
    detail.hit.zdb_id: 1483456-X
    SSG: 12
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 3
    In: Genome Research, Cold Spring Harbor Laboratory, Vol. 21, No. 12 ( 2011-12), p. 2224-2241
    Abstract: Low-cost short read sequencing technology has revolutionized genomics, though it is only just becoming practical for the high-quality de novo assembly of a novel large genome. We describe the Assemblathon 1 competition, which aimed to comprehensively assess the state of the art in de novo assembly methods when applied to current sequencing technologies. In a collaborative effort, teams were asked to assemble a simulated Illumina HiSeq data set of an unknown, simulated diploid genome. A total of 41 assemblies from 17 different groups were received. Novel haplotype aware assessments of coverage, contiguity, structure, base calling, and copy number were made. We establish that within this benchmark: (1) It is possible to assemble the genome to a high level of coverage and accuracy, and that (2) large differences exist between the assemblies, suggesting room for further improvements in current methods. The simulated benchmark, including the correct answer, the assemblies, and the code that was used to evaluate the assemblies is now public and freely available from http://www.assemblathon.org/ .
    Type of Medium: Online Resource
    ISSN: 1088-9051
    RVK:
    Language: English
    Publisher: Cold Spring Harbor Laboratory
    Publication Date: 2011
    detail.hit.zdb_id: 1483456-X
    SSG: 12
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 4
    In: Genome Research, Cold Spring Harbor Laboratory, Vol. 27, No. 2 ( 2017-02), p. 300-309
    Abstract: We are rapidly approaching the point where we have sequenced millions of human genomes. There is a pressing need for new data structures to store raw sequencing data and efficient algorithms for population scale analysis. Current reference-based data formats do not fully exploit the redundancy in population sequencing nor take advantage of shared genetic variation. In recent years, the Burrows–Wheeler transform (BWT) and FM-index have been widely employed as a full-text searchable index for read alignment and de novo assembly. We introduce the concept of a population BWT and use it to store and index the sequencing reads of 2705 samples from the 1000 Genomes Project. A key feature is that, as more genomes are added, identical read sequences are increasingly observed, and compression becomes more efficient. We assess the support in the 1000 Genomes read data for every base position of two human reference assembly versions, identifying that 3.2 Mbp with population support was lost in the transition from GRCh37 with 13.7 Mbp added to GRCh38. We show that the vast majority of variant alleles can be uniquely described by overlapping 31-mers and show how rapid and accurate SNP and indel genotyping can be carried out across the genomes in the population BWT. We use the population BWT to carry out nonreference queries to search for the presence of all known viral genomes and discover human T-lymphotropic virus 1 integrations in six samples in a recognized epidemiological distribution.
    Type of Medium: Online Resource
    ISSN: 1088-9051 , 1549-5469
    RVK:
    Language: English
    Publisher: Cold Spring Harbor Laboratory
    Publication Date: 2017
    detail.hit.zdb_id: 1483456-X
    SSG: 12
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 5
    In: Genome Research, Cold Spring Harbor Laboratory, Vol. 19, No. 7 ( 2009-07), p. 1316-1323
    Abstract: Effective use of the human and mouse genomes requires reliable identification of genes and their products. Although multiple public resources provide annotation, different methods are used that can result in similar but not identical representation of genes, transcripts, and proteins. The collaborative consensus coding sequence (CCDS) project tracks identical protein annotations on the reference mouse and human genomes with a stable identifier (CCDS ID), and ensures that they are consistently represented on the NCBI, Ensembl, and UCSC Genome Browsers. Importantly, the project coordinates on manually reviewing inconsistent protein annotations between sites, as well as annotations for which new evidence suggests a revision is needed, to progressively converge on a complete protein-coding set for the human and mouse reference genomes, while maintaining a high standard of reliability and biological accuracy. To date, the project has identified 20,159 human and 17,707 mouse consensus coding regions from 17,052 human and 16,893 mouse genes. Three evaluation methods indicate that the entries in the CCDS set are highly likely to represent real proteins, more so than annotations from contributing groups not included in CCDS. The CCDS database thus centralizes the function of identifying well-supported, identically-annotated, protein-coding regions.
    Type of Medium: Online Resource
    ISSN: 1088-9051
    RVK:
    Language: English
    Publisher: Cold Spring Harbor Laboratory
    Publication Date: 2009
    detail.hit.zdb_id: 1483456-X
    SSG: 12
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 6
    Online Resource
    Online Resource
    Cold Spring Harbor Laboratory ; 2004
    In:  Genome Research Vol. 14, No. 4 ( 2004-04), p. 721-732
    In: Genome Research, Cold Spring Harbor Laboratory, Vol. 14, No. 4 ( 2004-04), p. 721-732
    Abstract: Atlas is a suite of programs developed for assembly of genomes by a “combined approach” that uses DNA sequence reads from both BACs and whole-genome shotgun (WGS) libraries. The BAC clones afford advantages of localized assembly with reduced computational load, and provide a robust method for dealing with repeated sequences. Inclusion of WGS sequences facilitates use of different clone insert sizes and reduces data production costs. A core function of Atlas software is recruitment of WGS sequences into appropriate BACs based on sequence overlaps. Because construction of consensus sequences is from local assembly of these reads, only small ( 〈 0.1%) units of the genome are assembled at a time. Once assembled, each BAC is used to derive a genomic layout. This “sequence-based” growth of the genome map has greater precision than with non-sequence-based methods. Use of BACs allows correction of artifacts due to repeats at each stage of the process. This is aided by ancillary data such as BAC fingerprint, other genomic maps, and syntenic relations with other genomes. Atlas was used to assemble a draft DNA sequence of the rat genome; its major components including overlapper and split-scaffold are also being used in pure WGS projects.
    Type of Medium: Online Resource
    ISSN: 1088-9051
    RVK:
    Language: English
    Publisher: Cold Spring Harbor Laboratory
    Publication Date: 2004
    detail.hit.zdb_id: 1483456-X
    SSG: 12
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 7
    In: Genome Research, Cold Spring Harbor Laboratory, Vol. 14, No. 5 ( 2004-05), p. 925-928
    Abstract: Ensembl ( http://www.ensembl.org/ ) is a bioinformatics project to organize biological information around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of individual genomes, and of the synteny and orthology relationships between them. It is also a framework for integration of any biological data that can be mapped onto features derived from the genomic sequence. Ensembl is available as an interactive Web site, a set of flat files, and as a complete, portable open source software system for handling genomes. All data are provided without restriction, and code is freely available. Ensembl's aims are to continue to “widen” this biological integration to include other model organisms relevant to understanding human biology as they become available; to “deepen” this integration to provide an ever more seamless linkage between equivalent components in different species; and to provide further classification of functional elements in the genome that have been previously elusive.
    Type of Medium: Online Resource
    ISSN: 1088-9051
    RVK:
    Language: English
    Publisher: Cold Spring Harbor Laboratory
    Publication Date: 2004
    detail.hit.zdb_id: 1483456-X
    SSG: 12
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...