GLORIA

GEOMAR Library Ocean Research Information Access

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
  • 1
    In: Bioinformatics, Oxford University Press (OUP), Vol. 33, No. 24 ( 2017-12-15), p. 4033-4040
    Abstract: RNA sequencing (RNA-seq) experiments now span hundreds to thousands of samples. Current spliced alignment software is designed to analyze each sample separately. Consequently, no information is gained from analyzing multiple samples together, and it requires extra work to obtain analysis products that incorporate data from across samples. Results We describe Rail-RNA, a cloud-enabled spliced aligner that analyzes many samples at once. Rail-RNA eliminates redundant work across samples, making it more efficient as samples are added. For many samples, Rail-RNA is more accurate than annotation-assisted aligners. We use Rail-RNA to align 667 RNA-seq samples from the GEUVADIS project on Amazon Web Services in under 16 h for US$0.91 per sample. Rail-RNA outputs alignments in SAM/BAM format; but it also outputs (i) base-level coverage bigWigs for each sample; (ii) coverage bigWigs encoding normalized mean and median coverages at each base across samples analyzed; and (iii) exon–exon splice junctions and indels (features) in columnar formats that juxtapose coverages in samples in which a given feature is found. Supplementary outputs are ready for use with downstream packages for reproducible statistical analysis. We use Rail-RNA to identify expressed regions in the GEUVADIS samples and show that both annotated and unannotated (novel) expressed regions exhibit consistent patterns of variation across populations and with respect to known confounding variables. Availability and Implementation Rail-RNA is open-source software available at http://rail.bio. Supplementary information Supplementary data are available at Bioinformatics online.
    Type of Medium: Online Resource
    ISSN: 1367-4803 , 1367-4811
    Language: English
    Publisher: Oxford University Press (OUP)
    Publication Date: 2017
    detail.hit.zdb_id: 1468345-3
    detail.hit.zdb_id: 1422668-6
    SSG: 12
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 2
    Online Resource
    Online Resource
    Oxford University Press (OUP) ; 2020
    In:  Bioinformatics Vol. 36, No. 3 ( 2020-02-01), p. 713-720
    In: Bioinformatics, Oxford University Press (OUP), Vol. 36, No. 3 ( 2020-02-01), p. 713-720
    Abstract: The vast majority of tools for neoepitope prediction from DNA sequencing of complementary tumor and normal patient samples do not consider germline context or the potential for the co-occurrence of two or more somatic variants on the same mRNA transcript. Without consideration of these phenomena, existing approaches are likely to produce both false-positive and false-negative results, resulting in an inaccurate and incomplete picture of the cancer neoepitope landscape. We developed neoepiscope chiefly to address this issue for single nucleotide variants (SNVs) and insertions/deletions (indels). Results Herein, we illustrate how germline and somatic variant phasing affects neoepitope prediction across multiple datasets. We estimate that up to ∼5% of neoepitopes arising from SNVs and indels may require variant phasing for their accurate assessment. neoepiscope is performant, flexible and supports several major histocompatibility complex binding affinity prediction tools. Availability and implementation neoepiscope is available on GitHub at https://github.com/pdxgx/neoepiscope under the MIT license. Scripts for reproducing results described in the text are available at https://github.com/pdxgx/neoepiscope-paper under the MIT license. Additional data from this study, including summaries of variant phasing incidence and benchmarking wallclock times, are available in Supplementary Files 1, 2 and 3. Supplementary File 1 contains Supplementary Table 1, Supplementary Figures 1 and 2, and descriptions of Supplementary Tables 2–8. Supplementary File 2 contains Supplementary Tables 2–6 and 8. Supplementary File 3 contains Supplementary Table 7. Raw sequencing data used for the analyses in this manuscript are available from the Sequence Read Archive under accessions PRJNA278450, PRJNA312948, PRJNA307199, PRJNA343789, PRJNA357321, PRJNA293912, PRJNA369259, PRJNA305077, PRJNA306070, PRJNA82745 and PRJNA324705; from the European Genome-phenome Archive under accessions EGAD00001004352 and EGAD00001002731; and by direct request to the authors. Supplementary information Supplementary data are available at Bioinformatics online.
    Type of Medium: Online Resource
    ISSN: 1367-4803 , 1367-4811
    Language: English
    Publisher: Oxford University Press (OUP)
    Publication Date: 2020
    detail.hit.zdb_id: 1468345-3
    detail.hit.zdb_id: 1422668-6
    SSG: 12
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 3
    Online Resource
    Online Resource
    Oxford University Press (OUP) ; 2018
    In:  Bioinformatics Vol. 34, No. 1 ( 2018-01-01), p. 114-116
    In: Bioinformatics, Oxford University Press (OUP), Vol. 34, No. 1 ( 2018-01-01), p. 114-116
    Abstract: As more and larger genomics studies appear, there is a growing need for comprehensive and queryable cross-study summaries. These enable researchers to leverage vast datasets that would otherwise be difficult to obtain. Results Snaptron is a search engine for summarized RNA sequencing data with a query planner that leverages R-tree, B-tree and inverted indexing strategies to rapidly execute queries over 146 million exon-exon splice junctions from over 70 000 human RNA-seq samples. Queries can be tailored by constraining which junctions and samples to consider. Snaptron can score junctions according to tissue specificity or other criteria, and can score samples according to the relative frequency of different splicing patterns. We describe the software and outline biological questions that can be explored with Snaptron queries. Availability and implementation Documentation is at http://snaptron.cs.jhu.edu. Source code is at https://github.com/ChristopherWilks/snaptron and https://github.com/ChristopherWilks/snaptron-experiments with a CC BY-NC 4.0 license. Supplementary information Supplementary data are available at Bioinformatics online.
    Type of Medium: Online Resource
    ISSN: 1367-4803 , 1367-4811
    Language: English
    Publisher: Oxford University Press (OUP)
    Publication Date: 2018
    detail.hit.zdb_id: 1468345-3
    detail.hit.zdb_id: 1422668-6
    SSG: 12
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 4
    Online Resource
    Online Resource
    Oxford University Press (OUP) ; 2016
    In:  Bioinformatics Vol. 32, No. 16 ( 2016-08-15), p. 2551-2553
    In: Bioinformatics, Oxford University Press (OUP), Vol. 32, No. 16 ( 2016-08-15), p. 2551-2553
    Abstract: Motivation: Public archives contain thousands of trillions of bases of valuable sequencing data. More than 40% of the Sequence Read Archive is human data protected by provisions such as dbGaP. To analyse dbGaP-protected data, researchers must typically work with IT administrators and signing officials to ensure all levels of security are implemented at their institution. This is a major obstacle, impeding reproducibility and reducing the utility of archived data. Results: We present a protocol and software tool for analyzing protected data in a commercial cloud. The protocol, Rail-dbGaP, is applicable to any tool running on Amazon Web Services Elastic MapReduce. The tool, Rail-RNA v0.2, is a spliced aligner for RNA-seq data, which we demonstrate by running on 9662 samples from the dbGaP-protected GTEx consortium dataset. The Rail-dbGaP protocol makes explicit for the first time the steps an investigator must take to develop Elastic MapReduce pipelines that analyse dbGaP-protected data in a manner compliant with NIH guidelines. Rail-RNA automates implementation of the protocol, making it easy for typical biomedical investigators to study protected RNA-seq data, regardless of their local IT resources or expertise. Availability and Implementation: Rail-RNA is available from http://rail.bio. Technical details on the Rail-dbGaP protocol as well as an implementation walkthrough are available at https://github.com/nellore/rail-dbgap. Detailed instructions on running Rail-RNA on dbGaP-protected data using Amazon Web Services are available at http://docs.rail.bio/dbgap/. Contacts: anellore@gmail.com or langmea@cs.jhu.edu Supplementary information:  Supplementary data are available at Bioinformatics online.
    Type of Medium: Online Resource
    ISSN: 1367-4811 , 1367-4803
    Language: English
    Publisher: Oxford University Press (OUP)
    Publication Date: 2016
    detail.hit.zdb_id: 1468345-3
    detail.hit.zdb_id: 1422668-6
    SSG: 12
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 5
    In: NAR Cancer, Oxford University Press (OUP), Vol. 2, No. 1 ( 2020-03-01)
    Abstract: This study probes the distribution of putatively cancer-specific junctions across a broad set of publicly available non-cancer human RNA sequencing (RNA-seq) datasets. We compared cancer and non-cancer RNA-seq data from The Cancer Genome Atlas (TCGA), the Genotype-Tissue Expression (GTEx) Project and the Sequence Read Archive. We found that (i) averaging across cancer types, 80.6% of exon–exon junctions thought to be cancer-specific based on comparison with tissue-matched samples (σ = 13.0%) are in fact present in other adult non-cancer tissues throughout the body; (ii) 30.8% of junctions not present in any GTEx or TCGA normal tissues are shared by multiple samples within at least one cancer type cohort, and 87.4% of these distinguish between different cancer types; and (iii) many of these junctions not found in GTEx or TCGA normal tissues (15.4% on average, σ = 2.4%) are also found in embryological and other developmentally associated cells. These findings refine the meaning of RNA splicing event novelty, particularly with respect to the human neoepitope repertoire. Ultimately, cancer-specific exon–exon junctions may have a substantial causal relationship with the biology of disease.
    Type of Medium: Online Resource
    ISSN: 2632-8674
    Language: English
    Publisher: Oxford University Press (OUP)
    Publication Date: 2020
    detail.hit.zdb_id: 3025038-9
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 6
    Online Resource
    Online Resource
    Oxford University Press (OUP) ; 2021
    In:  Bioinformatics Vol. 37, No. 21 ( 2021-11-05), p. 3723-3733
    In: Bioinformatics, Oxford University Press (OUP), Vol. 37, No. 21 ( 2021-11-05), p. 3723-3733
    Abstract: Proteasomal cleavage is a key component in protein turnover, as well as antigen processing and presentation. Although tools for proteasomal cleavage prediction are available, they vary widely in their performance, options and availability. Results Herein, we present pepsickle, an open-source tool for proteasomal cleavage prediction with better in vivo prediction performance (area under the curve) and computational speed than current models available in the field and with the ability to predict sites based on both constitutive and immunoproteasome profiles. Post hoc filtering of predicted patient neoepitopes using pepsickle significantly enriches for immune-responsive epitopes and may improve current epitope prediction and vaccine development pipelines. Availability and implementation pepsickle is open source and available at https://github.com/pdxgx/pepsickle. Supplementary information Supplementary data are available at Bioinformatics online.
    Type of Medium: Online Resource
    ISSN: 1367-4803 , 1367-4811
    Language: English
    Publisher: Oxford University Press (OUP)
    Publication Date: 2021
    detail.hit.zdb_id: 1468345-3
    detail.hit.zdb_id: 1422668-6
    SSG: 12
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 7
    In: Nucleic Acids Research, Oxford University Press (OUP), Vol. 45, No. 2 ( 2017-01-25), p. e9-e9
    Type of Medium: Online Resource
    ISSN: 0305-1048 , 1362-4962
    RVK:
    Language: English
    Publisher: Oxford University Press (OUP)
    Publication Date: 2017
    detail.hit.zdb_id: 186809-3
    detail.hit.zdb_id: 1472175-2
    SSG: 12
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 8
    Online Resource
    Online Resource
    Oxford University Press (OUP) ; 2021
    In:  NAR Genomics and Bioinformatics Vol. 3, No. 2 ( 2021-04-09)
    In: NAR Genomics and Bioinformatics, Oxford University Press (OUP), Vol. 3, No. 2 ( 2021-04-09)
    Abstract: While DNA methylation (DNAm) is the most-studied epigenetic mark, few recent studies probe the breadth of publicly available DNAm array samples. We collectively analyzed 35 360 Illumina Infinium HumanMethylation450K DNAm array samples published on the Gene Expression Omnibus. We learned a controlled vocabulary of sample labels by applying regular expressions to metadata and used existing models to predict various sample properties including epigenetic age. We found approximately two-thirds of samples were from blood, one-quarter were from brain and one-third were from cancer patients. About 19% of samples failed at least one of Illumina’s 17 prescribed quality assessments; signal distributions across samples suggest modifying manufacturer-recommended thresholds for failure would make these assessments more informative. We further analyzed DNAm variances in seven tissues (adipose, nasal, blood, brain, buccal, sperm and liver) and characterized specific probes distinguishing them. Finally, we compiled DNAm array data and metadata, including our learned and predicted sample labels, into database files accessible via the recountmethylation R/Bioconductor companion package. Its vignettes walk the user through some analyses contained in this paper.
    Type of Medium: Online Resource
    ISSN: 2631-9268
    Language: English
    Publisher: Oxford University Press (OUP)
    Publication Date: 2021
    detail.hit.zdb_id: 3009998-5
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 9
    Online Resource
    Online Resource
    Oxford University Press (OUP) ; 2023
    In:  Bioinformatics Advances Vol. 3, No. 1 ( 2023-01-05)
    In: Bioinformatics Advances, Oxford University Press (OUP), Vol. 3, No. 1 ( 2023-01-05)
    Abstract: Thousands of DNA methylation (DNAm) array samples from human blood are publicly available on the Gene Expression Omnibus (GEO), but they remain underutilized for experiment planning, replication and cross-study and cross-platform analyses. To facilitate these tasks, we augmented our recountmethylation R/Bioconductor package with 12 537 uniformly processed EPIC and HM450K blood samples on GEO as well as several new features. We subsequently used our updated package in several illustrative analyses, finding (i) study ID bias adjustment increased variation explained by biological and demographic variables, (ii) most variation in autosomal DNAm was explained by genetic ancestry and CD4+ T-cell fractions and (iii) the dependence of power to detect differential methylation on sample size was similar for each of peripheral blood mononuclear cells (PBMC), whole blood and umbilical cord blood. Finally, we used PBMC and whole blood to perform independent validations, and we recovered 38–46% of differentially methylated probes between sexes from two previously published epigenome-wide association studies. Availability and implementation Source code to reproduce the main results are available on GitHub (repo: recountmethylation_flexible-blood-analysis_manuscript; url: https://github.com/metamaden/recountmethylation_flexible-blood-analysis_manuscript). All data was publicly available and downloaded from the Gene Expression Omnibus (https://www.ncbi.nlm.nih.gov/geo/). Compilations of the analyzed public data can be accessed from the website recount.bio/data (preprocessed HM450K array data: https://recount.bio/data/remethdb_h5se-gm_epic_0-0-2_1589820348/; preprocessed EPIC array data: https://recount.bio/data/remethdb_h5se-gm_epic_0-0-2_1589820348/). Supplementary information Supplementary data are available at Bioinformatics Advances online.
    Type of Medium: Online Resource
    ISSN: 2635-0041
    Language: English
    Publisher: Oxford University Press (OUP)
    Publication Date: 2023
    detail.hit.zdb_id: 3076075-6
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...