GLORIA

GEOMAR Library Ocean Research Information Access

feed icon rss

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
  • 1
    Publication Date: 2012-04-08
    Description: Motivation: High-throughput sequencing has made the analysis of new model organisms more affordable. Although assembling a new genome can still be costly and difficult, it is possible to use RNA-seq to sequence mRNA. In the absence of a known genome, it is necessary to assemble these sequences de novo , taking into account possible alternative isoforms and the dynamic range of expression values. Results: We present a software package named Oases designed to heuristically assemble RNA-seq reads in the absence of a reference genome, across a broad spectrum of expression values and in presence of alternative isoforms. It achieves this by using an array of hash lengths, a dynamic filtering of noise, a robust resolution of alternative splicing events and the efficient merging of multiple assemblies. It was tested on human and mouse RNA-seq data and is shown to improve significantly on the transABySS and Trinity de novo transcriptome assemblers. Availability and implementation: Oases is freely available under the GPL license at www.ebi.ac.uk/~zerbino/oases/ Contact: dzerbino@ucsc.edu Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 2
    Publication Date: 2012-03-01
    Description: Motivation: The reliable detection of genomic variation in resequencing data is still a major challenge, especially for variants larger than a few base pairs. Sequencing reads crossing boundaries of structural variation carry the potential for their identification, but are difficult to map. Results: Here we present a method for ‘split’ read mapping, where prefix and suffix match of a read may be interrupted by a longer gap in the read-to-reference alignment. We use this method to accurately detect medium-sized insertions and long deletions with precise breakpoints in genomic resequencing data. Compared with alternative split mapping methods, SplazerS significantly improves sensitivity for detecting large indel events, especially in variant-rich regions. Our method is robust in the presence of sequencing errors as well as alignment errors due to genomic mutations/divergence, and can be used on reads of variable lengths. Our analysis shows that SplazerS is a versatile tool applicable to unanchored or single-end as well as anchored paired-end reads. In addition, application of SplazerS to targeted resequencing data led to the interesting discovery of a complete, possibly functional gene retrocopy variant. Availability: SplazerS is available from http://www.seqan.de/projects/ splazers. Contact: emde@inf.fu-berlin.de Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 3
    Publication Date: 2012-03-01
    Description: Motivation: The identity of cells and tissues is to a large degree governed by transcriptional regulation. A major part is accomplished by the combinatorial binding of transcription factors at regulatory sequences, such as enhancers. Even though binding of transcription factors is sequence-specific, estimating the sequence similarity of two functionally similar enhancers is very difficult. However, a similarity measure for regulatory sequences is crucial to detect and understand functional similarities between two enhancers and will facilitate large-scale analyses like clustering, prediction and classification of genome-wide datasets. Results: We present the standardized alignment-free sequence similarity measure N 2, a flexible framework that is defined for word neighbourhoods. We explore the usefulness of adding reverse complement words as well as words including mismatches into the neighbourhood. On simulated enhancer sequences as well as functional enhancers in mouse development, N 2 is shown to outperform previous alignment-free measures. N 2 is flexible, faster than competing methods and less susceptible to single sequence noise and the occurrence of repetitive sequences. Experiments on the mouse enhancers reveal that enhancers active in different tissues can be separated by pairwise comparison using N 2. Conclusion: N 2 represents an improvement over previous alignment-free similarity measures without compromising speed, which makes it a good candidate for large-scale sequence comparison of regulatory sequences. Availability: The software is part of the open-source C++ library SeqAn ( www.seqan.de ) and a compiled version can be downloaded at http://www.seqan.de/projects/alf.html Contact: goeke@molgen.mpg.de ; vingron@molgen.mpg.de Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 4
    Publication Date: 2012-09-30
    Description: Motivation: Ontologies provide a structured representation of the concepts of a domain of knowledge as well as the relations between them. Attribute ontologies are used to describe the characteristics of the items of a domain, such as the functions of proteins or the signs and symptoms of disease, which opens the possibility of searching a database of items for the best match to a list of observed or desired attributes. However, naive search methods do not perform well on realistic data because of noise in the data, imprecision in typical queries and because individual items may not display all attributes of the category they belong to. Results: : We present a method for combining ontological analysis with Bayesian networks to deal with noise, imprecision and attribute frequencies and demonstrate an application of our method as a differential diagnostic support system for human genetics. Availability: We provide an implementation for the algorithm and the benchmark at http://compbio.charite.de/boqa/ . Contact: Sebastian.Bauer@charite.de or Peter.Robinson@charite.de Supplementary Information: Supplementary Material for this article is available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 5
    Publication Date: 2017-01-10
    Description: The binding and contribution of transcription factors (TF) to cell specific gene expression is often deduced from open-chromatin measurements to avoid costly TF ChIP-seq assays. Thus, it is important to develop computational methods for accurate TF binding prediction in open-chromatin regions (OCRs). Here, we report a novel segmentation-based method, TEPIC, to predict TF binding by combining sets of OCRs with position weight matrices. TEPIC can be applied to various open-chromatin data, e.g. DNaseI-seq and NOMe-seq. Additionally, Histone-Marks (HMs) can be used to identify candidate TF binding sites. TEPIC computes TF affinities and uses open-chromatin/HM signal intensity as quantitative measures of TF binding strength. Using machine learning, we find low affinity binding sites to improve our ability to explain gene expression variability compared to the standard presence/absence classification of binding sites. Further, we show that both footprints and peaks capture essential TF binding events and lead to a good prediction performance. In our application, gene-based scores computed by TEPIC with one open-chromatin assay nearly reach the quality of several TF ChIP-seq data sets. Finally, these scores correctly predict known transcriptional regulators as illustrated by the application to novel DNaseI-seq and NOMe-seq data for primary human hepatocytes and CD4+ T-cells, respectively.
    Keywords: Protein-nucleic acid interaction, Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 6
    Publication Date: 2016-12-24
    Description: In the course of a 2-year combined chronic toxicity—carcinogenicity study performed according to Organisation for Economic Co-operation and Development (OECD) Test Guideline 453, systemic (blood cell) genotoxicity of two OECD representative nanomaterials, CeO 2 NM-212 and BaSO 4 upon 3- or 6-month inhalation exposure to rats was assessed. DNA effects were analysed in leukocytes using the alkaline Comet assay, gene mutations and chromosome aberrations were measured in erythrocytes using the flow cytometric Pig-a gene mutation assay and the micronucleus test (applying both microscopic and flow cytometric evaluation), respectively. Since nano-sized CeO 2 elicited lung effects at concentrations of 5mg/m 3 (burdens of 0.5mg/lung) in the preceding range-finding study, whereas nano-sized BaSO 4 did not induce any effect, female rats were exposed to aerosol concentrations of 0.1 up to 3mg/m 3 CeO 2 or 50mg/m 3 BaSO 4 nanomaterials (6h/day; 5 days/week; whole-body exposure). The blood of animals treated with clean air served as negative control, whereas blood samples from rats treated orally with three doses of 20mg/kg body weight ethylnitrosourea at 24h intervals were used as positive controls. As expected, ethylnitrosourea elicited significant genotoxicity in the alkaline Comet and Pig-a gene mutation assays and in the micronucleus test. By contrast, 3- and 6-month CeO 2 or BaSO 4 nanomaterial inhalation exposure did not elicit significant findings in any of the genotoxicity tests. The results demonstrate that subchronic inhalation exposure to different low doses of CeO 2 or to a high dose of BaSO 4 nanomaterials does not induce genotoxicity on the rat hematopoietic system at the DNA, gene or chromosome levels.
    Print ISSN: 0267-8357
    Electronic ISSN: 1464-3804
    Topics: Biology , Medicine
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 7
    Publication Date: 2016-06-01
    Description: Motivation: De novo transcriptome assembly is an integral part for many RNA-seq workflows. Common applications include sequencing of non-model organisms, cancer or meta transcriptomes. Most de novo transcriptome assemblers use the de Bruijn graph (DBG) as the underlying data structure. The quality of the assemblies produced by such assemblers is highly influenced by the exact word length k . As such no single k mer value leads to optimal results. Instead, DBGs over different k mer values are built and the assemblies are merged to improve sensitivity. However, no studies have investigated thoroughly the problem of automatically learning at which k mer value to stop the assembly. Instead a suboptimal selection of k mer values is often used in practice. Results: Here we investigate the contribution of a single k mer value in a multi- k mer based assembly approach. We find that a comparative clustering of related assemblies can be used to estimate the importance of an additional k mer assembly. Using a model fit based algorithm we predict the k mer value at which no further assemblies are necessary. Our approach is tested with different de novo assemblers for datasets with different coverage values and read lengths. Further, we suggest a simple post processing step that significantly improves the quality of multi- k mer assemblies. Conclusion: We provide an automatic method for limiting the number of k mer values without a significant loss in assembly quality but with savings in assembly time. This is a step forward to making multi- k mer methods more reliable and easier to use. Availability and Implementation :A general implementation of our approach can be found under: https://github.com/SchulzLab/KREATION . Supplementary information: Supplementary data are available at Bioinformatics online. Contact: mschulz@mmci.uni-saarland.de
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 8
    Publication Date: 2013-05-29
    Description: Sequencing of RNAs (RNA-Seq) has revolutionized the field of transcriptomics, but the reads obtained often contain errors. Read error correction can have a large impact on our ability to accurately assemble transcripts. This is especially true for de novo transcriptome analysis, where a reference genome is not available. Current read error correction methods, developed for DNA sequence data, cannot handle the overlapping effects of non-uniform abundance, polymorphisms and alternative splicing. Here we present SEquencing Error CorrEction in Rna-seq data (SEECER), a hidden Markov Model (HMM)–based method, which is the first to successfully address these problems. SEECER efficiently learns hundreds of thousands of HMMs and uses these to correct sequencing errors. Using human RNA-Seq data, we show that SEECER greatly improves on previous methods in terms of quality of read alignment to the genome and assembly accuracy. To illustrate the usefulness of SEECER for de novo transcriptome studies, we generated new RNA-Seq data to study the development of the sea cucumber Parastichopus parvimensis . Our corrected assembled transcripts shed new light on two important stages in sea cucumber development. Comparison of the assembled transcripts to known transcripts in other species has also revealed novel transcripts that are unique to sea cucumber, some of which we have experimentally validated. Supporting website: http://sb.cs.cmu.edu/seecer/ .
    Keywords: Computational Methods, Massively Parallel (Deep) Sequencing
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 9
    Publication Date: 2013-05-23
    Description: Motivation: The rapid accumulation of knowledge in the field of Systems Biology during the past years requires advanced, but simple-to-use, methods for the visualization of information in a structured and easily comprehensible manner. Results: We have developed biographer, a web-based renderer and editor for reaction networks, which can be integrated as a library into tools dealing with network-related information. Our software enables visualizations based on the emerging standard Systems Biology Graphical Notation. It is able to import networks encoded in various formats such as SBML, SBGN-ML and jSBGN, a custom lightweight exchange format. The core package is implemented in HTML5, CSS and JavaScript and can be used within any kind of web-based project. It features interactive graph-editing tools and automatic graph layout algorithms. In addition, we provide a standalone graph editor and a web server, which contains enhanced features like web services for the import and export of models and visualizations in different formats. Availability: The biographer tool can be used at and downloaded from the web page http://biographer.biologie.hu-berlin.de/ . The different software packages, including a server-indepenent version as well as a web server for Windows and Linux based systems, are available at http://code.google.com/p/biographer/ under the open-source license LGPL. Contact: edda.klipp@biologie.hu-berlin.de or handorf@physik.hu-berlin.de
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 10
    Publication Date: 2013-03-26
    Description: Background Low-grade gliomas (LGGs) are rare brain neoplasms, with survival spanning up to a few decades. Thus, accurate evaluations on how biomarkers impact survival among patients with LGG require long-term studies on samples prospectively collected over a long period. Methods The 210 adult LGGs collected in our databank were screened for IDH1 and IDH2 mutations ( IDH mut), MGMT gene promoter methylation ( MGMT met), 1p/19q loss of heterozygosity (1p19qloh), and nuclear TP53 immunopositivity (TP53pos). Multivariate survival analyses with multiple imputation of missing data were performed using either histopathology or molecular markers. Both models were compared using Akaike's information criterion (AIC). The molecular model was reduced by stepwise model selection to filter out the most critical predictors. A third model was generated to assess for various marker combinations. Results Molecular parameters were better survival predictors than histology (AIC = 12.5, P 〈 .001). Forty-five percent of studied patients died. MGMT met was positively associated with IDH mut ( P 〈 .001). In the molecular model with marker combinations, IDH mut/ MGMT met combined status had a favorable impact on overall survival, compared with IDH wt (hazard ratio [HR] = 0.33, P 〈 .01), and even more so the triple combination, IDH mut/ MGMT met/1p19qloh (HR = 0.18, P 〈 .001). Furthermore, IDH mut/ MGMT met/TP53pos triple combination was a significant risk factor for malignant transformation (HR = 2.75, P 〈 .05). Conclusion By integrating networks of activated molecular glioma pathways, the model based on genotype better predicts prognosis than histology and, therefore, provides a more reliable tool for standardizing future treatment strategies.
    Print ISSN: 1522-8517
    Electronic ISSN: 1523-5866
    Topics: Medicine
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...