GLORIA

GEOMAR Library Ocean Research Information Access

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
Filter
  • Oxford University Press (OUP)  (7)
  • Gao, Xin  (7)
Material
Publisher
  • Oxford University Press (OUP)  (7)
Language
Years
Subjects(RVK)
  • 1
    In: Nucleic Acids Research, Oxford University Press (OUP), Vol. 50, No. D1 ( 2022-01-07), p. D27-D38
    Abstract: The National Genomics Data Center (NGDC), part of the China National Center for Bioinformation (CNCB), provides a family of database resources to support global research in both academia and industry. With the explosively accumulated multi-omics data at ever-faster rates, CNCB-NGDC is constantly scaling up and updating its core database resources through big data archive, curation, integration and analysis. In the past year, efforts have been made to synthesize the growing data and knowledge, particularly in single-cell omics and precision medicine research, and a series of resources have been newly developed, updated and enhanced. Moreover, CNCB-NGDC has continued to daily update SARS-CoV-2 genome sequences, variants, haplotypes and literature. Particularly, OpenLB, an open library of bioscience, has been established by providing easy and open access to a substantial number of abstract texts from PubMed, bioRxiv and medRxiv. In addition, Database Commons is significantly updated by cataloguing a full list of global databases, and BLAST tools are newly deployed to provide online sequence search services. All these resources along with their services are publicly accessible at https://ngdc.cncb.ac.cn.
    Type of Medium: Online Resource
    ISSN: 0305-1048 , 1362-4962
    RVK:
    Language: English
    Publisher: Oxford University Press (OUP)
    Publication Date: 2022
    detail.hit.zdb_id: 1472175-2
    SSG: 12
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 2
    In: Nucleic Acids Research, Oxford University Press (OUP), Vol. 51, No. D1 ( 2023-01-06), p. D18-D28
    Abstract: The National Genomics Data Center (NGDC), part of the China National Center for Bioinformation (CNCB), provides a family of database resources to support global academic and industrial communities. With the explosive accumulation of multi-omics data generated at an unprecedented rate, CNCB-NGDC constantly expands and updates core database resources by big data archive, integrative analysis and value-added curation. In the past year, efforts have been devoted to integrating multiple omics data, synthesizing the growing knowledge, developing new resources and upgrading a set of major resources. Particularly, several database resources are newly developed for infectious diseases and microbiology (MPoxVR, KGCoV, ProPan), cancer-trait association (ASCancer Atlas, TWAS Atlas, Brain Catalog, CCAS) as well as tropical plants (TCOD). Importantly, given the global health threat caused by monkeypox virus and SARS-CoV-2, CNCB-NGDC has newly constructed the monkeypox virus resource, along with frequent updates of SARS-CoV-2 genome sequences, variants as well as haplotypes. All the resources and services are publicly accessible at https://ngdc.cncb.ac.cn.
    Type of Medium: Online Resource
    ISSN: 0305-1048 , 1362-4962
    RVK:
    Language: English
    Publisher: Oxford University Press (OUP)
    Publication Date: 2023
    detail.hit.zdb_id: 1472175-2
    SSG: 12
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 3
    Online Resource
    Online Resource
    Oxford University Press (OUP) ; 2022
    In:  Briefings in Bioinformatics Vol. 23, No. 5 ( 2022-09-20)
    In: Briefings in Bioinformatics, Oxford University Press (OUP), Vol. 23, No. 5 ( 2022-09-20)
    Abstract: As a frontier field of individualized therapy, microRNA (miRNA) pharmacogenomics facilitates the understanding of different individual responses to certain drugs and provides a reasonable reference for clinical treatment. However, the known drug resistance-associated miRNAs are not yet sufficient to support precision medicine. Although existing methods are effective, they all focus on modelling miRNA-drug resistance interaction graphs, making their performance bounded by the interaction density. In this study, we propose a framework for miRNA-drug resistance prediction through efficient neural architecture search and graph isomorphism networks (NASMDR). NASMDR uses attribute information instead of the commonly used interactive graph information. In the cross-validation experiment, the proposed framework can achieve an AUC of 0.9468 on the ncDR dataset, which is 2.29% higher than the state-of-the-art method. In addition, we propose a novel sequence characterization approach, k-mer Sparse Nonnegative Matrix Factorization (KSNMF). The results show that NASMDR provides novel insights for integrating efficient neural architecture search and graph isomorphic networks into a unified framework to predict drug resistance-related miRNAs. The codes for NASMDR are available at https://github.com/kaizheng-academic/NASMDR.
    Type of Medium: Online Resource
    ISSN: 1467-5463 , 1477-4054
    Language: English
    Publisher: Oxford University Press (OUP)
    Publication Date: 2022
    detail.hit.zdb_id: 2036055-1
    SSG: 12
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 4
    Online Resource
    Online Resource
    Oxford University Press (OUP) ; 2022
    In:  Nucleic Acids Research Vol. 50, No. D1 ( 2022-01-07), p. D236-D245
    In: Nucleic Acids Research, Oxford University Press (OUP), Vol. 50, No. D1 ( 2022-01-07), p. D236-D245
    Abstract: Repeats are prevalent in the genomes of all bacteria, plants and animals, and they cover nearly half of the Human genome, which play indispensable roles in the evolution, inheritance, variation and genomic instability, and serve as substrates for chromosomal rearrangements that include disease-causing deletions, inversions, and translocations. Comprehensive identification, classification and annotation of repeats in genomes can provide accurate and targeted solutions towards understanding and diagnosis of complex diseases, optimization of plant properties and development of new drugs. RepBase and Dfam are two most frequently used repeat databases, but they are not sufficiently complete. Due to the lack of a comprehensive repeat database of multiple species, the current research in this field is far from being satisfactory. LongRepMarker is a new framework developed recently by our group for comprehensive identification of genomic repeats. We here propose msRepDB based on LongRepMarker, which is currently the most comprehensive multi-species repeat database, covering & gt;80 000 species. Comprehensive evaluations show that msRepDB contains more species, and more complete repeats and families than RepBase and Dfam databases. (https://msrepdb.cbrc.kaust.edu.sa/pages/msRepDB/index.html).
    Type of Medium: Online Resource
    ISSN: 0305-1048 , 1362-4962
    RVK:
    Language: English
    Publisher: Oxford University Press (OUP)
    Publication Date: 2022
    detail.hit.zdb_id: 1472175-2
    SSG: 12
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 5
    Online Resource
    Online Resource
    Oxford University Press (OUP) ; 2021
    In:  Nucleic Acids Research Vol. 49, No. 17 ( 2021-09-27), p. e100-e100
    In: Nucleic Acids Research, Oxford University Press (OUP), Vol. 49, No. 17 ( 2021-09-27), p. e100-e100
    Abstract: Numerous studies have shown that repetitive regions in genomes play indispensable roles in the evolution, inheritance and variation of living organisms. However, most existing methods cannot achieve satisfactory performance on identifying repeats in terms of both accuracy and size, since NGS reads are too short to identify long repeats whereas SMS (Single Molecule Sequencing) long reads are with high error rates. In this study, we present a novel identification framework, LongRepMarker, based on the global de novo assembly and k-mer based multiple sequence alignment for precisely marking long repeats in genomes. The major characteristics of LongRepMarker are as follows: (i) by introducing barcode linked reads and SMS long reads to assist the assembly of all short paired-end reads, it can identify the repeats to a greater extent; (ii) by finding the overlap sequences between assemblies or chomosomes, it locates the repeats faster and more accurately; (iii) by using the multi-alignment unique k-mers rather than the high frequency k-mers to identify repeats in overlap sequences, it can obtain the repeats more comprehensively and stably; (iv) by applying the parallel alignment model based on the multi-alignment unique k-mers, the efficiency of data processing can be greatly optimized and (v) by taking the corresponding identification strategies, structural variations that occur between repeats can be identified. Comprehensive experimental results show that LongRepMarker can achieve more satisfactory results than the existing de novo detection methods (https://github.com/BioinformaticsCSU/LongRepMarker).
    Type of Medium: Online Resource
    ISSN: 0305-1048 , 1362-4962
    RVK:
    Language: English
    Publisher: Oxford University Press (OUP)
    Publication Date: 2021
    detail.hit.zdb_id: 1472175-2
    SSG: 12
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 6
    Online Resource
    Online Resource
    Oxford University Press (OUP) ; 2022
    In:  Briefings in Bioinformatics Vol. 23, No. 1 ( 2022-01-17)
    In: Briefings in Bioinformatics, Oxford University Press (OUP), Vol. 23, No. 1 ( 2022-01-17)
    Abstract: Long-read sequencing technology enables significant progress in de novo genome assembly. However, the high error rate and the wide error distribution of raw reads result in a large number of errors in the assembly. Polishing is a procedure to fix errors in the draft assembly and improve the reliability of genomic analysis. However, existing methods treat all the regions of the assembly equally while there are fundamental differences between the error distributions of these regions. How to achieve very high accuracy in genome assembly is still a challenging problem. Motivated by the uneven errors in different regions of the assembly, we propose a novel polishing workflow named BlockPolish. In this method, we divide contigs into blocks with low complexity and high complexity according to statistics of aligned nucleotide bases. Multiple sequence alignment is applied to realign raw reads in complex blocks and optimize the alignment result. Due to the different distributions of error rates in trivial and complex blocks, two multitask bidirectional Long short-term memory (LSTM) networks are proposed to predict the consensus sequences. In the whole-genome assemblies of NA12878 assembled by Wtdbg2 and Flye using Nanopore data, BlockPolish has a higher polishing accuracy than other state-of-the-arts including Racon, Medaka and MarginPolish & HELEN. In all assemblies, errors are predominantly indels and BlockPolish has a good performance in correcting them. In addition to the Nanopore assemblies, we further demonstrate that BlockPolish can also reduce the errors in the PacBio assemblies. The source code of BlockPolish is freely available on Github (https://github.com/huangnengCSU/BlockPolish).
    Type of Medium: Online Resource
    ISSN: 1467-5463 , 1477-4054
    Language: English
    Publisher: Oxford University Press (OUP)
    Publication Date: 2022
    detail.hit.zdb_id: 2036055-1
    SSG: 12
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 7
    Online Resource
    Online Resource
    Oxford University Press (OUP) ; 2021
    In:  Bioinformatics Vol. 37, No. 19 ( 2021-10-11), p. 3120-3127
    In: Bioinformatics, Oxford University Press (OUP), Vol. 37, No. 19 ( 2021-10-11), p. 3120-3127
    Abstract: Oxford Nanopore sequencing producing long reads at low cost has made many breakthroughs in genomics studies. However, the large number of errors in Nanopore genome assembly affect the accuracy of genome analysis. Polishing is a procedure to correct the errors in genome assembly and can improve the reliability of the downstream analysis. However, the performances of the existing polishing methods are still not satisfactory. Results We developed a novel polishing method, NeuralPolish, to correct the errors in assemblies based on alignment matrix construction and orthogonal Bi-GRU networks. In this method, we designed an alignment feature matrix for representing read-to-assembly alignment. Each row of the matrix represents a read, and each column represents the aligned bases at each position of the contig. In the network architecture, a bi-directional GRU network is used to extract the sequence information inside each read by processing the alignment matrix row by row. After that, the feature matrix is processed by another bi-directional GRU network column by column to calculate the probability distribution. Finally, a CTC decoder generates a polished sequence with a greedy algorithm. We used five real datasets and three assembly tools including Wtdbg2, Flye and Canu for testing, and compared the results of different polishing methods including NeuralPolish, Racon, MarginPolish, HELEN and Medaka. Comprehensive experiments demonstrate that NeuralPolish achieves more accurate assembly with fewer errors than other polishing methods and can improve the accuracy of assembly obtained by different assemblers. Availability and implementation https://github.com/huangnengCSU/NeuralPolish.git. Supplementary information Supplementary data are available at Bioinformatics online.
    Type of Medium: Online Resource
    ISSN: 1367-4803 , 1367-4811
    Language: English
    Publisher: Oxford University Press (OUP)
    Publication Date: 2021
    detail.hit.zdb_id: 1468345-3
    SSG: 12
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...