GLORIA — GEOMAR Library Ocean Research Information Access

1

Online Resource

Computational drug repositioning using low-rank matrix approximation and randomized algorithms

Luo, Huimin ; Li, Min ; Wang, Shaokai ; [et al.]

Oxford University Press (OUP) ; 2018

In: Bioinformatics Vol. 34, No. 11 ( 2018-06-01), p. 1904-1912

add to mindlist on the mindlist

Details

In: Bioinformatics, Oxford University Press (OUP), Vol. 34, No. 11 ( 2018-06-01), p. 1904-1912

Abstract: Computational drug repositioning is an important and efficient approach towards identifying novel treatments for diseases in drug discovery. The emergence of large-scale, heterogeneous biological and biomedical datasets has provided an unprecedented opportunity for developing computational drug repositioning methods. The drug repositioning problem can be modeled as a recommendation system that recommends novel treatments based on known drug–disease associations. The formulation under this recommendation system is matrix completion, assuming that the hidden factors contributing to drug–disease associations are highly correlated and thus the corresponding data matrix is low-rank. Under this assumption, the matrix completion algorithm fills out the unknown entries in the drug–disease matrix by constructing a low-rank matrix approximation, where new drug–disease associations having not been validated can be screened. Results In this work, we propose a drug repositioning recommendation system (DRRS) to predict novel drug indications by integrating related data sources and validated information of drugs and diseases. Firstly, we construct a heterogeneous drug–disease interaction network by integrating drug–drug, disease–disease and drug–disease networks. The heterogeneous network is represented by a large drug–disease adjacency matrix, whose entries include drug pairs, disease pairs, known drug–disease interaction pairs and unknown drug–disease pairs. Then, we adopt a fast Singular Value Thresholding (SVT) algorithm to complete the drug–disease adjacency matrix with predicted scores for unknown drug–disease pairs. The comprehensive experimental results show that DRRS improves the prediction accuracy compared with the other state-of-the-art approaches. In addition, case studies for several selected drugs further demonstrate the practical usefulness of the proposed method. Availability and implementation http://bioinformatics.csu.edu.cn/resources/softs/DrugRepositioning/DRRS/index.html Supplementary information Supplementary data are available at Bioinformatics online.

Type of Medium: Online Resource

ISSN: 1367-4803 , 1367-4811

URL: Article

DOI: 10.1093/bioinformatics/bty013

Language: English

Publisher: Oxford University Press (OUP)

Publication Date: 2018

detail.hit.zdb_id: 1468345-3

SSG: 12

Permalink

	Location	Call Number	Limitation	Availability

Others were also interested in ...

Online Resource

Link to publisher

2

Online Resource

A sensitive repeat identification framework based on short and long reads

Liao, Xingyu ; Li, Min ; Hu, Kang ; [et al.]

Oxford University Press (OUP) ; 2021

In: Nucleic Acids Research Vol. 49, No. 17 ( 2021-09-27), p. e100-e100

add to mindlist on the mindlist

Details

In: Nucleic Acids Research, Oxford University Press (OUP), Vol. 49, No. 17 ( 2021-09-27), p. e100-e100

Abstract: Numerous studies have shown that repetitive regions in genomes play indispensable roles in the evolution, inheritance and variation of living organisms. However, most existing methods cannot achieve satisfactory performance on identifying repeats in terms of both accuracy and size, since NGS reads are too short to identify long repeats whereas SMS (Single Molecule Sequencing) long reads are with high error rates. In this study, we present a novel identification framework, LongRepMarker, based on the global de novo assembly and k-mer based multiple sequence alignment for precisely marking long repeats in genomes. The major characteristics of LongRepMarker are as follows: (i) by introducing barcode linked reads and SMS long reads to assist the assembly of all short paired-end reads, it can identify the repeats to a greater extent; (ii) by finding the overlap sequences between assemblies or chomosomes, it locates the repeats faster and more accurately; (iii) by using the multi-alignment unique k-mers rather than the high frequency k-mers to identify repeats in overlap sequences, it can obtain the repeats more comprehensively and stably; (iv) by applying the parallel alignment model based on the multi-alignment unique k-mers, the efficiency of data processing can be greatly optimized and (v) by taking the corresponding identification strategies, structural variations that occur between repeats can be identified. Comprehensive experimental results show that LongRepMarker can achieve more satisfactory results than the existing de novo detection methods (https://github.com/BioinformaticsCSU/LongRepMarker).

Type of Medium: Online Resource

ISSN: 0305-1048 , 1362-4962

URL: Article

DOI: 10.1093/nar/gkab563

RVK:

WA 15000

Language: English

Publisher: Oxford University Press (OUP)

Publication Date: 2021

detail.hit.zdb_id: 1472175-2

SSG: 12

Permalink

	Location	Call Number	Limitation	Availability

Others were also interested in ...

Online Resource

Link to publisher

3

Online Resource

EPGA: de novo assembly using the distributions of reads and insert size

Luo, Junwei ; Wang, Jianxin ; Zhang, Zhen ; [et al.]

Oxford University Press (OUP) ; 2015

In: Bioinformatics Vol. 31, No. 6 ( 2015-03-15), p. 825-833

add to mindlist on the mindlist

Details

In: Bioinformatics, Oxford University Press (OUP), Vol. 31, No. 6 ( 2015-03-15), p. 825-833

Abstract: Motivation: In genome assembly, the primary issue is how to determine upstream and downstream sequence regions of sequence seeds for constructing long contigs or scaffolds. When extending one sequence seed, repetitive regions in the genome always cause multiple feasible extension candidates which increase the difficulty of genome assembly. The universally accepted solution is choosing one based on read overlaps and paired-end (mate-pair) reads. However, this solution faces difficulties with regard to some complex repetitive regions. In addition, sequencing errors may produce false repetitive regions and uneven sequencing depth leads some sequence regions to have too few or too many reads. All the aforementioned problems prohibit existing assemblers from getting satisfactory assembly results. Results: In this article, we develop an algorithm, called extract paths for genome assembly (EPGA), which extracts paths from De Bruijn graph for genome assembly. EPGA uses a new score function to evaluate extension candidates based on the distributions of reads and insert size. The distribution of reads can solve problems caused by sequencing errors and short repetitive regions. Through assessing the variation of the distribution of insert size, EPGA can solve problems introduced by some complex repetitive regions. For solving uneven sequencing depth, EPGA uses relative mapping to evaluate extension candidates. On real datasets, we compare the performance of EPGA and other popular assemblers. The experimental results demonstrate that EPGA can effectively obtain longer and more accurate contigs and scaffolds. Availability and implementation: EPGA is publicly available for download at https://github.com/bioinfomaticsCSU/EPGA. Contact: jxwang@csu.edu.cn Supplementary information: Supplementary data are available at Bioinformatics online.

Type of Medium: Online Resource

ISSN: 1367-4811 , 1367-4803

URL: Article

DOI: 10.1093/bioinformatics/btu762

Language: English

Publisher: Oxford University Press (OUP)

Publication Date: 2015

detail.hit.zdb_id: 1468345-3

SSG: 12

Permalink

	Location	Call Number	Limitation	Availability

Others were also interested in ...

Online Resource

Link to publisher

4

Online Resource

SCOP: a novel scaffolding algorithm based on contig classification and optimization

Li, Min ; Tang, Li ; Wu, Fang-Xiang ; [et al.]

Oxford University Press (OUP) ; 2019

In: Bioinformatics Vol. 35, No. 7 ( 2019-04-01), p. 1142-1150

add to mindlist on the mindlist

Details

In: Bioinformatics, Oxford University Press (OUP), Vol. 35, No. 7 ( 2019-04-01), p. 1142-1150

Abstract: Scaffolding is an essential step during the de novo sequence assembly process to infer the direction and order relationships between the contigs and make the sequence assembly results more continuous and complete. However, scaffolding still faces the challenges of repetitive regions in genome, sequencing errors and uneven sequencing depth. Moreover, the accuracy of scaffolding greatly depends on the quality of contigs. Generally, the existing scaffolding methods construct a scaffold graph, and then optimize the graph by deleting spurious edges. Nevertheless, due to the wrong joints between contigs, some correct edges connecting contigs may be deleted. Results In this study, we present a novel scaffolding method SCOP, which is the first method to classify the contigs and utilize the vertices and edges to optimize the scaffold graph. Specially, SCOP employs alignment features and GC-content of paired reads to evaluate the quality of contigs (vertices), and divide the contigs into three types (True, Uncertain and Misassembled), and then optimizes the scaffold graph based on the classification of contigs together with the alignment of edges. The experiment results on the datasets of GAGE-A and GAGE-B demonstrate that SCOP performs better than 12 other competing scaffolders. Availability and implementation SCOP is publicly available for download at https://github.com/bioinfomaticsCSU/SCOP. Supplementary information Supplementary data are available at Bioinformatics online.

Type of Medium: Online Resource

ISSN: 1367-4803 , 1367-4811

URL: Article

DOI: 10.1093/bioinformatics/bty773

Language: English

Publisher: Oxford University Press (OUP)

Publication Date: 2019

detail.hit.zdb_id: 1468345-3

SSG: 12

Permalink

	Location	Call Number	Limitation	Availability

Others were also interested in ...

Online Resource

Link to publisher

5

Online Resource

SSRE: Cell Type Detection Based on Sparse Subspace Representation and Similarity Enhancement

Liang, Zhenlan ; Li, Min ; Zheng, Ruiqing ; [et al.]

Oxford University Press (OUP) ; 2021

In: Genomics, Proteomics & Bioinformatics Vol. 19, No. 2 ( 2021-04-01), p. 282-291

add to mindlist on the mindlist

Details

In: Genomics, Proteomics & Bioinformatics, Oxford University Press (OUP), Vol. 19, No. 2 ( 2021-04-01), p. 282-291

Abstract: Accurate identification of cell types from single-cell RNA sequencing (scRNA-seq) data plays a critical role in a variety of scRNA-seq analysis studies. This task corresponds to solving an unsupervised clustering problem, in which the similarity measurement between cells affects the result significantly. Although many approaches for cell type identification have been proposed, the accuracy still needs to be improved. In this study, we proposed a novel single-cell clustering framework based on similarity learning, called SSRE. SSRE models the relationships between cells based on subspace assumption, and generates a sparse representation of the cell-to-cell similarity. The sparse representation retains the most similar neighbors for each cell. Besides, three classical pairwise similarities are incorporated with a gene selection and enhancement strategy to further improve the effectiveness of SSRE. Tested on ten real scRNA-seq datasets and five simulated datasets, SSRE achieved the superior performance in most cases compared to several state-of-the-art single-cell clustering methods. In addition, SSRE can be extended to visualization of scRNA-seq data and identification of differentially expressed genes. The matlab and python implementations of SSRE are available at https://github.com/CSUBioGroup/SSRE.

Type of Medium: Online Resource

ISSN: 1672-0229 , 2210-3244

URL: Article

DOI: 10.1016/j.gpb.2020.09.004

Language: English

Publisher: Oxford University Press (OUP)

Publication Date: 2021

detail.hit.zdb_id: 2233708-8

Permalink

	Location	Call Number	Limitation	Availability

Others were also interested in ...

Online Resource

Link to publisher

6

Online Resource

Biomedical data and computational models for drug repositioning: a comprehensive review

Luo, Huimin ; Li, Min ; Yang, Mengyun ; [et al.]

Oxford University Press (OUP) ; 2021

In: Briefings in Bioinformatics Vol. 22, No. 2 ( 2021-03-22), p. 1604-1619

add to mindlist on the mindlist

Details

In: Briefings in Bioinformatics, Oxford University Press (OUP), Vol. 22, No. 2 ( 2021-03-22), p. 1604-1619

Abstract: Drug repositioning can drastically decrease the cost and duration taken by traditional drug research and development while avoiding the occurrence of unforeseen adverse events. With the rapid advancement of high-throughput technologies and the explosion of various biological data and medical data, computational drug repositioning methods have been appealing and powerful techniques to systematically identify potential drug-target interactions and drug-disease interactions. In this review, we first summarize the available biomedical data and public databases related to drugs, diseases and targets. Then, we discuss existing drug repositioning approaches and group them based on their underlying computational models consisting of classical machine learning, network propagation, matrix factorization and completion, and deep learning based models. We also comprehensively analyze common standard data sets and evaluation metrics used in drug repositioning, and give a brief comparison of various prediction methods on the gold standard data sets. Finally, we conclude our review with a brief discussion on challenges in computational drug repositioning, which includes the problem of reducing the noise and incompleteness of biomedical data, the ensemble of various computation drug repositioning methods, the importance of designing reliable negative samples selection methods, new techniques dealing with the data sparseness problem, the construction of large-scale and comprehensive benchmark data sets and the analysis and explanation of the underlying mechanisms of predicted interactions.

Type of Medium: Online Resource

ISSN: 1467-5463 , 1477-4054

URL: Article

DOI: 10.1093/bib/bbz176

Language: English

Publisher: Oxford University Press (OUP)

Publication Date: 2021

detail.hit.zdb_id: 2036055-1

SSG: 12

Permalink

	Location	Call Number	Limitation	Availability

Others were also interested in ...

Online Resource

Link to publisher

7

Online Resource

Key residues influencing binding affinities of 2019-nCoV with ACE2 in different species

Fang, Senbiao ; Zheng, Ruoqian ; Lei, Chuqi ; [et al.]

Oxford University Press (OUP) ; 2021

In: Briefings in Bioinformatics Vol. 22, No. 2 ( 2021-03-22), p. 963-975

add to mindlist on the mindlist

Details

In: Briefings in Bioinformatics, Oxford University Press (OUP), Vol. 22, No. 2 ( 2021-03-22), p. 963-975

Abstract: The Novel Coronavirus Disease 2019 (COVID-19) has become an international public health emergency, which poses the most serious threat to the human health around the world. Accumulating evidences have shown that the new coronavirus could not only infect human beings, but also can infect other species which might result in the cross-species infections. In this research, 1056 ACE2 protein sequences are collected from the NCBI database, and 173 species with & gt;60% sequence identity compared with that of human beings are selected for further analysis. We find 14 polar residues forming the binding interface of ACE2/2019-nCoV-Spike complex play an important role in maintaining protein–protein stability. Among them, 8 polar residues at the same positions with that of human ACE2 are highly conserved, which ensure its basic binding affinity with the novel coronavirus. 5 of other 6 unconserved polar residues (positions at human ACE2: Q24, D30, K31, H34 and E35) are proved to have an effect on the binding patterns among species. We select 21 species keeping close contacts with human beings, construct their ACE2 three-dimensional structures by Homology Modeling method and calculate the binding free energies of their ACE2/2019-nCoV-Spike complexes. We find the ACE2 from all the 21 species possess the capabilities to bind with the novel coronavirus. Compared with the human beings, 8 species (cow, deer, cynomys, chimpanzee, monkey, sheep, dolphin and whale) present almost the same binding abilities, and 3 species (bat, pig and dog) show significant improvements in binding affinities. We hope this research could provide significant help for the future epidemic detection, drug and vaccine development and even the global eco-system protections.

Type of Medium: Online Resource

ISSN: 1467-5463 , 1477-4054

URL: Article

DOI: 10.1093/bib/bbaa329

Language: English

Publisher: Oxford University Press (OUP)

Publication Date: 2021

detail.hit.zdb_id: 2036055-1

SSG: 12

Permalink

	Location	Call Number	Limitation	Availability

Others were also interested in ...

Online Resource

Link to publisher

8

Online Resource

LDAP: a web server for lncRNA-disease association prediction

Lan, Wei ; Li, Min ; Zhao, Kaijie ; [et al.]

Oxford University Press (OUP) ; 2017

In: Bioinformatics Vol. 33, No. 3 ( 2017-02-01), p. 458-460

add to mindlist on the mindlist

Details

In: Bioinformatics, Oxford University Press (OUP), Vol. 33, No. 3 ( 2017-02-01), p. 458-460

Abstract: Increasing evidences have demonstrated that long noncoding RNAs (lncRNAs) play important roles in many human diseases. Therefore, predicting novel lncRNA-disease associations would contribute to dissect the complex mechanisms of disease pathogenesis. Some computational methods have been developed to infer lncRNA-disease associations. However, most of these methods infer lncRNA-disease associations only based on single data resource. Results In this paper, we propose a new computational method to predict lncRNA-disease associations by integrating multiple biological data resources. Then, we implement this method as a web server for lncRNA-disease association prediction (LDAP). The input of the LDAP server is the lncRNA sequence. The LDAP predicts potential lncRNA-disease associations by using a bagging SVM classifier based on lncRNA similarity and disease similarity. Availability and Implementation The web server is available at http://bioinformatics.csu.edu.cn/ldap Supplementary information Supplementary data are available at Bioinformatics online.

Type of Medium: Online Resource

ISSN: 1367-4803 , 1367-4811

URL: Article

DOI: 10.1093/bioinformatics/btw639

Language: English

Publisher: Oxford University Press (OUP)

Publication Date: 2017

detail.hit.zdb_id: 1468345-3

SSG: 12

Permalink

	Location	Call Number	Limitation	Availability

Others were also interested in ...

Online Resource

Link to publisher

9

Online Resource

EPGA2: memory-efficient de novo assembler

Luo, Junwei ; Wang, Jianxin ; Li, Weilong ; [et al.]

Oxford University Press (OUP) ; 2015

In: Bioinformatics Vol. 31, No. 24 ( 2015-12-15), p. 3988-3990

add to mindlist on the mindlist

Details

In: Bioinformatics, Oxford University Press (OUP), Vol. 31, No. 24 ( 2015-12-15), p. 3988-3990

Abstract: Motivation: In genome assembly, as coverage of sequencing and genome size growing, most current softwares require a large memory for handling a great deal of sequence data. However, most researchers usually cannot meet the requirements of computing resources which prevent most current softwares from practical applications. Results: In this article, we present an update algorithm called EPGA2, which applies some new modules and can bring about improved assembly results in small memory. For reducing peak memory in genome assembly, EPGA2 adopts memory-efficient DSK to count K-mers and revised BCALM to construct De Bruijn Graph. Moreover, EPGA2 parallels the step of Contigs Merging and adds Errors Correction in its pipeline. Our experiments demonstrate that all these changes in EPGA2 are more useful for genome assembly. Availability and implementation: EPGA2 is publicly available for download at https://github.com/bioinfomaticsCSU/EPGA2. Contact: jxwang@csu.edu.cn Supplementary information: Supplementary data are available at Bioinformatics online.

Type of Medium: Online Resource

ISSN: 1367-4811 , 1367-4803

URL: Article

DOI: 10.1093/bioinformatics/btv487

Language: English

Publisher: Oxford University Press (OUP)

Publication Date: 2015

detail.hit.zdb_id: 1468345-3

SSG: 12

Permalink

	Location	Call Number	Limitation	Availability

Others were also interested in ...

Online Resource

Link to publisher

10

Online Resource

BOSS: a novel scaffolding algorithm based on an optimized scaffold graph

Luo, Junwei ; Wang, Jianxin ; Zhang, Zhen ; [et al.]

Oxford University Press (OUP) ; 2017

In: Bioinformatics Vol. 33, No. 2 ( 2017-01-15), p. 169-176

add to mindlist on the mindlist

Details

In: Bioinformatics, Oxford University Press (OUP), Vol. 33, No. 2 ( 2017-01-15), p. 169-176

Abstract: While aiming to determine orientations and orders of fragmented contigs, scaffolding is an essential step of assembly pipelines and can make assembly results more complete. Most existing scaffolding tools adopt scaffold graph approaches. However, due to repetitive regions in genome, sequencing errors and uneven sequencing depth, constructing an accurate scaffold graph is still a challenge task. Results In this paper, we present a novel algorithm (called BOSS), which employs paired reads for scaffolding. To construct a scaffold graph, BOSS utilizes the distribution of insert size to decide whether an edge between two vertices (contigs) should be added and how an edge should be weighed. Moreover, BOSS adopts an iterative strategy to detect spurious edges whose removal can guarantee no contradictions in the scaffold graph. Based on the scaffold graph constructed, BOSS employs a heuristic algorithm to sort vertices (contigs) and then generates scaffolds. The experimental results demonstrate that BOSS produces more satisfactory scaffolds, compared with other popular scaffolding tools on real sequencing data of four genomes. Availability and Implementation BOSS is publicly available for download at https://github.com/bioinfomaticsCSU/BOSS. Supplementary information Supplementary data are available at Bioinformatics online.

Type of Medium: Online Resource

ISSN: 1367-4803 , 1367-4811

URL: Article

DOI: 10.1093/bioinformatics/btw597

Language: English

Publisher: Oxford University Press (OUP)

Publication Date: 2017

detail.hit.zdb_id: 1468345-3

SSG: 12

Permalink

	Location	Call Number	Limitation	Availability

Others were also interested in ...

Online Resource

Link to publisher