GLORIA — GEOMAR Library Ocean Research Information Access

Hits per page

hits 1 - 2 | 2 hits

Sorting

Online Resource

A new efficient probabilistic model for mining labeled ordered trees applied to glycobiology

Hashimoto, Kosuke ; Aoki-Kinoshita, Kiyoko Flora ; Ueda, Nobuhisa ; [et al.]

Association for Computing Machinery (ACM) ; 2008

In: ACM Transactions on Knowledge Discovery from Data Vol. 2, No. 1 ( 2008-03), p. 1-30

add to mindlist on the mindlist

Details

In: ACM Transactions on Knowledge Discovery from Data, Association for Computing Machinery (ACM), Vol. 2, No. 1 ( 2008-03), p. 1-30

Abstract: Mining frequent patterns from large datasets is an important issue in data mining. Recently, complex and unstructured (or semi-structured) datasets have appeared as targets for major data mining applications, including text mining, web mining and bioinformatics. Our work focuses on labeled ordered trees, which are typically semi-structured datasets. In bioinformatics, carbohydrate sugar chains, or glycans, can be modeled as labeled ordered trees. Glycans are the third major class of biomolecules, having important roles in signaling and recognition. For mining labeled ordered trees, we propose a new probabilistic model and its efficient learning scheme which significantly improves the time and space complexity of an existing probabilistic model for labeled ordered trees. We evaluated the performance of the proposed model, comparing it with those of other probabilistic models, using synthetic as well as real datasets from glycobiology. Experimental results showed that the proposed model drastically reduced the computation time of the competing model, keeping the predictive power and avoiding overfitting to the training data. Finally, we assessed our results on real data from a variety of biological viewpoints, verifying known facts in glycobiology.

Type of Medium: Online Resource

ISSN: 1556-4681 , 1556-472X

URL: Article

DOI: 10.1145/1342320.1342326

Language: English

Publisher: Association for Computing Machinery (ACM)

Publication Date: 2008

detail.hit.zdb_id: 2257358-6

Permalink

	Location	Call Number	Limitation	Availability

Others were also interested in ...

Online Resource

Link to publisher

Online Resource

Managing and analyzing carbohydrate data

Aoki, Kiyoko F. ; Ueda, Nobuhisa ; Yamaguchi, Atsuko ; [et al.]

Association for Computing Machinery (ACM) ; 2004

In: ACM SIGMOD Record Vol. 33, No. 2 ( 2004-06), p. 33-38

add to mindlist on the mindlist

Details

In: ACM SIGMOD Record, Association for Computing Machinery (ACM), Vol. 33, No. 2 ( 2004-06), p. 33-38

Abstract: One of the most vital molecules in multicellular organisms is the carbohydrate, as it is structurally important in the construction of such organisms. In fact, all cells in nature carry carbohydrate sugar chains, or glycans, that help modulate various cell-cell events for the development of the organism. Unfortunately, informatics research on glycans has been slow in comparison to DNA and proteins, largely due to difficulties in the biological analysis of glycan structures. Our work consists of data engineering approaches in order to glean some understanding of the current glycan data that is publicly available. In particular, by modeling glycans as labeled unordered trees, we have implemented a tree-matching algorithm for measuring tree similarity. Our algorithm utilizes proven efficient methodologies in computer science that has been extended and developed for glycan data. Moreover, since glycans are recognized by various agents in multicellular organisms, in order to capture the patterns that might be recognized, we needed to somehow capture the dependencies that seem to range beyond the directly connected nodes in a tree. Therefore, by defining glycans as labeled ordered trees, we were able to develop a new probabilistic tree model such that sibling patterns across a tree could be mined. We provide promising results from our methodologies that could prove useful for the future of glycome informatics.

Type of Medium: Online Resource

ISSN: 0163-5808

URL: Article

DOI: 10.1145/1024694.1024700

Language: English

Publisher: Association for Computing Machinery (ACM)

Publication Date: 2004

detail.hit.zdb_id: 243829-X

detail.hit.zdb_id: 2051432-3

Permalink

	Location	Call Number	Limitation	Availability

Others were also interested in ...

Online Resource

Link to publisher

hits 1 - 2 | 2 hits