Keywords:
Data structures (Computer science).
;
Electronic books.
Description / Table of Contents:
This book covers the latest advances in structure inference in heterogeneous collections of documents and data, offering a comprehensive view of the state of the art, and identifying challenges and opportunities for further research agenda and developments.
Type of Medium:
Online Resource
Pages:
1 online resource (448 pages)
Edition:
1st ed.
ISBN:
9783642229138
Series Statement:
Studies in Computational Intelligence Series ; v.375
URL:
https://ebookcentral.proquest.com/lib/geomar/detail.action?docID=3067350
DDC:
006.3
Language:
English
Note:
Title Page -- Foreword -- Preface by Editors -- Contents -- List of Contributors -- Learning Structure and Schemas from Heterogeneous Domains in Networked Systems Surveyed -- Introduction -- Learning Patterns in Sensor Networks -- Learning Structures in Biological Domains -- Learning in Distributed Automation and Control Systems -- Learning Structures in Social Networks -- Learning Structures in Peer-to-Peer Networks -- Learning and Privacy-Preserving in Distributed Environments -- Conclusion -- References -- Handling Hierarchically Structured Resources Addressing Interoperability Issues in Digital Libraries -- Introduction and Motivation -- Objectives and Contributions -- The Background Context and Technologies -- Archives and Archival Descriptions -- EAD: Encoded Archival Description -- OAI-PMH and Dublin Core -- The NESTOR Model -- The NESTOR Algebra -- The NESTOR Prototype: Addressing Interoperability for Digital Archives -- How to Represent an Archive through the NESTOR Model -- Analysis of the Requirements -- Retaining Archival Hierarchy and Context throughout an XML Tree -- Encoding, Accessing and Sharing an Archive through Sets -- Conclusions and Future Work -- References -- Administrative Document Analysis and Structure -- Introduction -- Case-Based Reasoning -- CBR Terminology -- Problem Elaboration -- Similar Case Search -- Adaptation -- Learning -- CBR for Document Image Analysis: CBRDIA -- The Proposed Approach -- Document Structures -- Problem Representation -- Problem Solving -- Experiments -- Experiments on CBRDIA -- Experiments on Administrative Documents -- Conclusion -- References -- Automatic Document Layout Analysis through Relational Machine Learning -- Introduction -- Related Work -- Preliminaries -- Learning Layout Correction Theories -- From Manual to Automatic Improvement of the Layout Correction -- Tool Architecture.
,
The Learning System -- Description Language -- Experiments -- Conclusions -- References -- Dataspaces: Where Structure and Schema Meet -- Introduction -- Data Structuring -- Data Integration: The Story so Far -- Schema Mapping -- Keyword-Driven Queries -- The Web of Data -- Dataspaces -- Dataspace Dimensions -- Dataspace Profiling -- Querying and Searching -- Application Domain -- A Roundup of Existing Projects on Managing Structured Data -- Google BigTable -- Apache Cassandra -- Apache Hadoop -- Apache CouchDB -- DHT-Based Data Management Systems -- Google Fusion Tables -- WebTables -- Yahoo! SearchMonkey -- iMeMex -- Conclusions and Future Work -- References -- Transductive Learning of Logical Structures from Document Images -- Introduction -- Motivation and Problem Definition -- Related Work -- Extracting Emerging Patterns with SPADA -- Document Description -- The Mining Step -- Transductive Classification -- Experiments -- Conclusions -- References -- Progressive Filtering on the Web: The Press Reviews Case Study -- Introduction and Motivation -- Mission -- Related Work -- Hierarchical Text Categorization -- The Input Imbalance Problem -- Agents and Information Retrieval -- Progressive Filtering in Text Categorization -- The Approach -- The Threshold Selection Algorithm -- A Case Study: NEWS.MAS -- The Implemented System -- Experimental Results -- Conclusions -- References -- A Hybrid Binarization Technique for Document Images -- Introduction -- Related Work -- Algorithm Description -- Application of Iterative Global Thresholding -- Noisy Area Detection -- Re-application of IGT (Local Thresholding) -- Experimental Results -- Discussion - Future work -- Conclusion -- References -- Digital Libraries and Document Image Retrieval Techniques: A Survey -- Introduction -- Retrieval Paradigms -- Features -- Pixel Level -- Column Level -- Sliding Window.
,
Stroke and Primitive Level -- Connected-Component Level -- Word Level -- Line and Page Level -- Shape Descriptor -- Representation -- Similarity Measure -- Clustering -- Matching -- Conclusions -- References -- Mining Biomedical Text towards Building a Quantitative Food-Disease-Gene Network -- Introduction and Motivation -- Related Work -- Named Entity Recognition (NER) in Biomedical Text -- Relationship Extraction -- Polarity and Strength Analysis -- Relationship Integration and Visualization -- Named Entity Recognition -- Improving the Performance of Food Recognition -- Abbreviations and Co-reference Recognition -- Verb-Centric Relationship Extraction -- Relationship Polarity and Strength Analysis -- Feature Space Design -- Feature Selection -- Relationship Integration and Visualization -- Evaluation -- Evaluation of the Named Entity Recognition Module -- Evaluation of the Relationship Extraction Algorithm -- Evaluation of Relationship Polarity and Strength Analysis -- Discussions and Conclusion -- References -- Mining Tinnitus Data Based on Clustering and New Temporal Features -- Introduction -- TRT Background -- TRT Data Collection -- Information Retrieval -- Mining Text Data -- Temporal Feature Design for Continuous Data -- Temporal Feature Design or Categorical Data -- System Overview -- Experiments and Results -- Experiment Type I -- Two-Bin Clustering -- Experiment Type II -- Experiment Type III -- Conclusion -- References -- DTW-GO Based Microarray Time Series DataAnalysis for Gene-Gene Regulation Prediction -- Introduction -- What Is Microarray? -- Importance of Microarray Technology -- Microarray Data Processing -- Microarray and Gene Ontology -- Research Issues in Microarray Time-Series Data -- Missing Value Imputation -- Gene Regulation Prediction -- Gene Clustering and Statistical Operations -- DTW-GO Based Microarray Data Analysis.
,
Dynamic Time Warping -- Gene Ontology -- DTW-GO Based Microarray Data Analysis -- Datasets and Performance Assessment -- Real Microarray Dataset -- Assessment of Imputation Accuracy -- Accuracy of Gene Regulation Prediction -- Experimental Results and Discussion -- Design of Experiments -- Results and Discussion -- Conclusions -- References -- Integrating Content and Structure into a Comprehensive Framework for XML Document Similarity Represented in 3D Space -- Introduction -- Problem Statement -- XML Background -- Problem Definition -- Similarity Metric Overview -- Similarity Metric -- Structural Encoding -- Content Encoding -- Nested Content Encoding -- Difference Operator -- Visual Representation and Experiments -- MS Word Incremental Saves -- Random News Documents -- Conclusions and Future Work -- References -- Modelling User Behaviour on Page Content and Layout in Recommender Systems -- Introduction -- Personalization and Browsing Behaviour -- Motivation and Objectives -- Literature Review and Survey -- Browsing Behaviour: Action -- Browsing Behaviour: Visual -- A Recommender System Based on Browsing Behaviour -- Recommender System Architecture -- Recommender Implementation -- Structure, Layout, and Schema Learning -- Profiling Layout and Design -- Formalizing Layout and Design -- Layout and Design Learning -- Layout and Design Matching -- Conclusions -- References -- MANENT: An Infrastructure for Integrating, Structuring and Searching Digital Libraries -- Motivation -- Background -- Standards for Digital Library Access and Description -- Ontologies and Related Languages and Tools -- WordNet Domains and the WordNet Domains Ontology -- Text Semantic Similarity -- MANENT -- The MANENT Architecture -- Metadata Classification According to the WordNet Domains Ontology -- Experiments and Results -- Related Work.
,
Related Work on Digital Libraries Infrastructures -- Related Work on WordNet Domains -- Conclusions and Future Work -- References -- Low-Level Document Image Analysis and Description: From Appearance to Structure -- Introduction -- Motivation, Objectives and Contributions -- Background -- Modeling the Appearance of an Ancient Document -- Extracting Pattern Information from Low-Level Processing -- Linear Instantaneous Case -- Linear Convolutional Case -- A Metadata Schema to Describe Data, Procedures, and Results -- Experimental Evaluation -- Conclusions and Future Work -- References -- Model Learning from Published Aggregated Data -- Introduction -- Mission, Objectives, and Contributions -- Related Work -- Aggregated Data -- Attributional Rules as Knowledge Representation -- Rule Induction -- AQ Algorithm -- Rule Induction from Aggregated Data -- Calculating Coverage -- Evaluation -- Discussion -- Conclusion -- References -- Data De-duplication: A Review -- Introduction -- Problem Description -- Supervised Approaches to De-duplication -- Relational Data De-duplication Approaches -- Multidimensional Data De-duplication Approaches -- Data-Mining Data/Results De-duplication Approaches -- Linked and XML Data De-duplication Approaches -- Streaming Data De-duplication Approaches -- Unsupervised Approaches to De-duplication -- De-duplication Based on Clustering -- De-duplication Based on (dis)Similarity-Search in Metric Spaces -- De-duplication Based on Locality-Sensitive Hashing -- Conclusions and Further Research -- Efficiency and Scalability -- Systematic Assessment of De-duplication Performance -- Ethical, Legal and Anonymity Aspects -- Uncertainty -- References -- A Survey on Integrating Data in Bioinformatics -- Introduction -- Mission -- Background -- Linked-Based Integration -- Data Warehousing Integration -- Mediator-Based Integration.
,
Federated Databases.
Permalink