An algebra for structured office documents

Author:

Güting Ralf Hartmut1,Zicari Roberto2,Choy David M.3

Affiliation:

1. Univ. of Dortmund, Dortmund, W. Germany

2. Politecnico di Milano, Milan, Italy

3. IBM Almaden Research Center, San Jose, CA

Abstract

We describe a data model for structured office information objects, which we generically call “documents,” and a practically useful algebraic language for the retrieval and manipulation of such objects. Documents are viewed as hierarchical structures; their layout (presentation) aspect is to be treated separately. The syntax and semantics of the language are defined precisely in terms of the formal model, an extended relational algebra. The proposed approach has several new features, some of which are particularly useful for the management of office information. The data model is based on nested sequences of tuples rather than nested relations. Therefore, sorting and sequence operations and the explicit handling of duplicates can be described by the model. Furthermore, this is the first model based on a many-sorted instead of a one-sorted algebra, which means that atomic data values as well as nested structures are objects of the algebra. As a consequence, arithmetic operations, aggregate functions, and so forth can be treated inside the model and need not be introduced as query language extensions to the model. Many-sorted algebra also allows arbitrary algebra expressions (with Boolean result) to be admitted as selection or join conditions and the results of arbitrary expressions to be embedded into tuples. In contrast to other formal models, this algebra can be used directly as a rich query language for office documents with precisely defined semantics.

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Science Applications,General Business, Management and Accounting,Information Systems

Cited by 38 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Algebricks;Proceedings of the Sixth ACM Symposium on Cloud Computing;2015-08-27

2. Introduction to the special issue on database and information retrieval integration;The VLDB Journal;2007-11-06

3. Concept and prototype of a collaborative business process environment for document processing;Data & Knowledge Engineering;2005-01

4. An extension of the relational data model to incorporate ordered domains;ACM Transactions on Database Systems;2001-09

5. Queries and computation on the web;Theoretical Computer Science;2000-05

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3