http://dx.doi.org/10.1090/psapm/034 AMS SHORT COURSE LECTURE NOTES Introductory Survey Lectures A Subseries of Proceedings of Symposia in Applied Mathematics Volume 21 Volume 22 Volume 23 Volume 24 Volume 25 Volume 26 Volume 27 Volume 28 Volume 29 Volume 30 Volume 31 Volume 32 Volume 33 MATHEMATICAL ASPECTS OF PRODUCTION AND DISTRIBUTION OF ENERGY Edited by P. D. Lax (San Antonio, Texas, January 1976) NUMERICAL ANALYSIS Edited by G. H. Golub and J. Oliger (Atlanta, Georgia, January 1978) MODERN STATISTICS: METHODS AND APPLICATIONS Edited by R. V. Hogg (San Antonio, Texas, January 1980) GAME THEORY AND ITS APPLICATIONS Edited by W. F. Lucas (Biloxi, Mississippi, January 1979) OPERATIONS RESEARCH: MATHEMATICS AND MODELS Edited by S. I. Gass (Duluth, Minnesota, August 1979) THE MATHEMATICS OF NETWORKS Edited by S. A. Burr (Pittsburgh, Pennsylvania, August 1981) COMPUTED TOMOGRAPHY Edited by L. A. Shepp (Cincinnati, Ohio, January 1982) STATISTICAL DATA ANALYSIS Edited by R. Gnanadesikan (Toronto, Ontario, August 1982) APPLIED CRYPTOLOGY, CRYPTOGRAPHIC PROTOCOLS, AND COMPUTER SECURITY MODELS By R. A. DeMillo, G. I. Davida, D. P. Dobkin, M. A. Harrison, and R. J. Lipton (San Francisco, California, January 1981) POPULATION BIOLOGY Edited by Simon A. Levin (Albany, New York, August 1983) COMPUTER COMMUNICATIONS Edited by B. Gopinath (Denver, Colorado, January 1988) ENVIRONMENTAL AND NATURAL RESOURCE MATHEMATICS Edited by R. W. McKelvey (Eugene, Oregon, August 1984) FAIR ALLOCATION Edited by H Peyton Young (Anaheim, California, January 1985)
AMS SHORT COURSE LECTURE NOTES Introductory Survey Lectures published as a subseries of Proceedings of Symposia in Applied Mathematics
PROCEEDINGS OF SYMPOSIA IN APPLIED MATHEMATICS Volume 34 Mathematic s of Informatio n Processin g Michae l Anshe l an d Willia m Gewirtz, Editor s American Mathematical Society Providence, Rhode Island
LECTURE NOTES PREPARED FOR THE AMERICAN MATHEMATICAL SOCIETY SHORT COURSE MATHEMATICS OF INFORMATION PROCESSING HELD IN LOUISVILLE, KENTUCKY JANUARY 23-24, 1984 The AMS Short Course Series is sponsored by the Society's Committee on Employment and Education Policy (CEEP). The series is under the direction of the Short Course Advisory Subcommittee of CEEP. Library of Congress Cataloging in Publication Data Main entry under title: Mathematics of information processing. (Proceedings of symposia in applied mathematics, ISSN 0160-7634; v. 34) Bibliography: p. Contents: Diameters of communication networks / F. R. K. Chung-The theory of data dependencies / Ronald Fagin and Moshe Y. Vardi-Transaction management / Hector Garcia- Molina-[etc] 1. Electronic data processing-mathematics-congresses. I. Anshel, Michael, 1941-. II. Gewirtz, William, 1948-. III. Series. QA76.9.M35M39 1986 005.7 85-26693 ISBN 0-8218-0086-8 (alk. paper) COPYING AND REPRINTING. Individual readers of this publication, and nonprofit libraries acting for them, are permitted to make fair use of the material, such as to copy an article for use in teaching or research. Permission is granted to quote brief passages from this publication in reviews, provided the customary acknowledgment of the source is given. Republication, systematic copying, or multiple reproduction of any material in this publication (including abstracts) is permitted only under license from the American Mathematical Society. Requests for such permission should be addressed to the Executive Director, American Mathematical Society, P.O. Box 6248, Providence, Rhode Island 02940. The appearance of the code on the first page of an article in this book indicates the copyright owner's consent for copying beyond that permitted by Sections 107 or 108 of the U.S. Copyright Law, provided that the fee of $1.00 plus $.25 per page for each copy be paid directly to the Copyright Clearance Center, Inc., 21 Congress Street, Salem, Massachusetts 01970. This consent does not extend to other kinds of copying, such as copying for general distribution, for advertising or promotional purposes, for creating new collective works, or for resale. 1980 Mathematics Subject Classification (1985 Revision). Primary 68-06. Copyright 1986 by the American Mathematical Society. All rights reserved. Printed in the United States of America. This volume was printed directly from copy prepared by the authors. The paper used in this book is acid-free and falls within the guidelines established to ensure permanence and durability.
CONTENTS List of contributors Preface Diameters of communication networks F. R. CHUNG 1 The theory of data dependencies a survey RONALD FAGIN and MOSHE Y. VARDI 19 Transaction management HECTOR GARCIA-MOLINA 73 Fundamental database issues BARRY E. JACOBS 91 Data compression algorithms VICTOR S.MILLER 107 Application of category theory of structural sets to modelling of information bases of systems AUGUSTIN A. TUZHILIN 119 ix xi vii
CONTRIBUTORS F. R. CHUNG, Bell Laboratories, Murray Hill, New Jersey. RONALD FAGIN, IBM Research Laboratory, San Jose, California. HECTOR GARCIA-MOLINA, Department of Electrical Engineering and Computer Science, Princeton University, Princeton, New Jersey. BARRY E. JACOBS, Department of Computer Science, University of Maryland, College Park, Maryland and National Aeronautics and Space Administration, Goddard Space Flight Center, Greenbelt, Maryland. VICTOR S. MILLER, IBM Watson Research Center, Yorktown Heights, New York. AUGUSTIN A. TUZHILIN, Department of Computer Science, College of Staten Island, Staten Island, New York. MOSHE E. VARDI, IBM Research Laboratory, San Jose, California. ix
PREFACE This volume contains the lecture notes prepared by six speakers for the American Mathematical Society Short Course on the Mathematics of Information Processing given in Louisville, Kentucky, January 23-24, 1984. The Short Course Advisory Subcommittee of the AMS approved this concept and recommended publication of these lecture notes. The Mathematics of Information Processing is not a single topic but rather a collection of methodologies whose end-goal is the creation of automated information systems. The viewpoint represented here is largely that of American researchers with heavy emphasis on the mathematical problems of database systems and communication networks. This reflects the rapid introduction of the products of information technology in the workplace and in the home. By way of contrast a systems-theoretic approach developed in the Soviet Union is included which also provides a self-contained background should the reader care to probe more deeply into the subject. The idea for the Short Course arose while one of us (Anshel) was a NASA-ASEE Summer Faculty Fellow at Goddard Space Flight Center, NASA. There we learned of a proposed futuristic information system, IESIS (Intelligent Earth sensing Information System). The real-world implementation of the Short Course rested on the shoulders of the practitioner among us (Gewirtz) who also chaired a lively panel discussion with a surprise guest (Bob Targan). We would like to thank the speakers for their efforts in the preparation of these lectures and their patience concerning the idiosyncracies of the co-director. Finally we also wish to thank Stefan Burr who acted as wise counsel and enthusiastic supporter. Michael Anshel City College of the City University of New York William Gewirtz ATT Communications Basking Ridge, New Jersey xi