Preface This book is the official documentation of PostgreSQL. It has been written by the PostgreSQL develop- ers and other volunteers in parallel to the development of the PostgreSQL software. It describes all the functionality that the current version of PostgreSQL officially supports. To make the large amount of information about PostgreSQL manageable, this book has been organized in several parts. Each part is targeted at a different class of users, or at users in different stages of their PostgreSQL experience: · Part I is an informal introduction for new users. · Part II documents the SQL query language environment, including data types and functions, as well as user-level performance tuning. Every PostgreSQL user should read this. · Part III describes the installation and administration of the server. Everyone who runs a PostgreSQL server, be it for private use or for others, should read this part. · Part IV describes the programming interfaces for PostgreSQL client programs. · Part V contains information for advanced users about the extensibility capabilities of the server. Topics include user-defined data types and functions. · Part VI contains reference information about SQL commands, client and server programs. This part supports the other parts with structured information sorted by command or program. · Part VII contains assorted information that might be of use to PostgreSQL developers. 1. What is PostgreSQL? PostgreSQL is an object-relational database management system (ORDBMS) based on POSTGRES, Version 4.21, developed at the University of California at Berkeley Computer Science Department. POST- GRES pioneered many concepts that only became available in some commercial database systems much later. PostgreSQL is an open-source descendant of this original Berkeley code. It supports a large part of the SQL standard and offers many modern features: · complex queries · foreign keys · triggers · views · transactional integrity · multiversion concurrency control Also, PostgreSQL can be extended by the user in many ways, for example by adding new · data types 1. http://s2k-ftp.CS.Berkeley.EDU:8000/postgres/postgres.html xlix Preface · functions · operators · aggregate functions · index methods · procedural languages And because of the liberal license, PostgreSQL can be used, modified, and distributed by anyone free of charge for any purpose, be it private, commercial, or academic. 2. A Brief History of PostgreSQL The object-relational database management system now known as PostgreSQL is derived from the POST- GRES package written at the University of California at Berkeley. With over two decades of development behind it, PostgreSQL is now the most advanced open-source database available anywhere. 2.1. The Berkeley POSTGRES Project The POSTGRES project, led by Professor Michael Stonebraker, was sponsored by the Defense Advanced Research Projects Agency (DARPA), the Army Research Office (ARO), the National Science Foundation (NSF), and ESL, Inc. The implementation of POSTGRES began in 1986. The initial concepts for the system were presented in The design of POSTGRES , and the definition of the initial data model appeared in The POSTGRES data model . The design of the rule system at that time was described in The design of the POSTGRES rules system. The rationale and architecture of the storage manager were detailed in The design of the POSTGRES storage system . POSTGRES has undergone several major releases since then. The first “demoware” system became op- erational in 1987 and was shown at the 1988 ACM-SIGMOD Conference. Version 1, described in The implementation of POSTGRES , was released to a few external users in June 1989. In response to a critique of the first rule system ( A commentary on the POSTGRES rules system ), the rule system was redesigned ( On Rules, Procedures, Caching and Views in Database Systems ), and Version 2 was released in June 1990 with the new rule system. Version 3 appeared in 1991 and added support for multiple storage man- agers, an improved query executor, and a rewritten rule system. For the most part, subsequent releases until Postgres95 (see below) focused on portability and reliability. POSTGRES has been used to implement many different research and production applications. These in- clude: a financial data analysis system, a jet engine performance monitoring package, an asteroid tracking database, a medical information database, and several geographic information systems. POSTGRES has also been used as an educational tool at several universities. Finally, Illustra Information Technologies (later merged into Informix2, which is now owned by IBM3) picked up the code and commercialized it. In late 1992, POSTGRES became the primary data manager for the Sequoia 2000 scientific computing project4. The size of the external user community nearly doubled during 1993. It became increasingly obvious that maintenance of the prototype code and support was taking up large amounts of time that should have been 2. http://www.informix.com/ 3. http://www.ibm.com/ 4. http://meteora.ucsd.edu/s2k/s2k_home.html l Preface devoted to database research. In an effort to reduce this support burden, the Berkeley POSTGRES project officially ended with Version 4.2. 2.2. Postgres95 In 1994, Andrew Yu and Jolly Chen added an SQL language interpreter to POSTGRES. Under a new name, Postgres95 was subsequently released to the web to find its own way in the world as an open- source descendant of the original POSTGRES Berkeley code. Postgres95 code was completely ANSI C and trimmed in size by 25%. Many internal changes improved performance and maintainability. Postgres95 release 1.0.x ran about 30-50% faster on the Wisconsin Benchmark compared to POSTGRES, Version 4.2. Apart from bug fixes, the following were the major enhancements: · The query language PostQUEL was replaced with SQL (implemented in the server). Subqueries were not supported until PostgreSQL (see below), but they could be imitated in Postgres95 with user-defined SQL functions. Aggregate functions were re-implemented. Support for the GROUP BY query clause was also added. · A new program (psql) was provided for interactive SQL queries, which used GNU Readline. This largely superseded the old monitor program. · A new front-end library, libpgtcl, supported Tcl-based clients. A sample shell, pgtclsh, provided new Tcl commands to interface Tcl programs with the Postgres95 server. · The large-object interface was overhauled. The inversion large objects were the only mechanism for storing large objects. (The inversion file system was removed.) · The instance-level rule system was removed. Rules were still available as rewrite rules. · A short tutorial introducing regular SQL features as well as those of Postgres95 was distributed with the source code · GNU make (instead of BSD make) was used for the build. Also, Postgres95 could be compiled with an unpatched GCC (data alignment of doubles was fixed). 2.3. PostgreSQL By 1996, it became clear that the name “Postgres95” would not stand the test of time. We chose a new name, PostgreSQL, to reflect the relationship between the original POSTGRES and the more recent ver- sions with SQL capability. At the same time, we set the version numbering to start at 6.0, putting the numbers back into the sequence originally begun by the Berkeley POSTGRES project. Many people continue to refer to PostgreSQL as “Postgres” (now rarely in all capital letters) because of tradition or because it is easier to pronounce. This usage is widely accepted as a nickname or alias. The emphasis during development of Postgres95 was on identifying and understanding existing problems in the server code. With PostgreSQL, the emphasis has shifted to augmenting features and capabilities, although work continues in all areas. li Preface Details about what has happened in PostgreSQL since then can be found in Appendix E. 3. Conventions This book uses the following typographical conventions to mark certain portions of text: new terms, foreign phrases, and other important passages are emphasized in italics. Everything that represents in- put or output of the computer, in particular commands, program code, and screen output, is shown in a monospaced font (example). Within such passages, italics (example) indicate placeholders; you must insert an actual value instead of the placeholder. On occasion, parts of program code are emphasized in bold face (example), if they have been added or changed since the preceding example. The following conventions are used in the synopsis of a command: brackets ([ and ]) indicate optional parts. (In the synopsis of a Tcl command, question marks (?) are used instead, as is usual in Tcl.) Braces ({ and }) and vertical lines (|) indicate that you must choose one alternative. Dots (...) mean that the preceding element can be repeated. Where it enhances the clarity, SQL commands are preceded by the prompt =>, and shell commands are preceded by the prompt $. Normally, prompts are not shown, though. An administrator is generally a person who is in charge of installing and running the server. A user could be anyone who is using, or wants to use, any part of the PostgreSQL system. These terms should not be interpreted too narrowly; this book does not have fixed presumptions about system administration procedures.