From eb0e04896a5d290bd9a054500c67cf2734c1ddc2 Mon Sep 17 00:00:00 2001 From: Daniel Gustafsson Date: Mon, 8 Apr 2024 14:54:42 +0200 Subject: [PATCH v1 2/2] Convert internal documentation to markdown - Conversion This patchset intends to achieve proper Markdown rendering with the least amount of changes to the source documents. This patch mainly changes whitespace in order to keep code from rendering as text, and to separate lists etc. There are few notable exceptions: * A few bulletlist characters are changed * In some files the section headlines lacked underlining * Very few instances required backticks in order to keep text being rendered in italics. --- contrib/start-scripts/macos/README.md | 9 +- src/backend/access/gin/README.md | 104 ++++---- src/backend/access/gist/README.md | 148 +++++------ src/backend/access/hash/README.md | 76 +++--- src/backend/access/heap/README.tuplock.md | 10 +- src/backend/access/spgist/README.md | 80 +++--- src/backend/access/transam/README.md | 56 ++--- src/backend/lib/README.md | 22 +- src/backend/libpq/README.SSL.md | 86 +++---- src/backend/optimizer/README.md | 244 ++++++++++--------- src/backend/optimizer/plan/README.md | 106 ++++---- src/backend/parser/README.md | 42 ++-- src/backend/regex/README.md | 46 ++-- src/backend/snowball/README.md | 14 +- src/backend/storage/freespace/README.md | 56 ++--- src/backend/storage/lmgr/README-SSI.md | 134 +++++----- src/backend/utils/fmgr/README.md | 77 +++--- src/backend/utils/mb/README.md | 22 +- src/backend/utils/misc/README.md | 62 +++-- src/backend/utils/mmgr/README.md | 10 +- src/backend/utils/resowner/README.md | 52 ++-- src/interfaces/ecpg/preproc/README.parser.md | 29 ++- src/port/README.md | 2 +- src/test/isolation/README.md | 30 ++- src/test/kerberos/README.md | 4 + src/test/locale/README.md | 3 + src/test/modules/dummy_seclabel/README.md | 12 +- src/test/modules/test_parser/README.md | 74 +++--- src/test/modules/test_regex/README.md | 2 +- src/test/modules/test_rls_hooks/README.md | 8 +- src/test/modules/test_shm_mq/README.md | 14 +- src/test/recovery/README.md | 8 +- src/test/ssl/README.md | 30 ++- src/timezone/README.md | 6 +- src/timezone/tznames/README.md | 8 +- src/tools/ci/README.md | 4 +- src/tools/pg_bsd_indent/README.md | 4 +- src/tools/pgindent/README.md | 9 +- 38 files changed, 897 insertions(+), 806 deletions(-) diff --git a/contrib/start-scripts/macos/README.md b/contrib/start-scripts/macos/README.md index c4f2d9a270..8fe6efb657 100644 --- a/contrib/start-scripts/macos/README.md +++ b/contrib/start-scripts/macos/README.md @@ -15,10 +15,13 @@ if you plan to run the Postgres server under some user name other than "postgres", adjust the UserName parameter value for that. 4. Copy the modified org.postgresql.postgres.plist file into -/Library/LaunchDaemons/. You must do this as root: - sudo cp org.postgresql.postgres.plist /Library/LaunchDaemons -because the file will be ignored if it is not root-owned. + /Library/LaunchDaemons/. You must do this as root: + + sudo cp org.postgresql.postgres.plist /Library/LaunchDaemons + + because the file will be ignored if it is not root-owned. At this point a reboot should launch the server. But if you want to test it without rebooting, you can do + sudo launchctl load /Library/LaunchDaemons/org.postgresql.postgres.plist diff --git a/src/backend/access/gin/README.md b/src/backend/access/gin/README.md index b080731621..c25d45ad31 100644 --- a/src/backend/access/gin/README.md +++ b/src/backend/access/gin/README.md @@ -40,7 +40,7 @@ Core PostgreSQL includes built-in Gin support for one-dimensional arrays Synopsis -------- -=# create index txt_idx on aa using gin(a); + =# create index txt_idx on aa using gin(a); Features -------- @@ -120,19 +120,21 @@ be, in which case a null bitmap is present as usual. (As usual for index tuples, the size of the null bitmap is fixed at INDEX_MAX_KEYS.) * If the key datum is null (ie, IndexTupleHasNulls() is true), then -just after the nominal index data (ie, at offset IndexInfoFindDataOffset -or IndexInfoFindDataOffset + sizeof(int2)) there is a byte indicating -the "category" of the null entry. These are the possible categories: + just after the nominal index data (ie, at offset IndexInfoFindDataOffset + or IndexInfoFindDataOffset + sizeof(int2)) there is a byte indicating + the "category" of the null entry. These are the possible categories: + 1 = ordinary null key value extracted from an indexable item 2 = placeholder for zero-key indexable item 3 = placeholder for null indexable item -Placeholder null entries are inserted into the index because otherwise -there would be no index entry at all for an empty or null indexable item, -which would mean that full index scans couldn't be done and various corner -cases would give wrong answers. The different categories of null entries -are treated as distinct keys by the btree, but heap itempointers for the -same category of null entry are merged into one index entry just as happens -with ordinary key entries. + + Placeholder null entries are inserted into the index because otherwise + there would be no index entry at all for an empty or null indexable item, + which would mean that full index scans couldn't be done and various corner + cases would give wrong answers. The different categories of null entries + are treated as distinct keys by the btree, but heap itempointers for the + same category of null entry are merged into one index entry just as happens + with ordinary key entries. * In a key entry at the btree leaf level, at the next SHORTALIGN boundary, there is a list of item pointers, in compressed format (see Posting List @@ -325,11 +327,11 @@ getting them on the next page. The picture below shows tree state after finding the leaf page. Lower case letters depicts tree pages. 'S' depicts shared lock on the page. - a - / | \ - b c d - / | \ | \ | \ - eS f g h i j k + a + / | \ + b c d + / | \ | \ | \ + eS f g h i j k ### Steping right @@ -346,11 +348,11 @@ concurrently and doesn't delete right sibling accordingly. The picture below shows two pages locked at once during stepping right. - a - / | \ - b c d - / | \ | \ | \ - eS fS g h i j k + a + / | \ + b c d + / | \ | \ | \ + eS fS g h i j k ### Insert @@ -365,11 +367,11 @@ The picture below shows leaf page locked in exclusive mode and ready for insertion. 'P' and 'E' depict pin and exclusive lock correspondingly. - aP - / | \ - b cP d - / | \ | \ | \ - e f g hE i j k + aP + / | \ + b cP d + / | \ | \ | \ + e f g hE i j k If insert causes a page split, the parent is locked in exclusive mode before @@ -379,11 +381,11 @@ parent and child pages at once starting from child. The picture below shows tree state after leaf page split. 'q' is new page produced by split. Parent 'c' is about to have downlink inserted. - aP - / | \ - b cE d - / | \ / | \ | \ - e f g hE q i j k + aP + / | \ + b cE d + / | \ / | \ | \ + e f g hE q i j k ### Page deletion @@ -404,11 +406,11 @@ we locked it. The picture below shows tree state after page deletion algorithm traversed to leftmost leaf of the tree. - aE - / | \ - bE c d - / | \ | \ | \ - eE f g h i j k + aE + / | \ + bE c d + / | \ | \ | \ + eE f g h i j k Deletion algorithm keeps exclusive locks on left siblings of pages comprising currently investigated path. Thus, if current page is to be removed, all @@ -436,21 +438,21 @@ The picture below shows tree state after page deletion algorithm further traversed the tree. Currently investigated path is 'a-c-h'. Left siblings 'b' and 'g' of 'c' and 'h' correspondingly are also exclusively locked. - aE - / | \ - bE cE d - / | \ | \ | \ - e f gE hE i j k + aE + / | \ + bE cE d + / | \ | \ | \ + e f gE hE i j k The next picture shows tree state after page 'h' was deleted. It's marked with 'deleted' flag and newest xid, which might visit it. Downlink from 'c' to 'h' is also deleted. - aE - / | \ - bE cE d - / | \ \ | \ - e f gE hD iE j k + aE + / | \ + bE cE d + / | \ \ | \ + e f gE hD iE j k However, it's still possible that concurrent reader has seen downlink from 'c' to 'h' before we deleted it. In that case this reader will step right from 'h' @@ -463,11 +465,11 @@ The next picture shows tree state after 'i' and 'c' was deleted. Internal page investigation is 'a-d-j'. Pages 'b' and 'g' are locked as self siblings of 'd' and 'j'. - aE - / \ - bE cD dE - / | \ | \ - e f gE hD iD jE k + aE + / \ + bE cD dE + / | \ | \ + e f gE hD iD jE k During the replay of page deletion at standby, the page's left sibling, the target page, and its parent, are locked in that order. This order guarantees diff --git a/src/backend/access/gist/README.md b/src/backend/access/gist/README.md index 8015ff19f0..af082fc2bb 100644 --- a/src/backend/access/gist/README.md +++ b/src/backend/access/gist/README.md @@ -183,70 +183,70 @@ operation. findPath is a subroutine of findParent, used when the correct parent page can't be found by following the rightlinks at the parent level: -findPath( stack item ) - push stack, [root, 0, 0] // page, LSN, parent - while( stack ) - ptr = top of stack - latch( ptr->page, S-mode ) - if ( ptr->parent->page->lsn < ptr->page->nsn ) - push stack, [ ptr->page->rightlink, 0, ptr->parent ] - end - for( each tuple on page ) - if ( tuple->pagepointer == item->page ) - return stack - else - add to stack at the end [tuple->pagepointer,0, ptr] + findPath( stack item ) + push stack, [root, 0, 0] // page, LSN, parent + while( stack ) + ptr = top of stack + latch( ptr->page, S-mode ) + if ( ptr->parent->page->lsn < ptr->page->nsn ) + push stack, [ ptr->page->rightlink, 0, ptr->parent ] + end + for( each tuple on page ) + if ( tuple->pagepointer == item->page ) + return stack + else + add to stack at the end [tuple->pagepointer,0, ptr] + end end + unlatch( ptr->page ) + pop stack end - unlatch( ptr->page ) - pop stack - end gistFindCorrectParent is used to re-find the parent of a page during insertion. It might have migrated to the right since we traversed down the tree because of page splits. -findParent( stack item ) - parent = item->parent - if ( parent->page->lsn != parent->lsn ) - while(true) - search parent tuple on parent->page, if found the return - rightlink = parent->page->rightlink - unlatch( parent->page ) - if ( rightlink is incorrect ) - break loop + findParent( stack item ) + parent = item->parent + if ( parent->page->lsn != parent->lsn ) + while(true) + search parent tuple on parent->page, if found the return + rightlink = parent->page->rightlink + unlatch( parent->page ) + if ( rightlink is incorrect ) + break loop + end + parent->page = rightlink + latch( parent->page, X-mode ) end - parent->page = rightlink + newstack = findPath( item->parent ) + replace part of stack to new one latch( parent->page, X-mode ) + return findParent( item ) end - newstack = findPath( item->parent ) - replace part of stack to new one - latch( parent->page, X-mode ) - return findParent( item ) - end pageSplit function decides how to distribute keys to the new pages after page split: -pageSplit(page, allkeys) - (lkeys, rkeys) = pickSplit( allkeys ) - if ( page is root ) - lpage = new page - else - lpage = page - rpage = new page - if ( no space left on rpage ) - newkeys = pageSplit( rpage, rkeys ) - else - push newkeys, union(rkeys) - end - if ( no space left on lpage ) - push newkeys, pageSplit( lpage, lkeys ) - else - push newkeys, union(lkeys) - end - return newkeys + pageSplit(page, allkeys) + (lkeys, rkeys) = pickSplit( allkeys ) + if ( page is root ) + lpage = new page + else + lpage = page + rpage = new page + if ( no space left on rpage ) + newkeys = pageSplit( rpage, rkeys ) + else + push newkeys, union(rkeys) + end + if ( no space left on lpage ) + push newkeys, pageSplit( lpage, lkeys ) + else + push newkeys, union(lkeys) + end + return newkeys @@ -302,18 +302,18 @@ In the algorithm, levels are numbered so that leaf pages have level zero, and internal node levels count up from 1. This numbering ensures that a page's level number never changes, even when the root page is split. -Level Tree + Level Tree -3 * - / \ -2 * * - / | \ / | \ -1 * * * * * * - / \ / \ / \ / \ / \ / \ -0 o o o o o o o o o o o o + 3 * + / \ + 2 * * + / | \ / | \ + 1 * * * * * * + / \ / \ / \ / \ / \ / \ + 0 o o o o o o o o o o o o -* - internal page -o - leaf page + * - internal page + o - leaf page Internal pages that belong to certain levels have buffers associated with them. Leaf pages never have buffers. Which levels have buffers is controlled @@ -322,17 +322,17 @@ have buffers, while others do not. For example, if level_step = 2, then pages on levels 2, 4, 6, ... have buffers. If level_step = 1 then every internal page has a buffer. -Level Tree (level_step = 1) Tree (level_step = 2) + Level Tree (level_step = 1) Tree (level_step = 2) -3 * * - / \ / \ -2 *(b) *(b) *(b) *(b) - / | \ / | \ / | \ / | \ -1 *(b) *(b) *(b) *(b) *(b) *(b) * * * * * * - / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ -0 o o o o o o o o o o o o o o o o o o o o o o o o + 3 * * + / \ / \ + 2 *(b) *(b) *(b) *(b) + / | \ / | \ / | \ / | \ + 1 *(b) *(b) *(b) *(b) *(b) *(b) * * * * * * + / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ + 0 o o o o o o o o o o o o o o o o o o o o o o o o -(b) - buffer + (b) - buffer Logically, a buffer is just bunch of tuples. Physically, it is divided in pages, backed by a temporary file. Each buffer can be in one of two states: @@ -365,7 +365,7 @@ pages where the tuples finally land on get cached too. If there are, the last buffer page of each buffer below is kept in memory. This is illustrated in the figures below: - Buffer being emptied to + Buffer being emptied to lower-level buffers Buffer being emptied to leaf pages +(fb) +(fb) @@ -374,11 +374,11 @@ the figures below: / \ / \ / \ / \ *(ab) *(ab) *(ab) *(ab) x x x x -+ - cached internal page -x - cached leaf page -* - non-cached internal page -(fb) - buffer being emptied -(ab) - buffers being appended to, with last page in memory + + - cached internal page + x - cached leaf page + * - non-cached internal page + (fb) - buffer being emptied + (ab) - buffers being appended to, with last page in memory In the beginning of the index build, the level-step is chosen so that all those pages involved in emptying one buffer fit in cache, so after each of those diff --git a/src/backend/access/hash/README.md b/src/backend/access/hash/README.md index 13dc59c124..a5df13a68a 100644 --- a/src/backend/access/hash/README.md +++ b/src/backend/access/hash/README.md @@ -248,24 +248,24 @@ track of available overflow pages. The reader algorithm is: - lock the primary bucket page of the target bucket - if the target bucket is still being populated by a split: - release the buffer content lock on current bucket page - pin and acquire the buffer content lock on old bucket in shared mode - release the buffer content lock on old bucket, but not pin - retake the buffer content lock on new bucket - arrange to scan the old bucket normally and the new bucket for - tuples which are not moved-by-split --- then, per read request: - reacquire content lock on current page - step to next page if necessary (no chaining of content locks, but keep - the pin on the primary bucket throughout the scan) - save all the matching tuples from current index page into an items array - release pin and content lock (but if it is primary bucket page retain - its pin till the end of the scan) - get tuple from an item array --- at scan shutdown: - release all pins still held + lock the primary bucket page of the target bucket + if the target bucket is still being populated by a split: + release the buffer content lock on current bucket page + pin and acquire the buffer content lock on old bucket in shared mode + release the buffer content lock on old bucket, but not pin + retake the buffer content lock on new bucket + arrange to scan the old bucket normally and the new bucket for + tuples which are not moved-by-split + -- then, per read request: + reacquire content lock on current page + step to next page if necessary (no chaining of content locks, but keep + the pin on the primary bucket throughout the scan) + save all the matching tuples from current index page into an items array + release pin and content lock (but if it is primary bucket page retain + its pin till the end of the scan) + get tuple from an item array + -- at scan shutdown: + release all pins still held Holding the buffer pin on the primary bucket page for the whole scan prevents the reader's current-tuple pointer from being invalidated by splits or @@ -288,8 +288,8 @@ which this bucket is formed by split. The insertion algorithm is rather similar: lock the primary bucket page of the target bucket --- (so far same as reader, except for acquisition of buffer content lock in - exclusive mode on primary bucket page) + -- (so far same as reader, except for acquisition of buffer content lock in + exclusive mode on primary bucket page) if the bucket-being-split flag is set for a bucket and pin count on it is one, then finish the split release the buffer content lock on current bucket @@ -465,24 +465,24 @@ overflow page to the free pool. Obtaining an overflow page: - take metapage content lock in exclusive mode - determine next bitmap page number; if none, exit loop - release meta page content lock - pin bitmap page and take content lock in exclusive mode - search for a free page (zero bit in bitmap) - if found: - set bit in bitmap - mark bitmap page dirty - take metapage buffer content lock in exclusive mode - if first-free-bit value did not change, - update it and mark meta page dirty - else (not found): - release bitmap page buffer content lock - loop back to try next bitmap page, if any --- here when we have checked all bitmap pages; we hold meta excl. lock - extend index to add another overflow page; update meta information - mark meta page dirty - return page number + take metapage content lock in exclusive mode + determine next bitmap page number; if none, exit loop + release meta page content lock + pin bitmap page and take content lock in exclusive mode + search for a free page (zero bit in bitmap) + if found: + set bit in bitmap + mark bitmap page dirty + take metapage buffer content lock in exclusive mode + if first-free-bit value did not change, + update it and mark meta page dirty + else (not found): + release bitmap page buffer content lock + loop back to try next bitmap page, if any + -- here when we have checked all bitmap pages; we hold meta excl. lock + extend index to add another overflow page; update meta information + mark meta page dirty + return page number It is slightly annoying to release and reacquire the metapage lock multiple times, but it seems best to do it that way to minimize loss of diff --git a/src/backend/access/heap/README.tuplock.md b/src/backend/access/heap/README.tuplock.md index 6441e8baf0..5a91b4fddd 100644 --- a/src/backend/access/heap/README.tuplock.md +++ b/src/backend/access/heap/README.tuplock.md @@ -62,11 +62,11 @@ the tuple without changing its key. The conflict table is: - UPDATE NO KEY UPDATE SHARE KEY SHARE -UPDATE conflict conflict conflict conflict -NO KEY UPDATE conflict conflict conflict -SHARE conflict conflict -KEY SHARE conflict + UPDATE NO KEY UPDATE SHARE KEY SHARE + UPDATE conflict conflict conflict conflict + NO KEY UPDATE conflict conflict conflict + SHARE conflict conflict + KEY SHARE conflict When there is a single locker in a tuple, we can just store the locking info in the tuple itself. We do this by storing the locker's Xid in XMAX, and diff --git a/src/backend/access/spgist/README.md b/src/backend/access/spgist/README.md index 7117e02c77..2b890e4f8e 100644 --- a/src/backend/access/spgist/README.md +++ b/src/backend/access/spgist/README.md @@ -13,7 +13,7 @@ few disk pages, even if it traverses many nodes. COMMON STRUCTURE DESCRIPTION - +---------------------------- Logically, an SP-GiST tree is a set of tuples, each of which can be either an inner or leaf tuple. Each inner tuple contains "nodes", which are (label,pointer) pairs, where the pointer (ItemPointerData) is a pointer to @@ -58,27 +58,27 @@ pages. An inner tuple consists of: - optional prefix value - all successors must be consistent with it. - Example: - radix tree - prefix value is a common prefix string - quad tree - centroid - k-d tree - one coordinate + optional prefix value - all successors must be consistent with it. + Example: + radix tree - prefix value is a common prefix string + quad tree - centroid + k-d tree - one coordinate - list of nodes, where node is a (label, pointer) pair. - Example of a label: a single character for radix tree + list of nodes, where node is a (label, pointer) pair. + Example of a label: a single character for radix tree A leaf tuple consists of: - a leaf value - Example: - radix tree - the rest of string (postfix) - quad and k-d tree - the point itself + a leaf value + Example: + radix tree - the rest of string (postfix) + quad and k-d tree - the point itself - ItemPointer to the corresponding heap tuple - nextOffset number of next leaf tuple in a chain on a leaf page + ItemPointer to the corresponding heap tuple + nextOffset number of next leaf tuple in a chain on a leaf page - optional nulls bitmask - optional INCLUDE-column values + optional nulls bitmask + optional INCLUDE-column values For compatibility with pre-v14 indexes, a leaf tuple has a nulls bitmask only if there are null values (among the leaf value and the INCLUDE values) @@ -90,7 +90,7 @@ code can be used. NULLS HANDLING - +-------------- We assume that SPGiST-indexable operators are strict (can never succeed for null inputs). It is still desirable to index nulls, so that whole-table indexscans are possible and so that "x IS NULL" can be implemented by an @@ -104,27 +104,27 @@ AllTheSame cases in the normal tree. INSERTION ALGORITHM - +------------------- Insertion algorithm is designed to keep the tree in a consistent state at any moment. Here is a simplified insertion algorithm specification (numbers refer to notes below): - Start with the first tuple on the root page (1) - - loop: - if (page is leaf) then - if (enough space) - insert on page and exit (5) - else (7) - call PickSplitFn() (2) - end if - else - switch (chooseFn()) - case MatchNode - descend through selected node - case AddNode - add node and then retry chooseFn (3, 6) - case SplitTuple - split inner tuple to prefix and postfix, then - retry chooseFn with the prefix tuple (4, 6) - end if + Start with the first tuple on the root page (1) + + loop: + if (page is leaf) then + if (enough space) + insert on page and exit (5) + else (7) + call PickSplitFn() (2) + end if + else + switch (chooseFn()) + case MatchNode - descend through selected node + case AddNode - add node and then retry chooseFn (3, 6) + case SplitTuple - split inner tuple to prefix and postfix, then + retry chooseFn with the prefix tuple (4, 6) + end if Notes: @@ -160,7 +160,7 @@ the following notation, where tuple's id is just for discussion (no such id is actually stored): inner tuple: {tuple id}(prefix string)[ comma separated list of node labels ] -leaf tuple: {tuple id} +leaf tuple: {tuple id}< value > Suppose we need to insert string 'www.gogo.com' into inner tuple @@ -215,7 +215,7 @@ space utilization, but doesn't change the basis of the algorithm. CONCURRENCY - +----------- While descending the tree, the insertion algorithm holds exclusive lock on two tree levels at a time, ie both parent and child pages (but parent and child pages can be the same, see notes above). There is a possibility of @@ -267,7 +267,7 @@ been flushed out of the system. DEAD TUPLES - +----------- Tuples on leaf pages can be in one of four states: SPGIST_LIVE: normal, live pointer to a heap tuple. @@ -319,7 +319,7 @@ remove unused inner tuples. VACUUM - +------ VACUUM (or more precisely, spgbulkdelete) performs a single sequential scan over the entire index. On both leaf and inner pages, we can convert old REDIRECT tuples into PLACEHOLDER status, and then remove any PLACEHOLDERs @@ -374,7 +374,7 @@ space map, and gather statistics. LAST USED PAGE MANAGEMENT - +------------------------- The list of last used pages contains four pages - a leaf page and three inner pages, one from each "triple parity" group. (Actually, there's one such list for the main tree and a separate one for the nulls tree.) This @@ -384,6 +384,6 @@ critical, because we could allocate a new page at any moment. AUTHORS - +------- Teodor Sigaev Oleg Bartunov diff --git a/src/backend/access/transam/README.md b/src/backend/access/transam/README.md index 28d196cf62..9a2181aa30 100644 --- a/src/backend/access/transam/README.md +++ b/src/backend/access/transam/README.md @@ -59,27 +59,27 @@ For example, consider the following sequence of user commands: In the main processing loop, this results in the following function call sequence: - / StartTransactionCommand; - / StartTransaction; -1) < ProcessUtility; << BEGIN - \ BeginTransactionBlock; - \ CommitTransactionCommand; - - / StartTransactionCommand; -2) / PortalRunSelect; << SELECT ... - \ CommitTransactionCommand; - \ CommandCounterIncrement; - - / StartTransactionCommand; -3) / ProcessQuery; << INSERT ... - \ CommitTransactionCommand; - \ CommandCounterIncrement; - - / StartTransactionCommand; - / ProcessUtility; << COMMIT -4) < EndTransactionBlock; - \ CommitTransactionCommand; - \ CommitTransaction; + / StartTransactionCommand; + / StartTransaction; + 1) < ProcessUtility; << BEGIN + \ BeginTransactionBlock; + \ CommitTransactionCommand; + + / StartTransactionCommand; + 2) / PortalRunSelect; << SELECT ... + \ CommitTransactionCommand; + \ CommandCounterIncrement; + + / StartTransactionCommand; + 3) / ProcessQuery; << INSERT ... + \ CommitTransactionCommand; + \ CommandCounterIncrement; + + / StartTransactionCommand; + / ProcessUtility; << COMMIT + 4) < EndTransactionBlock; + \ CommitTransactionCommand; + \ CommitTransaction; The point of this example is to demonstrate the need for StartTransactionCommand and CommitTransactionCommand to be state smart -- they @@ -100,12 +100,12 @@ Transaction aborts can occur in two ways: The reason we have to distinguish them is illustrated by the following two situations: - case 1 case 2 - ------ ------ -1) user types BEGIN 1) user types BEGIN -2) user does something 2) user does something -3) user does not like what 3) system aborts for some reason - she sees and types ABORT (syntax error, etc) + case 1 case 2 + ------ ------ + 1) user types BEGIN 1) user types BEGIN + 2) user does something 2) user does something + 3) user does not like what 3) system aborts for some reason + she sees and types ABORT (syntax error, etc) In case 1, we want to abort the transaction and return to the default state. In case 2, there may be more commands coming our way which are part of the @@ -171,7 +171,7 @@ CommitTransactionCommand, the real work is done. The main point of doing things this way is that if we get an error while popping state stack entries, the remaining stack entries still show what we need to do to finish up. -In the case of ROLLBACK TO , we abort all the subtransactions up +In the case of ROLLBACK TO < savepoint >, we abort all the subtransactions up through the one identified by the savepoint name, and then re-create that subtransaction level with the same name. So it's a completely new subtransaction as far as the internals are concerned. diff --git a/src/backend/lib/README.md b/src/backend/lib/README.md index f2fb591237..fc8e1aa1f7 100644 --- a/src/backend/lib/README.md +++ b/src/backend/lib/README.md @@ -1,27 +1,27 @@ This directory contains a general purpose data structures, for use anywhere in the backend: -binaryheap.c - a binary heap + binaryheap.c - a binary heap -bipartite_match.c - Hopcroft-Karp maximum cardinality algorithm for bipartite graphs + bipartite_match.c - Hopcroft-Karp maximum cardinality algorithm for bipartite graphs -bloomfilter.c - probabilistic, space-efficient set membership testing + bloomfilter.c - probabilistic, space-efficient set membership testing -dshash.c - concurrent hash tables backed by dynamic shared memory areas + dshash.c - concurrent hash tables backed by dynamic shared memory areas -hyperloglog.c - a streaming cardinality estimator + hyperloglog.c - a streaming cardinality estimator -ilist.c - single and double-linked lists + ilist.c - single and double-linked lists -integerset.c - a data structure for holding large set of integers + integerset.c - a data structure for holding large set of integers -knapsack.c - knapsack problem solver + knapsack.c - knapsack problem solver -pairingheap.c - a pairing heap + pairingheap.c - a pairing heap -rbtree.c - a red-black tree + rbtree.c - a red-black tree -stringinfo.c - an extensible string type + stringinfo.c - an extensible string type Aside from the inherent characteristics of the data structures, there are a diff --git a/src/backend/libpq/README.SSL.md b/src/backend/libpq/README.SSL.md index d84a434a6e..98e20498bd 100644 --- a/src/backend/libpq/README.SSL.md +++ b/src/backend/libpq/README.SSL.md @@ -3,59 +3,59 @@ src/backend/libpq/README.SSL SSL === ->From the servers perspective: +From the servers perspective: - Receives StartupPacket - | - | - (Is SSL_NEGOTIATE_CODE?) ----------- Normal startup - | No - | - | Yes + Receives StartupPacket + | + | + (Is SSL_NEGOTIATE_CODE?) ----------- Normal startup + | No + | + | Yes + | + | + (Server compiled with USE_SSL?) ------- Send 'N' + | No | + | | + | Yes Normal startup + | + | + Send 'S' + | + | + Establish SSL + | + | + Normal startup + + + + + +From the clients perspective (v6.6 client _with_ SSL): + + + Connect | | - (Server compiled with USE_SSL?) ------- Send 'N' - | No | - | | - | Yes Normal startup + Send packet with SSL_NEGOTIATE_CODE | | - Send 'S' + Receive single char ------- 'S' -------- Establish SSL + | | + | '' | + | Normal startup | | - Establish SSL + Is it 'E' for error ------------------- Retry connection + | Yes without SSL + | No | + Is it 'N' for normal ------------------- Normal startup + | Yes | - Normal startup - - - - - ->From the clients perspective (v6.6 client _with_ SSL): - - - Connect - | - | - Send packet with SSL_NEGOTIATE_CODE - | - | - Receive single char ------- 'S' -------- Establish SSL - | | - | '' | - | Normal startup - | - | - Is it 'E' for error ------------------- Retry connection - | Yes without SSL - | No - | - Is it 'N' for normal ------------------- Normal startup - | Yes - | - Fail with unknown + Fail with unknown --------------------------------------------------------------------------- diff --git a/src/backend/optimizer/README.md b/src/backend/optimizer/README.md index 2ab4f3dbf3..5805b61a23 100644 --- a/src/backend/optimizer/README.md +++ b/src/backend/optimizer/README.md @@ -254,7 +254,9 @@ the boundary, unless the proposed join is a LEFT join that can associate into the SpecialJoinInfo's RHS using identity 3. The use of minimum Relid sets has some pitfalls; consider a query like + A leftjoin (B leftjoin (C innerjoin D) on (Pbcd)) on Pa + where Pa doesn't mention B/C/D at all. In this case a naive computation would give the upper leftjoin's min LHS as {A} and min RHS as {C,D} (since we know that the innerjoin can't associate out of the leftjoin's RHS, and @@ -262,7 +264,9 @@ enforce that by including its relids in the leftjoin's min RHS). And the lower leftjoin has min LHS of {B} and min RHS of {C,D}. Given such information, join_is_legal would think it's okay to associate the upper join into the lower join's RHS, transforming the query to + B leftjoin (A leftjoin (C innerjoin D) on Pa) on (Pbcd) + which yields totally wrong answers. We prevent that by forcing the min RHS for the upper join to include B. This is perhaps overly restrictive, but such cases don't arise often so it's not clear that it's worth developing a @@ -359,10 +363,12 @@ side of the full join a Var came from; but that information can be found elsewhere at need.) Notionally, a Var having nonempty varnullingrels can be thought of as + CASE WHEN any-of-these-outer-joins-produced-a-null-extended-row THEN NULL ELSE the-scan-level-value-of-the-column END + It's only notional, because no such calculation is ever done explicitly. In a finished plan, Vars occurring in scan-level plan nodes represent the actual table column values, but upper-level Vars are always @@ -375,14 +381,20 @@ otherwise be essential information for FULL JOIN cases. Outer join identity 3 (discussed above) complicates this picture a bit. In the form + A leftjoin (B leftjoin C on (Pbc)) on (Pab) + all of the Vars in clauses Pbc and Pab will have empty varnullingrels, but if we start with + (A leftjoin B on (Pab)) leftjoin C on (Pbc) + then the parser will have marked Pbc's B Vars with the A/B join's RT index, making this form artificially different from the first. For discussion's sake, let's denote this marking with a star: + (A leftjoin B on (Pab)) leftjoin C on (Pb*c) + To cope with this, once we have detected that commuting these joins is legal, we generate both the Pbc and Pb*c forms of that ON clause, by either removing or adding the first join's RT index in the B Vars @@ -553,114 +565,114 @@ Optimizer Functions The primary entry point is planner(). -planner() -set up for recursive handling of subqueries --subquery_planner() - pull up sublinks and subqueries from rangetable, if possible - canonicalize qual - Attempt to simplify WHERE clause to the most useful form; this includes - flattening nested AND/ORs and detecting clauses that are duplicated in - different branches of an OR. - simplify constant expressions - process sublinks - convert Vars of outer query levels into Params ---grouping_planner() - preprocess target list for non-SELECT queries - handle UNION/INTERSECT/EXCEPT, GROUP BY, HAVING, aggregates, - ORDER BY, DISTINCT, LIMIT ----query_planner() - make list of base relations used in query - split up the qual into restrictions (a=1) and joins (b=c) - find qual clauses that enable merge and hash joins -----make_one_rel() - set_base_rel_pathlists() - find seqscan and all index paths for each base relation - find selectivity of columns used in joins - make_rel_from_joinlist() - hand off join subproblems to a plugin, GEQO, or standard_join_search() -------standard_join_search() - call join_search_one_level() for each level of join tree needed - join_search_one_level(): - For each joinrel of the prior level, do make_rels_by_clause_joins() - if it has join clauses, or make_rels_by_clauseless_joins() if not. - Also generate "bushy plan" joins between joinrels of lower levels. - Back at standard_join_search(), generate gather paths if needed for - each newly constructed joinrel, then apply set_cheapest() to extract - the cheapest path for it. - Loop back if this wasn't the top join level. - Back at grouping_planner: - do grouping (GROUP BY) and aggregation - do window functions - make unique (DISTINCT) - do sorting (ORDER BY) - do limit (LIMIT/OFFSET) -Back at planner(): -convert finished Path tree into a Plan tree -do final cleanup after planning + planner() + set up for recursive handling of subqueries + -subquery_planner() + pull up sublinks and subqueries from rangetable, if possible + canonicalize qual + Attempt to simplify WHERE clause to the most useful form; this includes + flattening nested AND/ORs and detecting clauses that are duplicated in + different branches of an OR. + simplify constant expressions + process sublinks + convert Vars of outer query levels into Params + --grouping_planner() + preprocess target list for non-SELECT queries + handle UNION/INTERSECT/EXCEPT, GROUP BY, HAVING, aggregates, + ORDER BY, DISTINCT, LIMIT + ---query_planner() + make list of base relations used in query + split up the qual into restrictions (a=1) and joins (b=c) + find qual clauses that enable merge and hash joins + ----make_one_rel() + set_base_rel_pathlists() + find seqscan and all index paths for each base relation + find selectivity of columns used in joins + make_rel_from_joinlist() + hand off join subproblems to a plugin, GEQO, or standard_join_search() + ------standard_join_search() + call join_search_one_level() for each level of join tree needed + join_search_one_level(): + For each joinrel of the prior level, do make_rels_by_clause_joins() + if it has join clauses, or make_rels_by_clauseless_joins() if not. + Also generate "bushy plan" joins between joinrels of lower levels. + Back at standard_join_search(), generate gather paths if needed for + each newly constructed joinrel, then apply set_cheapest() to extract + the cheapest path for it. + Loop back if this wasn't the top join level. + Back at grouping_planner: + do grouping (GROUP BY) and aggregation + do window functions + make unique (DISTINCT) + do sorting (ORDER BY) + do limit (LIMIT/OFFSET) + Back at planner(): + convert finished Path tree into a Plan tree + do final cleanup after planning Optimizer Data Structures ------------------------- -PlannerGlobal - global information for a single planner invocation - -PlannerInfo - information for planning a particular Query (we make - a separate PlannerInfo node for each sub-Query) - -RelOptInfo - a relation or joined relations - - RestrictInfo - WHERE clauses, like "x = 3" or "y = z" - (note the same structure is used for restriction and - join clauses) - - Path - every way to generate a RelOptInfo(sequential,index,joins) - A plain Path node can represent several simple plans, per its pathtype: - T_SeqScan - sequential scan - T_SampleScan - tablesample scan - T_FunctionScan - function-in-FROM scan - T_TableFuncScan - table function scan - T_ValuesScan - VALUES scan - T_CteScan - CTE (WITH) scan - T_NamedTuplestoreScan - ENR scan - T_WorkTableScan - scan worktable of a recursive CTE - T_Result - childless Result plan node (used for FROM-less SELECT) - IndexPath - index scan - BitmapHeapPath - top of a bitmapped index scan - TidPath - scan by CTID - TidRangePath - scan a contiguous range of CTIDs - SubqueryScanPath - scan a subquery-in-FROM - ForeignPath - scan a foreign table, foreign join or foreign upper-relation - CustomPath - for custom scan providers - AppendPath - append multiple subpaths together - MergeAppendPath - merge multiple subpaths, preserving their common sort order - GroupResultPath - childless Result plan node (used for degenerate grouping) - MaterialPath - a Material plan node - MemoizePath - a Memoize plan node for caching tuples from sub-paths - UniquePath - remove duplicate rows (either by hashing or sorting) - GatherPath - collect the results of parallel workers - GatherMergePath - collect parallel results, preserving their common sort order - ProjectionPath - a Result plan node with child (used for projection) - ProjectSetPath - a ProjectSet plan node applied to some sub-path - SortPath - a Sort plan node applied to some sub-path - IncrementalSortPath - an IncrementalSort plan node applied to some sub-path - GroupPath - a Group plan node applied to some sub-path - UpperUniquePath - a Unique plan node applied to some sub-path - AggPath - an Agg plan node applied to some sub-path - GroupingSetsPath - an Agg plan node used to implement GROUPING SETS - MinMaxAggPath - a Result plan node with subplans performing MIN/MAX - WindowAggPath - a WindowAgg plan node applied to some sub-path - SetOpPath - a SetOp plan node applied to some sub-path - RecursiveUnionPath - a RecursiveUnion plan node applied to two sub-paths - LockRowsPath - a LockRows plan node applied to some sub-path - ModifyTablePath - a ModifyTable plan node applied to some sub-path(s) - LimitPath - a Limit plan node applied to some sub-path - NestPath - nested-loop joins - MergePath - merge joins - HashPath - hash joins - - EquivalenceClass - a data structure representing a set of values known equal - - PathKey - a data structure representing the sort ordering of a path + PlannerGlobal - global information for a single planner invocation + + PlannerInfo - information for planning a particular Query (we make + a separate PlannerInfo node for each sub-Query) + + RelOptInfo - a relation or joined relations + + RestrictInfo - WHERE clauses, like "x = 3" or "y = z" + (note the same structure is used for restriction and + join clauses) + + Path - every way to generate a RelOptInfo(sequential,index,joins) + A plain Path node can represent several simple plans, per its pathtype: + T_SeqScan - sequential scan + T_SampleScan - tablesample scan + T_FunctionScan - function-in-FROM scan + T_TableFuncScan - table function scan + T_ValuesScan - VALUES scan + T_CteScan - CTE (WITH) scan + T_NamedTuplestoreScan - ENR scan + T_WorkTableScan - scan worktable of a recursive CTE + T_Result - childless Result plan node (used for FROM-less SELECT) + IndexPath - index scan + BitmapHeapPath - top of a bitmapped index scan + TidPath - scan by CTID + TidRangePath - scan a contiguous range of CTIDs + SubqueryScanPath - scan a subquery-in-FROM + ForeignPath - scan a foreign table, foreign join or foreign upper-relation + CustomPath - for custom scan providers + AppendPath - append multiple subpaths together + MergeAppendPath - merge multiple subpaths, preserving their common sort order + GroupResultPath - childless Result plan node (used for degenerate grouping) + MaterialPath - a Material plan node + MemoizePath - a Memoize plan node for caching tuples from sub-paths + UniquePath - remove duplicate rows (either by hashing or sorting) + GatherPath - collect the results of parallel workers + GatherMergePath - collect parallel results, preserving their common sort order + ProjectionPath - a Result plan node with child (used for projection) + ProjectSetPath - a ProjectSet plan node applied to some sub-path + SortPath - a Sort plan node applied to some sub-path + IncrementalSortPath - an IncrementalSort plan node applied to some sub-path + GroupPath - a Group plan node applied to some sub-path + UpperUniquePath - a Unique plan node applied to some sub-path + AggPath - an Agg plan node applied to some sub-path + GroupingSetsPath - an Agg plan node used to implement GROUPING SETS + MinMaxAggPath - a Result plan node with subplans performing MIN/MAX + WindowAggPath - a WindowAgg plan node applied to some sub-path + SetOpPath - a SetOp plan node applied to some sub-path + RecursiveUnionPath - a RecursiveUnion plan node applied to two sub-paths + LockRowsPath - a LockRows plan node applied to some sub-path + ModifyTablePath - a ModifyTable plan node applied to some sub-path(s) + LimitPath - a Limit plan node applied to some sub-path + NestPath - nested-loop joins + MergePath - merge joins + HashPath - hash joins + + EquivalenceClass - a data structure representing a set of values known equal + + PathKey - a data structure representing the sort ordering of a path The optimizer spends a good deal of its time worrying about the ordering of the tuples returned by a path. The reason this is useful is that by @@ -909,10 +921,10 @@ of the tuples generated by a particular Path. A path's pathkeys field is a list of PathKey nodes, where the n'th item represents the n'th sort key of the result. Each PathKey contains these fields: - * a reference to an EquivalenceClass - * a btree opfamily OID (must match one of those in the EC) - * a sort direction (ascending or descending) - * a nulls-first-or-last flag +* a reference to an EquivalenceClass +* a btree opfamily OID (must match one of those in the EC) +* a sort direction (ascending or descending) +* a nulls-first-or-last flag The EquivalenceClass represents the value being sorted on. Since the various members of an EquivalenceClass are known equal according to the @@ -997,9 +1009,11 @@ lists (sort orderings) do not mention the same EquivalenceClass more than once. For example, in all these cases the second sort column is redundant, because it cannot distinguish values that are the same according to the first sort column: + SELECT ... ORDER BY x, x SELECT ... ORDER BY x, x DESC SELECT ... WHERE x = y ORDER BY x, y + Although a user probably wouldn't write "ORDER BY x,x" directly, such redundancies are more probable once equivalence classes have been considered. Also, the system may generate redundant pathkey lists when @@ -1350,14 +1364,14 @@ RelOptInfos are mostly dummy, but their pathlist lists hold all the Paths considered useful for each step. Currently, we may create these types of additional RelOptInfos during upper-level planning: -UPPERREL_SETOP result of UNION/INTERSECT/EXCEPT, if any -UPPERREL_PARTIAL_GROUP_AGG result of partial grouping/aggregation, if any -UPPERREL_GROUP_AGG result of grouping/aggregation, if any -UPPERREL_WINDOW result of window functions, if any -UPPERREL_PARTIAL_DISTINCT result of partial "SELECT DISTINCT", if any -UPPERREL_DISTINCT result of "SELECT DISTINCT", if any -UPPERREL_ORDERED result of ORDER BY, if any -UPPERREL_FINAL result of any remaining top-level actions + UPPERREL_SETOP result of UNION/INTERSECT/EXCEPT, if any + UPPERREL_PARTIAL_GROUP_AGG result of partial grouping/aggregation, if any + UPPERREL_GROUP_AGG result of grouping/aggregation, if any + UPPERREL_WINDOW result of window functions, if any + UPPERREL_PARTIAL_DISTINCT result of partial "SELECT DISTINCT", if any + UPPERREL_DISTINCT result of "SELECT DISTINCT", if any + UPPERREL_ORDERED result of ORDER BY, if any + UPPERREL_FINAL result of any remaining top-level actions UPPERREL_FINAL is used to represent any final processing steps, currently LockRows (SELECT FOR UPDATE), LIMIT/OFFSET, and ModifyTable. There is no diff --git a/src/backend/optimizer/plan/README.md b/src/backend/optimizer/plan/README.md index 013c0f9ea2..93dd422dc1 100644 --- a/src/backend/optimizer/plan/README.md +++ b/src/backend/optimizer/plan/README.md @@ -6,32 +6,32 @@ Subselects Vadim B. Mikheev -From owner-pgsql-hackers@hub.org Fri Feb 13 09:01:19 1998 -Received: from renoir.op.net (root@renoir.op.net [209.152.193.4]) - by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id JAA11576 - for ; Fri, 13 Feb 1998 09:01:17 -0500 (EST) -Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$Revision: 1.14 $) with ESMTP id IAA09761 for ; Fri, 13 Feb 1998 08:41:22 -0500 (EST) -Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id IAA08135; Fri, 13 Feb 1998 08:40:17 -0500 (EST) -Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Fri, 13 Feb 1998 08:38:42 -0500 (EST) -Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id IAA06646 for pgsql-hackers-outgoing; Fri, 13 Feb 1998 08:38:35 -0500 (EST) -Received: from dune.krasnet.ru (dune.krasnet.ru [193.125.44.86]) by hub.org (8.8.8/8.7.5) with ESMTP id IAA04568 for ; Fri, 13 Feb 1998 08:37:16 -0500 (EST) -Received: from sable.krasnoyarsk.su (dune.krasnet.ru [193.125.44.86]) - by dune.krasnet.ru (8.8.7/8.8.7) with ESMTP id UAA13717 - for ; Fri, 13 Feb 1998 20:51:03 +0700 (KRS) - (envelope-from vadim@sable.krasnoyarsk.su) -Message-ID: <34E44FBA.D64E7997@sable.krasnoyarsk.su> -Date: Fri, 13 Feb 1998 20:50:50 +0700 -From: "Vadim B. Mikheev" -Organization: ITTS (Krasnoyarsk) -X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386) -MIME-Version: 1.0 -To: PostgreSQL Developers List -Subject: [HACKERS] Subselects are in CVS... -Content-Type: text/plain; charset=us-ascii -Content-Transfer-Encoding: 7bit -Sender: owner-pgsql-hackers@hub.org -Precedence: bulk -Status: OR + From owner-pgsql-hackers@hub.org Fri Feb 13 09:01:19 1998 + Received: from renoir.op.net (root@renoir.op.net [209.152.193.4]) + by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id JAA11576 + for ; Fri, 13 Feb 1998 09:01:17 -0500 (EST) + Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$Revision: 1.14 $) with ESMTP id IAA09761 for ; Fri, 13 Feb 1998 08:41:22 -0500 (EST) + Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id IAA08135; Fri, 13 Feb 1998 08:40:17 -0500 (EST) + Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Fri, 13 Feb 1998 08:38:42 -0500 (EST) + Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id IAA06646 for pgsql-hackers-outgoing; Fri, 13 Feb 1998 08:38:35 -0500 (EST) + Received: from dune.krasnet.ru (dune.krasnet.ru [193.125.44.86]) by hub.org (8.8.8/8.7.5) with ESMTP id IAA04568 for ; Fri, 13 Feb 1998 08:37:16 -0500 (EST) + Received: from sable.krasnoyarsk.su (dune.krasnet.ru [193.125.44.86]) + by dune.krasnet.ru (8.8.7/8.8.7) with ESMTP id UAA13717 + for ; Fri, 13 Feb 1998 20:51:03 +0700 (KRS) + (envelope-from vadim@sable.krasnoyarsk.su) + Message-ID: <34E44FBA.D64E7997@sable.krasnoyarsk.su> + Date: Fri, 13 Feb 1998 20:50:50 +0700 + From: "Vadim B. Mikheev" + Organization: ITTS (Krasnoyarsk) + X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386) + MIME-Version: 1.0 + To: PostgreSQL Developers List + Subject: [HACKERS] Subselects are in CVS... + Content-Type: text/plain; charset=us-ascii + Content-Transfer-Encoding: 7bit + Sender: owner-pgsql-hackers@hub.org + Precedence: bulk + Status: OR This is some implementation notes and opened issues... @@ -95,17 +95,17 @@ vac=> explain select * from tmp where x >= (select max(x2) from test2 where y2 = y and exists (select * from tempx where tx = x)); NOTICE: QUERY PLAN: -Seq Scan on tmp (cost=40.03 size=101 width=8) - SubPlan - ^^^^^^^ subquery is in Seq Scan' qual, its plan is below - -> Aggregate (cost=2.05 size=0 width=0) - InitPlan - ^^^^^^^^ EXISTS subsubquery is InitPlan of subquery - -> Seq Scan on tempx (cost=4.33 size=1 width=4) - -> Result (cost=2.05 size=0 width=0) - ^^^^^^ EXISTS subsubquery was transformed into Param - and so we have Result node here - -> Index Scan on test2 (cost=2.05 size=1 width=4) + Seq Scan on tmp (cost=40.03 size=101 width=8) + SubPlan + ^^^^^^^ subquery is in Seq Scan' qual, its plan is below + -> Aggregate (cost=2.05 size=0 width=0) + InitPlan + ^^^^^^^^ EXISTS subsubquery is InitPlan of subquery + -> Seq Scan on tempx (cost=4.33 size=1 width=4) + -> Result (cost=2.05 size=0 width=0) + ^^^^^^ EXISTS subsubquery was transformed into Param + and so we have Result node here + -> Index Scan on test2 (cost=2.05 size=1 width=4) Opened issues. @@ -131,28 +131,28 @@ Results of some test. TMP is table with x,y (int4-s), x in 0-9, y = 100 - x, 1000 tuples (10 duplicates of each tuple). TEST2 is table with x2, y2 (int4-s), x2 in 1-99, y2 = 100 -x2, 10000 tuples (100 dups). - Trying +Trying -select * from tmp where x >= (select max(x2) from test2 where y2 = y); + select * from tmp where x >= (select max(x2) from test2 where y2 = y); - and +and -begin; -select y as ty, max(x2) as mx into table tsub from test2, tmp -where y2 = y group by ty; -vacuum tsub; -select x, y from tmp, tsub where x >= mx and y = ty; -drop table tsub; -end; + begin; + select y as ty, max(x2) as mx into table tsub from test2, tmp + where y2 = y group by ty; + vacuum tsub; + select x, y from tmp, tsub where x >= mx and y = ty; + drop table tsub; + end; - Without index on test2(y2): +Without index on test2(y2): -SubSelect -> 320 sec -Using temp table -> 32 sec + SubSelect -> 320 sec + Using temp table -> 32 sec - Having index +Having index -SubSelect -> 17 sec (2M of memory) -Using temp table -> 32 sec (12M of memory: -S 8192) + SubSelect -> 17 sec (2M of memory) + Using temp table -> 32 sec (12M of memory: -S 8192) Vadim diff --git a/src/backend/parser/README.md b/src/backend/parser/README.md index e0c986a41e..e6016fa430 100644 --- a/src/backend/parser/README.md +++ b/src/backend/parser/README.md @@ -7,27 +7,27 @@ This directory does more than tokenize and parse SQL queries. It also creates Query structures for the various complex queries that are passed to the optimizer and then executor. -parser.c things start here -scan.l break query into tokens -scansup.c handle escapes in input strings -gram.y parse the tokens and produce a "raw" parse tree -analyze.c top level of parse analysis for optimizable queries -parse_agg.c handle aggregates, like SUM(col1), AVG(col2), ... -parse_clause.c handle clauses like WHERE, ORDER BY, GROUP BY, ... -parse_coerce.c handle coercing expressions to different data types -parse_collate.c assign collation information in completed expressions -parse_cte.c handle Common Table Expressions (WITH clauses) -parse_expr.c handle expressions like col, col + 3, x = 3 or x = 4 -parse_enr.c handle ephemeral named rels (trigger transition tables, ...) -parse_func.c handle functions, table.column and column identifiers -parse_merge.c handle MERGE -parse_node.c create nodes for various structures -parse_oper.c handle operators in expressions -parse_param.c handle Params (for the cases used in the core backend) -parse_relation.c support routines for tables and column handling -parse_target.c handle the result list of the query -parse_type.c support routines for data type handling -parse_utilcmd.c parse analysis for utility commands (done at execution time) + parser.c things start here + scan.l break query into tokens + scansup.c handle escapes in input strings + gram.y parse the tokens and produce a "raw" parse tree + analyze.c top level of parse analysis for optimizable queries + parse_agg.c handle aggregates, like SUM(col1), AVG(col2), ... + parse_clause.c handle clauses like WHERE, ORDER BY, GROUP BY, ... + parse_coerce.c handle coercing expressions to different data types + parse_collate.c assign collation information in completed expressions + parse_cte.c handle Common Table Expressions (WITH clauses) + parse_expr.c handle expressions like col, col + 3, x = 3 or x = 4 + parse_enr.c handle ephemeral named rels (trigger transition tables, ...) + parse_func.c handle functions, table.column and column identifiers + parse_merge.c handle MERGE + parse_node.c create nodes for various structures + parse_oper.c handle operators in expressions + parse_param.c handle Params (for the cases used in the core backend) + parse_relation.c support routines for tables and column handling + parse_target.c handle the result list of the query + parse_type.c support routines for data type handling + parse_utilcmd.c parse analysis for utility commands (done at execution time) See also src/common/keywords.c, which contains the table of standard keywords and the keyword lookup function. We separated that out because diff --git a/src/backend/regex/README.md b/src/backend/regex/README.md index 930d8ced0d..43b965e9cc 100644 --- a/src/backend/regex/README.md +++ b/src/backend/regex/README.md @@ -9,11 +9,13 @@ General source-file layout There are six separately-compilable source files, five of which expose exactly one exported function apiece: + regcomp.c: pg_regcomp regexec.c: pg_regexec regerror.c: pg_regerror regfree.c: pg_regfree regprefix.c: pg_regprefix + (The pg_ prefixes were added by the Postgres project to distinguish this library version from any similar one that might be present on a particular system. They'd need to be removed or replaced in any standalone version @@ -22,8 +24,8 @@ of the library.) The sixth file, regexport.c, exposes multiple functions that allow extraction of info about a compiled regex (see regexport.h). -There are additional source files regc_*.c that are #include'd in regcomp, -and similarly additional source files rege_*.c that are #include'd in +There are additional source files `regc_*.c` that are #include'd in regcomp, +and similarly additional source files `rege_*.c` that are #include'd in regexec. This was done to avoid exposing internal symbols globally; all functions not meant to be part of the library API are static. @@ -38,19 +40,19 @@ structs.) What's where in src/backend/regex/: -regcomp.c Top-level regex compilation code -regc_color.c Color map management -regc_cvec.c Character vector (cvec) management -regc_lex.c Lexer -regc_nfa.c NFA handling -regc_locale.c Application-specific locale code from Tcl project -regc_pg_locale.c Postgres-added application-specific locale code -regexec.c Top-level regex execution code -rege_dfa.c DFA creation and execution -regerror.c pg_regerror: generate text for a regex error code -regfree.c pg_regfree: API to free a no-longer-needed regex_t -regexport.c Functions for extracting info from a regex_t -regprefix.c Code for extracting a common prefix from a regex_t + regcomp.c Top-level regex compilation code + regc_color.c Color map management + regc_cvec.c Character vector (cvec) management + regc_lex.c Lexer + regc_nfa.c NFA handling + regc_locale.c Application-specific locale code from Tcl project + regc_pg_locale.c Postgres-added application-specific locale code + regexec.c Top-level regex execution code + rege_dfa.c DFA creation and execution + regerror.c pg_regerror: generate text for a regex error code + regfree.c pg_regfree: API to free a no-longer-needed regex_t + regexport.c Functions for extracting info from a regex_t + regprefix.c Code for extracting a common prefix from a regex_t The locale-specific code is concerned primarily with case-folding and with expanding locale-specific character classes, such as [[:alnum:]]. It @@ -58,11 +60,11 @@ really needs refactoring if this is ever to become a standalone library. The header files for the library are in src/include/regex/: -regcustom.h Customizes library for particular application -regerrs.h Error message list -regex.h Exported API -regexport.h Exported API for regexport.c -regguts.h Internals declarations + regcustom.h Customizes library for particular application + regerrs.h Error message list + regex.h Exported API + regexport.h Exported API for regexport.c + regguts.h Internals declarations DFAs, NFAs, and all that @@ -261,15 +263,19 @@ an additional arc labeled 2 wherever there is an arc labeled 3; this action ensures that characters of color 2 (i.e., "x") will still be considered as allowing any transitions they did before. We are now done parsing the regex, and we have these final color assignments: + color 1: "a" color 2: "x" color 3: other letters color 4: digits + and the NFA has these arcs: + states 1 -> 2 on color 1 (hence, "a" only) states 2 -> 3 on color 4 (digits) states 3 -> 4 on colors 1, 3, 4, and 2 (covering all \w characters) states 4 -> 5 on color 2 ("x" only) + which can be seen to be a correct representation of the regex. There is one more complexity, which is how to handle ".", that is a diff --git a/src/backend/snowball/README.md b/src/backend/snowball/README.md index 675baff5c9..ab4e4044d3 100644 --- a/src/backend/snowball/README.md +++ b/src/backend/snowball/README.md @@ -10,12 +10,12 @@ which is released by them under a BSD-style license. The Snowball project does not often make formal releases; it's best to pull from their git repository -git clone https://github.com/snowballstem/snowball.git + git clone https://github.com/snowballstem/snowball.git and then building the derived files is as simple as -cd snowball -make + cd snowball + make At least on Linux, no platform-specific adjustment is needed. @@ -39,10 +39,10 @@ To update the PostgreSQL sources from a new Snowball version: 1. Copy the *.c files in snowball/src_c/ to src/backend/snowball/libstemmer with replacement of "../runtime/header.h" by "header.h", for example -for f in .../snowball/src_c/*.c -do - sed 's|\.\./runtime/header\.h|header.h|' $f >libstemmer/`basename $f` -done + for f in .../snowball/src_c/*.c + do + sed 's|\.\./runtime/header\.h|header.h|' $f >libstemmer/`basename $f` + done Do not copy stemmers that are listed in libstemmer/modules.txt as nonstandard, such as "german2" or "lovins". diff --git a/src/backend/storage/freespace/README.md b/src/backend/storage/freespace/README.md index e7ff23b76f..11ef718509 100644 --- a/src/backend/storage/freespace/README.md +++ b/src/backend/storage/freespace/README.md @@ -33,9 +33,9 @@ node stores the max amount of free space on any of its children. For example: - 4 - 4 2 -3 4 0 2 <- This level represents heap pages + 4 + 4 2 + 3 4 0 2 <- This level represents heap pages We need two basic operations: search and update. @@ -67,10 +67,10 @@ header takes some space on a page, the binary tree isn't perfect. That is, a few right-most leaf nodes are missing, and there are some useless non-leaf nodes at the right. So the tree looks something like this: - 0 - 1 2 - 3 4 5 6 -7 8 9 A B + 0 + 1 2 + 3 4 5 6 + 7 8 9 A B where the numbers denote each node's position in the array. Note that the tree is guaranteed complete above the leaf level; only some leaf nodes are @@ -100,27 +100,27 @@ For example, assuming each FSM page can hold information about 4 pages (in reality, it holds (BLCKSZ - headers) / 2, or ~4000 with default BLCKSZ), we get a disk layout like this: - 0 <-- page 0 at level 2 (root page) - 0 <-- page 0 at level 1 - 0 <-- page 0 at level 0 - 1 <-- page 1 at level 0 - 2 <-- ... - 3 - 1 <-- page 1 at level 1 - 4 - 5 - 6 - 7 - 2 - 8 - 9 - 10 - 11 - 3 - 12 - 13 - 14 - 15 + 0 <-- page 0 at level 2 (root page) + 0 <-- page 0 at level 1 + 0 <-- page 0 at level 0 + 1 <-- page 1 at level 0 + 2 <-- ... + 3 + 1 <-- page 1 at level 1 + 4 + 5 + 6 + 7 + 2 + 8 + 9 + 10 + 11 + 3 + 12 + 13 + 14 + 15 where the numbers are page numbers *at that level*, starting from 0. diff --git a/src/backend/storage/lmgr/README-SSI.md b/src/backend/storage/lmgr/README-SSI.md index 50d2ecca9d..76a4f40ce6 100644 --- a/src/backend/storage/lmgr/README-SSI.md +++ b/src/backend/storage/lmgr/README-SSI.md @@ -199,24 +199,24 @@ The PostgreSQL implementation uses two additional optimizations: PostgreSQL Implementation ------------------------- - * Since this technique is based on Snapshot Isolation (SI), those +* Since this technique is based on Snapshot Isolation (SI), those areas in PostgreSQL which don't use SI can't be brought under SSI. This includes system tables, temporary tables, sequences, hint bit rewrites, etc. SSI can not eliminate existing anomalies in these areas. - * Any transaction which is run at a transaction isolation level +* Any transaction which is run at a transaction isolation level other than SERIALIZABLE will not be affected by SSI. If you want to enforce business rules through SSI, all transactions should be run at the SERIALIZABLE transaction isolation level, and that should probably be set as the default. - * If all transactions are run at the SERIALIZABLE transaction +* If all transactions are run at the SERIALIZABLE transaction isolation level, business rules can be enforced in triggers or application code without ever having a need to acquire an explicit lock or to use SELECT FOR SHARE or SELECT FOR UPDATE. - * Those who want to continue to use snapshot isolation without +* Those who want to continue to use snapshot isolation without the additional protections of SSI (and the associated costs of enforcing those protections), can use the REPEATABLE READ transaction isolation level. This level retains its legacy behavior, which @@ -224,21 +224,21 @@ is identical to the old SERIALIZABLE implementation and fully consistent with the standard's requirements for the REPEATABLE READ transaction isolation level. - * Performance under this SSI implementation will be significantly +* Performance under this SSI implementation will be significantly improved if transactions which don't modify permanent tables are declared to be READ ONLY before they begin reading data. - * Performance under SSI will tend to degrade more rapidly with a +* Performance under SSI will tend to degrade more rapidly with a large number of active database transactions than under less strict isolation levels. Limiting the number of active transactions through use of a connection pool or similar techniques may be necessary to maintain good performance. - * Any transaction which must be rolled back to prevent +* Any transaction which must be rolled back to prevent serialization anomalies will fail with SQLSTATE 40001, which has a standard meaning of "serialization failure". - * This SSI implementation makes an effort to choose the +* This SSI implementation makes an effort to choose the transaction to be canceled such that an immediate retry of the transaction will not fail due to conflicts with exactly the same transactions. Pursuant to this goal, no transaction is canceled @@ -303,20 +303,20 @@ Heap locking Predicate locks will be acquired for the heap based on the following: - * For a table scan, the entire relation will be locked. +* For a table scan, the entire relation will be locked. - * Each tuple read which is visible to the reading transaction +* Each tuple read which is visible to the reading transaction will be locked, whether or not it meets selection criteria; except that there is no need to acquire an SIREAD lock on a tuple when the transaction already holds a write lock on any tuple representing the row, since a rw-conflict would also create a ww-dependency which has more aggressive enforcement and thus will prevent any anomaly. - * Modifying a heap tuple creates a rw-conflict with any transaction +* Modifying a heap tuple creates a rw-conflict with any transaction that holds a SIREAD lock on that tuple, or on the page or relation that contains it. - * Inserting a new tuple creates a rw-conflict with any transaction +* Inserting a new tuple creates a rw-conflict with any transaction holding a SIREAD lock on the entire relation. It doesn't conflict with page-level locks, because page-level locks are only used to aggregate tuple locks. Unlike index page locks, they don't lock "gaps" on the page. @@ -346,34 +346,34 @@ false positives, they should be minimized for performance reasons. Several optimizations are possible, though not all are implemented yet: - * An index scan which is just finding the right position for an +* An index scan which is just finding the right position for an index insertion or deletion need not acquire a predicate lock. - * An index scan which is comparing for equality on the entire key +* An index scan which is comparing for equality on the entire key for a unique index need not acquire a predicate lock as long as a key is found corresponding to a visible tuple which has not been modified by another transaction -- there are no "between or around" gaps to cover. - * As long as built-in foreign key enforcement continues to use +* As long as built-in foreign key enforcement continues to use its current "special tricks" to deal with MVCC issues, predicate locks should not be needed for scans done by enforcement code. - * If a search determines that no rows can be found regardless of +* If a search determines that no rows can be found regardless of index contents because the search conditions are contradictory (e.g., x = 1 AND x = 2), then no predicate lock is needed. Other index AM implementation considerations: - * For an index AM that doesn't have support for predicate locking, +* For an index AM that doesn't have support for predicate locking, we just acquire a predicate lock on the whole index for any search. - * B-tree index searches acquire predicate locks only on the +* B-tree index searches acquire predicate locks only on the index *leaf* pages needed to lock the appropriate index range. If, however, a search discovers that no root page has yet been created, a predicate lock on the index relation is required. - * Like a B-tree, GIN searches acquire predicate locks only on the +* Like a B-tree, GIN searches acquire predicate locks only on the leaf pages of entry tree. When performing an equality scan, and an entry has a posting tree, the posting tree root is locked instead, to lock only that key value. However, fastupdate=on postpones the @@ -382,7 +382,7 @@ into pending list. That makes us unable to detect r-w conflicts using page-level locks. To cope with that, insertions to the pending list conflict with all scans. - * GiST searches can determine that there are no matches at any +* GiST searches can determine that there are no matches at any level of the index, so we acquire predicate lock at each index level during a GiST search. An index insert at the leaf level can then be trusted to ripple up to all levels and locations where @@ -390,13 +390,13 @@ conflicting predicate locks may exist. In case there is a page split, we need to copy predicate lock from the original page to all the new pages. - * Hash index searches acquire predicate locks on the primary +* Hash index searches acquire predicate locks on the primary page of a bucket. It acquires a lock on both the old and new buckets for scans that happen concurrently with page splits. During a bucket split, a predicate lock is copied from the primary page of an old bucket to the primary page of a new bucket. - * The effects of page splits, overflows, consolidations, and +* The effects of page splits, overflows, consolidations, and removals must be carefully reviewed to ensure that predicate locks aren't "lost" during those operations, or kept with pages which could get re-used for different parts of the index. @@ -409,56 +409,56 @@ The PostgreSQL implementation of Serializable Snapshot Isolation differs from what is described in the cited papers for several reasons: - 1. PostgreSQL didn't have any existing predicate locking. It had +1. PostgreSQL didn't have any existing predicate locking. It had to be added from scratch. - 2. The existing in-memory lock structures were not suitable for +2. The existing in-memory lock structures were not suitable for tracking SIREAD locks. - * In PostgreSQL, tuple level locks are not held in RAM for +- In PostgreSQL, tuple level locks are not held in RAM for any length of time; lock information is written to the tuples involved in the transactions. - * In PostgreSQL, existing lock structures have pointers to +- In PostgreSQL, existing lock structures have pointers to memory which is related to a session. SIREAD locks need to persist past the end of the originating transaction and even the session which ran it. - * PostgreSQL needs to be able to tolerate a large number of +- PostgreSQL needs to be able to tolerate a large number of transactions executing while one long-running transaction stays open -- the in-RAM techniques discussed in the papers wouldn't support that. - 3. Unlike the database products used for the prototypes described +3. Unlike the database products used for the prototypes described in the papers, PostgreSQL didn't already have a true serializable isolation level distinct from snapshot isolation. - 4. PostgreSQL supports subtransactions -- an issue not mentioned +4. PostgreSQL supports subtransactions -- an issue not mentioned in the papers. - 5. PostgreSQL doesn't assign a transaction number to a database +5. PostgreSQL doesn't assign a transaction number to a database transaction until and unless necessary (normally, when the transaction attempts to modify data). - 6. PostgreSQL has pluggable data types with user-definable +6. PostgreSQL has pluggable data types with user-definable operators, as well as pluggable index types, not all of which are based around data types which support ordering. - 7. Some possible optimizations became apparent during development +7. Some possible optimizations became apparent during development and testing. Differences from the implementation described in the papers are listed below. - * New structures needed to be created in shared memory to track +* New structures needed to be created in shared memory to track the proper information for serializable transactions and their SIREAD locks. - * Because PostgreSQL does not have the same concept of an "oldest +* Because PostgreSQL does not have the same concept of an "oldest transaction ID" for all serializable transactions as assumed in the Cahill thesis, we track the oldest snapshot xmin among serializable transactions, and a count of how many active transactions use that xmin. When the count hits zero we find the new oldest xmin and run a clean-up based on that. - * Because reads in a subtransaction may cause that subtransaction +* Because reads in a subtransaction may cause that subtransaction to roll back, thereby affecting what is written by the top level transaction, predicate locks must survive a subtransaction rollback. As a consequence, all xid usage in SSI, including predicate locking, @@ -466,7 +466,7 @@ is based on the top level xid. When looking at an xid that comes from a tuple's xmin or xmax, for example, we always call SubTransGetTopmostTransaction() before doing much else with it. - * PostgreSQL does not use "update in place" with a rollback log +* PostgreSQL does not use "update in place" with a rollback log for its MVCC implementation. Where possible it uses "HOT" updates on the same page (if there is room and no indexed value is changed). For non-HOT updates the old tuple is expired in place and a new tuple @@ -477,21 +477,21 @@ versions of the row, based on the following proof that any additional serialization failures we would get from that would be false positives: - o If transaction T1 reads a row version (thus acquiring a + - If transaction T1 reads a row version (thus acquiring a predicate lock on it) and a second transaction T2 updates that row version (thus creating a rw-conflict graph edge from T1 to T2), must a third transaction T3 which re-updates the new version of the row also have a rw-conflict in from T1 to prevent anomalies? In other words, does it matter whether we recognize the edge T1 -> T3? - o If T1 has a conflict in, it certainly doesn't. Adding the + - If T1 has a conflict in, it certainly doesn't. Adding the edge T1 -> T3 would create a dangerous structure, but we already had one from the edge T1 -> T2, so we would have aborted something anyway. (T2 has already committed, else T3 could not have updated its output; but we would have aborted either T1 or T1's predecessor(s). Hence no cycle involving T1 and T3 can survive.) - o Now let's consider the case where T1 doesn't have a + - Now let's consider the case where T1 doesn't have a rw-conflict in. If that's the case, for this edge T1 -> T3 to make a difference, T3 must have a rw-conflict out that induces a cycle in the dependency graph, i.e. a conflict out to some transaction preceding T1 @@ -499,41 +499,41 @@ in the graph. (A conflict out to T1 itself would be problematic too, but that would mean T1 has a conflict in, the case we already eliminated.) - o So now we're trying to figure out if there can be an + - So now we're trying to figure out if there can be an rw-conflict edge T3 -> T0, where T0 is some transaction that precedes T1. For T0 to precede T1, there has to be some edge, or sequence of edges, from T0 to T1. At least the last edge has to be a wr-dependency or ww-dependency rather than a rw-conflict, because T1 doesn't have a rw-conflict in. And that gives us enough information about the order of transactions to see that T3 can't have a rw-conflict to T0: - - T0 committed before T1 started (the wr/ww-dependency implies this) - - T1 started before T2 committed (the T1->T2 rw-conflict implies this) - - T2 committed before T3 started (otherwise, T3 would get aborted + - T0 committed before T1 started (the wr/ww-dependency implies this) + - T1 started before T2 committed (the T1->T2 rw-conflict implies this) + - T2 committed before T3 started (otherwise, T3 would get aborted because of an update conflict) - o That means T0 committed before T3 started, and therefore + - That means T0 committed before T3 started, and therefore there can't be a rw-conflict from T3 to T0. - o So in all cases, we don't need the T1 -> T3 edge to + - So in all cases, we don't need the T1 -> T3 edge to recognize cycles. Therefore it's not necessary for T1's SIREAD lock on the original tuple version to cover later versions as well. - * Predicate locking in PostgreSQL starts at the tuple level +* Predicate locking in PostgreSQL starts at the tuple level when possible. Multiple fine-grained locks are promoted to a single coarser-granularity lock as needed to avoid resource exhaustion. The amount of memory used for these structures is configurable, to balance RAM usage against SIREAD lock granularity. - * Each backend keeps a process-local table of the locks it holds. +* Each backend keeps a process-local table of the locks it holds. To support granularity promotion decisions with low CPU and locking overhead, this table also includes the coarser covering locks and the number of finer-granularity locks they cover. - * Conflicts are identified by looking for predicate locks +* Conflicts are identified by looking for predicate locks when tuples are written, and by looking at the MVCC information when tuples are read. There is no matching between two RAM-based locks. - * Because write locks are stored in the heap tuples rather than a +* Because write locks are stored in the heap tuples rather than a RAM-based lock table, the optimization described in the Cahill thesis which eliminates an SIREAD lock where there is a write lock is implemented by the following: @@ -543,18 +543,18 @@ predicate locks, a tuple lock on the tuple being written is removed. return quickly without doing anything if it is a tuple written by the reading transaction. - * Rather than using conflictIn and conflictOut pointers which use +* Rather than using conflictIn and conflictOut pointers which use NULL to indicate no conflict and a self-reference to indicate multiple conflicts or conflicts with committed transactions, we use a list of rw-conflicts. With the more complete information, false positives are reduced and we have sufficient data for more aggressive clean-up and other optimizations: - o We can avoid ever rolling back a transaction until and + - We can avoid ever rolling back a transaction until and unless there is a pivot where a transaction on the conflict *out* side of the pivot committed before either of the other transactions. - o We can avoid ever rolling back a transaction when the + - We can avoid ever rolling back a transaction when the transaction on the conflict *in* side of the pivot is explicitly or implicitly READ ONLY unless the transaction on the conflict *out* side of the pivot committed before the READ ONLY transaction acquired @@ -562,25 +562,25 @@ its snapshot. (An implicit READ ONLY transaction is one which committed without writing, even though it was not explicitly declared to be READ ONLY.) - o We can more aggressively clean up conflicts, predicate + - We can more aggressively clean up conflicts, predicate locks, and SSI transaction information. - * We allow a READ ONLY transaction to "opt out" of SSI if there are +* We allow a READ ONLY transaction to "opt out" of SSI if there are no READ WRITE transactions which could cause the READ ONLY transaction to ever become part of a "dangerous structure" of overlapping transaction dependencies. - * We allow the user to request that a READ ONLY transaction wait +* We allow the user to request that a READ ONLY transaction wait until the conditions are right for it to start in the "opt out" state described above. We add a DEFERRABLE state to transactions, which is specified and maintained in a way similar to READ ONLY. It is ignored for transactions that are not SERIALIZABLE and READ ONLY. - * When a transaction must be rolled back, we pick among the +* When a transaction must be rolled back, we pick among the active transactions such that an immediate retry will not fail again on conflicts with the same transactions. - * We use the PostgreSQL SLRU system to hold summarized +* We use the PostgreSQL SLRU system to hold summarized information about older committed transactions to put an upper bound on RAM used. Beyond that limit, information spills to disk. Performance can degrade in a pessimal situation, but it should be @@ -594,7 +594,7 @@ R&D Issues This is intended to be the place to record specific issues which need more detailed review or analysis. - * WAL file replay. While serializable implementations using S2PL +* WAL file replay. While serializable implementations using S2PL can guarantee that the write-ahead log contains commits in a sequence consistent with some serial execution of serializable transactions, SSI cannot make that guarantee. While the WAL replay is no less @@ -606,18 +606,18 @@ essence, if we do nothing, WAL replay will be at snapshot isolation even for serializable transactions. Is this OK? If not, how do we address it? - * External replication. Look at how this impacts external +* External replication. Look at how this impacts external replication solutions, like Postgres-R, Slony, pgpool, HS/SR, etc. This is related to the "WAL file replay" issue. - * UNIQUE btree search for equality on all columns. Since a search +* UNIQUE btree search for equality on all columns. Since a search of a UNIQUE index using equality tests on all columns will lock the heap tuple if an entry is found, it appears that there is no need to get a predicate lock on the index in that case. A predicate lock is still needed for such a search if a matching index entry which points to a visible tuple is not found. - * Minimize touching of shared memory. Should lists in shared +* Minimize touching of shared memory. Should lists in shared memory push entries which have just been returned to the front of the available list, so they will be popped back off soon and some memory might never be touched, or should we keep adding returned items to @@ -638,9 +638,9 @@ http://dx.doi.org/10.1145/1071610.1071615 Architecture of a Database System. Foundations and Trends(R) in Databases Vol. 1, No. 2 (2007) 141-259. http://db.cs.berkeley.edu/papers/fntdb07-architecture.pdf - Of particular interest: - * 6.1 A Note on ACID - * 6.2 A Brief Review of Serializability - * 6.3 Locking and Latching - * 6.3.1 Transaction Isolation Levels - * 6.5.3 Next-Key Locking: Physical Surrogates for Logical Properties +Of particular interest: +* 6.1 A Note on ACID +* 6.2 A Brief Review of Serializability +* 6.3 Locking and Latching +* 6.3.1 Transaction Isolation Levels +* 6.5.3 Next-Key Locking: Physical Surrogates for Logical Properties diff --git a/src/backend/utils/fmgr/README.md b/src/backend/utils/fmgr/README.md index 9958d38992..ee14362b8e 100644 --- a/src/backend/utils/fmgr/README.md +++ b/src/backend/utils/fmgr/README.md @@ -20,18 +20,18 @@ tuple.) When a function is looked up in pg_proc, the result is represented as -typedef struct -{ - PGFunction fn_addr; /* pointer to function or handler to be called */ - Oid fn_oid; /* OID of function (NOT of handler, if any) */ - short fn_nargs; /* number of input args (0..FUNC_MAX_ARGS) */ - bool fn_strict; /* function is "strict" (NULL in => NULL out) */ - bool fn_retset; /* function returns a set (over multiple calls) */ - unsigned char fn_stats; /* collect stats if track_functions > this */ - void *fn_extra; /* extra space for use by handler */ - MemoryContext fn_mcxt; /* memory context to store fn_extra in */ - Node *fn_expr; /* expression parse tree for call, or NULL */ -} FmgrInfo; + typedef struct + { + PGFunction fn_addr; /* pointer to function or handler to be called */ + Oid fn_oid; /* OID of function (NOT of handler, if any) */ + short fn_nargs; /* number of input args (0..FUNC_MAX_ARGS) */ + bool fn_strict; /* function is "strict" (NULL in => NULL out) */ + bool fn_retset; /* function returns a set (over multiple calls) */ + unsigned char fn_stats; /* collect stats if track_functions > this */ + void *fn_extra; /* extra space for use by handler */ + MemoryContext fn_mcxt; /* memory context to store fn_extra in */ + Node *fn_expr; /* expression parse tree for call, or NULL */ + } FmgrInfo; For an ordinary built-in function, fn_addr is just the address of the C routine that implements the function. Otherwise it is the address of a @@ -59,17 +59,17 @@ FmgrInfo than in FunctionCallInfoBaseData where it might more logically go. During a call of a function, the following data structure is created and passed to the function: -typedef struct -{ - FmgrInfo *flinfo; /* ptr to lookup info used for this call */ - Node *context; /* pass info about context of call */ - Node *resultinfo; /* pass or return extra info about result */ - Oid fncollation; /* collation for function to use */ - bool isnull; /* function must set true if result is NULL */ - short nargs; /* # arguments actually passed */ - NullableDatum args[]; /* Arguments passed to function */ -} FunctionCallInfoBaseData; -typedef FunctionCallInfoBaseData* FunctionCallInfo; + typedef struct + { + FmgrInfo *flinfo; /* ptr to lookup info used for this call */ + Node *context; /* pass info about context of call */ + Node *resultinfo; /* pass or return extra info about result */ + Oid fncollation; /* collation for function to use */ + bool isnull; /* function must set true if result is NULL */ + short nargs; /* # arguments actually passed */ + NullableDatum args[]; /* Arguments passed to function */ + } FunctionCallInfoBaseData; + typedef FunctionCallInfoBaseData* FunctionCallInfo; flinfo points to the lookup info used to make the call. Ordinary functions will probably ignore this field, but function class handlers will need it @@ -119,11 +119,11 @@ least), and other uglinesses. Callees, whether they be individual functions or function handlers, shall always have this signature: -Datum function (FunctionCallInfo fcinfo); + Datum function (FunctionCallInfo fcinfo); which is represented by the typedef -typedef Datum (*PGFunction) (FunctionCallInfo fcinfo); + typedef Datum (*PGFunction) (FunctionCallInfo fcinfo); The function is responsible for setting fcinfo->isnull appropriately as well as returning a result represented as a Datum. Note that since @@ -139,11 +139,11 @@ Here are the proposed macros and coding conventions: The definition of an fmgr-callable function will always look like -Datum -function_name(PG_FUNCTION_ARGS) -{ - ... -} + Datum + function_name(PG_FUNCTION_ARGS) + { + ... + } "PG_FUNCTION_ARGS" just expands to "FunctionCallInfo fcinfo". The main reason for using this macro is to make it easy for scripts to spot function @@ -156,26 +156,37 @@ just "fcinfo->args[n].isnull"). It should avoid trying to fetch the value of any argument that is null. Both strict and nonstrict functions can return NULL, if needed, with + PG_RETURN_NULL(); + which expands to + { fcinfo->isnull = true; return (Datum) 0; } Argument values are ordinarily fetched using code like + int32 name = PG_GETARG_INT32(number); For float4, float8, and int8, the PG_GETARG macros will hide whether the types are pass-by-value or pass-by-reference. For example, if float8 is pass-by-reference then PG_GETARG_FLOAT8 expands to + + (* (float8 *) DatumGetPointer(fcinfo->args[number].value)) + and would typically be called like this: + float8 arg = PG_GETARG_FLOAT8(0); + For what are now historical reasons, the float-related typedefs and macros express the type width in bytes (4 or 8), whereas we prefer to label the widths of integer types in bits. Non-null values are returned with a PG_RETURN_XXX macro of the appropriate type. For example, PG_RETURN_INT32 expands to + return Int32GetDatum(x) + PG_RETURN_FLOAT4, PG_RETURN_FLOAT8, and PG_RETURN_INT64 hide whether their data types are pass-by-value or pass-by-reference, by doing a palloc if needed. @@ -290,9 +301,13 @@ transaction cleanup. SQL-callable functions can support this need using the ErrorSaveContext context mechanism. To report a "soft" error, a SQL-callable function should call + errsave(fcinfo->context, ...) + where it would previously have done + ereport(ERROR, ...) + If the passed "context" is NULL or is not an ErrorSaveContext node, then errsave behaves precisely as ereport(ERROR): the exception is thrown via longjmp, so that control does not return. If "context" @@ -306,7 +321,9 @@ been reported in the ErrorSaveContext node.) If there is nothing to do except return after calling errsave(), you can save a line or two by writing + ereturn(fcinfo->context, dummy_value, ...) + to perform errsave() and then "return dummy_value". An error reported "softly" must be safe, in the sense that there is diff --git a/src/backend/utils/mb/README.md b/src/backend/utils/mb/README.md index ef36626891..f1299fa2fd 100644 --- a/src/backend/utils/mb/README.md +++ b/src/backend/utils/mb/README.md @@ -3,20 +3,20 @@ src/backend/utils/mb/README Encodings ========= -conv.c: static functions and a public table for code conversion -mbutils.c: public functions for the backend only. -stringinfo_mb.c: public backend-only multibyte-aware stringinfo functions -wstrcmp.c: strcmp for mb -wstrncmp.c: strncmp for mb -win866.c: a tool to generate KOI8 <--> CP866 conversion table -iso.c: a tool to generate KOI8 <--> ISO8859-5 conversion table -win1251.c: a tool to generate KOI8 <--> CP1251 conversion table + conv.c: static functions and a public table for code conversion + mbutils.c: public functions for the backend only. + stringinfo_mb.c: public backend-only multibyte-aware stringinfo functions + wstrcmp.c: strcmp for mb + wstrncmp.c: strncmp for mb + win866.c: a tool to generate KOI8 <--> CP866 conversion table + iso.c: a tool to generate KOI8 <--> ISO8859-5 conversion table + win1251.c: a tool to generate KOI8 <--> CP1251 conversion table See also in src/common/: -encnames.c: public functions for encoding names -wchar.c: mostly static functions and a public table for mb string and - multibyte conversion + encnames.c: public functions for encoding names + wchar.c: mostly static functions and a public table for mb string and + multibyte conversion Introduction ------------ diff --git a/src/backend/utils/misc/README.md b/src/backend/utils/misc/README.md index 85d97d29b6..abfa473757 100644 --- a/src/backend/utils/misc/README.md +++ b/src/backend/utils/misc/README.md @@ -23,29 +23,37 @@ modify the default SHOW display for a variable. If a check_hook is provided, it points to a function of the signature + bool check_hook(datatype *newvalue, void **extra, GucSource source) + The "newvalue" argument is of type bool *, int *, double *, or char ** for bool, int/enum, real, or string variables respectively. The check function should validate the proposed new value, and return true if it is OK or false if not. The function can optionally do a few other things: * When rejecting a bad proposed value, it may be useful to append some -additional information to the generic "invalid value for parameter FOO" -complaint that guc.c will emit. To do that, call + additional information to the generic "invalid value for parameter FOO" + complaint that guc.c will emit. To do that, call + void GUC_check_errdetail(const char *format, ...) -where the format string and additional arguments follow the rules for -errdetail() arguments. The resulting string will be emitted as the -DETAIL line of guc.c's error report, so it should follow the message style -guidelines for DETAIL messages. There is also + + where the format string and additional arguments follow the rules for + errdetail() arguments. The resulting string will be emitted as the + DETAIL line of guc.c's error report, so it should follow the message style + guidelines for DETAIL messages. There is also + void GUC_check_errhint(const char *format, ...) -which can be used in the same way to append a HINT message. -Occasionally it may even be appropriate to override guc.c's generic primary -message or error code, which can be done with + + which can be used in the same way to append a HINT message. + Occasionally it may even be appropriate to override guc.c's generic primary + message or error code, which can be done with + void GUC_check_errcode(int sqlerrcode) void GUC_check_errmsg(const char *format, ...) -In general, check_hooks should avoid throwing errors directly if possible, -though this may be impractical to avoid for some corner cases such as -out-of-memory. + + In general, check_hooks should avoid throwing errors directly if possible, + though this may be impractical to avoid for some corner cases such as + out-of-memory. * Since the newvalue is pass-by-reference, the function can modify it. This might be used for example to canonicalize the spelling of a string @@ -76,7 +84,9 @@ assignment will occur. If an assign_hook is provided, it points to a function of the signature + void assign_hook(datatype newvalue, void *extra) + where the type of "newvalue" matches the kind of variable, and "extra" is the derived-information pointer returned by the check_hook (always NULL if there is no check_hook). This function is called immediately @@ -110,7 +120,9 @@ needing to check GUC values outside a transaction. If a show_hook is provided, it points to a function of the signature + const char *show_hook(void) + This hook allows variable-specific computation of the value displayed by SHOW (and other SQL features for showing GUC variable values). The return value can point to a static buffer, since show functions are @@ -214,23 +226,23 @@ The merged entry will have level N-1 and prior = older prior, so easiest to keep older entry and free newer. There are 12 possibilities since we already handled level N state = SAVE: -N-1 N + N-1 N -SAVE SET discard top prior, set state SET -SAVE LOCAL discard top prior, no change to stack entry -SAVE SET+LOCAL discard top prior, copy masked, state S+L + SAVE SET discard top prior, set state SET + SAVE LOCAL discard top prior, no change to stack entry + SAVE SET+LOCAL discard top prior, copy masked, state S+L -SET SET discard top prior, no change to stack entry -SET LOCAL copy top prior to masked, state S+L -SET SET+LOCAL discard top prior, copy masked, state S+L + SET SET discard top prior, no change to stack entry + SET LOCAL copy top prior to masked, state S+L + SET SET+LOCAL discard top prior, copy masked, state S+L -LOCAL SET discard top prior, set state SET -LOCAL LOCAL discard top prior, no change to stack entry -LOCAL SET+LOCAL discard top prior, copy masked, state S+L + LOCAL SET discard top prior, set state SET + LOCAL LOCAL discard top prior, no change to stack entry + LOCAL SET+LOCAL discard top prior, copy masked, state S+L -SET+LOCAL SET discard top prior and second masked, state SET -SET+LOCAL LOCAL discard top prior, no change to stack entry -SET+LOCAL SET+LOCAL discard top prior, copy masked, state S+L + SET+LOCAL SET discard top prior and second masked, state SET + SET+LOCAL LOCAL discard top prior, no change to stack entry + SET+LOCAL SET+LOCAL discard top prior, copy masked, state S+L RESET is executed like a SET, but using the reset_val as the desired new diff --git a/src/backend/utils/mmgr/README.md b/src/backend/utils/mmgr/README.md index f484f7d6f5..b710aa0002 100644 --- a/src/backend/utils/mmgr/README.md +++ b/src/backend/utils/mmgr/README.md @@ -412,11 +412,11 @@ GetMemoryChunkMethodID() and finding the corresponding MemoryContextMethods in the mcxt_methods[] array. For convenience, the MCXT_METHOD() macro is provided, making the code as simple as: -void -pfree(void *pointer) -{ - MCXT_METHOD(pointer, free_p)(pointer); -} + void + pfree(void *pointer) + { + MCXT_METHOD(pointer, free_p)(pointer); + } All of the current memory contexts make use of the MemoryChunk header type which is defined in memutils_memorychunk.h. This suits all of the existing diff --git a/src/backend/utils/resowner/README.md b/src/backend/utils/resowner/README.md index d67df3faed..9aed4ff88a 100644 --- a/src/backend/utils/resowner/README.md +++ b/src/backend/utils/resowner/README.md @@ -83,13 +83,13 @@ references, to name a few examples. To add a new kind of resource, define a ResourceOwnerDesc to describe it. For example: -static const ResourceOwnerDesc myresource_desc = { - .name = "My fancy resource", - .release_phase = RESOURCE_RELEASE_AFTER_LOCKS, - .release_priority = RELEASE_PRIO_FIRST, - .ReleaseResource = ReleaseMyResource, - .DebugPrint = PrintMyResource -}; + static const ResourceOwnerDesc myresource_desc = { + .name = "My fancy resource", + .release_phase = RESOURCE_RELEASE_AFTER_LOCKS, + .release_priority = RELEASE_PRIO_FIRST, + .ReleaseResource = ReleaseMyResource, + .DebugPrint = PrintMyResource + }; ResourceOwnerRemember() and ResourceOwnerForget() functions take a pointer to that struct, along with a Datum to represent the resource. The meaning @@ -139,28 +139,28 @@ within each phase. For example, imagine that you have two ResourceOwners, parent and child, as follows: -Parent - parent resource BEFORE_LOCKS priority 1 - parent resource BEFORE_LOCKS priority 2 - parent resource AFTER_LOCKS priority 10001 - parent resource AFTER_LOCKS priority 10002 - Child - child resource BEFORE_LOCKS priority 1 - child resource BEFORE_LOCKS priority 2 - child resource AFTER_LOCKS priority 10001 - child resource AFTER_LOCKS priority 10002 + Parent + parent resource BEFORE_LOCKS priority 1 + parent resource BEFORE_LOCKS priority 2 + parent resource AFTER_LOCKS priority 10001 + parent resource AFTER_LOCKS priority 10002 + Child + child resource BEFORE_LOCKS priority 1 + child resource BEFORE_LOCKS priority 2 + child resource AFTER_LOCKS priority 10001 + child resource AFTER_LOCKS priority 10002 These resources would be released in the following order: -child resource BEFORE_LOCKS priority 1 -child resource BEFORE_LOCKS priority 2 -parent resource BEFORE_LOCKS priority 1 -parent resource BEFORE_LOCKS priority 2 -(locks) -child resource AFTER_LOCKS priority 10001 -child resource AFTER_LOCKS priority 10002 -parent resource AFTER_LOCKS priority 10001 -parent resource AFTER_LOCKS priority 10002 + child resource BEFORE_LOCKS priority 1 + child resource BEFORE_LOCKS priority 2 + parent resource BEFORE_LOCKS priority 1 + parent resource BEFORE_LOCKS priority 2 + (locks) + child resource AFTER_LOCKS priority 10001 + child resource AFTER_LOCKS priority 10002 + parent resource AFTER_LOCKS priority 10001 + parent resource AFTER_LOCKS priority 10002 To release all the resources, you need to call ResourceOwnerRelease() three times, once for each phase. You may perform additional tasks between the diff --git a/src/interfaces/ecpg/preproc/README.parser.md b/src/interfaces/ecpg/preproc/README.parser.md index ddc3061d48..27194dd956 100644 --- a/src/interfaces/ecpg/preproc/README.parser.md +++ b/src/interfaces/ecpg/preproc/README.parser.md @@ -14,29 +14,38 @@ ECPG modifies and extends the core grammar in a way that actions for grammar rules. In "ecpg.addons", every modified rule follows this pattern: + ECPG: dumpedtokens postfix + where "dumpedtokens" is simply tokens from core gram.y's rules concatenated together. e.g. if gram.y has this: - ruleA: tokenA tokenB tokenC {...} + + ruleA: tokenA tokenB tokenC {...} + then "dumpedtokens" is "ruleAtokenAtokenBtokenC". "postfix" above can be: -a) "block" - the automatic rule created by parse.pl is completely - overridden, the code block has to be written completely as - it were in a plain bison grammar -b) "rule" - the automatic rule is extended on, so new syntaxes - are accepted for "ruleA". E.g.: + +* "block" - the automatic rule created by parse.pl is completely + overridden, the code block has to be written completely as + it were in a plain bison grammar +* "rule" - the automatic rule is extended on, so new syntaxes + are accepted for "ruleA". E.g.: + ECPG: ruleAtokenAtokenBtokenC rule | tokenD tokenE { action_code; } ... + It will be substituted with: + ruleA: | tokenD tokenE { action_code; } ... -c) "addon" - the automatic action for the rule (SQL syntax constructed - from the tokens concatenated together) is prepended with a new - action code part. This code part is written as is's already inside - the { ... } + +* "addon" - the automatic action for the rule (SQL syntax constructed + from the tokens concatenated together) is prepended with a new + action code part. This code part is written as is's already inside + the { ... } Multiple "addon" or "block" lines may appear together with the new code block if the code block is common for those rules. diff --git a/src/port/README.md b/src/port/README.md index ed5c54a72f..e5aeed07b6 100644 --- a/src/port/README.md +++ b/src/port/README.md @@ -14,7 +14,7 @@ libraries. This is done by removing -lpgport from the link line: # Need to recompile any libpgport object files LIBS := $(filter-out -lpgport, $(LIBS)) -and adding infrastructure to recompile the object files: + and adding infrastructure to recompile the object files: OBJS= execute.o typename.o descriptor.o data.o error.o prepare.o memory.o \ connect.o misc.o path.o exec.o \ diff --git a/src/test/isolation/README.md b/src/test/isolation/README.md index 5818ca5003..471b0a5029 100644 --- a/src/test/isolation/README.md +++ b/src/test/isolation/README.md @@ -12,23 +12,33 @@ serializable isolation level; but tests for other sorts of concurrent behaviors have been added as well. You can run the tests against the current build tree by typing + make check + Alternatively, you can run against an existing installation by typing + make installcheck + (This will contact a server at the default port expected by libpq. You can set PGPORT and so forth in your environment to control this.) To run just specific test(s) against an installed server, you can do something like + ./pg_isolation_regress fk-contention fk-deadlock + (look into the specs/ subdirectory to see the available tests). Certain tests require the server's max_prepared_transactions parameter to be set to at least 3; therefore they are not run by default. To include them in the test run, use + make check-prepared-txns + or + make installcheck-prepared-txns + after making sure the server configuration is correct (see TEMP_CONFIG to adjust this in the "check" case). @@ -64,7 +74,7 @@ that are to be run. A test specification consists of four parts, in this order: -setup { } +setup { < SQL > } The given SQL block is executed once (per permutation) before running the test. Create any test tables or other required objects here. This @@ -74,13 +84,13 @@ setup { } and some statements such as VACUUM cannot be combined with others in such a block.) -teardown { } +teardown { < SQL > } The teardown SQL block is executed once after the test is finished. Use this to clean up in preparation for the next permutation, e.g dropping any test tables created by setup. This part is optional. -session +session < name > There are normally several "session" parts in a spec file. Each session is executed in its own connection. A session part consists @@ -91,13 +101,13 @@ session Each step has the syntax - step { } + step < name > { < SQL > } - where is a name identifying this step, and is a SQL statement + where < name > is a name identifying this step, and < SQL > is a SQL statement (or statements, separated by semicolons) that is executed in the step. Step names must be unique across the whole spec file. -permutation ... +permutation < step name > ... A permutation line specifies a list of steps that are run in that order. Any number of permutation lines can appear. If no permutation lines are @@ -116,10 +126,10 @@ whether you quote them or not. You must use quotes if you want to use an isolation test keyword (such as "permutation") as a name. A # character begins a comment, which extends to the end of the line. -(This does not work inside blocks, however. Use the usual SQL +(This does not work inside < SQL > blocks, however. Use the usual SQL comment conventions there.) -There is no way to include a "}" character in an block. +There is no way to include a "}" character in an < SQL > block. For each permutation of the session steps (whether these are manually specified in the spec file, or automatically generated), the isolation @@ -187,9 +197,9 @@ step has completed. (If the other step is used more than once in the current permutation, this step cannot complete while any of those instances is active.) -A marker of the form " notices " (where is a +A marker of the form "< other step name > notices < n >" (where < n > is a positive integer) indicates that this step may not be reported as -completing until the other step's session has returned at least +completing until the other step's session has returned at least < n > NOTICE messages, counting from when this step is launched. This is useful for stabilizing cases where a step can return NOTICE messages before it actually completes, and those messages must be synchronized with the diff --git a/src/test/kerberos/README.md b/src/test/kerberos/README.md index a048d442af..dc53747fd9 100644 --- a/src/test/kerberos/README.md +++ b/src/test/kerberos/README.md @@ -23,9 +23,13 @@ Also, to use "make installcheck", you must have built and installed contrib/dblink and contrib/postgres_fdw in addition to the core code. Run + make check PG_TEST_EXTRA=kerberos + or + make installcheck PG_TEST_EXTRA=kerberos + You can use "make installcheck" if you previously did "make install". In that case, the code in the installation tree is tested. With "make check", a temporary installation tree is built from the current diff --git a/src/test/locale/README.md b/src/test/locale/README.md index e290e31480..eb6a7c6124 100644 --- a/src/test/locale/README.md +++ b/src/test/locale/README.md @@ -9,8 +9,11 @@ locale data. Then there are test-sort.pl and test-sort.py that test collating. To run a test for some locale run + make check-$locale + for example + make check-koi8-r Currently, there are only tests for a few locales available. The script diff --git a/src/test/modules/dummy_seclabel/README.md b/src/test/modules/dummy_seclabel/README.md index a3fcbd7599..1a12fdaad2 100644 --- a/src/test/modules/dummy_seclabel/README.md +++ b/src/test/modules/dummy_seclabel/README.md @@ -18,13 +18,13 @@ Usage Here's a simple example of usage: -# postgresql.conf -shared_preload_libraries = 'dummy_seclabel' + # postgresql.conf + shared_preload_libraries = 'dummy_seclabel' -postgres=# CREATE TABLE t (a int, b text); -CREATE TABLE -postgres=# SECURITY LABEL ON TABLE t IS 'classified'; -SECURITY LABEL + postgres=# CREATE TABLE t (a int, b text); + CREATE TABLE + postgres=# SECURITY LABEL ON TABLE t IS 'classified'; + SECURITY LABEL The dummy_seclabel module provides only four hardcoded labels: unclassified, classified, diff --git a/src/test/modules/test_parser/README.md b/src/test/modules/test_parser/README.md index 0a11ec85fb..735bfe9a00 100644 --- a/src/test/modules/test_parser/README.md +++ b/src/test/modules/test_parser/README.md @@ -5,12 +5,12 @@ a starting point for developing your own parser. test_parser recognizes words separated by white space, and returns just two token types: -mydb=# SELECT * FROM ts_token_type('testparser'); - tokid | alias | description --------+-------+--------------- - 3 | word | Word - 12 | blank | Space symbols -(2 rows) + mydb=# SELECT * FROM ts_token_type('testparser'); + tokid | alias | description + -------+-------+--------------- + 3 | word | Word + 12 | blank | Space symbols + (2 rows) These token numbers have been chosen to be compatible with the default parser's numbering. This allows us to use its headline() @@ -24,38 +24,38 @@ parser testparser. It has no user-configurable parameters. You can test the parser with, for example, -mydb=# SELECT * FROM ts_parse('testparser', 'That''s my first own parser'); - tokid | token --------+-------- - 3 | That's - 12 | - 3 | my - 12 | - 3 | first - 12 | - 3 | own - 12 | - 3 | parser + mydb=# SELECT * FROM ts_parse('testparser', 'That''s my first own parser'); + tokid | token + -------+-------- + 3 | That's + 12 | + 3 | my + 12 | + 3 | first + 12 | + 3 | own + 12 | + 3 | parser Real-world use requires setting up a text search configuration that uses the parser. For example, -mydb=# CREATE TEXT SEARCH CONFIGURATION testcfg ( PARSER = testparser ); -CREATE TEXT SEARCH CONFIGURATION - -mydb=# ALTER TEXT SEARCH CONFIGURATION testcfg -mydb-# ADD MAPPING FOR word WITH english_stem; -ALTER TEXT SEARCH CONFIGURATION - -mydb=# SELECT to_tsvector('testcfg', 'That''s my first own parser'); - to_tsvector -------------------------------- - 'that':1 'first':3 'parser':5 -(1 row) - -mydb=# SELECT ts_headline('testcfg', 'Supernovae stars are the brightest phenomena in galaxies', -mydb(# to_tsquery('testcfg', 'star')); - ts_headline ------------------------------------------------------------------ - Supernovae stars are the brightest phenomena in galaxies -(1 row) + mydb=# CREATE TEXT SEARCH CONFIGURATION testcfg ( PARSER = testparser ); + CREATE TEXT SEARCH CONFIGURATION + + mydb=# ALTER TEXT SEARCH CONFIGURATION testcfg + mydb-# ADD MAPPING FOR word WITH english_stem; + ALTER TEXT SEARCH CONFIGURATION + + mydb=# SELECT to_tsvector('testcfg', 'That''s my first own parser'); + to_tsvector + ------------------------------- + 'that':1 'first':3 'parser':5 + (1 row) + + mydb=# SELECT ts_headline('testcfg', 'Supernovae stars are the brightest phenomena in galaxies', + mydb(# to_tsquery('testcfg', 'star')); + ts_headline + ----------------------------------------------------------------- + Supernovae stars are the brightest phenomena in galaxies + (1 row) diff --git a/src/test/modules/test_regex/README.md b/src/test/modules/test_regex/README.md index 3ef152d4e1..7da5e67a9c 100644 --- a/src/test/modules/test_regex/README.md +++ b/src/test/modules/test_regex/README.md @@ -5,7 +5,7 @@ aren't currently exposed at the SQL level by PostgreSQL. Currently, one function is provided: -test_regex(pattern text, string text, flags text) returns setof text[] + test_regex(pattern text, string text, flags text) returns setof text[] Reports an error if the pattern is an invalid regex. Otherwise, the first row of output contains the number of subexpressions, diff --git a/src/test/modules/test_rls_hooks/README.md b/src/test/modules/test_rls_hooks/README.md index c22e0d3fb4..b561bb0192 100644 --- a/src/test/modules/test_rls_hooks/README.md +++ b/src/test/modules/test_rls_hooks/README.md @@ -3,14 +3,14 @@ define additional policies to be used. Functions ========= -test_rls_hooks_permissive(CmdType cmdtype, Relation relation) - RETURNS List* + test_rls_hooks_permissive(CmdType cmdtype, Relation relation) + RETURNS List* Returns a list of policies which should be added to any existing policies on the relation, combined with OR. -test_rls_hooks_restrictive(CmdType cmdtype, Relation relation) - RETURNS List* + test_rls_hooks_restrictive(CmdType cmdtype, Relation relation) + RETURNS List* Returns a list of policies which should be added to any existing policies on the relation, combined with AND. diff --git a/src/test/modules/test_shm_mq/README.md b/src/test/modules/test_shm_mq/README.md index 641407bee0..1df3d38bde 100644 --- a/src/test/modules/test_shm_mq/README.md +++ b/src/test/modules/test_shm_mq/README.md @@ -14,9 +14,9 @@ Functions ========= -test_shm_mq(queue_size int8, message text, - repeat_count int4 default 1, num_workers int4 default 1) - RETURNS void + test_shm_mq(queue_size int8, message text, + repeat_count int4 default 1, num_workers int4 default 1) + RETURNS void This function sends and receives messages synchronously. The user backend sends the provided message to the first background worker using @@ -31,10 +31,10 @@ the user backend verifies that the message finally received matches the one originally sent and throws an error if not. -test_shm_mq_pipelined(queue_size int8, message text, - repeat_count int4 default 1, num_workers int4 default 1, - verify bool default true) - RETURNS void + test_shm_mq_pipelined(queue_size int8, message text, + repeat_count int4 default 1, num_workers int4 default 1, + verify bool default true) + RETURNS void This function sends the same message multiple times, as specified by the repeat count, to the first background worker using a queue of the given diff --git a/src/test/recovery/README.md b/src/test/recovery/README.md index 896df0ad05..c18fe7e21e 100644 --- a/src/test/recovery/README.md +++ b/src/test/recovery/README.md @@ -14,9 +14,13 @@ contrib/pg_prewarm, contrib/pg_stat_statements and contrib/test_decoding in addition to the core code. Run - make check + + make check + or - make installcheck + + make installcheck + You can use "make installcheck" if you previously did "make install". In that case, the code in the installation tree is tested. With "make check", a temporary installation tree is built from the current diff --git a/src/test/ssl/README.md b/src/test/ssl/README.md index 2101a466d2..957694c689 100644 --- a/src/test/ssl/README.md +++ b/src/test/ssl/README.md @@ -20,9 +20,13 @@ Also, to use "make installcheck", you must have built and installed contrib/sslinfo in addition to the core code. Run + make check PG_TEST_EXTRA=ssl + or + make installcheck PG_TEST_EXTRA=ssl + You can use "make installcheck" if you previously did "make install". In that case, the code in the installation tree is tested. With "make check", a temporary installation tree is built from the current @@ -39,49 +43,49 @@ Certificates The test suite needs a set of public/private key pairs and certificates to run: -root_ca +* root_ca root CA, use to sign the server and client CA certificates. -server_ca +* server_ca CA used to sign server certificates. -client_ca +* client_ca CA used to sign client certificates. -server-cn-only +* server-cn-only server-cn-and-alt-names server-single-alt-name server-multiple-alt-names -server-no-names +server-no-names: server certificates, with small variations in the hostnames present in the certificate. Signed by server_ca. -server-password +* server-password: same as server-cn-only, but password-protected. -client +* client: a client certificate, for user "ssltestuser". Signed by client_ca. -client-revoked +* client-revoked: like "client", but marked as revoked in the client CA's CRL. In addition, there are a few files that combine various certificates together in the same file: -both-cas-1 +* both-cas-1: Contains root_ca.crt, client_ca.crt and server_ca.crt, in that order. -both-cas-2 +* both-cas-2: Contains root_ca.crt, server_ca.crt and client_ca.crt, in that order. -root+server_ca +* root+server_ca: Contains root_crt and server_ca.crt. For use as client's "sslrootcert" option. -root+client_ca +* root+client_ca: Contains root_crt and client_ca.crt. For use as server's "ssl_ca_file". -client+client_ca +* client+client_ca: Contains client.crt and client_ca.crt in that order. For use as client's certificate chain. diff --git a/src/timezone/README.md b/src/timezone/README.md index dd5d5f9892..811cddd2f7 100644 --- a/src/timezone/README.md +++ b/src/timezone/README.md @@ -79,15 +79,15 @@ fixed that.) includes relying on configure's results rather than hand-hacked #defines (see private.h in particular). -* Similarly, avoid relying on features that may not exist on old +* Similarly, avoid relying on stdint.h features that may not exist on old systems. In particular this means using Postgres' definitions of the int32 and int64 typedefs, not int_fast32_t/int_fast64_t. Likewise we use PG_INT32_MIN/MAX not INT32_MIN/MAX. (Once we desupport all PG versions -that don't require C99, it'd be practical to rely on and remove +that don't require C99, it'd be practical to rely on stdint.h and remove this set of diffs; but that day is not yet.) * Since Postgres is typically built on a system that has its own copy -of the functions, we must avoid conflicting with those. This +of the time.h functions, we must avoid conflicting with those. This mandates renaming typedef time_t to pg_time_t, and similarly for most other exposed names. diff --git a/src/timezone/tznames/README.md b/src/timezone/tznames/README.md index 6d355e4616..55b98a3143 100644 --- a/src/timezone/tznames/README.md +++ b/src/timezone/tznames/README.md @@ -5,14 +5,14 @@ tznames This directory contains files with timezone sets for PostgreSQL. The problem is that time zone abbreviations are not unique throughout the world and you -might find out that a time zone abbreviation in the `Default' set collides +might find out that a time zone abbreviation in the 'Default' set collides with the one you wanted to use. This can be fixed by selecting a timezone set that defines the abbreviation the way you want it. There might already be a file here that serves your needs. If not, you can create your own. In order to use one of these files, you need to set - timezone_abbreviations = 'xyz' + timezone_abbreviations = 'xyz' in any of the usual ways for setting a parameter, where xyz is the filename that contains the desired time zone abbreviations. @@ -22,9 +22,9 @@ location supplied here, please report this to // +https://cirrus-ci.com/github/< username >/< reponame >/ Hint: all build log files are uploaded to cirrus-ci and can be downloaded from the "Artifacts" section from the cirrus-ci UI after clicking into a @@ -74,7 +74,7 @@ When running a lot of tests in a repository, cirrus-ci's free credits do not suffice. In those cases a repository can be configured to use other infrastructure for running tests. To do so, the REPO_CI_CONFIG_GIT_URL variable can be configured for the repository in the cirrus-ci web interface, -at https://cirrus-ci.com/github/. The file referenced +at https://cirrus-ci.com/github/< user or organization >. The file referenced (see https://cirrus-ci.org/guide/programming-tasks/#fs) by the variable can overwrite the default execution method for different operating systems, defined in .cirrus.yml, by redefining the relevant yaml anchors. diff --git a/src/tools/pg_bsd_indent/README.md b/src/tools/pg_bsd_indent/README.md index 992d4fce61..f1cb900dce 100644 --- a/src/tools/pg_bsd_indent/README.md +++ b/src/tools/pg_bsd_indent/README.md @@ -72,7 +72,7 @@ University of California, Berkeley What follows is the README file as maintained by FreeBSD indent. ---------- - +``` $FreeBSD: head/usr.bin/indent/README 105244 2002-10-16 13:58:39Z charnier $ This is the C indenter, it originally came from the University of Illinois @@ -171,4 +171,4 @@ regards.. oz cc: ccvaxa!willcox sun.com!jar uunet!rsalz - +``` diff --git a/src/tools/pgindent/README.md b/src/tools/pgindent/README.md index b649a21f59..c432e22a49 100644 --- a/src/tools/pgindent/README.md +++ b/src/tools/pgindent/README.md @@ -9,7 +9,7 @@ http://adpgtech.blogspot.com/2015/05/running-pgindent-on-non-core-code-or.html PREREQUISITES: - +-------------- 1) Install pg_bsd_indent in your PATH. Its source code is in the sibling directory src/tools/pg_bsd_indent; see the directions in that directory's README file. @@ -21,12 +21,15 @@ PREREQUISITES: To install, follow the usual install process for a Perl module ("man perlmodinstall" explains it). Or, if you have cpan installed, this should work: + cpan SHANCOCK/Perl-Tidy-20230309.tar.gz + Or if you have cpanm installed, you can just use: + cpanm https://cpan.metacpan.org/authors/id/S/SH/SHANCOCK/Perl-Tidy-20230309.tar.gz DOING THE INDENT RUN: - +--------------------- 1) Change directory to the top of the source tree. 2) Download the latest typedef file from the buildfarm: @@ -58,7 +61,7 @@ DOING THE INDENT RUN: cd ../../.. VALIDATION: - +----------- 1) Check for any newly-created files using "git status"; there shouldn't be any. (pgindent leaves *.BAK files behind if it has trouble, while perltidy leaves *.LOG files behind.) -- 2.32.1 (Apple Git-133)