== PostgreSQL Weekly News - April 01 2012 == - Mailing list pgsql-announce
From | David Fetter |
---|---|
Subject | == PostgreSQL Weekly News - April 01 2012 == |
Date | |
Msg-id | 20120402035408.GA23975@fetter.org Whole thread Raw |
List | pgsql-announce |
== PostgreSQL Weekly News - April 01 2012 == In a move that stunned the industry, Oracle has bought up all the PostgreSQL support companies including Bull, Command Prompt, Dalibo, EMC, EnterpriseDB, OmniTI, PostgreSQL Experts, Second Quadrant and VMware. CEO Larry Ellison remarked that the move was necessary in order to ensure that his next yacht would be larger rather than smaller than his previous one. == PostgreSQL Product News == psycopg2 2.4.5, a Python connector for PostgreSQL, released. http://initd.org/psycopg/articles/2012/03/29/psycopg-245-released/ == PostgreSQL Jobs for April == http://jobs.oracle.com/ == PostgreSQL Local == PGDay NYC will be held April 2, 2012 at Lighthouse International in New York City. http://pgday.nycpug.org London PostgreSQL Evening Meetup, 17 April 2012 http://www.meetup.com/London-PostgreSQL-Meetup-Group/ PGCon 2012 will be held 17-18 May 2012, in Ottawa at the University of Ottawa. It will be preceded by two days of tutorials on 15-16 May 2012. http://www.pgcon.org/2012/ PGDay France will be in Lyon on June 7, 2012. http://www.pgday.fr == PostgreSQL in the News == Planet PostgreSQL: http://planet.postgresql.org/ PostgreSQL Weekly News is brought to you this week by David Fetter Submit news and announcements by Sunday at 3:00pm Pacific time. Please send English language ones to david@fetter.org, German language to pwn@pgug.de, Italian language to pwn@itpug.org. Spanish language to pwn@arpug.com.ar. == Reviews == == Applied Patches == Robert Haas pushed: - Code cleanup for heap_freeze_tuple. It used to be case that lazy vacuum could call this function with only a shared lock on the buffer, but neither lazy vacuum nor any other code path does that any more. Simplify the code accordingly and clean up some related, obsolete comments. http://git.postgresql.org/pg/commitdiff/7386089d23c748af142ec7e3668fa0dd164eaf99 - New GUC, track_iotiming, to track I/O timings. Currently, the only way to see the numbers this gathers is via EXPLAIN (ANALYZE, BUFFERS), but the plan is to add visibility through the stats collector and pg_stat_statements in subsequent patches. Ants Aasma, reviewed by Greg Smith, with some further changes by me. http://git.postgresql.org/pg/commitdiff/40b9b957694cf7749c420c6c51a7e1d3c9b1fec1 - Expose track_iotiming information via pg_stat_statements. Ants Aasma, reviewed by Greg Smith, with very minor tweaks by me. http://git.postgresql.org/pg/commitdiff/5b4f346611431361339253203d486789e4babb02 - pg_test_timing utility, to measure clock monotonicity and timing cost. Ants Aasma, Greg Smith http://git.postgresql.org/pg/commitdiff/cee523867db29c0bfc5de7ec638ce0a4ad9b3817 - Doc fix for pg_test_timing. Fujii Masao http://git.postgresql.org/pg/commitdiff/9c272da8c49ac5531ceea57cd13d56dbdf761a82 - pg_basebackup: Error message improvements. Fujii Masao http://git.postgresql.org/pg/commitdiff/81f6bbe8ade8c90f23f9286ca9ca726d3e0e310f - pg_basebackup: Error handling fixes. Thomas Ogrisegg and Fujii Masao http://git.postgresql.org/pg/commitdiff/ada763cfcd0978cf95ebb0587f40a45b5cb57594 - Attempt to unbreak pg_test_timing on Windows. Per buildfarm, and Álvaro Herrera. http://git.postgresql.org/pg/commitdiff/7f63527c82177c73197789468384a064d9d750d9 - pg_test_timing: Lame hack to work around compiler warning. Fujii Masao, plus a comment by me. While I'm at it, correctly tabify this chunk of code. http://git.postgresql.org/pg/commitdiff/3f427c1353d92c41069197b31ea6c8b89f24cd11 Peter Eisentraut pushed: - Remove dead assignment found by Coverity http://git.postgresql.org/pg/commitdiff/dcb33b1c6417cbeaf2c7d24eece0edf7cc0adfa7 - Improve PL/Python database access function documentation. Organize the function descriptions as a list instead of running text, for easier access. http://git.postgresql.org/pg/commitdiff/206bec11bd9214873e3703898958789324480b1f - pg_dump: Small message adjustment for consistency http://git.postgresql.org/pg/commitdiff/dd024c22f1fff083065b42af555a3552721fabfd - Run maintainer-check on all PO files, not only configured ones. The intent is to allow configure --enable-nls=xx for installation speed and size, but have maintainer-check check all source files regardless. http://git.postgresql.org/pg/commitdiff/4e1c72079abcc160e84cdcd879f2dca2a6956dea - Tweak markup to avoid extra whitespace in man pages http://git.postgresql.org/pg/commitdiff/03f0c08f4730f3c80b65221749267b6436eac576 - initdb: Mark more messages for translation. Some Windows-only messages had apparently been forgotten so far. Also make the wording of the messages more consistent with similar messages other parts, such as pg_ctl and pg_regress. http://git.postgresql.org/pg/commitdiff/2005b77b8f336f4c2bd00185b4a63fe6582bf7a8 - Add new files to NLS file lists. Some of these are newly added, some are older and were forgotten, some don't contain any translatable strings right now but look like they could in the future. http://git.postgresql.org/pg/commitdiff/194b5ea3d0722f94e8a6ba9cec03b858cc8c9370 - Replace printf format %i by %d. see also ce8d7bb6440710058503d213b2aafcdf56a5b481 http://git.postgresql.org/pg/commitdiff/1d1361b6b77242f519c3cbb10fede4478ba6444f - pgxs: Supply default values for BISON and FLEX variables. Otherwise, the availability of these variables depends on what happened to be available at the time the PostgreSQL build was configured. http://git.postgresql.org/pg/commitdiff/6ca365bff28e52dc14500df84e7b3230d4c292eb - Fix recently introduced typo in NLS file lists http://git.postgresql.org/pg/commitdiff/5633df2582bedc93c2fb37555b7376eb51518bd5 - NLS: Seed Language field in PO header. Use msgmerge --lang option to seed the Language field, recently introduced by gettext, in the header of the new PO file. http://git.postgresql.org/pg/commitdiff/c2cc5c347440e48bc4d0e4ed083db2f3966a0e90 Tom Lane pushed: - Silence compiler warning about uninitialized variable. http://git.postgresql.org/pg/commitdiff/98316e211b60cb160247171e3557b40a247c4610 - Bend parse location rules for the convenience of pg_stat_statements. Generally, the parse location assigned to a multiple-token construct is the location of its leftmost token. This commit breaks that rule for the syntaxes TYPENAME 'LITERAL' and CAST(CONSTANT Alexander Shulgin TYPENAME) --- the resulting Const will have the location of the literal string, not the typename or CAST keyword. The cases where this matters are pretty thin on the ground (no error messages in the regression tests change, for example), and it's unlikely that any user would be confused anyway by an error cursor pointing at the literal. But still it's less than consistent. The reason for changing it is that contrib/pg_stat_statements wants to know the parse location of the original literal, and it was agreed that this is the least unpleasant way to preserve that information through parse analysis. Peter Geoghegan http://git.postgresql.org/pg/commitdiff/5d3fcc4c2e137417ef470d604fee5e452b22f6a7 - Add some infrastructure for contrib/pg_stat_statements. Add a queryId field to Query and PlannedStmt. This is not used by the core backend, except for being copied around at appropriate times. It's meant to allow plug-ins to track a particular query forward from parse analysis to execution. The queryId is intentionally not dumped into stored rules (and hence this commit doesn't bump catversion). You could argue that choice either way, but it seems better that stored rule strings not have any dependency on plug-ins that might or might not be present. Also, add a post_parse_analyze_hook that gets invoked at the end of parse analysis (but only for top-level analysis of complete queries, not cases such as analyzing a domain's default-value expression). This is mainly meant to be used to compute and assign a queryId, but it could have other applications. Peter Geoghegan http://git.postgresql.org/pg/commitdiff/a40fa613b516b97c37d87ac1b21fb7aa8a2f2c1b - Improve contrib/pg_stat_statements to lump "similar" queries together. pg_stat_statements now hashes selected fields of the analyzed parse tree to assign a "fingerprint" to each query, and groups all queries with the same fingerprint into a single entry in the pg_stat_statements view. In practice it is expected that queries with the same fingerprint will be equivalent except for values of literal constants. To make the display more useful, such constants are replaced by "?" in the displayed query strings. This mechanism currently supports only optimizable queries (SELECT, INSERT, UPDATE, DELETE). Utility commands are still matched on the basis of their literal query strings. There remain some open questions about how to deal with utility statements that contain optimizable queries (such as EXPLAIN and SELECT INTO) and how to deal with expiring speculative hashtable entries that are made to save the normalized form of a query string. However, fixing these issues should require only localized changes, and since there are other open patches involving contrib/pg_stat_statements, it seems best to go ahead and commit what we've got. Peter Geoghegan, reviewed by Daniel Farina http://git.postgresql.org/pg/commitdiff/7313cc016344a5705eb3e6916d8c4ea849c57975 - Improve handling of utility statements containing plannable statements. When tracking nested statements, contrib/pg_stat_statements formerly double-counted the execution costs of utility statements that directly contain an executable statement, such as EXPLAIN and DECLARE CURSOR. This was not obvious since the ProcessUtility and Executor hooks would each add their measured costs to the same stats table entry. However, with the new implementation that hashes utility and plannable statements differently, this showed up as seemingly-duplicate stats entries. Fix that by disabling the Executor hooks when the query has a queryId of zero, which was the case already for such statements but is now more clearly specified in the code. (The zero queryId was causing problems anyway because all such statements would add to a single bogus entry.) The PREPARE/EXECUTE case still results in counting the same execution in two different stats table entries, but it should be much less surprising to users that there are two entries in such cases. In passing, include a CommonTableExpr's ctename in the query hash. I had left it out originally on the grounds that we wanted to omit all inessential aliases, but since RTE_CTE RTEs are hashing their referenced names, we'd better hash the CTE names too to make sure we don't hash semantically different queries the same. http://git.postgresql.org/pg/commitdiff/e0e4ebe38469a777e3c585e1d97383f974c19b8c - Improve contrib/pg_stat_statements' handling of PREPARE/EXECUTE statements. It's actually more useful for the module to ignore these. Ignoring EXECUTE (and not incrementing the nesting level) allows the executor hooks to charge the time to the underlying prepared query, which shows up as a stats entry with the original PREPARE as query string (possibly modified by suppression of constants, which might not be terribly useful here but it's not worth avoiding). This is much more useful than cluttering the stats table with a distinct entry for each textually distinct EXECUTE. Experimentation with this idea shows that it's also preferable to ignore PREPARE. If we don't, we get two stats table entries, one with the query string hash and one with the jumble-derived hash, but with the same visible query string (modulo those constants). This is confusing and not very helpful, since the first entry will only receive costs associated with initial planning of the query, which is not something counted at all normally by pg_stat_statements. (And if we do start tracking planning costs, we'd want them blamed on the other hash table entry anyway.) http://git.postgresql.org/pg/commitdiff/566a1d43cf6bfcc7f9385b581d98e07eab282cdd - Fix dblink's failure to report correct connection name in error messages. The DBLINK_GET_CONN and DBLINK_GET_NAMED_CONN macros did not set the surrounding function's conname variable, causing errors to be incorrectly reported as having occurred on the "unnamed" connection in some cases. This bug was actually visible in two cases in the regression tests, but apparently whoever added those cases wasn't paying attention. Noted by Kyotaro Horiguchi, though this is different from his proposed patch. Back-patch to 8.4; 8.3 does not have the same type of error reporting so the patch is not relevant. http://git.postgresql.org/pg/commitdiff/b75fbe91910df323a8d3e1d92a8bb4dd0d5e88a9 - Add PGDLLIMPORT to ScanKeywords and NumScanKeywords. Per buildfarm, this is now needed by contrib/pg_stat_statements. http://git.postgresql.org/pg/commitdiff/5e83854d71bb05403768a97a415a129b0081564b - Fix glitch recently introduced in psql tab completion. Over-optimization (by me, looks like :-() broke the case of recognizing a word boundary just before a quoted identifier. Reported and diagnosed by Dean Rasheed. http://git.postgresql.org/pg/commitdiff/a52e6fe7bcf86f7e52d7b1d6f59260cb57b565fa - Rename frontend keyword arrays to avoid conflict with backend. ecpg and pg_dump each contain keyword arrays with structure similar to the backend's keyword array. Up to now, we actually named those arrays the same as the backend's and relied on parser/keywords.h to declare them. This seems a tad too cute, though, and it breaks now that we need to PGDLLIMPORT-decorate the backend symbols. Rename to avoid the problem. Per buildfarm. (It strikes me that maybe we should get rid of the separate keywords.c files altogether, and just define these arrays in the modules that use them, but that's a rather more invasive change.) http://git.postgresql.org/pg/commitdiff/c252a17d828756e2f7d635f69eace53aaf983420 - Fix O(N^2) behavior in pg_dump for large numbers of owned sequences. The loop that matched owned sequences to their owning tables required time proportional to number of owned sequences times number of tables; although this work was only expended in selective-dump situations, which is probably why the issue wasn't recognized long since. Refactor slightly so that we can perform this work after the index array for findTableByOid has been set up, reducing the time to O(M log N). Per gripe from Mike Roest. Since this is a longstanding performance bug, backpatch to all supported versions. http://git.postgresql.org/pg/commitdiff/0d8117abefdae69dbec7465adf2c68f5cd0412ac - Fix O(N^2) behavior in pg_dump when many objects are in dependency loops. Combining the loop workspace with the record of already-processed objects might have been a cute trick, but it behaves horridly if there are many dependency loops to repair: the time spent in the first step of findLoop() grows as O(N^2). Instead use a separate flag array indexed by dump ID, which we can check in constant time. The length of the workspace array is now never more than the actual length of a dependency chain, which should be reasonably short in all cases of practical interest. The code is noticeably easier to understand this way, too. Per gripe from Mike Roest. Since this is a longstanding performance bug, backpatch to all supported versions. http://git.postgresql.org/pg/commitdiff/d5881c036a913d31a5b0f56519cce76ca3b3e587 Andrew Dunstan pushed: - Remove now redundant pgpipe code. http://git.postgresql.org/pg/commitdiff/d2c1740dc275543a46721ed254ba3623f63d2204 - Unbreak Windows builds broken by pgpipe removal. http://git.postgresql.org/pg/commitdiff/aeca65022609282f004e8332910938abd4eefbd6 Heikki Linnakangas pushed: - Inherit max_safe_fds to child processes in EXEC_BACKEND mode. Postmaster sets max_safe_fds by testing how many open file descriptors it can open, and that is normally inherited by all child processes at fork(). Not so on EXEC_BACKEND, ie. Windows, however. Because of that, we effectively ignored max_files_per_process on Windows, and always assumed a conservative default of 32 simultaneous open files. That could have an impact on performance, if you need to access a lot of different files in a query. After this patch, the value is passed to child processes by save/restore_backend_variables() among many other global variables. It has been like this forever, but given the lack of complaints about it, I'm not backpatching this. http://git.postgresql.org/pg/commitdiff/5762a4d9098ac0cba789ddd26286ac85c2d316f2 Simon Riggs pushed: - Correct epoch of txid_current() when executed on a Hot Standby server. Initialise ckptXidEpoch from starting checkpoint and maintain the correct value as we roll forwards. This allows GetNextXidAndEpoch() to return the correct epoch when executed during recovery. Backpatch to 9.0 when the problem is first observable by a user. Bug report from Daniel Farina http://git.postgresql.org/pg/commitdiff/68219aaf6b8c6214e5f4a6124bd5607254aad6b3 == Rejected Patches (for now) == No one was disappointed this week :-) == Pending Patches == Kyotaro HORIGUCHI sent in two more revisions of the patch to create a new tuple storage format for libpq and use same in dblink. Shigeru HANADA sent in two more revisions of the patch to add a PostgreSQL FDW along with infrastructure for same. Peter Eisentraut and Alexander Shulgin traded patches to add a URI format for connection strings in libpq. Fujii Masao sent in two revisions of a patch to make pg_basebackup exit on error. Ants Aasma sent in a patch to use lazy hash aggregation to speed up cases where no actual aggregates are used. Dimitri Fontaine sent in two more revisions of the patch to add finer dependencies for EXTENSIONs. Marco Nenciarini sent in another revision of the patch to allow each element of an array to be an enforced foreign key reference. Peter Eisentraut sent in a patch to fix some infelicities between pgxs, bison and flexx. Andrew Dunstan and Joachim Wieland traded patches to implement parallel pg_dump. Zoltan Boszormenyi sent in two more revisions of the ECPG FETCH readahead patch. Daniel Farina sent in another revision of the patch to allow same-role pg_terminate_backend. Pavel Stehule sent in another revision of the CHECK TRIGGER functionality for PL/pgsql. Peter Eisentraut sent in a patch which reverts the default capitalization behavior in psql's tab completion to that prior to a previous patch while expanding the tunability of that capitalization with tab completion. Robert Haas sent in two patches to measure lwlock-related latency spikes. Heikki Linnakangas sent in a patch to set stack_base_ptr in autovacuum.
pgsql-announce by date: