== PostgreSQL Weekly News - July 13 2008 == - Mailing list pgsql-announce
From | David Fetter |
---|---|
Subject | == PostgreSQL Weekly News - July 13 2008 == |
Date | |
Msg-id | 20080714022124.GM14063@fetter.org Whole thread Raw |
List | pgsql-announce |
== PostgreSQL Weekly News - July 13 2008 == == PostgreSQL Product News == Open Technology Group has created a high-availability training course. http://www.otg-nc.com/training-courses/coursedetail.php?courseid=65&cat_id=8 == PostgreSQL Jobs for July == http://archives.postgresql.org/pgsql-jobs/2008-07/threads.php == PostgreSQL Local == The Call for Papers for European PGDay has begun. http://www.pgday.org/en/call4papers pgDay Portland is July 20, just before OSCON. http://pugs.postgresql.org/node/400 PGCon Brazil 2008 will be on September 26-27 at Unicamp in Campinas. http://pgcon.postgresql.org.br/index.en.html PGDay.(IT|EU) 2008 will be October 17 and 18 in Prato. http://www.pgday.org/it/ == PostgreSQL in the News == Planet PostgreSQL: http://www.planetpostgresql.org/ General Bits, Archives and occasional new articles: http://www.varlena.com/GeneralBits/ PostgreSQL Weekly News is brought to you this week by David Fetter Submit news and announcements by Sunday at 3:00pm Pacific time. Please send English language ones to david@fetter.org, German language to pwn@pgug.de, Italian language to pwn@itpug.org. == Applied Patches == Peter Eisentraut committed: - In pgsql/doc/src/sgml/func.sgml, added documentation for function xmlagg. - Allow binary-coercible types for cast function arguments and return types. Document return type of cast functions. Also change documentation to prefer the term "binary coercible" in its present sense instead of the previous term "binary compatible". Tom Lane committed: - Fix AT TIME ZONE (in all three variants) so that we first try to interpret the timezone argument as a timezone abbreviation, and only try it as a full timezone name if that fails. The zic database has four zones (CET, EET, MET, WET) that are full daylight-savings zones and yet have names that are the same as their abbreviations for standard time, resulting in ambiguity. In the timestamp input functions we resolve the ambiguity by preferring the abbreviation, and AT TIME ZONE should work the same way. (No functionality is lost because the zic database also has other names for these zones, eg Europe/Zurich.) Per gripe from Jaromir Talir. Backpatch to 8.1. Older releases did not have the issue because AT TIME ZONE only accepted abbreviations not zone names. (Thus, this patch also arguably fixes a compatibility botch introduced at 8.1: in ambiguous cases we now behave the same as 8.0 did.) - In pgsql/src/backend/utils/adt/selfuncs.c, fix estimate_num_groups() to assume that GROUP BY expressions yielding boolean results always contribute two groups, regardless of the expression contents. This is very substantially more accurate than the regular heuristic for certain boolean tests like "col IS NULL". Per gripe from Sam Mason. Back-patch to all supported releases, since the behavior of estimate_num_groups() hasn't changed all that much since 7.4. - In pgsql/src/backend/utils/error/elog.c, fix performance bug in write_syslog(): the code to preferentially break the log message at newlines cost O(N^2) for very long messages with few or no newlines. For messages in the megabyte range this became the dominant cost. Per gripe from Achilleas Mantzios. Patch all the way back, since this is a safe change with no portability risks. I am also thinking of increasing PG_SYSLOG_LIMIT, but that should be done separately. - In pgsql/src/backend/utils/error: elog.c, increase PG_SYSLOG_LIMIT (the max line length sent to syslog()) from 128 to 1024 to improve performance when sending large elog messages. Also add a comment about why we use that number. Since this represents an externally visible behavior change, and might possibly result in portability issues, it seems best not to back-patch it. - Fix mis-calculation of extParam/allParam sets for plan nodes, as seen in bug #4290. The fundamental bug is that masking extParam by outer_params, as finalize_plan had been doing, caused us to lose the information that an initPlan depended on the output of a sibling initPlan. On reflection the best thing to do seemed to be not to try to adjust outer_params for this case but get rid of it entirely. The only thing it was really doing for us was to filter out param IDs associated with SubPlan nodes, and that can be done (with greater accuracy) while processing individual SubPlan nodes in finalize_primnode. This approach was vindicated by the discovery that the masking method was hiding a second bug: SS_finalize_plan failed to remove extParam bits for initPlan output params that were referenced in the main plan tree (it only got rid of those referenced by other initPlans). It's not clear that this caused any real problems, given the limited use of extParam by the executor, but it's certainly not what was intended. I originally thought that there was also a problem with needing to include indirect dependencies on external params in initPlans' param sets, but it turns out that the executor handles this correctly so long as the depended-on initPlan is earlier in the initPlans list than the one using its output. That seems a bit of a fragile assumption, but it is true at the moment, so I just documented it in some code comments rather than making what would be rather invasive changes to remove the assumption. Back-patch to 8.1. Previous versions don't have the case of initPlans referring to other initPlans' outputs, so while the existing logic is still questionable for them, there are not any known bugs to be fixed. So I'll refrain from changing them for now. - Tighten up SS_finalize_plan's computation of valid_params to exclude Params of the current query level that aren't in fact output parameters of the current initPlans. (This means, for example, output parameters of regular subplans.) To make this work correctly for output parameters coming from sibling initplans requires rejiggering the API of SS_finalize_plan just a bit: we need the siblings to be visible to it, rather than hidden as SS_make_initplan_from_plan had been doing. This is really part of my response to bug #4290, but I concluded this part probably shouldn't be back-patched, since all that it's doing is to make a debugging cross-check tighter. - Add unchangeable GUC "variables" segment_size, wal_block_size, and wal_segment_size to make those configuration parameters available to clients, in the same way that block_size was previously exposed. Bernd Helmle, with comments from Abhijit Menon-Sen and some further tweaking by me. - In pgsql/src/backend/utils/time/snapmgr.c, fix a few typos in comments and sort header inclusions alphabetically. - Fix an oversight in the original implementation of performMultipleDeletions(): the alreadyDeleted list has to be passed down through deleteDependentObjects(), else objects that are deleted via auto/internal dependencies don't get reported back up to performMultipleDeletions(). Depending on the visitation order, this could cause the code to try to delete an already-deleted object, leading to strange errors in DROP OWNED (typically "cache lookup failed for relation NNNNN" or similar). Per bug #4289. Patch for back branches only. This code has recently been rewritten in HEAD, and doesn't have this particular bug anymore. - Multi-column GIN indexes. Teodor Sigaev - Const-ify the arguments of str_tolower() and friends to suppress compile warnings. Clean up various unneeded cruft that was left behind after creating those routines. Introduce some convenience functions str_tolower_z etc to eliminate tedious and error-prone double arguments in formatting.c. (Currently there seems no need to export the latter, but maybe reconsider this later.) - In pgsql/src/include/pg_config_manual.h, don't make --enable-cassert turn on RANDOMIZE_ALLOCATED_MEMORY automatically; it's just too dang expensive. Per recent discussion, but I just got my nose rubbed in it again while doing some performance checking. - More replacements of binary compatible to binary coercible. - In pgsql/doc/src/sgml/ref/create_cast.sgml, fix a couple of stray misuses of "binary compatible". - Clean up the use of some page-header-access macros: principally, use SizeOfPageHeaderData instead of sizeof(PageHeaderData) in places where that makes the code clearer, and avoid casting between Page and PageHeader where possible. Zdenek Kotala, with some additional cleanup by Heikki Linnakangas. I did not apply the parts of the proposed patch that would have resulted in slightly changing the on-disk format of hash indexes; it seems to me that's not a win as long as there's any chance of having in-place upgrade for 8.4. - Change the PageGetContents() macro to guarantee its result is maxalign'd, thereby forestalling any problems with alignment of the data structure placed there. Since SizeOfPageHeaderData is maxalign'd anyway in 8.3 and HEAD, this does not actually change anything right now, but it is foreseeable that the header size will change again someday. I had to fix a couple of places that were assuming that the content offset is just SizeOfPageHeaderData rather than MAXALIGN(SizeOfPageHeaderData). Per discussion of Zdenek's page-macros patch. - Create a type-specific typanalyze routine for tsvector, which collects stats on the most common individual lexemes in place of the mostly-useless default behavior of counting duplicate tsvectors. Future work: create selectivity estimation functions that actually do something with these stats. (Some other things we ought to look at doing: using the Lossy Counting algorithm in compute_minimal_stats, and using the element-counting idea for stats on regular arrays.) Jan Urbanski Bruce Momjian committed: - In pgsql/src/backend/utils/misc/guc.c, add comment for deadlock_timeout: "This is PGC_SIGHUP so all backends have the same value." Neil Conway committed: - In pgsql/src/backend/access/gin/README, minor improvements to the Gin internal documentation. Heikki Linnakangas committed: - In pgsql/contrib/pg_standby/pg_standby.c, fix WAL file cutoff point calculation in pg_standby. Patch by Simon Riggs, per bug report from Ferenc Felhoffer. Alvaro Herrera committed: - Make sure we only try to free snapshots that have been passed through CopySnapshot, per Neil Conway. Also add a comment about the assumption in GetSnapshotData that the argument is statically allocated. Also, fix some more typos in comments in snapmgr.c. Teodor Sigaev committed: - Add caching of query to GIN/GiST consistent function. Per performance gripe from nomao.com == Rejected Patches (for now) == No one was disappointed this week :-) == Pending Patches == Heikki Linnakangas sent in a revision of the page macros cleanup. Simon Riggs sent in a patch to change PG_USERSET to PG_SUSET for logging files. Bernd Helmle sent in a patch which adds some missing descriptions for aggregates, functions and conversions. Pavel Stehule, with feedback from Marko Kreen, sent in two more revisions of his table function support patch. Ken Camann sent in a patch to get Postgres to compile under 64-bit Windows. Jaime Casanova sent in another revision of his patch which makes granting INSERT on a table extend to any sequences attached. Tom Lane sent in a revised version of David Wheeler's case-insensitive text patch.
pgsql-announce by date: