== PostgreSQL Weekly News - July 13 2008 == - Mailing list pgsql-announce

From David Fetter
Subject == PostgreSQL Weekly News - July 13 2008 ==
Date
Msg-id 20080714022124.GM14063@fetter.org
Whole thread Raw
List pgsql-announce
== PostgreSQL Weekly News - July 13 2008 ==

== PostgreSQL Product News ==

Open Technology Group has created a high-availability training course.
http://www.otg-nc.com/training-courses/coursedetail.php?courseid=65&cat_id=8

== PostgreSQL Jobs for July ==

http://archives.postgresql.org/pgsql-jobs/2008-07/threads.php

== PostgreSQL Local ==

The Call for Papers for European PGDay has begun.
http://www.pgday.org/en/call4papers

pgDay Portland is July 20, just before OSCON.
http://pugs.postgresql.org/node/400

PGCon Brazil 2008 will be on September 26-27 at Unicamp in Campinas.
http://pgcon.postgresql.org.br/index.en.html

PGDay.(IT|EU) 2008 will be October 17 and 18 in Prato.
http://www.pgday.org/it/

== PostgreSQL in the News ==

Planet PostgreSQL: http://www.planetpostgresql.org/

General Bits, Archives and occasional new articles:
http://www.varlena.com/GeneralBits/

PostgreSQL Weekly News is brought to you this week by David Fetter

Submit news and announcements by Sunday at 3:00pm Pacific time.
Please send English language ones to david@fetter.org, German language
to pwn@pgug.de, Italian language to pwn@itpug.org.

== Applied Patches ==

Peter Eisentraut committed:

- In pgsql/doc/src/sgml/func.sgml, added documentation for function
  xmlagg.

- Allow binary-coercible types for cast function arguments and return
  types.  Document return type of cast functions.  Also change
  documentation to prefer the term "binary coercible" in its present
  sense instead of the previous term "binary compatible".

Tom Lane committed:

- Fix AT TIME ZONE (in all three variants) so that we first try to
  interpret the timezone argument as a timezone abbreviation, and only
  try it as a full timezone name if that fails.  The zic database has
  four zones (CET, EET, MET, WET) that are full daylight-savings zones
  and yet have names that are the same as their abbreviations for
  standard time, resulting in ambiguity.  In the timestamp input
  functions we resolve the ambiguity by preferring the abbreviation,
  and AT TIME ZONE should work the same way.  (No functionality is
  lost because the zic database also has other names for these zones,
  eg Europe/Zurich.)  Per gripe from Jaromir Talir.  Backpatch to 8.1.
  Older releases did not have the issue because AT TIME ZONE only
  accepted abbreviations not zone names.  (Thus, this patch also
  arguably fixes a compatibility botch introduced at 8.1: in ambiguous
  cases we now behave the same as 8.0 did.)

- In pgsql/src/backend/utils/adt/selfuncs.c, fix estimate_num_groups()
  to assume that GROUP BY expressions yielding boolean results always
  contribute two groups, regardless of the expression contents.  This
  is very substantially more accurate than the regular heuristic for
  certain boolean tests like "col IS NULL".  Per gripe from Sam Mason.
  Back-patch to all supported releases, since the behavior of
  estimate_num_groups() hasn't changed all that much since 7.4.

- In pgsql/src/backend/utils/error/elog.c, fix performance bug in
  write_syslog(): the code to preferentially break the log message at
  newlines cost O(N^2) for very long messages with few or no newlines.
  For messages in the megabyte range this became the dominant cost.
  Per gripe from Achilleas Mantzios.  Patch all the way back, since
  this is a safe change with no portability risks.  I am also thinking
  of increasing PG_SYSLOG_LIMIT, but that should be done separately.

- In pgsql/src/backend/utils/error: elog.c, increase PG_SYSLOG_LIMIT
  (the max line length sent to syslog()) from 128 to 1024 to improve
  performance when sending large elog messages.  Also add a comment
  about why we use that number.  Since this represents an externally
  visible behavior change, and might possibly result in portability
  issues, it seems best not to back-patch it.

- Fix mis-calculation of extParam/allParam sets for plan nodes, as
  seen in bug #4290.  The fundamental bug is that masking extParam by
  outer_params, as finalize_plan had been doing, caused us to lose the
  information that an initPlan depended on the output of a sibling
  initPlan.  On reflection the best thing to do seemed to be not to
  try to adjust outer_params for this case but get rid of it entirely.
  The only thing it was really doing for us was to filter out param
  IDs associated with SubPlan nodes, and that can be done (with
  greater accuracy) while processing individual SubPlan nodes in
  finalize_primnode.  This approach was vindicated by the discovery
  that the masking method was hiding a second bug: SS_finalize_plan
  failed to remove extParam bits for initPlan output params that were
  referenced in the main plan tree (it only got rid of those
  referenced by other initPlans).  It's not clear that this caused any
  real problems, given the limited use of extParam by the executor,
  but it's certainly not what was intended.  I originally thought that
  there was also a problem with needing to include indirect
  dependencies on external params in initPlans' param sets, but it
  turns out that the executor handles this correctly so long as the
  depended-on initPlan is earlier in the initPlans list than the one
  using its output.  That seems a bit of a fragile assumption, but it
  is true at the moment, so I just documented it in some code comments
  rather than making what would be rather invasive changes to remove
  the assumption.  Back-patch to 8.1.  Previous versions don't have
  the case of initPlans referring to other initPlans' outputs, so
  while the existing logic is still questionable for them, there are
  not any known bugs to be fixed.  So I'll refrain from changing them
  for now.

- Tighten up SS_finalize_plan's computation of valid_params to exclude
  Params of the current query level that aren't in fact output
  parameters of the current initPlans.  (This means, for example,
  output parameters of regular subplans.) To make this work correctly
  for output parameters coming from sibling initplans requires
  rejiggering the API of SS_finalize_plan just a bit: we need the
  siblings to be visible to it, rather than hidden as
  SS_make_initplan_from_plan had been doing.  This is really part of
  my response to bug #4290, but I concluded this part probably
  shouldn't be back-patched, since all that it's doing is to make a
  debugging cross-check tighter.

- Add unchangeable GUC "variables" segment_size, wal_block_size, and
  wal_segment_size to make those configuration parameters available to
  clients, in the same way that block_size was previously exposed.
  Bernd Helmle, with comments from Abhijit Menon-Sen and some further
  tweaking by me.

- In pgsql/src/backend/utils/time/snapmgr.c, fix a few typos in
  comments and sort header inclusions alphabetically.

- Fix an oversight in the original implementation of
  performMultipleDeletions(): the alreadyDeleted list has to be passed
  down through deleteDependentObjects(), else objects that are deleted
  via auto/internal dependencies don't get reported back up to
  performMultipleDeletions().  Depending on the visitation order, this
  could cause the code to try to delete an already-deleted object,
  leading to strange errors in DROP OWNED (typically "cache lookup
  failed for relation NNNNN" or similar).  Per bug #4289.  Patch for
  back branches only.  This code has recently been rewritten in HEAD,
  and doesn't have this particular bug anymore.

- Multi-column GIN indexes.  Teodor Sigaev

- Const-ify the arguments of str_tolower() and friends to suppress
  compile warnings.  Clean up various unneeded cruft that was left
  behind after creating those routines.  Introduce some convenience
  functions str_tolower_z etc to eliminate tedious and error-prone
  double arguments in formatting.c.  (Currently there seems no need to
  export the latter, but maybe reconsider this later.)

- In pgsql/src/include/pg_config_manual.h, don't make --enable-cassert
  turn on RANDOMIZE_ALLOCATED_MEMORY automatically; it's just too dang
  expensive.  Per recent discussion, but I just got my nose rubbed in
  it again while doing some performance checking.

- More replacements of binary compatible to binary coercible.

- In pgsql/doc/src/sgml/ref/create_cast.sgml, fix a couple of stray
  misuses of "binary compatible".

- Clean up the use of some page-header-access macros: principally, use
  SizeOfPageHeaderData instead of sizeof(PageHeaderData) in places
  where that makes the code clearer, and avoid casting between Page
  and PageHeader where possible.  Zdenek Kotala, with some additional
  cleanup by Heikki Linnakangas.  I did not apply the parts of the
  proposed patch that would have resulted in slightly changing the
  on-disk format of hash indexes; it seems to me that's not a win as
  long as there's any chance of having in-place upgrade for 8.4.

- Change the PageGetContents() macro to guarantee its result is
  maxalign'd, thereby forestalling any problems with alignment of the
  data structure placed there.  Since SizeOfPageHeaderData is
  maxalign'd anyway in 8.3 and HEAD, this does not actually change
  anything right now, but it is foreseeable that the header size will
  change again someday.  I had to fix a couple of places that were
  assuming that the content offset is just SizeOfPageHeaderData rather
  than MAXALIGN(SizeOfPageHeaderData).  Per discussion of Zdenek's
  page-macros patch.

- Create a type-specific typanalyze routine for tsvector, which
  collects stats on the most common individual lexemes in place of the
  mostly-useless default behavior of counting duplicate tsvectors.
  Future work: create selectivity estimation functions that actually
  do something with these stats.  (Some other things we ought to look
  at doing: using the Lossy Counting algorithm in
  compute_minimal_stats, and using the element-counting idea for stats
  on regular arrays.) Jan Urbanski

Bruce Momjian committed:

- In pgsql/src/backend/utils/misc/guc.c, add comment for deadlock_timeout:
  "This is PGC_SIGHUP so all backends have the same value."

Neil Conway committed:

- In pgsql/src/backend/access/gin/README, minor improvements to the
  Gin internal documentation.

Heikki Linnakangas committed:

- In pgsql/contrib/pg_standby/pg_standby.c, fix WAL file cutoff point
  calculation in pg_standby.  Patch by Simon Riggs, per bug report
  from Ferenc Felhoffer.

Alvaro Herrera committed:

- Make sure we only try to free snapshots that have been passed
  through CopySnapshot, per Neil Conway.  Also add a comment about the
  assumption in GetSnapshotData that the argument is statically
  allocated.  Also, fix some more typos in comments in snapmgr.c.

Teodor Sigaev committed:

- Add caching of query to GIN/GiST consistent function.  Per
  performance gripe from nomao.com

== Rejected Patches (for now) ==

No one was disappointed this week :-)

== Pending Patches ==

Heikki Linnakangas sent in a revision of the page macros cleanup.

Simon Riggs sent in a patch to change PG_USERSET to PG_SUSET for
logging files.

Bernd Helmle sent in a patch which adds some missing descriptions for
aggregates, functions and conversions.

Pavel Stehule, with feedback from Marko Kreen, sent in two more
revisions of his table function support patch.

Ken Camann sent in a patch to get Postgres to compile under 64-bit
Windows.

Jaime Casanova sent in another revision of his patch which makes
granting INSERT on a table extend to any sequences attached.

Tom Lane sent in a revised version of David Wheeler's case-insensitive
text patch.


pgsql-announce by date:

Previous
From: gabrielle
Date:
Subject: Reminder - PDXPUG Day
Next
From: "SQL Maestro Team"
Date:
Subject: PostgreSQL Code Factory 8.7 released