== Postgres Weekly News - March 09 2008 == - Mailing list pgsql-announce

From David Fetter
Subject == Postgres Weekly News - March 09 2008 ==
Date
Msg-id 20080310071652.GM23803@fetter.org
Whole thread Raw
List pgsql-announce
== Postgres Weekly News - March 09 2008 ==

== Postgres Product News ==

Benetl 1.6 for Windows released.
http://www.benetl.net

Continuent uni/cluster for PostgreSQL 2007.1 Update 2 released.
http://www.continuent.com

Image2db 2.2 released.
http://www.vive.net/products/image2db.htm

PL/Ruby 0.5.3 released.
http://raa.ruby-lang.org/project/pl-ruby/

ptop 3.6.1 released.
http://ptop.projects.postgresql.org/

== Postgres Jobs for March ==

http://archives.postgresql.org/pgsql-jobs/2008-03/threads.php

== Postgres Local ==

Utah Open Source Conference 2008's CfP is open through June 1.
This 2nd annual conference is August 28-30, 2008 in Salt Lake City, UT
http://2008.utosc.com/

Atlanta PUG's first meeting will be March 11, 6:30pm.
http://pugs.postgresql.org/atlpug

Sun Coast PUG's first meeting will be March 11, 2008.
http://pugs.postgresql.org/spug

LAPUG will be meeting March 28, 7:00pm in the City of Garden Grove
Training Room.
http://pugs.postgresql.org/lapug

PG UK day will be April 2 in Birmingham.
http://www.postgresql.org.uk/

PGCon 2008 will be May 20-23 in Ottawa.
http://www.pgcon.org/2008/

PostgreSQL Conference East '08 talks are March 29 and 30 at the
University of Maryland, College Park.
http://www.postgresqlconference.org/

FISL 9.0 will be April 17-19 at PUCRS in Porto Alegre, RS, Brazil.
https://fisl.softwarelivre.org/9.0/

== Postgres in the News ==

Planet PostgreSQL: http://www.planetpostgresql.org/

General Bits, Archives and occasional new articles:
http://www.varlena.com/GeneralBits/

Postgres Weekly News is brought to you this week by David Fetter
and Devrim GUNDUZ.

Submit news and announcements by Sunday at 3:00pm Pacific time.
Please send English language ones to david@fetter.org, German language
to pwn@pgug.de, Italian language to pwn@itpug.org.

== Applied Patches ==

Bruce Momjian committed:

- Add URL's for sequence discussions to TODO.

- Add information on (un)subscribing to mailing lists to the FAQ.

- Fix markup in FAQ.

- In pgsql/doc/src/sgml/func.sgml, document that the null byte is not
  supported and explain why.

- In pgsql/doc/src/sgml/func.sgml, remove the word, "reliably" from
  the above explanation.

- Add to TODO: "Speed WAL recovery by allowing more than one page to
  be prefetched."

- Add to TODO: "Allow the UUID type to accept non-standard formats."

- In pgsql/doc/src/sgml/ref/revoke.sgml, document that REVOKE doesn't
  remove all permissions if PUBLIC has permissions.

- Add to TODO: "Add another URL for: 'Consider using a ring buffer for
  COPY FROM'"

- Add to TODO: "Allow INSERT ... DELETE ... RETURNING, namely allow
  the DELETE ... RETURNING to supply values to the INSERT"

- Update Japanese FAQ.  Jun Kuwamura.

- Add to TODO: "ideas for concurrent pg_dump and pg_restore."

- Add URL for TODO: "Allow pg_restore to utilize multiple CPUs and I/O
  channels by restoring multiple objects simultaneously."

- Remove TODO: "To better utilize resources, restore data, primary
  keys, and indexes for a single table before restoring the next
  table."

- In pgsql/doc/src/sgml/textsearch.sgml, show example of ts_headline()
  using a configuration name.

- Update TODO to read: "The difficulty with this is getting multiple
  dump processes to produce a single dump output file.  It also would
  require several sessions to share the same snapshot."

- When text search string is too long, in error message report actual
  and maximum number of bytes allowed.

- Add to TODO: "Allow COPY FROM to create index entries in bulk."

- Add URL for TODO: "Add SQL:2003 WITH RECURSIVE (hierarchical)
  queries to SELECT."

- Add URL for TODO: "Add support for SQL-standard GENERATED/IDENTITY
  columns."

- In pgsql/doc/src/sgml/config.sgml, document that increasing the
  number of checkpoints segments or checkpoint timeout can incrase the
  time needed for crash recovery, per suggestion from Simon Riggs.

- In pgsql/README, update libpqxx URL in top-level README, per Gurjeet
  Singh.

- Break out referential integrity and server-side languages into
  separate TODO categories.

- Add to TODO: "Have CONSTRAINT cname NOT NULL preserve the contraint
  name."

- Move client encoding libpq function docs into libpq doc section, and
  just reference them from the localization doc section.  Backpatch to
  8.3.X.

- In pgsql/src/backend/utils/misc/guc.c, improve "bgwriter_lru_multiplier"
  GUC description.

- Add to TODO: "Prevent malicious functions from being executed with
  the permissions of unsuspecting users."

- Add to TODO: "Prevent escape string warnings when object names have
  backslashes."

- Add to TODO: "Reduce memory usage of aggregates in set returning
  functions."

- Document use of pg_locks.objid for advisory locks, suggestion from
  Marc Mamin.

- Add to TODO: "Allow client certificate names to be checked against
  the client hostname."

- In pgsql/doc/src/sgml/installation.sgml, document that enabling
  asserts can _significantly_ slow down the server.  Back patch to
  8.3.X.

- Add URL for TODO: "Add SQL:2003 WITH RECURSIVE (hierarchical)
  queries to SELECT."

- Add URL for TODO: "Consider compressing indexes by storing key
  values duplicated in several rows as a single index entry."

- Add to TODO: "Have \d show foreign keys that reference a table's
  primary key." and "Have \d show child tables that inherit from the
  specified parent."

- Add to TODO: "Require all CHECK constraints to be inherited."

- In pgsql/doc/src/sgml/backup.sgml, clarify PITR doc wording.

- Add to TODO: "Add comments on system tables/columns using the
  information in catalogs.sgml."

- Add to TODO: "Have \l+ show database size, if permissions allow."

- Add to TODO: "Add SQLSTATE severity to PGconn return status."

- Add URL for TODO: "Allow multiple identical NOTIFY events to always
  be communicated to the client, rather than sent as a single
  notification to the listener."

- Add to TODO: "Store per-table autovacuum settings in
  pg_class.reloptions."

- Add to TODO: "Improve referential integrity checks."

- Add to TODO: "Consider allowing higher priority queries to have
  referenced buffer cache pages stay in memory longer."

- Add to TODO: "Allow text search dictionary to filter out only stop
  words."

- Add to TODO: "Prevent autovacuum from running if an old transaction
  is still running from the last vacuum."

- Add to TODO: "Add a function like pg_get_indexdef() that report more
  detailed index information."

- Add to TODO: "Consider a function-based API for '@@' full text
  searches."

Magnus Hagander committed:

- In pgsql/src/test/regress/pg_regress.c, use windows DACL fix for
  pg_regress as well.  Dave Page

Tom Lane committed:

- Fix PREPARE TRANSACTION to reject the case where the transaction has
  dropped a temporary table; we can't support that because there's no
  way to clean up the source backend's internal state if the eventual
  COMMIT PREPARED is done by another backend.  This was checked
  correctly in 8.1 but I broke it in 8.2 :-(.  Patch by Heikki
  Linnakangas, original trouble report by John Smith.

- In pgsql/src/interfaces/libpq/Makefile, include -lgss in libpq link,
  if available.  Bjorn Munch.

- In pgsql/src/backend/utils/cache/catcache.c, in
  PrepareToInvalidateCacheTuple, don't force initialization of catalog
  caches that we don't actually need to touch.  This saves some
  trivial number of cycles and avoids certain cases of deadlock when
  doing concurrent VACUUM FULL on system catalogs.  Per report from
  Gavin Roy.  Backpatch to 8.2.  In earlier versions,
  CatalogCacheInitializeCache didn't lock the relation so there's no
  deadlock risk (though that certainly had plenty of risks of its
  own).

- In pgsql/src/backend/access/hash/hashscan.c, change hashscan.c to
  keep its list of active hash index scans in TopMemoryContext, rather
  than scattered through executor per-query contexts.  This poses no
  danger of memory leak since the ResourceOwner mechanism guarantees
  release of no-longer-needed items.  It is needed because the
  per-query context might already be released by the time we try to
  clean up the hash scan list.  Report by ykhuang, diagnosis by
  Heikki.  Back-patch to 8.0, where the ResourceOwner-based cleanup
  was introduced.  The given test case does not fail before 8.2,
  probably because we rearranged transaction abort processing somehow;
  but this coding is undoubtedly risky so I'll patch 8.0 and 8.1
  anyway.

- This patch addresses some issues in TOAST compression strategy that
  were discussed last year, but we felt it was too late in the 8.3
  cycle to change the code immediately.  Specifically, the patch:
  Reduces the minimum datum size to be considered for compression from
  256 to 32 bytes, as suggested by Greg Stark.  Increases the required
  compression rate for compressed storage from 20% to 25%, again per
  Greg's suggestion.  Replaces force_input_size (size above which
  compression is forced) with a maximum size to be considered for
  compression.  It was agreed that allowing large inputs to escape the
  minimum-compression-rate requirement was not bright, and that indeed
  we'd rather have a knob that acted in the other direction.  I set
  this value to 1MB for the moment, but it could use some performance
  studies to tune it.  Adds an early-failure path to the compressor as
  suggested by Jan: if it's been unable to find even one compressible
  substring in the first 1KB (parameterizable), assume we're looking
  at incompressible input and give up.  (Possibly this logic can be
  improved, but I'll commit it as-is for now.) Improves the toasting
  heuristics so that when we have very large fields with attstorage
  'x' or 'e', we will push those out to toast storage before
  considering inline compression of shorter fields.  This also
  responds to a suggestion of Greg's, though my original proposal for
  a solution was a bit off base because it didn't fix the problem for
  large 'e' fields.  There was some discussion in the earlier threads
  of exposing some of the compression knobs to users, perhaps even on
  a per-column basis.  I have not done anything about that here.  It
  seems to me that if we are changing around the parameters, we'd
  better get some experience and be sure we are happy with the design
  before we set things in stone by providing user-visible knobs.

- Improve pglz_decompress() so that it cannot clobber memory beyond
  the available output buffer when presented with corrupt input.  Some
  testing suggests that this slows the decompression loop about 1%,
  which seems an acceptable price to pay for more robustness.
  (Curiously, the penalty seems to be *less* on not-very-compressible
  data, which I didn't expect since the overhead per output byte ought
  to be more in the literal-bytes path.) Patch from Zdenek Kotala.  I
  fixed a corner case and did some renaming of variables to make the
  routine more readable.

- Refactor heap_page_prune so that instead of changing item states
  on-the-fly, it accumulates the set of changes to be made and then
  applies them.  It had to accumulate the set of changes anyway to
  prepare a WAL record for the pruning action, so this isn't an
  enormous change; the only new complexity is to not doubly mark
  tuples that are visited twice in the scan.  The main advantage is
  that we can substantially reduce the scope of the critical section
  in which the changes are applied, thus avoiding PANIC in foreseeable
  cases like running out of memory in inval.c.  A nice secondary
  advantage is that it is now far clearer that WAL replay will
  actually do the same thing that the original pruning did.  This
  commit doesn't do anything about the open problem that
  CacheInvalidateHeapTuple doesn't have the right semantics for a CTID
  change caused by collapsing out a redirect pointer.  But whatever we
  do about that, it'll be a good idea to not do it inside a critical
  section.

- Modify prefix_selectivity() so that it will never estimate the
  selectivity of the generated range condition var >= 'foo' AND var <
  'fop' as being less than what eqsel() would estimate for var =
  'foo'.  This is intuitively reasonable and it gets rid of the need
  for some entirely ad-hoc coding we formerly used to reject bogus
  estimates.  The basic problem here is that if the prefix is more
  than a few characters long, the two boundary values are too close
  together to be distinguishable by comparison to the column
  histogram, resulting in a selectivity estimate of zero, which is
  often not very sane.  Change motivated by an example from Peter
  Eisentraut.  Arguably this is a bug fix, but I'll refrain from
  back-patching it for the moment.

- Change patternsel() so that instead of switching from a pure
  pattern-examination heuristic method to purely histogram-driven
  selectivity at histogram size 100, we compute both estimates and use
  a weighted average.  The weight put on the heuristic estimate
  decreases linearly with histogram size, dropping to zero for 100 or
  more histogram entries.  Likewise in ltreeparentsel().  After a
  patch by Greg Stark, though I reorganized the logic a bit to give
  the caller of histogram_selectivity() more control.

- Remove postmaster.c's check that NBuffers is at least twice
  MaxBackends.  With the addition of multiple autovacuum workers, our
  choices were to delete the check, document the interaction with
  autovacuum_max_workers, or complicate the check to try to hide that
  interaction.  Since this restriction has never been adequate to
  ensure backends can't run out of pinnable buffers, it doesn't really
  have enough excuse to live to justify the second or third choices.
  Per discussion of a complaint from Andreas Kling (see also bug
  #3888).  This commit also removes several documentation references
  to this restriction, but I'm not sure I got them all.

Alvaro Herrera committed:

- In pgsql/src/backend/port/dynloader/netbsd.c, clean up double
  negative, per Tom Lane.

Teodor Sigaev committed:

- In pgsql/src/backend/tsearch/to_tsany.c, fix memory arrangement  of
  tsquery after removing stop words.  It causes a unused memory holes
  in tsquery.  It had been working because tsquery->size was not used
  for any kind of operation except comparing tsqueries, so in HEAD
  it's enough to fix to_tsquery function, but for previous versions
  it's necessary to remove the optimization in CompareTSQ to prevent
  requirement of renewing all stored tsquerys.  Per report by Richard
  Huxton.

- In pgsql/src/backend/utils/adt/tsquery_op.c, revert changes of
  CompareTSQ: it affects existing btree indexes.

Andrew Dunstan committed:

- In pgsql/src/backend/commands/copy.c, improve efficiency of
  attribute scanning in CopyReadAttributesCSV.  The loop is split into
  two parts, inside quotes, and outside quotes, saving some
  instructions in both parts.  Heikki Linnakangas

== Rejected Patches (for now) ==

Marko Kreen's patch of November 23, 2007 which moved the decision
about how much more room to allocate from callers of
appendStringInfoVA isnide the function, where more information is
available, on grounds of unportability and dubious performance
improvement.

== Pending Patches ==

Kenneth D'Souza sent in another revision of his patch to psql which
shows incoming foreign key constraints along with the existing
out-going foreign key constraints when people invoke \d table_name.

Alex Hunsaker sent in a patch intended to fix a bug in ALTER TABLE
which allows dropping a NOT NULL constraint in places where it breaks
inheritance.

Magnus Hagander sent in a WIP patch to make GUC enums.

Zoltan Boszormenyi sent in two revisions of a patch to allow for
64-bit CommandIds.

Pavel Stehule sent in an updated SQL/PSM patch.

Julius Stroffek sent in a patch intended to allow people to use Sun's
compiler to compile Postgres on Linux.

Bruce Momjian sent in a patch to clarify an error message for the
tsvector cast when the string is too long.

Merlin Moncure sent in another revision of his libpq type system
patch.

Bryce Nesbitt sent in a patch which optionally sets a maximum width
for psql output.


pgsql-announce by date:

Previous
From: Benoît Carpentier
Date:
Subject: A free ETL tool for files using postgreSQL
Next
From: "Selena Deckelmann"
Date:
Subject: United States PostgreSQL Association is launched!