Summary of Changes since last release (7.4.1) - Mailing list pgsql-hackers

From Simon Riggs
Subject Summary of Changes since last release (7.4.1)
Date
Msg-id 001401c3f67c$0dab65e0$77c487d9@LaptopDellXP
Whole thread Raw
Responses Re: Summary of Changes since last release (7.4.1)
List pgsql-hackers
POSTGRESQL: Summary of Changes since last release (7.4.1)
----------------------------------------------------------
18 Feb 2004

This is a summary of most changes since code versions marked 7_4_1. The
intention is to help everybody understand what's coming and what might
be affected, though most importantly, where YOU might add value to the
community as a designer, developer, tester, technical author or
advocate. It seeks to complement other information sources such as
Robert Treat's weekly news bulletin, the TODO summary of desired future
items, Elein Mustain's General Bits or the README files - which are the
definitive list of what's in any particular release. 
*** Your feedback is appreciated. ***

So far in this release dev cycle, major functionality will effect
- PERFORMANCE
- OPTIMIZER/EXECUTOR
- ROBUSTNESS
- SECURITY
- WIN32 PORTABILITY
Other code changes are summarised and their major impacts noted.

These notes cover major changes and are not guaranteed complete, or even
fully tested. Many additional patches to the latest full release have
been submitted and these are appreciated just as much, even though they
have *mostly* more isolated effects. Documentation changes continue,
though aren't described here, neither are client side
utilities/interfaces.

Nothing mentioned here is DEFINITELY in 7.5 or any future release;
testing of everything mentioned here is encouraged and appreciated, for
regression, performance and robustness. There is not yet a CVS branch
specifically for any later release than 7_4_1; these changes are not yet
even guaranteed to build into a consistent release when taken together.

Description of changes is designed to highlight benefit and impact, as
well as identifying specific areas of code change and potential knock-on
effects.

MAJOR FUNCTIONALITY
PERFORMANCE

- Major new memory buffer cache algorithm has now been implemented using
the Adaptive Replacement Cache algorithm. The implementation should have
positive benefit for everybody's workload, since ARC will adapt to a
variety of situations and has been designed to allow Vacuum to avoid
interfering with user applications. (Jan) src/backend/buffer

- New performance profiling of Intel CPU has allowed new spinlock code
to achieve performance/throughput gains of up to 10% using DBT-2 (OLTP)
workloads. Further gains to follow? (Manfred Spraul/Tom)
src/backend/storage/lmgr/s_lock.c

- TIP 9 now needs changing! Cross-data-type comparisons are now
indexable by btrees. All the existing cross-type comparison operators
(int2/int4/int8) and (float4/float8) have appropriate support. Also
(date/timestamp) comparisons allow use of indexes for expressions like
datecol >= date 'today' - interval '1 month' (Tom) Implications for user
defined types and indices also? [HACKERS] 8-Nov-03

- Index performance improved when scanning highly non-unique indices;
will greatly improve performance of cursor/fetch logic. B-tree's
initial-positioning-strategy code has been improved for the case when
index scans are caused by a WHERE indexcol > something. We now start
scan at first entry, rather than reading in all entries that share that
index value before we begin to scan. (Tom, after Dimitry Tkach) 

- Heap access code is now faster when using compressed columns in-line;
previous assumption was that all compressed columns were also toasted
(Tom)

- Optimized calling performance for dynamically loaded C functions. Hash
table added to cache lookups of 'C'-language functions. Some limited
testing suggests that this puts the lookup speed for external functions
just about on par with built-in functions. (Tom)

- New delay feature added to VACUUM, allowing it to be executed in at a
lower priority, ensuring other concurrent transaction performance can be
maintained at a predictable level. Detailed analysis and graphs of
run-time behaviour available at
http://developer.postgresql.org/~wieck/vacuum_cost/ (Jan)
Extended to include VACUUM FULL,ANALYZE and non-btree index vacuums.
Centralize implementation of delay code by creating a pg_usleep()
subroutine in src/port/pgsleep.c. (Tom)

- More flexible memory control will allow large memory allocations to
large maintenance operations such as CREATE INDEX, without effecting
normal memory usage for queries. Rename server parameters SortMem and
VacuumMem to work_mem and maintenance_work_mem; old names still
available via new backward compatibility feature. Make btree index
creation and initial validation of foreign-key constraints use
maintenance_work_mem rather than work_mem as their memory limit. (Tom)

- Restructure smgr API as per detailed proposal of 6 Feb, to improve
performance in bgwriter and background checkpoint processes. Possibly
also a precursor to later implementation of Tablespaces... (Tom)

- ANALYZE will now collect statistics on expressional indexes, and make
use of them during optimization in majority of cases. (Tom)

- Repaired longstanding oversight in separate ANALYZE command: it
updated the pg_class.relpages and reltuples counts for the table proper,
but not for indexes. Greater planning accuracy should now result. (Tom)
OPTIMIZER/EXECUTOR IMPROVEMENTS

- Genetic Optimizer usage has been re-analyzed; geqo defaults have now
been set to more effective values which are expected to significantly
improve plan selection for complex multi-way joins (> 10-way).
geqo_effort setting now offers an easy 1..10 setting (like IBM DB2),
that allows this to be controlled realistically by user/DBA. New
heuristic added to significantly reduce number of join plans attempted
before geqo begins. (Tom)

- Avoid redundant unique-ification step on subqueries where the result
is already known to be unique (i.e. it is a SELECT DISTINCT ...
subquery, IN subqueries that use UNION/INTERSECT/EXCEPT (without ALL)).
Also set join_in_selectivity correctly. (Tom)

- Avoid redundant projection step when scanning a table that we need all
the columns from.  In case of SELECT INTO, we have to check that the
hasoids flag matches the desired output type, too. (Tom)

- Repair mis-estimation of indexscan CPU costs.  When an indexqual
contains a run-time key (that is, a nonconstant expression compared to
the index variable), the key is evaluated just once per scan, but we
were charging costs as though it were evaluated once per visited index
entry. (Tom)

- Avoid planner failure for cases involving Cartesian products inside IN
(sub-SELECT) constructs. (Tom)

ROBUSTNESS

- Massive overhaul of pg_dump: make use of dependency information from
pg_depend to determine a safe dump order.  Defaults and check
constraints can be emitted either as part of a table or domain
definition, or separately if that's needed to break a dependency loop.
Lots of old half-baked code for controlling dump order removed.
Performance work has also occurred to address some regressions in
performance this caused.
(Tom)

- Changes to ALTER .. SET PATH allows ordered dumps to restore without
error - pg_restore options to select restore order now removed - not
needed (Tom)

- In backend/access/transam/ add warning to AtEOXact_SPI() to catch
cases where the current txn has been committed without SPI_finish()
being called first. Allows detection of resource leaks... (Joe)

- psql memory allocation is being cleaned up, using safer calls
(Bruce/Neil)

- GetNewTransactionId() logic sequence now enhanced to stay intact even
at final stage of resource failure conditions, such as running out of
disk space etc (Tom)

- Add checks for close() and fclose() failure, applicable to some
filesystems. Various locations affected in backend,initdb,copy (Tom)

- A header record was added to each WAL file, to allow them to be
reliably identified. We now avoid splitting WAL records across segment
files, and we now make WAL entries for file creation, deletion, and
truncation. This work should give the basics for building the true PITR
implementation (J.R., Patrick, Tom)

- Also, add support for making XLOG_SEG_SIZE configurable at compile
time, similarly to BLCKSZ, possibly useful for smaller installations.
(Tom)

SECURITY

- New permission-checking code. Rather than relying on the query context
of a rangetable entry to identify what permissions it wants checked,
store a full AclMode mask in each RTE, and check exactly those bits.
This allows an RTE specifying, say, INSERT privilege on a view to be
copied into a derived UPDATE query without changing meaning. (Tom)

- Parsing of quoted keywords in pg_hba.conf enhances client-server
specific combination security (Andrew)
OTHER NEW FUNCTIONALITY

- Add "WITH / WITHOUT OIDS" clause to CREATE TABLE AS. This allows the
user to explicitly specify whether OIDs should be included in the
newly-created relation; useful because it provides a way for application
authors to ensure their applications are compatible with future versions
of (in which the relation created by CREATE TABLE AS won't include OIDs
by default). (Neil)

- Add more kinds of exprs that can be accepted after a CREATE SCHEMA
(Neil)

- Info Schema enhanced further to support named function parameters
(Dennis)

- Change factorial to return a numeric (Gavin)

- Comments can now be set on individual Cast, Conversion, Op Class,
Large Object and Language (s) (Chris)

- Have psql \dn show only visible temp schemas using current_schemas()

- Have psql '\i ~/<tab><tab>' actually load files it displays from home
dir

- Allow psql \du to show groups, and add \dg for groups

- Allow pg_dump to dump CREATE CONVERSION (Chris)

- Make USING and WITH optional to bring the syntax of \copy into exact
agreement with what the backend grammar actually accepts and what the
documentation already says (Tom, Bill Moran)

- Remove platform dependencies from miscadmin.h and put them in port.h
(Tom)

- New generate_series() function; first of new class of Set Returning
functions - can return more than one row. (Joe)

- Monitoring of session disconnection now possible using the
log_disconnections parameter (Andrew)

- Customizable ANALYZE function for user definable functions is generic
functionality, though as base for PostGIS enhancements
(Mark)
REFACTORING AND OTHER CODE CHANGES

- Remove the explicit casting of NULL literals to a pointer in a wide
variety of code locations (Neil)

- Add operator strategy and comparison-value datatype fields to ScanKey.
Remove the 'strategy map' code - Passing the strategy number to the
index AM directly should now be simpler and faster. Changes to
ScanKeyEntryInitialize() API touches quite a lot of files. (Tom)

- nbtree function _bt_first is now substantially changed/simplified
(Tom)

- Change PG_DELAY from msec to usec and use it consistently rather than
select(). Add Win32 Sleep() for delay. (Bruce)

- Supporting relaxing of ALTER...SET PATH requires changing the API for
GUC assign_hook functions, which touches a lot of places (Tom)

- initdb has now been completely re-written from shell script to C
(Andrew)

- Add some code to guc.c to allow backwards compatibility for server
config parameters. Variable renaming is now more easily possible, since
parameters can potentially be referenced by both their old and new names
in SHOW and SET commands. (Tom)

- Remove the long-dead 'persistent main memory' storage manager (mm.c),
since it seems quite unlikely to ever get resurrected. (Tom)

- Add configure support for determining UINT64_FORMAT, the appropriate
snprintf format for uint64 items (Tom)

- Put another layer of indirection between the compute_stats functions
and the actual data storage.  This change would eventually allow us to
compute the values on-the-fly, for example using dynamic data sampling
techniques when ANALYZE output is not available... (Tom) 

- In src/backend/access/ we have major changes in heap, nbtree, transam
and common code

- Almost all files src/backend/commands have changes; mostly robustness

- JDBC interface has been moved out into its own project to improve the
focus on this popular and important area of client code.


MAJOR IMPACTS NOTED (Upgrades, performance, change of defaults etc)

- initdb forced due to pg_proc change to support Named function
parameters

- initdb forced due to change of stored rule representation.

- initdb recommended to allow picking up Info Schema changes

- initdb is forced due to changes in pg_control contents for WAL logging
enhancements.

- geqo defaults have now been set to significantly more effective
values; we should expect increased optimization elapsed times for very
large queries, though hopefully outweighed by a significant improvement
in plan selection

- security tightening may cause errors with some existing applications -
be aware, but also be thankful!


MAJOR WORK IN PROGRESS (Much less accurate than the above...)

- Win32 - Changes to many areas, especially the postmaster, ipc, libpq
etc, to streamline and allow a single source Win32 port - the Win32 port
is coming closer! (Claudio) Changes are being implemented to allow a
single code base to work across both *NIX and Win32 systems. Mostly,
code changes are such that the original behaviour is preserved, though
in a way that allows it to also work on Win32 systems, or with minor
changes.

- Replication - involving remote copying of WAL logs, then cut-over and
automatic catch-up on secondary node (Jan)

- Background writer work progresses, which is likely to improve overall
scalability/performance by smoothing dirty blocks writes; forms the
basis for increased server availability also (Jan)

- Initial stages of Named Function parameter support have been
committed. pg_proc has a column to store names, and CREATE FUNCTION can
insert data into it, but that's all as yet. (Dennis)

- Buffer manager locking changes: lock contention data have shown that
the BufMgrLock is the major source of contention under heavy load,
effecting multi-CPU SMP scalability. Patch to rework the bufmgr's
locking scheme to be more granular and further perf testing may yield
further perf gains. (Neil)

- Backend internal data structure changes: list rewrite: the linked list
implementation used throughout the backend is being redesigned for
constant-time length and append operations. This was done because
lappend() is called quite frequently, and allows some ugly code
(FastList) to be removed. (Neil)

- Hash index changes: complete wrap up of unique hash indexes, as well
as some improvements to hash index concurrent performance. (Neil)

- Re-evaluation of Genetic Query Optimizer parameters and usage will
likely continue for some and any real usage scenarios/observations are
welcome

- PITR APIs, basic utilities and testing (Simon)

DEVELOPER FEATURES (logging, debugging etc)
- bootstrap can now be cancelled using CTRL-C (Tom)
- add a console control handler for Ctrl-C (Magnus Hagander)
- use debug_shared_buffers = <seconds> to show ARC mem buffer contents
(Jan)

All corrections and changes welcome...this summary is updated every
Tuesday based upon committed changes to the public community code base.
Please draw my attention to other changes in other parts of PostgreSQL
utilities etc that may affect the backend code, portability or run-time
behaviour.

More exactly, the changes listed here are ones that have occurred since
the 7_4 branchpoint in CVS. They might have been applied to more than
one branch so could potentially be in 7_4_STABLE and 7_4_1. The bottom
line is that they haven't been cast in stone yet...

Thanks to everybody around the world for appreciating that however hard
I try, I do speak the English variant of English, with appropriate
spellings.
-- Simon Riggs, simon@2ndquadrant.com




pgsql-hackers by date:

Previous
From: Jan Wieck
Date:
Subject: Re: [PATCHES] NO WAIT ...
Next
From: "Glen Parker"
Date:
Subject: Index scan ordering (performance)