Re: 7.5 change documentation - Mailing list pgsql-hackers
From | Simon Riggs |
---|---|
Subject | Re: 7.5 change documentation |
Date | |
Msg-id | 002a01c3e537$13cab6c0$189087d9@LaptopDellXP Whole thread Raw |
Responses |
Re: 7.5 change documentation
|
List | pgsql-hackers |
POSTGRESQL: Summary of Changes since last release (7.4.1) ---------------------------------------------------------- 26 Jan 2004 This is a summary of most changes since code versions marked 7_4_1, rather than a weekly news bulletin, a summary of desired future items, or the definitive list of what's in any particular release. The intention is to help everybody understand what's coming and what might be affected, though most importantly, where you might add value to the community as a designer, developer, tester, technical author or advocate. So far in this release dev cycle, major functionality will effect - PERFORMANCE - OPTIMIZER/EXECUTOR - ROBUSTNESS - SECURITY Other code changes are summarised and their major impacts noted. These notes cover major changes and are not guaranteed complete, or even fully tested. Many additional patches to the latest full release have been submitted and these are appreciated just as much, even though they have *mostly* more isolated effects. Documentation changes continue, though aren't described here, neither are client side utilities/interfaces. Nothing mentioned here is DEFINITELY in 7.5 or any future release; testing of everything mentioned here is encouraged and appreciated, for regression, performance and robustness. There is not yet a CVS branch specifically for any later release than 7_4_1; these changes are not yet even guaranteed to build into a consistent release when taken together. Description of changes is designed to highlight benefit and impact, as well as identifying specific areas of code change and potential knock-on effects. MAJOR FUNCTIONALITY PERFORMANCE - Major new memory buffer cache algorithm has now been implemented using the Adaptive Replacement Cache algorithm. The implementation should have positive benefit for everybody's workload, since ARC will adapt to a variety of situations and has been designed to allow Vacuum to avoid interfering with user applications. (Jan) src/backend/buffer - New performance profiling of Intel CPU has allowed new spinlock code to achieve performance/throughput gains of up to 10% using DBT-2 (OLTP) workloads. Further gains to follow? (Manfred Spraul/Tom) src/backend/storage/lmgr/s_lock.c - TIP 9 now needs changing! Cross-data-type comparisons are now indexable by btrees. All the existing cross-type comparison operators (int2/int4/int8 and float4/float8) have appropriate support. (Tom) Implications for user defined types and indices also? [HACKERS] 8-Nov-03 - All operations on TEMP relations are no longer logged in WAL, nor are they involved in checkpoints, thus improving performance. (Tom) - Index performance improved when scanning highly non-unique indices; will greatly improve performance of cursor/fetch logic. B-tree's initial-positioning-strategy code has been improved so that we start scan at first entry, rather than reading in all entries that share that index value before we begin to scan. (Tom, after Dimitry Tkach) - Heap access code is now faster when using compressed columns in-line; previous assumption was that all compressed columns were also toasted (Tom) - Optimized calling performance for dynamically loaded C functions. Hash table added to cache lookups of 'C'-language functions. Some limited testing suggests that this puts the lookup speed for external functions just about on par with built-in functions. (Tom) OPTIMIZER/EXECUTOR IMPROVEMENTS - Genetic Optimizer usage has been re-analyzed; geqo defaults have now been set to more effective values which are expected to significantly improve plan selection for complex multi-way joins (> 10-way). geqo_effort setting now offers an easy 1..10 setting (like IBM DB2), that allows this to be controlled realistically by user/DBA. New heuristic added to significantly reduce number of join plans attempted before geqo begins. (Tom) - Avoid redundant unique-ification step on subqueries where the result is already known to be unique (i.e. it is a SELECT DISTINCT ... subquery, IN subqueries that use UNION/INTERSECT/EXCEPT (without ALL)). Also set join_in_selectivity correctly. (Tom) - Avoid redundant projection step when scanning a table that we need all the columns from. In case of SELECT INTO, we have to check that the hasoids flag matches the desired output type, too. (Tom) - Repair mis-estimation of indexscan CPU costs. When an indexqual contains a run-time key (that is, a nonconstant expression compared to the index variable), the key is evaluated just once per scan, but we were charging costs as though it were evaluated once per visited index entry. (Tom) - Avoid planner failure for cases involving Cartesian products inside IN (sub-SELECT) constructs. (Tom) ROBUSTNESS - Local buffer manager is no longer used for newly-created non-TEMP relations; a new non-TEMP relation goes through the shared bufmgr and thus will participate normally in checkpoints. TEMP relations use the local buffer manager throughout their lifespan. (Tom) - Massive overhaul of pg_dump: make use of dependency information from pg_depend to determine a safe dump order. Defaults and check constraints can be emitted either as part of a table or domain definition, or separately if that's needed to break a dependency loop. Lots of old half-baked code for controlling dump order removed. Performance work has also occurred to address some regressions in performance this caused. (Tom) - Changes to ALTER .. SET PATH allows ordered dumps to restore without error - pg_restore options to select restore order now removed - not needed (Tom) - In backend/access/transam/ add warning to AtEOXact_SPI() to catch cases where the current txn has been committed without SPI_finish() being called first. Allows detection of resource leaks... (Joe) - psql memory allocation is being cleaned up, using safer calls (Bruce/Neil) - Transaction logic now enhanced to stay intact even at final stage of resource failure conditions, such as running out of disk space etc (Tom) SECURITY - New permission-checking code. Rather than relying on the query context of a rangetable entry to identify what permissions it wants checked, store a full AclMode mask in each RTE, and check exactly those bits. This allows an RTE specifying, say, INSERT privilege on a view to be copied into a derived UPDATE query without changing meaning. (Tom) OTHER NEW FUNCTIONALITY - Add "WITH / WITHOUT OIDS" clause to CREATE TABLE AS. This allows the user to explicitly specify whether OIDs should be included in the newly-created relation; useful because it provides a way for application authors to ensure their applications are compatible with future versions of (in which the relation created by CREATE TABLE AS won't include OIDs by default). (Neil) - Add more kinds of exprs that can be accepted after a CREATE SCHEMA (Neil) - Info Schema enhanced further to support named function parameters (Dennis) - Change factorial to return a numeric (Gavin) - Comments can now be set on individual Cast, Conversion, Op Class, Large Object and Language (s) (Chris) - Have psql \dn show only visible temp schemas using current_schemas() - Have psql '\i ~/<tab><tab>' actually load files it displays from home dir - Allow psql \du to show groups, and add \dg for groups - Allow pg_dump to dump CREATE CONVERSION (Chris) SUMMARY OF OTHER CODE CHANGES - Remove the explicit casting of NULL literals to a pointer in a wide variety of code locations (Neil) - Streamline local buffer manager code: Since it's no longer necessary to fsync relations as they move out of the local buffers into shared buffers, quite a lot of smgr.c/md.c/fd.c code is no longer needed and has been removed: there's no concept of a dirty relation anymore in md.c/fd.c, and we never fsync anything but WAL. (Tom) - Add operator strategy and comparison-value datatype fields to ScanKey. Remove the 'strategy map' code - Passing the strategy number to the index AM directly should now be simpler and faster. Changes to ScanKeyEntryInitialize() API touches quite a lot of files. (Tom) - nbtree function _bt_first is now substantially changed/simplified (Tom) - Change PG_DELAY from msec to usec and use it consistently rather than select(). Add Win32 Sleep() for delay. (Bruce) - Supporting relaxing of ALTER...SET PATH requires changing the API for GUC assign_hook functions, which touches a lot of places (Tom) - In src/backend/access/ we have major changes in heap, nbtree, transam and common code - Almost all files src/backend/commands have changes; mostly robustness - JDBC interface has been moved out into its own project to improve the focus on this popular and important area of client code. MAJOR IMPACTS NOTED (Upgrades, performance, change of defaults etc) - initdb forced due to pg_proc change to support Named function parameters - initdb forced due to change of stored rule representation. - initdb recommended to allow picking up Info Schema changes - geqo defaults have now been set to significantly more effective values; we should expect increased optimization elapsed times for very large queries, though hopefully outweighed by a significant improvement in plan selection MAJOR WORK IN PROGRESS (Much less accurate than the above...) - Win32 - Changes to many areas, especially the postmaster, ipc, libpq etc, to streamline and allow a single source Win32 port - the Win32 port is coming closer! (Claudio) - Replication - involving remote copying of WAL logs, then cut-over and automatic catch-up on secondary node (Jan) - Background writer work progresses, which is likely to improve overall scalability/performance by smoothing dirty blocks writes; forms the basis for increased server availability also (Jan) - Initial stages of Named Function parameter support have been committed. pg_proc has a column to store names, and CREATE FUNCTION can insert data into it, but that's all as yet. (Dennis) - Buffer manager locking changes: lock contention data have shown that the BufMgrLock is the major source of contention under heavy load, effecting multi-CPU SMP scalability. Patch to rework the bufmgr's locking scheme to be more granular and further perf testing may yield further perf gains. (Neil) - Backend internal data structure changes: list rewrite: the linked list implementation used throughout the backend is being redesigned for constant-time length and append operations. This was done because lappend() is called quite frequently, and allows some ugly code (FastList) to be removed. (Neil) - Hash index changes: complete wrap up unique hash indexes, as well as some improvements to hash index concurrent performance. (Neil) - Customizable ANALYZE function for user definable functions is generic functionality being added, though as base for PostGIS enhancements (Mark) - Re-evaluation of Genetic Query Optimizer parameters and usage will likely continue for some and any real usage scenarios/observations are welcome WHAT **ISN'T** IN THIS RELEASE (YET!) - Many TODO items still to be claimed... - SQL Commands, standardisation and compatibility features - Referential Integrity & Inheritance - Administration (client interfaces may have changed) - National/Multi-language support extensions DEVELOPER FEATURES (logging, debugging etc) - bootstrap can now be cancelled using CTRL-C (Tom) - use debug_shared_buffers = <seconds> to show ARC mem buffer contents (Jan) All corrections and changes welcome...if this is well received, then I will monitor pgsql-commiters to keep track of things. Please let me know about even minor technical mistakes, though please lets not revisit the designs of everything again! -- Simon
pgsql-hackers by date: