Thread: TODO list comments

TODO list comments

From

Tom Lane

Date:

24 August 2005, 23:17:21

I made a pass over the TODO list to see what was out of date.

> * Allow administrators to safely terminate individual sessions either
>   via an SQL function or SIGTERM 
> 
>   Currently SIGTERM of a backend can lead to lock table corruption.

This comment may be out of date.  Suggest
Lock table corruption following SIGTERM of an individual backendhas been reported in 8.0.  A possible cause is fixed in
8.1,butit is unknown whether other trouble spots exist.  This item ismainly a matter of doing adequate testing rather
thanof writingany new code.
 

>     o Allow postgresql.conf values to be set so they can not be changed
>       by the user

Is that really a good idea?  The ones that are unsafe are restricted already.

> * %Remove Money type, add money formatting for decimal type

There's a fair-size contingent that doesn't want Money removed
completely, but just reimplemented as an I/O wrapper around type
numeric.  Maybe that's even what you mean by the TODO item, but
it's not clear.  Please at least mention the alternative.

>     o %Allow MIN()/MAX() on arrays

This is done.

>     o Modify array literal representation to handle array index lower bound
>       of other than one

This too.

>     o Add security checking for large objects
> 
>       Currently large objects entries do not have owners. Permissions can
>       only be set at the pg_largeobject table level.

This comment is wrong: trying to set the permissions on pg_largeobject
would have no effect whatsoever on the lo_xxx functions, so there is not
even a partial solution available now.

>     o Auto-delete large objects when referencing row is deleted

This should note that contrib/lo already offers a solution.

> * %Have views on temporary tables exist in the temporary namespace
> * Allow temporary views on non-temporary tables

Both of these are done in 8.1.

> * %Allow RULE recompilation

Eh?  Perhaps you meant "automatically regenerate cached plans when
needed", in which case it's redundant with the Dependency Checking
entries.  Whatever it means, this doesn't seem a particularly simple
item.

> * %Allow TRUNCATE ... CASCADE/RESTRICT

Huh?  What would that do?

> * Make row-wise comparisons work per SQL spec

This could probably be marked as a % item.

>     o Currently the system uses the operating system COPY command to
>       create a new database. Add ON COMMIT capability to CREATE TABLE AS
>       SELECT

This seems a bit garbled, and anyway the first part is done.

>     o %Add ALTER DOMAIN TYPE

To do what, exactly?  This is unclear.

>     o -Allow objects to be moved to different schemas

This is only partly done --- the 8.1 patch didn't cover all object types.

>     o %Disallow dropping of an inherited constraint
> ...
>     o %Prevent child tables from altering constraints like CHECK that were
>       inherited from the parent table

These seem to be duplicates, or at least in need of merging.

>     o Handle references to temporary tables that are created, destroyed,
>       then recreated during a session, and EXECUTE is not used
> 
>       This requires the cached PL/PgSQL byte code to be invalidated when
>       an object referenced in the function is changed.

This is redundant with the Dependency Checking item about regenerating
cached plans.

>     o Add table function support to pltcl, plperl, plpython?

Isn't this done for plperl?

>     o Allow PL/pgSQL to name columns by ordinal position, e.g. rec.(3)

This doesn't seem like an amazingly good idea; would prefer to see a way
to get the column name list and use names dynamically.  Numbers have all
the same problems as "SELECT *" ...

>     o Add MOVE to PL/pgSQL

This should be generalized: upgrade plpgsql cursor support to have all
the FETCH and MOVE options of the main language.

>     o Add support for polymorphic arguments and return types to plperl

I think all the PLs except plpgsql need this.

Also, all the PLs except plpgsql are well behind the curve on supporting
parameter names and OUT parameters.  Please add TODO item(s) for these.

> * Allow libpq to access SQLSTATE so pg_ctl can test for connection failure
> 
>   This would be used for checking if the server is up.

Huh?  What has SQLSTATE got to do with connection failure checking?

> * Have initdb set DateStyle based on locale?

Is this really a good idea?  Being standardized on ISO format seems like
a good thing to me, and encouraging people to adopt ambiguous formats as
default a very bad thing.  They can do it if they like, certainly, but
having initdb do it for them just seems like not the direction we want.

> * Add a schema option to createlang

This is superseded by events: createlang now puts the functions in
pg_catalog, and there doesn't seem any particularly good reason to
want to put them elsewhere.

>     o Improve psql's handling of multi-line queries

Uh, what's wrong with it?  This item seems far too vague.

>     o Add pg_dumpall custom format dumps.
> 
>       This is probably best done by combining pg_dump and pg_dumpall
>       into a single binary.

This is probably obsoleted by events, too.  Now that we can dump blobs
in text mode, I see no reason that we ever need to do this.
pg_restore's only real reason to live is to support selective restore
(ie, pulling out just a few objects from an existing dump) and I do not
see that you need that for pg_dumpall dumps.

>     o Remove unnecessary abstractions in pg_dump source code

Like which?

> * %Remove CREATE CONSTRAINT TRIGGER
> 
>   This was used in older releases to dump referential integrity
>   constraints.

Do we really want to remove it, and thereby guarantee we can't load
dumps from those old releases?

> * Fetch heap pages matching index entries in sequential order
> 
>   Rather than randomly accessing heap pages based on index entries, mark
>   heap pages needing access in a bitmap and do the lookups in sequential
>   order. Another method would be to sort heap ctids matching the index
>   before accessing the heap rows.

This is done (see bitmap index scans).

> * Hash

Why doesn't the hash index section mention the need for WAL support?
Multicolumn hash indexes might be interesting too.

>     o Pack hash index buckets onto disk pages more efficiently
> 
>       Currently no only one hash bucket can be stored on a page. Ideally
>       several hash buckets could be stored on a single page and greater
>       granularity used for the hash algorithm.

I think that should read "Currently only one ..."

> * Determine optimal fdatasync/fsync, O_SYNC/O_DSYNC options

This item should probably point out that the optimal settings are
certainly platform-dependent.

> * Improve the background writer
> 
>   Allow the background writer to more efficiently write dirty buffers
>   from the end of the LRU cache and use a clock sweep algorithm to
>   write other dirty buffers to reduced checkpoint I/O

This is done.

> * Improve speed with indexes
> 
>   For large table adjustements during vacuum, it is faster to reindex
>   rather than update the index.

This applies only to VACUUM FULL, so it probably needs to be reworded.

> * Reduce lock time by moving tuples with read lock, then write
>   lock and truncate table

Ditto.

> * Auto-vacuum
> 
>     o %Suggest VACUUM FULL if a table is nearly empty

It seems like a fairly bad idea for auto-vacuum to do a VACUUM FULL
ever, given the locking effects.  And how is a background daemon going
to "suggest" anything?  It could write to the postmaster log but it's
entirely likely the user would never notice.

> * -Improve SMP performance on i386 machines
> 
>   i386-based SMP machines can generate excessive context switching
>   caused by lock failure in high concurrency situations. This may be
>   caused by CPU cache line invalidation inefficiencies.

This isn't really done, I don't think, we just ameliorated what was the
largest cause of it.  It'll be back...

> * Fix priority ordering of read and write light-weight locks (Neil)

I think we concluded that was a bad idea.

> * Add WAL index reliability improvement to non-btree indexes

Oh, here it is.  I'd be inclined to put this item in the index section
instead, so that you can have separate entries for each index type.
Besides, GIST is done.

> * -Use CHECK constraints to influence optimizer decisions
> 
>   CHECK constraints contain information about the distribution of values
>   within the table. This is also useful for implementing subtables where
>   a tables content is distributed across several subtables.

This isn't completely done by any means.

> * ANALYZE should record a pg_statistic entry for an all-NULL column

This is done.

> * Remove memory/file descriptor freeing before ereport(ERROR)
> * Promote debug_query_string into a server-side function current_query()
> * Allow the identifier length to be increased via a configure option

All of those could probably be marked %.

> * Allow cross-compiling by generating the zic database on the target system
> * Fix cross-compiling of time zone database via 'zic'

These look like duplicates to me ...

>     o Improve dlerror() reporting string

I think this got done.

>         o Add support for Unicode

This is done too, though not tested enough.
        regards, tom lane

Re: TODO list comments

From

Kris Jurka

Date:

24 August 2005, 23:54:26


On Wed, 24 Aug 2005, Tom Lane wrote:

>> * Fetch heap pages matching index entries in sequential order
>>
>>   Rather than randomly accessing heap pages based on index entries, mark
>>   heap pages needing access in a bitmap and do the lookups in sequential
>>   order. Another method would be to sort heap ctids matching the index
>>   before accessing the heap rows.
>
> This is done (see bitmap index scans).
>

Will the optimizer ever choose this plan when dealing with only one index?

Kris Jurka

Re: TODO list comments

From

Tom Lane

Date:

25 August 2005, 00:13:42

Kris Jurka <books@ejurka.com> writes:
> On Wed, 24 Aug 2005, Tom Lane wrote:
>> This is done (see bitmap index scans).

> Will the optimizer ever choose this plan when dealing with only one index?

Certainly.  It's actually likely to prefer a bitmap scan whenever the
query is estimated to fetch more than one percent or so of the table
(although if you are demanding ORDER BY the index order, the crossover
point is higher, since a bitmap scan doesn't deliver sorted output).

Something that probably ought to be on the Open Items list for 8.1
is whether the cost estimation for bitmap vs plain indexscan is OK.
It's entirely likely that we need to do some tweaking to get the
planner to make the right choice.
        regards, tom lane

Re: TODO list comments

From

Michael Glaesemann

Date:

25 August 2005, 03:50:50

On Aug 25, 2005, at 10:58 AM, Tom Lane wrote:

>> * %Remove CREATE CONSTRAINT TRIGGER
>>
>>   This was used in older releases to dump referential integrity
>>   constraints.
>>
>
> Do we really want to remove it, and thereby guarantee we can't load
> dumps from those old releases?

Also, I believe CONSTRAINT TRIGGERS are the only way to provide  
transaction level (rather than statement level) referential  
integrity. I've used this in the past. The SQL command reference page  
mentions that it's not for general use, but it'd be a shame to remove  
it before there's an alternative way to provide transaction level  
referential integrity.

Michael Glaesemann
grzm myrealbox com

Re: TODO list comments

From

Hannu Krosing

Date:

25 August 2005, 04:12:58

On K, 2005-08-24 at 21:58 -0400, Tom Lane wrote:
> > * %Allow TRUNCATE ... CASCADE/RESTRICT
> 
> Huh?  What would that do?

Maybe this was meant truncating of tables with dependent foreign keys ?

AFAIR this was solved by allowing truncating several tables in one
command even if they have FK relationships between themselves.

> This is only partly done --- the 8.1 patch didn't cover all object types.
> 
> >     o %Disallow dropping of an inherited constraint
> > ...
> >     o %Prevent child tables from altering constraints like CHECK that were
> >       inherited from the parent table
> 
> These seem to be duplicates, or at least in need of merging.

It should probably mention about weird inheritance behaviour of "CREATE
CONSTRAINT ON ONLY tablename" - it is not propagated to existing child
tables, but is inherited when creating new ones.

Also, I don't think this should be done at all, at least not before we
have proper partitioned table support ready. I could live with it
creating a warning about not being future-compatible.

> >     o Handle references to temporary tables that are created, destroyed,
> >       then recreated during a session, and EXECUTE is not used
> > 
> >       This requires the cached PL/PgSQL byte code to be invalidated when
> >       an object referenced in the function is changed.
> 
> This is redundant with the Dependency Checking item about regenerating
> cached plans.

Or maybe not completely, depending on how you do it. 

If temp table itself is created inside the same pl/pgsql function, then
there could still be a way to do the planning/optimising only once and
then substitute temp table oids when running the function. 

The table structure in this case is quaranteed to be the same during
each run of the function, it's just that the temp table and index oids
should be treated as local variables.

Done this way, it gives real benefits in terms of cached query plans,
instead of just preventing newcomers from shooting themselves in foot by
not using EXECUTE.

> > * Improve speed with indexes
> > 
> >   For large table adjustements during vacuum, it is faster to reindex
> >   rather than update the index.
> 
> This applies only to VACUUM FULL, so it probably needs to be reworded.

In case we implement concurrent/non-blocking CREATE INDEX at some point,
this might be a good idea for lazy VACUUM as well.

And it may make more sense to do CLUSTER instead of VACUUM FULL in at
least some of these cases.

(btw. CLUSTER seems to be another function which my concurrent vacuuming
patch should be extended to cover, at least on "client" side, like
CREATE INDEX)

> > * Auto-vacuum
> > 
> >     o %Suggest VACUUM FULL if a table is nearly empty
> 
> It seems like a fairly bad idea for auto-vacuum to do a VACUUM FULL
> ever, given the locking effects.  And how is a background daemon going
> to "suggest" anything?  It could write to the postmaster log but it's
> entirely likely the user would never notice.

With current implementations of commands, doing (some equivalent of)
CLUSTER here seems a better idea than VACUUM FULL, as it also un-bloats
indexes. Not sure of of transactional behaviour though.

-- 
Hannu Krosing <hannu@skype.net>

Re: TODO list comments

From

Robert Treat

Date:

25 August 2005, 09:58:51

On Wed, 2005-08-24 at 21:58, Tom Lane wrote:
> >     o Add pg_dumpall custom format dumps.
> > 
> >       This is probably best done by combining pg_dump and pg_dumpall
> >       into a single binary.
> 
> This is probably obsoleted by events, too.  Now that we can dump blobs
> in text mode, I see no reason that we ever need to do this.
> pg_restore's only real reason to live is to support selective restore
> (ie, pulling out just a few objects from an existing dump) and I do not
> see that you need that for pg_dumpall dumps.
> 

Being able to restore just the database users without restoring all
databases?  (There are other ways that could be accomplished, like
adding user information to pg_dump, but it's one scenario anyway) 

Actually the argument that you would have to do both a pg_dumpall of the
cluster and a pg_dump of each database in order to obtain this
functionality seems so user unfriendly it seems like something to persue
on those grounds alone (imho).

Robert Treat
-- 
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL

Re: TODO list comments

From

"Greg Sabino Mullane"

Date:

25 August 2005, 10:53:34

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Tom Lane asked:

>>       o Improve psql's handling of multi-line queries

> Uh, what's wrong with it?  This item seems far too vague.

I think perhaps this means adding multi-line support to
the tab-completion? Only thing I can think of, cause other
than that, multi-line queries work just fine.

- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200508250952
https://www.biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEARECAAYFAkMNzTAACgkQvJuQZxSWSshB8gCgvOU3rZi1uwFnwXO2zVz6KjUG
TUwAn3VoHGbqGkP1bRItMgVFE3vPQkkf
=rA0w
-----END PGP SIGNATURE-----

Re: TODO list comments

From

Matt Miller

Date:

25 August 2005, 11:29:11

On Thu, 2005-08-25 at 15:50 +0900, Michael Glaesemann wrote:
> >> * %Remove CREATE CONSTRAINT TRIGGER
> >>
> > Do we really want to remove it,
> 
> Also, I believe CONSTRAINT TRIGGERS are the only way to provide  
> transaction level (rather than statement level) referential  
> integrity.

Don't deferrable foreign keys give you transaction-level referential
integrity?  From the SET CONSTRAINTS doc:

Synopsis
SET CONSTRAINTS { ALL | name [, ...] } { DEFERRED | IMMEDIATE }
Description
SET CONSTRAINTS sets the behavior of constraint checking within the
current transaction. IMMEDIATE constraints are checked at the end of
each statement. DEFERRED constraints are not checked until transaction
commit.

Re: TODO list comments

From

Alvaro Herrera

Date:

25 August 2005, 11:49:41

On Thu, Aug 25, 2005 at 01:53:32PM -0000, Greg Sabino Mullane wrote:

> Tom Lane asked:
> 
> >>       o Improve psql's handling of multi-line queries
> 
> > Uh, what's wrong with it?  This item seems far too vague.
> 
> I think perhaps this means adding multi-line support to
> the tab-completion? Only thing I can think of, cause other
> than that, multi-line queries work just fine.

The saved history is also not cool about multiline queries.  If you
enter them interactively (or by pasting), they are entered as several
entries.  If you edit them with \e, they are entered as a single unit.

It would be also nice to have M-# to work well -- currently it inserts a
#, which works in bash but is obviously wrong in psql.

-- 
Alvaro Herrera (<alvherre[a]alvh.no-ip.org>)
"Los románticos son seres que mueren de deseos de vida"

Re: TODO list comments

From

Oliver Elphick

Date:

25 August 2005, 14:21:34

On Thu, 2005-08-25 at 13:53 +0000, Greg Sabino Mullane wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> 
> Tom Lane asked:
> 
> >>       o Improve psql's handling of multi-line queries
> 
> > Uh, what's wrong with it?  This item seems far too vague.

If you enter a multi-line query one line at a time, a subsequent
up-arrow will recover one line at a time; on the other hand, if you use
\e to edit a multi-line query, a subsequent up-arrow will recover the
whole query in one go.  The latter behaviour would be nice in all cases.


An item not in the TODO list yet -- would anyone support including this
feature in psql?:
It would be nice if multi-line items lined up with their proper column
on output.  This is what happens at the moment:

junk=# insert into xyz (name,address) values ('Joe Bloggs','1 Hindhead Villas,
junk'# Newport,
junk'# Gwent');
INSERT 230412518 1
junk=# select * from xyz;id |    name    |              address
----+------------+----------------------------------- 1 | Joe Bloggs | 1 Hindhead Villas,
Newport,
Gwent
(1 row)

If there is more than one potential source column, things are even
worse:

junk=# select * from xyz;id |    name    |              address              |             del_addr 
----+------------+-----------------------------------+---------------------------------- 1 | Joe Bloggs | 1 Hindhead
Villas,
Newport,
Gwent | 2 The Laurels,
Swinkley,
XX3 5CX
(1 row)

It would be better to show the columns aligned (perhaps without showing
separators for other columns so as not to give the impression that the
other columns contain null or empty strings):

junk=# select * from xyz;id |    name    |              address              |             del_addr 
----+------------+-----------------------------------+---------------------------------- 1 | Joe Bloggs | 1 Hindhead
Villas,               | 2 The Laurels,                | Newport,                          | Swinkley,                |
Gwent                            | XX3 5CX
 
(1 row)

\a would turn this behaviour off.


Oliver Elphick

Re: TODO list comments

From

Tom Lane

Date:

25 August 2005, 15:03:45

Oliver Elphick <olly@lfix.co.uk> writes:
> It would be better to show the columns aligned (perhaps without showing
> separators for other columns so as not to give the impression that the
> other columns contain null or empty strings):

> junk=# select * from xyz;
>  id |    name    |              address              |             del_addr 
> ----+------------+-----------------------------------+----------------------------------
>   1 | Joe Bloggs | 1 Hindhead Villas,                | 2 The Laurels,
>                  | Newport,                          | Swinkley,
>                  | Gwent                             | XX3 5CX
> (1 row)

I think the above is unacceptable because it looks indistinguishable
from a valid but quite different dataset.  (No, the "1 row" doesn't make
it better; as soon as there's more than one row you can't tell what you
have.  And leaving out the first | doesn't help if all the columns are
multiline.)

It might be OK without any separators on the added lines, though:
id |    name    |              address              |             del_addr 
----+------------+-----------------------------------+---------------------------------- 1 | Joe Bloggs | 1 Hindhead
Villas,               | 2 The Laurels,                  Newport,                            Swinkley,
Gwent                              XX3 5CX
 
(1 row)

Or perhaps use a different separator:

junk=# select * from xyz;id |    name    |              address              |             del_addr 
----+------------+-----------------------------------+---------------------------------- 1 | Joe Bloggs | 1 Hindhead
Villas,               | 2 The Laurels,   +            + Newport,                          + Swinkley,   +            +
Gwent                            + XX3 5CX
 
(1 row)

Not sure how hard this would be to program, or what sort of overhead it
might impose to check for the case.  My recollection is that psql's
table-layout code is pretty slow and ugly already ...
        regards, tom lane

Re: TODO list comments

From

Andrew Dunstan

Date:

25 August 2005, 15:52:44


Tom Lane wrote:

>Or perhaps use a different separator:
>
>junk=# select * from xyz;
> id |    name    |              address              |             del_addr 
>----+------------+-----------------------------------+----------------------------------
>  1 | Joe Bloggs | 1 Hindhead Villas,                | 2 The Laurels,
>    +            + Newport,                          + Swinkley,
>    +            + Gwent                             + XX3 5CX
>(1 row)
>  
>

That's a terrific idea, and, incidentally, just the sort of project that 
might well suit a beginning hacker, since the code is pretty isolated.

>Not sure how hard this would be to program, or what sort of overhead it
>might impose to check for the case.  My recollection is that psql's
>table-layout code is pretty slow and ugly already ...
>
>
>  
>

If people want speed they shouldn't use psql as a client anyway. I don't 
see this as much of an obstacle.

cheers

andrew

Re: TODO list comments

From

"Jim C. Nasby"

Date:

25 August 2005, 16:02:47

On Wed, Aug 24, 2005 at 09:58:04PM -0400, Tom Lane wrote:
> > * %Allow RULE recompilation
> 
> Eh?  Perhaps you meant "automatically regenerate cached plans when
> needed", in which case it's redundant with the Dependency Checking
> entries.  Whatever it means, this doesn't seem a particularly simple
> item.

Hrm... I read that as allowing CREATE OR REPLACE on rules, but of course
that already exists.

http://lnk.nu/search.postgresql.org/3mt.search

> > * %Allow TRUNCATE ... CASCADE/RESTRICT
> 
> Huh?  What would that do?

http://archives.postgresql.org/pgsql-hackers/2003-08/msg01045.php

> >     o %Add ALTER DOMAIN TYPE
> 
> To do what, exactly?  This is unclear.

http://archives.postgresql.org/pgsql-hackers/2004-05/msg00985.php

> >     o Remove unnecessary abstractions in pg_dump source code
> 
> Like which?

I *think* this is reffering to how pg_dump makes some assumptions about
what things are system objects.

http://archives.postgresql.org/pgsql-committers/2005-08/msg00203.php
doesn't help a heck of a lot...

Can we add an interface to the TODO list that contains search links back
to the mailing lists?
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software        http://pervasive.com        512-569-9461

Re: TODO list comments

From

Michael Glaesemann

Date:

25 August 2005, 19:43:47

On Aug 25, 2005, at 11:29 PM, Matt Miller wrote:

> On Thu, 2005-08-25 at 15:50 +0900, Michael Glaesemann wrote:
>
>
>>>> * %Remove CREATE CONSTRAINT TRIGGER
>>>>
>>>>
>>>>
>>> Do we really want to remove it,
>>>
>>>
>>
>> Also, I believe CONSTRAINT TRIGGERS are the only way to provide
>> transaction level (rather than statement level) referential
>> integrity.
>>
>>
>
> Don't deferrable foreign keys give you transaction-level referential
> integrity?  From the SET CONSTRAINTS doc:
>

Sorry, I misspoke. What I'm thinking of is not referential integrity  
in the sense of foreign keys, but assertions, which PostgreSQL does  
not yet support.

Say for example you have a table that contains time periods marked by  
a start_date and an end_date and you want there to be no gaps between  
the different time periods in the table for a given key. When doing  
updates, deletes, or inserts on this table, you'll need to check to  
make sure there are no gaps when the transaction is finished.  
However, there may indeed be gaps during the transaction as  
start_dates and end_dates are updated. Triggers can be written to  
enforce this kind of integrity, but they'll only work if they're  
deferrable.

Michael Glaesemann
grzm myrealbox com

Re: TODO list comments

From

Bruce Momjian

Date:

26 August 2005, 15:52:27

Great updates!  Let me comment on each one.

> I made a pass over the TODO list to see what was out of date.
> 
> > * Allow administrators to safely terminate individual sessions either
> >   via an SQL function or SIGTERM 
> > 
> >   Currently SIGTERM of a backend can lead to lock table corruption.
> 
> This comment may be out of date.  Suggest
> 
>     Lock table corruption following SIGTERM of an individual backend
>     has been reported in 8.0.  A possible cause is fixed in 8.1, but
>     it is unknown whether other trouble spots exist.  This item is
>     mainly a matter of doing adequate testing rather than of writing
>     any new code.

Done.

> 
> >     o Allow postgresql.conf values to be set so they can not be changed
> >       by the user
> 
> Is that really a good idea?  The ones that are unsafe are restricted already.

Well, a typical case would be log_statement, but I see that is
super-user now.  I guess we are OK, removed.  If we get more problems,
we can re-add something later.

> > * %Remove Money type, add money formatting for decimal type
> 
> There's a fair-size contingent that doesn't want Money removed
> completely, but just reimplemented as an I/O wrapper around type
> numeric.  Maybe that's even what you mean by the TODO item, but
> it's not clear.  Please at least mention the alternative.

Updated:
* Improve the MONEY data type  Change the MONEY data type to use DECIMAL internally, with special  locale-aware output
formatting.

> >     o %Allow MIN()/MAX() on arrays
> 
> This is done.

OK.

> >     o Modify array literal representation to handle array index lower bound
> >       of other than one
> 
> This too.

OK.

> 
> >     o Add security checking for large objects
> > 
> >       Currently large objects entries do not have owners. Permissions can
> >       only be set at the pg_largeobject table level.
> 
> This comment is wrong: trying to set the permissions on pg_largeobject
> would have no effect whatsoever on the lo_xxx functions, so there is not
> even a partial solution available now.

Oh, comment removed.

> >     o Auto-delete large objects when referencing row is deleted
> 
> This should note that contrib/lo already offers a solution.

Done.

> > * %Have views on temporary tables exist in the temporary namespace
> > * Allow temporary views on non-temporary tables
> 
> Both of these are done in 8.1.

OK.

> > * %Allow RULE recompilation
> 
> Eh?  Perhaps you meant "automatically regenerate cached plans when
> needed", in which case it's redundant with the Dependency Checking
> entries.  Whatever it means, this doesn't seem a particularly simple
> item.

Agreed, updated to:
* Allow VIEW/RULE recompilation when the underlying tables change

> > * %Allow TRUNCATE ... CASCADE/RESTRICT
> 
> Huh?  What would that do?

I assume it is just like DELETE CASCADE, but it TRUNCATES rather than
DELETE.  Description added.

> > * Make row-wise comparisons work per SQL spec
> 
> This could probably be marked as a % item.

Done.

> >     o Currently the system uses the operating system COPY command to
> >       create a new database. Add ON COMMIT capability to CREATE TABLE AS
> >       SELECT
> 
> This seems a bit garbled, and anyway the first part is done.

Yep, garbled.  I have removed the first part.

> >     o %Add ALTER DOMAIN TYPE
> 
> To do what, exactly?  This is unclear.

I assume it would allow the underlying data type to be changed.  Updated
text:
       o Add ALTER DOMAIN to modify the underlying data type

> >     o -Allow objects to be moved to different schemas
> 
> This is only partly done --- the 8.1 patch didn't cover all object types.

Updated to:
       o Add missing object types for ALTER ... SET SCHEMA


> >     o %Disallow dropping of an inherited constraint
> > ...
> >     o %Prevent child tables from altering constraints like CHECK that were
> >       inherited from the parent table
> 
> These seem to be duplicates, or at least in need of merging.

Merged and updated:
       o %Prevent child tables from altering or dropping constraints         like CHECK that were inherited from the
parenttable
 

> >     o Handle references to temporary tables that are created, destroyed,
> >       then recreated during a session, and EXECUTE is not used
> > 
> >       This requires the cached PL/PgSQL byte code to be invalidated when
> >       an object referenced in the function is changed.
> 
> This is redundant with the Dependency Checking item about regenerating
> cached plans.

Removed and description added to dependency item:* Track dependencies in function bodies and recompile/invalidate  This
isparticularly important for references to temporary tables  in PL/PgSQL because PL/PgSQL caches query plans.  The only
workaround in PL/PgSQL is to use EXECUTE.
 

> >     o Add table function support to pltcl, plperl, plpython?
> 
> Isn't this done for plperl?

Right, plperl removed.

> >     o Allow PL/pgSQL to name columns by ordinal position, e.g. rec.(3)
> 
> This doesn't seem like an amazingly good idea; would prefer to see a way
> to get the column name list and use names dynamically.  Numbers have all
> the same problems as "SELECT *" ...

Yep, updated:
    o Allow function argument names to be queries from PL/PgSQL


> >     o Add MOVE to PL/pgSQL
> 
> This should be generalized: upgrade plpgsql cursor support to have all
> the FETCH and MOVE options of the main language.
> 
> >     o Add support for polymorphic arguments and return types to plperl
> 
> I think all the PLs except plpgsql need this.

Updated to:
       o Add support for polymorphic arguments and return types to         languages other than PL/PgSQL

> Also, all the PLs except plpgsql are well behind the curve on supporting
> parameter names and OUT parameters.  Please add TODO item(s) for these.

OK:
       o Add support for OUT and INOUT parameters to languages other         than PL/PgSQL

> > * Allow libpq to access SQLSTATE so pg_ctl can test for connection failure
> > 
> >   This would be used for checking if the server is up.
> 
> Huh?  What has SQLSTATE got to do with connection failure checking?

I am confused too.  What we have now is this check in pg_ctl.c:
           (PQstatus(conn) == CONNECTION_OK ||            (strcmp(PQerrorMessage(conn),
PQnoPasswordSupplied)== 0)))
 

I assume it is to allow better checking of the password request, but I
am unsure.  I will remove the item unless someone else understands it.
Or is it this from libpq, PG_DIAG_SQLSTATE?  I think the underlying
problem is that pg_ctl doesn't have a good way to determine if the
server is running based on a failure to connect.  Anyway, item removed
unless we get more reports of problems.

> > * Have initdb set DateStyle based on locale?
> 
> Is this really a good idea?  Being standardized on ISO format seems like
> a good thing to me, and encouraging people to adopt ambiguous formats as
> default a very bad thing.  They can do it if they like, certainly, but
> having initdb do it for them just seems like not the direction we want.

True, but we still have to have a default of what do with with input
like 01/02/03.  TODO updated:
* Have initdb set the input DateStyle (MDY or DMY) based on locale?

> > * Add a schema option to createlang
> 
> This is superseded by events: createlang now puts the functions in
> pg_catalog, and there doesn't seem any particularly good reason to
> want to put them elsewhere.

OK, removed.

> >     o Improve psql's handling of multi-line queries
> 
> Uh, what's wrong with it?  This item seems far too vague.

Later emails in this thread clarify this and I will address it there.

> >     o Add pg_dumpall custom format dumps.
> > 
> >       This is probably best done by combining pg_dump and pg_dumpall
> >       into a single binary.
> 
> This is probably obsoleted by events, too.  Now that we can dump blobs
> in text mode, I see no reason that we ever need to do this.
> pg_restore's only real reason to live is to support selective restore
> (ie, pulling out just a few objects from an existing dump) and I do not
> see that you need that for pg_dumpall dumps.

Hmm, good points.  I added a question mark and removed the description.
If we add CSV output format to pg_dump, I can imagine pg_dumpall perhaps
optionally using that too.

> >     o Remove unnecessary abstractions in pg_dump source code
> 
> Like which?

Uh, we have talked about it.  It is the complex function pointer usage
in the code that needs cleaning.  Updated:
       o Remove unnecessary function pointer abstractions in pg_dump source         code

> > * %Remove CREATE CONSTRAINT TRIGGER
> > 
> >   This was used in older releases to dump referential integrity
> >   constraints.
> 
> Do we really want to remove it, and thereby guarantee we can't load
> dumps from those old releases?

OK, TODO item removed, and I saw later discussion that is still useful.

> > * Fetch heap pages matching index entries in sequential order
> > 
> >   Rather than randomly accessing heap pages based on index entries, mark
> >   heap pages needing access in a bitmap and do the lookups in sequential
> >   order. Another method would be to sort heap ctids matching the index
> >   before accessing the heap rows.
> 
> This is done (see bitmap index scans).

OK.

> > * Hash
> 
> Why doesn't the hash index section mention the need for WAL support?
> Multicolumn hash indexes might be interesting too.

Added:
       o Add WAL logging for crash recovery       o Allow multi-column hash indexes

> >     o Pack hash index buckets onto disk pages more efficiently
> > 
> >       Currently no only one hash bucket can be stored on a page. Ideally
> >       several hash buckets could be stored on a single page and greater
> >       granularity used for the hash algorithm.
> 
> I think that should read "Currently only one ..."

Thanks, fixed.

> > * Determine optimal fdatasync/fsync, O_SYNC/O_DSYNC options
> 
> This item should probably point out that the optimal settings are
> certainly platform-dependent.

Added, though they are not optimizer per-platform.  Rather we use the
best one we find, and assume the _best_ is the same on all platforms.

Updated:* Determine optimal fdatasync/fsync, O_SYNC/O_DSYNC options  Ideally this requires a separate test program that
canbe run  at initdb time or optionally later.
 

> > * Improve the background writer
> > 
> >   Allow the background writer to more efficiently write dirty buffers
> >   from the end of the LRU cache and use a clock sweep algorithm to
> >   write other dirty buffers to reduced checkpoint I/O
> 
> This is done.

OK.

> > * Improve speed with indexes
> > 
> >   For large table adjustements during vacuum, it is faster to reindex
> >   rather than update the index.
> 
> This applies only to VACUUM FULL, so it probably needs to be reworded.

Updated.

> > * Reduce lock time by moving tuples with read lock, then write
> >   lock and truncate table
> 
> Ditto.

Updated.

> > * Auto-vacuum
> > 
> >     o %Suggest VACUUM FULL if a table is nearly empty
> 
> It seems like a fairly bad idea for auto-vacuum to do a VACUUM FULL
> ever, given the locking effects.  And how is a background daemon going
> to "suggest" anything?  It could write to the postmaster log but it's
> entirely likely the user would never notice.

Well, something in the logs is better than nothing.  We already warn
about other server issues like too frequent checkpoints in the server
logs.

New TODO:
       o %Issue log message to suggest VACUUM FULL if a table is nearly         empty?

Added a question mark because this clearly needs more thought.

> > * -Improve SMP performance on i386 machines
> > 
> >   i386-based SMP machines can generate excessive context switching
> >   caused by lock failure in high concurrency situations. This may be
> >   caused by CPU cache line invalidation inefficiencies.
> 
> This isn't really done, I don't think, we just ameliorated what was the
> largest cause of it.  It'll be back...

Yep, we can re-add it later once we get a report.
> 
> > * Fix priority ordering of read and write light-weight locks (Neil)
> 
> I think we concluded that was a bad idea.
> 
> > * Add WAL index reliability improvement to non-btree indexes
> 
> Oh, here it is.  I'd be inclined to put this item in the index section
> instead, so that you can have separate entries for each index type.
> Besides, GIST is done.

Right, I think Hash is the only one left, and you suggested adding that
above.  Item removed.

> > * -Use CHECK constraints to influence optimizer decisions
> > 
> >   CHECK constraints contain information about the distribution of values
> >   within the table. This is also useful for implementing subtables where
> >   a tables content is distributed across several subtables.
> 
> This isn't completely done by any means.

Uh, isn't this constraint_elimination?  The only problem is that it
isn't automatic.  The remaining part seems useless now that the
optimizer statistics are pretty good.  OK?

Added:

* Allow constraint_elimination to be automatically performed
 This requires additional code to reduce the performance loss caused by constraint elimination.


> > * ANALYZE should record a pg_statistic entry for an all-NULL column
> 
> This is done.

OK

> > * Remove memory/file descriptor freeing before ereport(ERROR)
> > * Promote debug_query_string into a server-side function current_query()
> > * Allow the identifier length to be increased via a configure option
> 
> All of those could probably be marked %.

Done.

> > * Allow cross-compiling by generating the zic database on the target system
> > * Fix cross-compiling of time zone database via 'zic'
> 
> These look like duplicates to me ...


Yep, second one removed.

> >     o Improve dlerror() reporting string
> 
> I think this got done.

OK.

> >         o Add support for Unicode
> 
> This is done too, though not tested enough.

OK.

Great changes.  All applied.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: TODO list comments

From

Bruce Momjian

Date:

26 August 2005, 16:35:19

Alvaro Herrera wrote:
> On Thu, Aug 25, 2005 at 01:53:32PM -0000, Greg Sabino Mullane wrote:
> 
> > Tom Lane asked:
> > 
> > >>       o Improve psql's handling of multi-line queries
> > 
> > > Uh, what's wrong with it?  This item seems far too vague.
> > 
> > I think perhaps this means adding multi-line support to
> > the tab-completion? Only thing I can think of, cause other
> > than that, multi-line queries work just fine.
> 
> The saved history is also not cool about multiline queries.  If you
> enter them interactively (or by pasting), they are entered as several
> entries.  If you edit them with \e, they are entered as a single unit.

TODO updated:
       o Improve psql's handling of multi-line queries
         Currently, while \e saves a single query as one entry, interactive         queries are saved one line at a
time. Ideally all queries         whould be saved like \e does.
 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: TODO list comments

From

Bruce Momjian

Date:

26 August 2005, 16:37:13

Hannu Krosing wrote:
> On K, 2005-08-24 at 21:58 -0400, Tom Lane wrote:
> > > * %Allow TRUNCATE ... CASCADE/RESTRICT
> > 
> > Huh?  What would that do?
> 
> Maybe this was meant truncating of tables with dependent foreign keys ?
> 
> AFAIR this was solved by allowing truncating several tables in one
> command even if they have FK relationships between themselves.

Yes, but I can imagine allowing a CASCADE behavior as well.

> > This is only partly done --- the 8.1 patch didn't cover all object types.
> > 
> > >     o %Disallow dropping of an inherited constraint
> > > ...
> > >     o %Prevent child tables from altering constraints like CHECK that were
> > >       inherited from the parent table
> > 
> > These seem to be duplicates, or at least in need of merging.
> 
> It should probably mention about weird inheritance behaviour of "CREATE
> CONSTRAINT ON ONLY tablename" - it is not propagated to existing child
> tables, but is inherited when creating new ones.

I am not sure on that one because the table does have the constraint at
the time the child is created.  Comments?

> Also, I don't think this should be done at all, at least not before we
> have proper partitioned table support ready. I could live with it
> creating a warning about not being future-compatible.

Right, TODO item removed.

> > >     o Handle references to temporary tables that are created, destroyed,
> > >       then recreated during a session, and EXECUTE is not used
> > > 
> > >       This requires the cached PL/PgSQL byte code to be invalidated when
> > >       an object referenced in the function is changed.
> > 
> > This is redundant with the Dependency Checking item about regenerating
> > cached plans.
> 
> Or maybe not completely, depending on how you do it. 

Well, I beefed up the item:* Track dependencies in function bodies and recompile/invalidate  This is particularly
importantfor references to temporary tables  in PL/PgSQL because PL/PgSQL caches query plans.  The only workaround  in
PL/PgSQLis to use EXECUTE.
 

> If temp table itself is created inside the same pl/pgsql function, then
> there could still be a way to do the planning/optimising only once and
> then substitute temp table oids when running the function. 
> 
> The table structure in this case is quaranteed to be the same during
> each run of the function, it's just that the temp table and index oids
> should be treated as local variables.

Interesting approach but is it worth the added complexity?  One issue
this does bring up is that functions themselves might invalidate their
own cached query plan by dropping a table and receating it.  In those
cases, your solution would be the only valid one, or throw an error.

I added some more text:* Track dependencies in function bodies and recompile/invalidate  This is particularly important
forreferences to temporary tables  in PL/PgSQL because PL/PgSQL caches query plans.  The only workaround  in PL/PgSQL
isto use EXECUTE.  One complexity is that a function  might itself drop and recreate dependent tables, causing it to
invalidateits own query plan.
 

> Done this way, it gives real benefits in terms of cached query plans,
> instead of just preventing newcomers from shooting themselves in foot by
> not using EXECUTE.
> 
> > > * Improve speed with indexes
> > > 
> > >   For large table adjustements during vacuum, it is faster to reindex
> > >   rather than update the index.
> > 
> > This applies only to VACUUM FULL, so it probably needs to be reworded.
> 
> In case we implement concurrent/non-blocking CREATE INDEX at some point,
> this might be a good idea for lazy VACUUM as well.

Perhaps.

> And it may make more sense to do CLUSTER instead of VACUUM FULL in at
> least some of these cases.

Cluster modifies the heap while reindex does not.  This makes cluster a
much heavier operation.

> (btw. CLUSTER seems to be another function which my concurrent vacuuming
> patch should be extended to cover, at least on "client" side, like
> CREATE INDEX)

Not sure.

> > > * Auto-vacuum
> > > 
> > >     o %Suggest VACUUM FULL if a table is nearly empty
> > 
> > It seems like a fairly bad idea for auto-vacuum to do a VACUUM FULL
> > ever, given the locking effects.  And how is a background daemon going
> > to "suggest" anything?  It could write to the postmaster log but it's
> > entirely likely the user would never notice.
> 
> With current implementations of commands, doing (some equivalent of)
> CLUSTER here seems a better idea than VACUUM FULL, as it also un-bloats
> indexes. Not sure of of transactional behaviour though.

Not sure, CLUSTER is still heavier.  That doesn't mean it shouldn't be
used, but the administrator should automatically consider CLUSTER in
place of VACUUM FULL for large updates.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: TODO list comments

From

Bruce Momjian

Date:

26 August 2005, 16:52:15

Jim C. Nasby wrote:
> I *think* this is reffering to how pg_dump makes some assumptions about
> what things are system objects.
> 
> http://archives.postgresql.org/pgsql-committers/2005-08/msg00203.php
> doesn't help a heck of a lot...
> 
> Can we add an interface to the TODO list that contains search links back
> to the mailing lists?

Yes, that would be nice, though some times the threads are pretty long
and I try to digest the agreed-upon solution.  Where would we put the
URLs?  In the TODO file?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: TODO list comments

From

Bruce Momjian

Date:

26 August 2005, 16:57:47

Andrew Dunstan wrote:
> 
> 
> Tom Lane wrote:
> 
> >Or perhaps use a different separator:
> >
> >junk=# select * from xyz;
> > id |    name    |              address              |             del_addr 
> >----+------------+-----------------------------------+----------------------------------
> >  1 | Joe Bloggs | 1 Hindhead Villas,                | 2 The Laurels,
> >    +            + Newport,                          + Swinkley,
> >    +            + Gwent                             + XX3 5CX
> >(1 row)
> >  
> >
> 
> That's a terrific idea, and, incidentally, just the sort of project that 
> might well suit a beginning hacker, since the code is pretty isolated.
> 
> >Not sure how hard this would be to program, or what sort of overhead it
> >might impose to check for the case.  My recollection is that psql's
> >table-layout code is pretty slow and ugly already ...
> >
> >
> >  
> >
> 
> If people want speed they shouldn't use psql as a client anyway. I don't 
> see this as much of an obstacle.

Added to TODO:
       o Allow multi-line column values to align in the proper columns
         If the second output column value is 'a\nb', the 'b' should appear         in the second display column,
ratherthan the first column as it         does now.
 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: TODO list comments

From

"Jim C. Nasby"

Date:

26 August 2005, 19:15:28

On Fri, Aug 26, 2005 at 03:44:18PM -0400, Bruce Momjian wrote:
> Jim C. Nasby wrote:
> > I *think* this is reffering to how pg_dump makes some assumptions about
> > what things are system objects.
> > 
> > http://archives.postgresql.org/pgsql-committers/2005-08/msg00203.php
> > doesn't help a heck of a lot...
> > 
> > Can we add an interface to the TODO list that contains search links back
> > to the mailing lists?
> 
> Yes, that would be nice, though some times the threads are pretty long
> and I try to digest the agreed-upon solution.  Where would we put the
> URLs?  In the TODO file?

Yeah, the digestification is good, and I hope it continues. But it's
also good to be able to refer back to the original thread in it's
entirety. My thought was to make the TODO item itself a link to the
search (or ideally the thread itself). The advantage of just linking to
the search is that would allow a clever CGI to just parse through the
TODO and linkify the TODO items.
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software        http://pervasive.com        512-569-9461