Thread: Current TODO list

Current TODO list

From
Bruce Momjian
Date:
Here is the current TODO list.  This list is independent of the Open
Items list.  Items not fixed by 6.5 final are moved to the TODO list.

I would like to know which items have been fixed already from this list.
Items I know are fixed are marked with a dash.  Are there more?  Tom,
can you identify any of the array items as fixed?  Should we assume they
are all fixed unless someone reports them broken?

Jan, and rewrite fixes already done that I can mark.

--------------------------------------------------------------------------

TODO list for PostgreSQL
========================
Last updated:        Sun May  9 21:06:49 EDT 1999

Current maintainer:    Bruce Momjian (maillist@candle.pha.pa.us)

The most recent version of this document can be viewed at
the PostgreSQL WWW site, http://www.postgreSQL.org.

A dash(-) marks changes to be in the next release.

Developers who have claimed items are:
-------------------------------------* Billy is Billy G. Allie <Bill.Allie@mug.org>* Brook is Brook Milligan
<brook@trillium.NMSU.Edu>*Bruce is Bruce Momjian<maillist@candle.pha.pa.us>* Bryan is Bryan
Henderson<bryanh@giraffe.netgate.net>*D'Arcy is D'Arcy J.M. Cain <darcy@druid.net>* Dan is Dan McGuirk
<mcguirk@indirect.com>*Darren is Darren King <darrenk@insightdist.com>* David is David Hartwig <daveh@insightdist.com>*
Edmundis Edmund Mergl <E.Mergl@bawue.de>* Goran is Goran Thyni <goran@kyla.kiruna.se>* Henry is Henry B. Hotz
<hotz@jpl.nasa.gov>*Jan is Jan Wieck <wieck@sapserv.debis.de>* Jun is Jun Kuwamura <juk@rccm.co.jp>* Maarten is Maarten
Boekhold<maartenb@dutepp0.et.tudelft.nl>    * Marc is Marc Fournier <scrappy@hub.org>    * Martin is Martin S. Utesch
<utesch@aut.tu-freiberg.de>*Massimo Dal Zotto <dz@cs.unitn.it>* Michael is Michael Meskes <meskes@debian.org>* Oleg is
OlegBartunov <oleg@sai.msu.su>* Paul is Paul M. Aoki <aoki@CS.Berkeley.EDU>* Peter is Peter T Mount
<peter@retep.org.uk>*Phil is Phil Thompson <phil@river-bank.demon.co.uk>* Ryan is Ryan Bradetich
<rbrad@hpb50023.boi.hp.com>*Soo-Ho Ok <shok@detc.dongeui-tc.ac.kr>* Stefan Simkovics
<ssimkovi@rainbow.studorg.tuwien.ac.at>*Sven is Sven Verdoolaege <skimo@breughel.ufsia.ac.be>* Tatsuo is Tatsuo Ishii
<t-ishii@sra.co.jp>*Tom is Tom Lane <tgl@sss.pgh.pa.us>* Thomas is Thomas Lockhart <tgl@mythos.jpl.nasa.gov>* TomH is
TomI Helbekkmo <tih@Hamartun.Priv.NO>* Vadim is "Vadim B. Mikheev" <vadim@krs.ru>
 

RELIABILITY
-----------
* Overhaul mdmgr/smgr to fix double unlinking and double opens, cleanup
* Overhaul bufmgr/lockmgr/transaction manager
* Remove EXTEND?
* Can lo_export()/lo_import() read/write anywhere, causing a security problem?
* Tables that start with xinv confused to be large objects
* Two and three dimensional arrays display improperly, missing {}
* -GROUP BY in INSERT INTO table SELECT * FROM table2 fails(Jan)
* SELECT a[1] FROM test fails, it needs test.a[1]
* UPDATE table SET table.value = 3 fails
* User who can create databases can modify pg_database table
* elog() does not free all its memory(Jan)
* views on subselects fail
* disallow inherited columns with the same name as new columns
* recover or force failure when disk space is exhausted
* allow UPDATE using aggregate to affect all rows, not just one
* -computations in views fail:(Jan)create view test as select usesysid * usesysid from pg_shadow;
* views containing aggregates sometimes fail(Jan)
* ALTER TABLE ADD COLUMN does not honor DEFAULT, add CONSTRAINT
* -fix memory leak in aborted transactions(Tom)
* array index references without table name cause problems
* aggregates on array indexes crash backend
* -subqueries containing HAVING return incorrect results(Stephan)
* -DEFAULT handles single quotes in value by requiring too many quotes
* -make CURSOR valid even after you hit end of cursor
* views with spaces in view name fail when referenced
* plpgsql does not handle quoted mixed-case identifiers
* do not allow bpchar column creation without length
* select t[1] from foo fails, select count(foo.t[1]) from foo crashes

ENHANCEMENTS
------------
* -Replace table-level locking with row or page-level locking(Vadim)
* Transaction log, so re-do log can be on a separate disk
* Allow transaction commits with rollback with no-fsync performance
* More access control over who can create tables and access the database
* Add full ANSI SQL capabilities* add OUTER joins, left and right (Thomas)* -add INTERSECTS, SUBTRACTS(Stephan)* -add
temporarytables(Bruce)* add sql3 recursive unions* add the concept of dataspaces* add BIT, BIT VARYING    * NCHAR (as
distinguishedfrom ordinary varchar),* DOMAIN capability
 
* Allow compression of large fields or a compressed field type
* -Fix the rules system(Jan)
* Large objects* Fix large object mapping scheme, own typeid or reltype(Peter)* Allow large text type to use large
objects(Peter)*not to stuff everything as files in a single directory* -delete orphaned large objects(Peter)
 
* Better interface for adding to pg_group
* -Make MONEY/DECIMAL have a defined precision(Jan)
* -Fix tables >2G, or report error when 2G size reached(Peter)(fix lseek()/off_t, mdextend()/RELSEG_SIZE)
* allow row re-use without vacuum, maybe?(Vadim)
* Populate backend status area and write program to dump status data
* Add ALTER TABLE DROP/ALTER COLUMN feature
* Add syslog functionality(Marc)
* Add STDDEV/VARIANCE() function for standard deviation computation/variance
* add UNIQUE capability to non-btree indexes
* certain indexes will not shrink, i.e. oid indexes with many inserts
* make NULL's come out at the beginning or end depending on the ORDER BY direction
* change the library/backend interface to use network byte order
* Restore unused oid's on backend exit if no one else has gotten oids
* have UPDATE/DELETE clean out indexes
* allow WHERE restriction on ctid
* allow pg_descriptions when creating types, tables, columns, and functions
* Fix compile and security of Kerberos/GSSAPI code
* Allow psql to print nulls as distinct from ""(?)
* Allow INSERT INTO ... SELECT ... FROM view to work
* Make VACUUM on database not lock pg_class
* Make VACUUM ANALYZE only use a readlock
* Allow cursors to be DECLAREd/OPENed/CLOSEed outside transactions
* -Allow installation data block size and max tuple size configuration(Darren)
* -Allow views on a UNION
* -Allow DISTINCT on view
* Allow views of aggregate columns
* -Allow variable block sizes(Darren)
* -System tables are now more update-able from SQL(Jan)
* Allow flag to control COPY input/output of NULLs
* Allow CLUSTER on all tables at once, and improve CLUSTER
* -Add ELOG_TIMESTAMPS to elog()
* -Allow max tuple length to be changed
* Have psql with no database name not connect to username as default(?)
* Allow subqueries in target list
* Allow queries across multiple databases
* Add replication of distributed databases
* Allow table destruction/alter to be rolled back
* Generate error on CREATE OPERATOR of ~~, ~ and and ~*
* Allow constraint NULL just as we honor NOT NULL
* -Add version number in startup banners for psql and postmaster
* Restructure storing of GRANT permission information to allow +-=
* allow psql \copy to allow delimiters
* allow international error message support and add error codes
* -allow usernames with dashes(GRANT fails)
* add a function to return the last inserted oid, for use in psql scripts
* allow creation of functional indexes to use default types
* put sort files, large objects in their on directory
* do autocommit so always in a transaction block
* add SIMILAR TO to allow character classes, 'pg_[a-c]%'
* -multi-verion concurrency control(Vadim)
* improve reporting of syntax errors by showing location of error in query
* allow chaining of pages to allow >8k tuples
* -remove un-needed conversion functions where appropriate
* redesign the function call interface to handle NULLs better(Jan)
* permissions on indexes - prevent them?
* -allow multiple generic operators in expressions without the use of parentheses
* document/trigger/rule so changes to pg_shadow create pg_pwd
* generate postmaster pid file and remove flock/fcntl lock code
* -improve PRIMARY KEY handling(D'Arcy)
* add ability to specifiy location of lock/socket files
* -psql \d on index with char()/varchar() fields shows improper length
* -disallow LOCK outside a transaction, change message to LOCK instead of DELETE
* Fix roundoff problems in "cash" datatype
* -fix any sprintf() overruns(Tatsuo)
* -add portable vsnprintf()(Tatsuo)
* auto-destroy sequence on SERIAL removal
* CREATE TABLE inside aborted transaction causes stray table file
* allow user to define char1 column
* -have psql \d on a view show the query
* allow LOCK TABLE tab1, tab2, tab3 so all tables locked in unison
* allow INSERT/UPDATE of system-generated oid value for a row
* missing optimizer selectivities for date, etc.

PERFORMANCE
-----------
* Use indexes in ORDER BY for restrictive data sets, min(), max()
* Allow LIMIT ability on single-table queries that have no ORDER BY to usea matching index
* Pull requested data directly from indexes, bypassing heap data
* -Prevent psort() usage when query already using index matching ORDER BY(Jan)
* -Fix bushy-plans(Bruce)
* Prevent fsync in SELECT-only queries
* Cache most recent query plan(s?)
* Shared catalog cache, reduce lseek()'s by caching table size in shared area
* Allow compression of log and meta data
* Add FILLFACTOR to index creation
* update pg_statistic table to remove operator column
* make index creation use psort code, because it is now faster(Vadim)
* Allow char() not to use variable-sized header to reduce disk size
* Do async I/O to do better read-ahead of data
* -Fix optmizer problem with self-table joins
* Fix memory exhaustion when using many OR's
* -Use spin locks only on multi-CPU systems, yield CPU instead
* Get faster regex() code from Henry Spencer <henry@zoo.utoronto.ca>when it is available
* use mmap() rather than SYSV shared memory(?)
* use index to restrict rows returned by multi-key index when used withnon-consecutive keys or OR clauses, so fewer
heapaccesses
 
* use index with constants on functions

DOCUMENTATION
-------------
* Add use of 'const' for varibles in source tree

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Current TODO list

From
Peter T Mount
Date:
On Mon, 17 May 1999, Bruce Momjian wrote:

> ENHANCEMENTS
> ------------
> * Large objects
>     * Fix large object mapping scheme, own typeid or reltype(Peter)
>     * Allow large text type to use large objects(Peter)
>     * not to stuff everything as files in a single directory

Hopefully when my workload eases (mid june ish) I'll be able to tackle
these again in earnest.

>     * -delete orphaned large objects(Peter)

As I missed the 6.5beta deadline, I put the solution into contrib/lo

> * -Fix tables >2G, or report error when 2G size reached(Peter)
>     (fix lseek()/off_t, mdextend()/RELSEG_SIZE)

This was done (twice if I remember). The tables now split at 1G. This
opened a new problem that vacuum can't handle segmented tables. I have the
general idea of how to fix this, but again it's time that's the problem.

peter

--       Peter T Mount peter@retep.org.uk     Main Homepage: http://www.retep.org.uk
PostgreSQL JDBC Faq: http://www.retep.org.uk/postgresJava PDF Generator: http://www.retep.org.uk/pdf



Re: [HACKERS] Current TODO list

From
Thomas Lockhart
Date:
Peter, I've seen some changes to preproc.y. Was this to sync back up
with the recent changes in gram.y for the lock table and set
transaction stuff?
                      - Thomas

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California


Re: [HACKERS] Current TODO list

From
Tom Lane
Date:
Bruce Momjian <maillist@candle.pha.pa.us> writes:
> Tom, can you identify any of the array items as fixed?  Should we
> assume they are all fixed unless someone reports them broken?

No, that would be unduly optimistic :-(.  I have fixed one or two
array-related bugs, but I haven't made a serious push on it; several
of the test cases that are in my to-do list still fail.

> * aggregates on array indexes crash backend

I believe I have fixed that one, at least.

> * select t[1] from foo fails, select count(foo.t[1]) from foo crashes

This item is a duplicate: the first part refers to the same thing as
> * array index references without table name cause problems
(which is as yet unfixed) and the second refers to the aggregate problem.

> * change the library/backend interface to use network byte order

Is there something I'm missing?  This has been true for a long while...

> * -Allow DISTINCT on view

I think this is not done...
        regards, tom lane


Re: [HACKERS] Current TODO list

From
Tom Lane
Date:
Peter T Mount <peter@retep.org.uk> writes:
> This was done (twice if I remember). The tables now split at 1G. This
> opened a new problem that vacuum can't handle segmented tables. I have the
> general idea of how to fix this, but again it's time that's the problem.

Ole Gjerde <gjerde@icebox.org> just contributed a patch for the vacuum
problem.  Perhaps you at least have time to check his patch?
        regards, tom lane


Re: [HACKERS] Current TODO list

From
Bruce Momjian
Date:
> Bruce Momjian <maillist@candle.pha.pa.us> writes:
> > Tom, can you identify any of the array items as fixed?  Should we
> > assume they are all fixed unless someone reports them broken?
> 
> No, that would be unduly optimistic :-(.  I have fixed one or two
> array-related bugs, but I haven't made a serious push on it; several
> of the test cases that are in my to-do list still fail.
> 
> > * aggregates on array indexes crash backend
> 
> I believe I have fixed that one, at least.

OK.

> > * select t[1] from foo fails, select count(foo.t[1]) from foo crashes
> 
> This item is a duplicate: the first part refers to the same thing as
> > * array index references without table name cause problems
> (which is as yet unfixed) and the second refers to the aggregate problem.

OK.

> > * change the library/backend interface to use network byte order
> 
> Is there something I'm missing?  This has been true for a long while...

People have mentioned we should make the change, but it will require a
new protocol, so it hasn't moved from the list.

> > * -Allow DISTINCT on view
> 
> I think this is not done...

Yes, I think he just added an error message to warn people.

Updated.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Current TODO list

From
Tom Lane
Date:
Bruce Momjian <maillist@candle.pha.pa.us> writes:
>>>> * change the library/backend interface to use network byte order
>> 
>> Is there something I'm missing?  This has been true for a long while...
>
> People have mentioned we should make the change, but it will require a
> new protocol, so it hasn't moved from the list.

But my point is the current protocol *already* uses network byte order.

IIRC the old "version 0" protocol did not, but that's ancient history.
Either this complaint is long obsolete, or I don't understand what's
being asked for.
        regards, tom lane


Re: [HACKERS] Current TODO list

From
Bruce Momjian
Date:
> Bruce Momjian <maillist@candle.pha.pa.us> writes:
> >>>> * change the library/backend interface to use network byte order
> >> 
> >> Is there something I'm missing?  This has been true for a long while...
> >
> > People have mentioned we should make the change, but it will require a
> > new protocol, so it hasn't moved from the list.
> 
> But my point is the current protocol *already* uses network byte order.

Oh.  Item removed.

> IIRC the old "version 0" protocol did not, but that's ancient history.
> Either this complaint is long obsolete, or I don't understand what's
> being asked for.

I think you are right.  Someone added code to serve both orders based on
the version of the client, I think, and newer clients use the proper
order.


--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Current TODO list

From
Michael Meskes
Date:
On Tue, May 18, 1999 at 01:48:07PM +0000, Thomas Lockhart wrote:
> Peter, I've seen some changes to preproc.y. Was this to sync back up

It was me who changed preproc.y

> with the recent changes in gram.y for the lock table and set
> transaction stuff?

Yes. It was just to get the two back in sync.

Now we only need to get rid of that shift/reduce problem.

Michael
-- 
Michael Meskes                         | Go SF 49ers!
Th.-Heuss-Str. 61, D-41812 Erkelenz    | Go Rhein Fire!
Tel.: (+49) 2431/72651                 | Use Debian GNU/Linux!
Email: Michael.Meskes@gmx.net          | Use PostgreSQL!


Re: [HACKERS] Current TODO list

From
Peter T Mount
Date:
On Tue, 18 May 1999, Tom Lane wrote:

> Peter T Mount <peter@retep.org.uk> writes:
> > This was done (twice if I remember). The tables now split at 1G. This
> > opened a new problem that vacuum can't handle segmented tables. I have the
> > general idea of how to fix this, but again it's time that's the problem.
> 
> Ole Gjerde <gjerde@icebox.org> just contributed a patch for the vacuum
> problem.  Perhaps you at least have time to check his patch?

Will do.

Peter

--       Peter T Mount peter@retep.org.uk     Main Homepage: http://www.retep.org.uk
PostgreSQL JDBC Faq: http://www.retep.org.uk/postgresJava PDF Generator: http://www.retep.org.uk/pdf



Re: [HACKERS] Current TODO list

From
Thomas Lockhart
Date:
> It was me who changed preproc.y

Oops. Right. Sorry...

> Yes. It was just to get the two back in sync.

Thanks.

> Now we only need to get rid of that shift/reduce problem.

Yes. I'm worried about it, since there are at least two places which
were modified which are leading to shift/reduce conflicts *or* which
were disabled to remove shift/reduce conflicts.
                   - Thomas

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California


RE: [HACKERS] Current TODO list

From
"Hiroshi Inoue"
Date:

> -----Original Message-----
> From: owner-pgsql-hackers@postgreSQL.org
> [mailto:owner-pgsql-hackers@postgreSQL.org]On Behalf Of Tom Lane
> Sent: Tuesday, May 18, 1999 11:41 PM
> To: Peter T Mount
> Cc: Bruce Momjian; PostgreSQL-development
> Subject: Re: [HACKERS] Current TODO list
>
>
> Peter T Mount <peter@retep.org.uk> writes:
> > This was done (twice if I remember). The tables now split at 1G. This
> > opened a new problem that vacuum can't handle segmented tables.
> I have the
> > general idea of how to fix this, but again it's time that's the problem.
>
> Ole Gjerde <gjerde@icebox.org> just contributed a patch for the vacuum
> problem.  Perhaps you at least have time to check his patch?
>

I wonder that no one but me object to the patch.
It may cause serious results.
I think it needs mooore checks and tests.

Thanks.

Hiroshi Inoue
Inoue@tpf.co.jp



Re: [HACKERS] Current TODO list

From
Michael Meskes
Date:
On Wed, May 19, 1999 at 04:12:18AM +0000, Thomas Lockhart wrote:
> Yes. I'm worried about it, since there are at least two places which
> were modified which are leading to shift/reduce conflicts *or* which
> were disabled to remove shift/reduce conflicts.

Yes, I wanted to dig into it but didn't find the time yet.

Michael
-- 
Michael Meskes                         | Go SF 49ers!
Th.-Heuss-Str. 61, D-41812 Erkelenz    | Go Rhein Fire!
Tel.: (+49) 2431/72651                 | Use Debian GNU/Linux!
Email: Michael.Meskes@gmx.net          | Use PostgreSQL!


Re: [HACKERS] Current TODO list

From
Thomas Lockhart
Date:
> * Thomas is Thomas Lockhart <tgl@mythos.jpl.nasa.gov>

Can you change this to my home address (lockhart@alumni.caltech.edu)?

TIA
                    - Thomas

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California


RE: [HACKERS] Current TODO list

From
The Hermit Hacker
Date:
On Wed, 19 May 1999, Hiroshi Inoue wrote:

> 
> 
> > -----Original Message-----
> > From: owner-pgsql-hackers@postgreSQL.org
> > [mailto:owner-pgsql-hackers@postgreSQL.org]On Behalf Of Tom Lane
> > Sent: Tuesday, May 18, 1999 11:41 PM
> > To: Peter T Mount
> > Cc: Bruce Momjian; PostgreSQL-development
> > Subject: Re: [HACKERS] Current TODO list
> >
> >
> > Peter T Mount <peter@retep.org.uk> writes:
> > > This was done (twice if I remember). The tables now split at 1G. This
> > > opened a new problem that vacuum can't handle segmented tables.
> > I have the
> > > general idea of how to fix this, but again it's time that's the problem.
> >
> > Ole Gjerde <gjerde@icebox.org> just contributed a patch for the vacuum
> > problem.  Perhaps you at least have time to check his patch?
> >
> 
> I wonder that no one but me object to the patch.
> It may cause serious results.

How?  Why?  In what way?  Details?

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 



RE: [HACKERS] Current TODO list

From
"Hiroshi Inoue"
Date:

> -----Original Message-----
> From: The Hermit Hacker [mailto:scrappy@hub.org]
> Sent: Thursday, May 20, 1999 7:59 PM
> To: Hiroshi Inoue
> Cc: Tom Lane; Peter T Mount; Bruce Momjian; PostgreSQL-development
> Subject: RE: [HACKERS] Current TODO list
>
>
> On Wed, 19 May 1999, Hiroshi Inoue wrote:
>
> >
> >
> > > -----Original Message-----
> > > From: owner-pgsql-hackers@postgreSQL.org
> > > [mailto:owner-pgsql-hackers@postgreSQL.org]On Behalf Of Tom Lane
> > > Sent: Tuesday, May 18, 1999 11:41 PM
> > > To: Peter T Mount
> > > Cc: Bruce Momjian; PostgreSQL-development
> > > Subject: Re: [HACKERS] Current TODO list
> > >
> > >
> > > Peter T Mount <peter@retep.org.uk> writes:
> > > > This was done (twice if I remember). The tables now split
> at 1G. This
> > > > opened a new problem that vacuum can't handle segmented tables.
> > > I have the
> > > > general idea of how to fix this, but again it's time that's
> the problem.
> > >
> > > Ole Gjerde <gjerde@icebox.org> just contributed a patch for the vacuum
> > > problem.  Perhaps you at least have time to check his patch?
> > >
> >
> > I wonder that no one but me object to the patch.
> > It may cause serious results.
>
> How?  Why?  In what way?  Details?
>

I don't have tables > 1G.
So I won't be damaged by the patch.

But I don't understand what Beta is.
Why isn't such a dangerous fucntion checked and tested
carefully ?

For example,the following code is not changed by the patch.
       if (FileTruncate(v->mdfd_vfd, nblocks * BLCKSZ) < 0)               return -1;

It never truncate segmented files and there may be cases the
original file increases its size(ftruncate() increases the size of
target file if the requested size is longer than the actual size).
It's not checked and tested and once it occurs I don't know
what will happen.

But my anxiety is the use of unlink()(FileNameUnlink()).

Unlink() is very dangerous.
Unlink() never remove the target file immediately.and even the
truncating process doesn't close the files by the patch and so
unlinked files are still alive for all processes which have already
opened the files.
Who checked and tested the influence carefully ?

I think it's not so easy to implement and check mdtruncate().

Thanks.

Hiroshi Inoue
Inoue@tpf.co.jp



Re: [HACKERS] Current TODO list

From
Bruce Momjian
Date:
> > > I wonder that no one but me object to the patch.
> > > It may cause serious results.
> >
> > How?  Why?  In what way?  Details?
> >
> 
> I don't have tables > 1G.
> So I won't be damaged by the patch.
> 
> But I don't understand what Beta is.
> Why isn't such a dangerous fucntion checked and tested
> carefully ?
> 
> For example,the following code is not changed by the patch.
> 
>         if (FileTruncate(v->mdfd_vfd, nblocks * BLCKSZ) < 0)
>                 return -1;
> 
> It never truncate segmented files and there may be cases the
> original file increases its size(ftruncate() increases the size of
> target file if the requested size is longer than the actual size).
> It's not checked and tested and once it occurs I don't know
> what will happen.
> 
> But my anxiety is the use of unlink()(FileNameUnlink()).
> 
> Unlink() is very dangerous.
> Unlink() never remove the target file immediately.and even the
> truncating process doesn't close the files by the patch and so
> unlinked files are still alive for all processes which have already
> opened the files.
> Who checked and tested the influence carefully ?
> 
> I think it's not so easy to implement and check mdtruncate().

OK, I see what you are saying, but the multi-segment problem is on our
list to fix.  Is this risking non-multi-segment cases.  If not, then
let's keep it, and continue improving the multi-segment handling,
because it was pretty bad before, and we need it fixed.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Current TODO list

From
Ole Gjerde
Date:
On Thu, 20 May 1999, Bruce Momjian wrote:
> > For example,the following code is not changed by the patch.
> >         if (FileTruncate(v->mdfd_vfd, nblocks * BLCKSZ) < 0)
> >                 return -1;
> > It never truncate segmented files and there may be cases the
> > original file increases its size(ftruncate() increases the size of
> > target file if the requested size is longer than the actual size).

I agree.  I have rewritten my patch, but I need to test it some more.

> > But my anxiety is the use of unlink()(FileNameUnlink()).
> > Unlink() is very dangerous.
> > Unlink() never remove the target file immediately.and even the
> > truncating process doesn't close the files by the patch and so
> > unlinked files are still alive for all processes which have already
> > opened the files.

I don't think unlink() is a problem.  That other backends have the files
open shouldn't matter.  Whenever they close it(should be pretty quick),
the files would be removed..

I'll try to get the patch out later today.

On another note, I've had some other problems with vacuuming my databases.
(All before patch :)
Sometimes the backend would crash while doing a vacuum analyze.  It would
do this repeatedly if I ran it again.  Then if I ran a regular vacuum, and
then again a vacuum analyze it would work fine.  Very weird...

Now I have a bit of a bigger problem.  I just did a pg_upgrade to a newer
CVS version.  Most of my tables seems fine and vacuum worked fine on most
of them.
But on the only 2 tables that I have changed lately I'm getting vacuum
"errors".  Both tables are very small(shotgun table file is 1.4MB).
If I keep running vacuum(over and over) the number of deleted tuples will
eventually go to 0 and it will look normal.  It does take a few vacuum
runs however, so something really weird is going on here.

shotgun=> vacuum verbose analyze shotgun;
NOTICE:  --Relation shotgun--
NOTICE:  Pages 334: Changed 0, Reapped 5, Empty 0, New 0; Tup 22414: Vac
3, Keep/VTL 11708/10895, Crash 0, UnUsed 49, MinLen 64, MaxLen 159;
Re-using: Free/Avail. Space 6556/492; EndEmpty/Avail. Pages 0/3. Elapsed
0/0 sec.
NOTICE:  Index shotgun_index_keyword: Pages 180; Tuples 22274: Deleted 3.
Elapsed 0/0 sec.
NOTICE:  Index shotgun_index_keyword: NUMBER OF INDEX' TUPLES (22274) IS
NOT THE SAME AS HEAP' (22414)
NOTICE:  Index shotgun_index_email: Pages 222; Tuples 22274: Deleted 3.
Elapsed 0/1 sec.
NOTICE:  Index shotgun_index_email: NUMBER OF INDEX' TUPLES (22274) IS NOT
THE SAME AS HEAP' (22414)
NOTICE:  Index shotgun_id_key: Pages 91; Tuples 22414: Deleted 3. Elapsed
0/0 sec.
NOTICE:  Rel shotgun: Pages: 334 --> 334; Tuple(s) moved: 2. Elapsed 0/0
sec.
NOTICE:  Index shotgun_index_keyword: Pages 180; Tuples 22275: Deleted 1.
Elapsed 0/0 sec.
NOTICE:  Index shotgun_index_keyword: NUMBER OF INDEX' TUPLES (22275) IS
NOT THE SAME AS HEAP' (22414)
NOTICE:  Index shotgun_index_email: Pages 222; Tuples 22275: Deleted 1.
Elapsed 0/0 sec.
NOTICE:  Index shotgun_index_email: NUMBER OF INDEX' TUPLES (22275) IS NOT
THE SAME AS HEAP' (22414)
NOTICE:  Index shotgun_id_key: Pages 91; Tuples 22415: Deleted 1. Elapsed
0/0 sec.
NOTICE:  Index shotgun_id_key: NUMBER OF INDEX' TUPLES (22415) IS NOT THE
SAME AS HEAP' (22414)
VACUUM

Thanks,
Ole Gjerde



Re: [HACKERS] Current TODO list

From
Vadim Mikheev
Date:
Ole Gjerde wrote:
> 
> Now I have a bit of a bigger problem.  I just did a pg_upgrade to a newer
> CVS version.  Most of my tables seems fine and vacuum worked fine on most
> of them.
> But on the only 2 tables that I have changed lately I'm getting vacuum
> "errors".  Both tables are very small(shotgun table file is 1.4MB).
> If I keep running vacuum(over and over) the number of deleted tuples will
> eventually go to 0 and it will look normal.  It does take a few vacuum
> runs however, so something really weird is going on here.
> 
> shotgun=> vacuum verbose analyze shotgun;
> NOTICE:  --Relation shotgun--
> NOTICE:  Pages 334: Changed 0, Reapped 5, Empty 0, New 0; Tup 22414: Vac
> 3, Keep/VTL 11708/10895, Crash 0, UnUsed 49, MinLen 64, MaxLen 159;
> Re-using: Free/Avail. Space 6556/492; EndEmpty/Avail. Pages 0/3. Elapsed
> 0/0 sec.
> NOTICE:  Index shotgun_index_keyword: Pages 180; Tuples 22274: Deleted 3.
> Elapsed 0/0 sec.
> NOTICE:  Index shotgun_index_keyword: NUMBER OF INDEX' TUPLES (22274) IS
> NOT THE SAME AS HEAP' (22414)

Hiroshi found the bug in vacuum and posted me patch, but I'm
unhappy with it and will commit my changes in a few hours.

Vadim


RE: [HACKERS] Current TODO list

From
"Hiroshi Inoue"
Date:

> -----Original Message-----
> From: Ole Gjerde [mailto:gjerde@icebox.org]
> Sent: Saturday, May 22, 1999 1:37 AM
> To: Bruce Momjian
> Cc: Hiroshi Inoue; PostgreSQL-development
> Subject: Re: [HACKERS] Current TODO list
> 
> 
> On Thu, 20 May 1999, Bruce Momjian wrote:

[snip]

> 
> > > But my anxiety is the use of unlink()(FileNameUnlink()).
> > > Unlink() is very dangerous.
> > > Unlink() never remove the target file immediately.and even the
> > > truncating process doesn't close the files by the patch and so
> > > unlinked files are still alive for all processes which have already
> > > opened the files.
> 
> I don't think unlink() is a problem.  That other backends have the files
> open shouldn't matter.  Whenever they close it(should be pretty quick),

When are those files closed ?
AFAIC,they are kept open until the backends which reference those files 
finish.

Certainly,those files are re-opened(without closing) by backends after 
vacuum,though I don't know it's intentional or caused by side-effect.
But unfortunately,re-open is not sufficiently quick. 

And I think that the assumption of mdtruncate() is not clear.
Could we suppose that unlinked files are closed quickly for all backends 
by the caller of mdunlink() ?

Thanks.

Hiroshi Inoue
Inoue@tpf.co.jp 



Re: [HACKERS] Current TODO list

From
Bruce Momjian
Date:
> > I don't think unlink() is a problem.  That other backends have the files
> > open shouldn't matter.  Whenever they close it(should be pretty quick),
> 
> When are those files closed ?
> AFAIC,they are kept open until the backends which reference those files 
> finish.
> 
> Certainly,those files are re-opened(without closing) by backends after 
> vacuum,though I don't know it's intentional or caused by side-effect.
> But unfortunately,re-open is not sufficiently quick. 
> 
> And I think that the assumption of mdtruncate() is not clear.
> Could we suppose that unlinked files are closed quickly for all backends 
> by the caller of mdunlink() ?

If they try and open a file that is already unlinked, they don't get to
see the file.  Unlink removes it from the directory, so the only way to
continue access after an unlink is if you already hold a file descrpitor
on the file.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


RE: [HACKERS] Current TODO list

From
"Hiroshi Inoue"
Date:

> -----Original Message-----
> From: Bruce Momjian [mailto:maillist@candle.pha.pa.us]
> Sent: Monday, May 24, 1999 12:32 PM
> To: Hiroshi Inoue
> Cc: Ole Gjerde; PostgreSQL-development
> Subject: Re: [HACKERS] Current TODO list
>
>
> > > I don't think unlink() is a problem.  That other backends
> have the files
> > > open shouldn't matter.  Whenever they close it(should be
> pretty quick),
> >
> > When are those files closed ?
> > AFAIC,they are kept open until the backends which reference those files
> > finish.
> >
> > Certainly,those files are re-opened(without closing) by backends after
> > vacuum,though I don't know it's intentional or caused by side-effect.
> > But unfortunately,re-open is not sufficiently quick.
> >
> > And I think that the assumption of mdtruncate() is not clear.
> > Could we suppose that unlinked files are closed quickly for all
> backends
> > by the caller of mdunlink() ?
>
> If they try and open a file that is already unlinked, they don't get to
> see the file.  Unlink removes it from the directory, so the only way to
> continue access after an unlink is if you already hold a file descrpitor
> on the file.
>

You are right.
Backends would continue to access the file descritors already hold
if vacuum does nothing about the invalidation of Relation Cache.

Thanks.

Hiroshi Inoue
Inoue@tpf.co.jp



Vacuum/mdtruncate() (was: RE: [HACKERS] Current TODO list)

From
Ole Gjerde
Date:
On Mon, 24 May 1999, Hiroshi Inoue wrote:
> Backends would continue to access the file descritors already hold
> if vacuum does nothing about the invalidation of Relation Cache.

Yes, and I don't believe that is a problem.  I may be wrong however...

First, please reverse my patch to mdtruncate() in md.c as soon as
possible.  It does not work properly in some cases.

Second, I do have a better patch in the works.  It is included below, but
DO NOT APPLY THIS!!!  I would like someone to look it over quick.  I have
checked the logic by hand for a few cases and done a bunch of tests.  I
would like to test more first.

While doing a bunch of vacuums, I have seen some strange things(so my
patch probably isn't 100%).
I started with 58 segments, and did a bunch of delete/vacuums and got it
down to about 5-6.  Then I got the error below while running a vacuum
analyze.  This appeared after the index clean, but before any tuples were
moved.
ERROR:  HEAP_MOVED_IN was not expected

Also, I was seeing some more errors about INDEX' TUPLES being higher than
HEAP TUPLES.  Didn't this just get fixed, or did I break something with my
patch.  I was seeing these after doing delete/vacuums with my patch.

Thanks,
Ole Gjerde

Index: src/backend/storage/smgr/md.c
===================================================================
RCS file: /usr/local/cvsroot/pgsql/src/backend/storage/smgr/md.c,v
retrieving revision 1.43
diff -u -r1.43 md.c
--- src/backend/storage/smgr/md.c    1999/05/17 06:38:41    1.43
+++ src/backend/storage/smgr/md.c    1999/05/24 06:30:25
@@ -712,32 +712,62 @@#ifndef LET_OS_MANAGE_FILESIZE    int            curnblk,
-                    i,                    oldsegno,
-                    newsegno;
-    char        fname[NAMEDATALEN];
-    char        tname[NAMEDATALEN + 10];
+                    newsegno,
+                    lastsegblocks,
+                    segcount = 0;
+    MdfdVec        *ov,
+                *lastv;
+    MemoryContext    oldcxt;
+    fd = RelationGetFile(reln);    curnblk = mdnblocks(reln);
-    oldsegno = curnblk / RELSEG_SIZE;
-    newsegno = nblocks / RELSEG_SIZE;
-    StrNCpy(fname, RelationGetRelationName(reln)->data, NAMEDATALEN);
+    oldsegno = (curnblk / RELSEG_SIZE) + 1;
+    newsegno = (nblocks / RELSEG_SIZE) + 1;
+    oldcxt = MemoryContextSwitchTo(MdCxt);
-    if (newsegno < oldsegno) {
-        for (i = (newsegno + 1);; i++) {
-            sprintf(tname, "%s.%d", fname, i);
-            if (FileNameUnlink(tname) < 0)
-                break;
+    if (newsegno < oldsegno && newsegno > 1)
+    {
+        lastv = v = &Md_fdvec[fd];
+        for (segcount = 1; v != (MdfdVec *) NULL;segcount++, v = v->mdfd_chain)
+        {
+            if(segcount == newsegno) /* Save pointer to last file
+                            in the chain */
+                lastv = v;
+                        if(segcount > newsegno)
+            {
+                FileUnlink(v->mdfd_vfd);
+                ov = v;
+                if (ov != &Md_fdvec[fd])
+                    pfree(ov);
+            }        }
+        lastv->mdfd_chain = (MdfdVec *) NULL;        }
-#endif
+    /* Find the last file in the md chain */
+    for (v = &Md_fdvec[fd]; v->mdfd_chain != (MdfdVec *) NULL;)
+        v = v->mdfd_chain;
+
+    /* Calculate the # of blocks in the last segment */
+    lastsegblocks = nblocks - ((newsegno - 1) * RELSEG_SIZE);
+
+    MemoryContextSwitchTo(oldcxt);
+
+    if (FileTruncate(v->mdfd_vfd, lastsegblocks * BLCKSZ) < 0)
+        return -1;
+
+#else
+    fd = RelationGetFile(reln);
+    v = &Md_fdvec[fd];    if (FileTruncate(v->mdfd_vfd, nblocks * BLCKSZ) < 0)        return -1;
+
+#endif    return nblocks;



Re: Vacuum/mdtruncate() (was: RE: [HACKERS] Current TODO list)

From
Vadim Mikheev
Date:
Ole Gjerde wrote:
> 
> While doing a bunch of vacuums, I have seen some strange things(so my
> patch probably isn't 100%).
> I started with 58 segments, and did a bunch of delete/vacuums and got it
> down to about 5-6.  Then I got the error below while running a vacuum
> analyze.  This appeared after the index clean, but before any tuples were
> moved.
> ERROR:  HEAP_MOVED_IN was not expected

I added this in my last patch ... I have to think more about
the cause.

> Also, I was seeing some more errors about INDEX' TUPLES being higher than
> HEAP TUPLES.  Didn't this just get fixed, or did I break something with my
> patch.  I was seeing these after doing delete/vacuums with my patch.

Hiroshi, could you try to reproduce NOT THE SAME problem
with new vacuum code?

Vadim


RE: Vacuum/mdtruncate() (was: RE: [HACKERS] Current TODO list)

From
"Hiroshi Inoue"
Date:

> -----Original Message-----
> From: root@sunpine.krs.ru [mailto:root@sunpine.krs.ru]On Behalf Of Vadim
> Mikheev
> Sent: Monday, May 24, 1999 4:53 PM
> To: Ole Gjerde
> Cc: Hiroshi Inoue; Bruce Momjian; PostgreSQL-development
> Subject: Re: Vacuum/mdtruncate() (was: RE: [HACKERS] Current TODO list)
>
>
> Ole Gjerde wrote:
> >
> > While doing a bunch of vacuums, I have seen some strange things(so my
> > patch probably isn't 100%).
> > I started with 58 segments, and did a bunch of delete/vacuums and got it

Are delete/vacuums executed sequentially by single session ?

> > down to about 5-6.  Then I got the error below while running a vacuum
> > analyze.  This appeared after the index clean, but before any
> tuples were
> > moved.
> > ERROR:  HEAP_MOVED_IN was not expected
>
> I added this in my last patch ... I have to think more about
> the cause.
>
> > Also, I was seeing some more errors about INDEX' TUPLES being
> higher than
> > HEAP TUPLES.  Didn't this just get fixed, or did I break
> something with my
> > patch.  I was seeing these after doing delete/vacuums with my patch.
>
> Hiroshi, could you try to reproduce NOT THE SAME problem
> with new vacuum code?
>

I couldn't reproduce NOT THE SAME message in current.

Thanks.

Hiroshi Inoue
Inoue@tpf.co.jp



RE: Vacuum/mdtruncate() (was: RE: [HACKERS] Current TODO list)

From
Ole Gjerde
Date:
On Mon, 24 May 1999, Hiroshi Inoue wrote:
> > Ole Gjerde wrote:
> > > While doing a bunch of vacuums, I have seen some strange things(so my
> > > patch probably isn't 100%).
> > > I started with 58 segments, and did a bunch of delete/vacuums and got it
> Are delete/vacuums executed sequentially by single session ?

Yes.

> > > Also, I was seeing some more errors about INDEX' TUPLES being
> > higher than
> > > HEAP TUPLES.  Didn't this just get fixed, or did I break
> > something with my
> > > patch.  I was seeing these after doing delete/vacuums with my patch.
> > Hiroshi, could you try to reproduce NOT THE SAME problem
> > with new vacuum code?
> I couldn't reproduce NOT THE SAME message in current.

Could you try with my patch?

Thanks,
Ole Gjerde



Re: Vacuum/mdtruncate() (was: RE: [HACKERS] Current TODO list)

From
Vadim Mikheev
Date:
Hiroshi Inoue wrote:
> 
> >
> > > Also, I was seeing some more errors about INDEX' TUPLES being
> > higher than
> > > HEAP TUPLES.  Didn't this just get fixed, or did I break
> > something with my
> > > patch.  I was seeing these after doing delete/vacuums with my patch.
> >
> > Hiroshi, could you try to reproduce NOT THE SAME problem
> > with new vacuum code?
> >
> 
> I couldn't reproduce NOT THE SAME message in current.

Nice to know it.
Thanks for finding/resolving this bug, Hiroshi!

Vadim