Re: [HACKERS] Priorities for 6.6 - Mailing list pgsql-hackers

From Hannu Krosing
Subject Re: [HACKERS] Priorities for 6.6
Date
Msg-id 3757981B.B47E4E93@trust.ee
Whole thread Raw
In response to Priorities for 6.6  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [HACKERS] Priorities for 6.6
Re: [HACKERS] Priorities for 6.6
List pgsql-hackers
Tom Lane wrote:
> 
> I don't know what people have had in mind for 6.6, but I propose that
> there ought to be three primary objectives for our next release:
> 
> 1. Eliminate arbitrary restrictions on tuple size.
> 
> 2. Eliminate arbitrary restrictions on query size (textual
>    length/complexity that is).
> 
> 3. Cure within-statement memory leaks, so that processing large numbers
>    of tuples in one statement is reliable.

I would add a few that I think would be important:

A. Add outer joins

B. Add the possibility to prepare statements and then execute them   with a set of arguments. This already exists in
SPIbut for many  C/S apps it would be desirable to have this in the fe/be protocol  as well
 

C. Look over the protocol and unify the _binary_ representations of  datatypes on wire. in fact each type already has
twosets of  in/out conversion functions in its definition tuple, one for disk and  another for net, it's only that
untilnow they are the same for  all types and thus probably used wromg in some parts of code.
 

D. After B. and C., add a possibility to insert binary data  in "(small)binary" field without relying on LOs or
expensive (4x the size) quoting. Allow any characters in said binary field
 

E. to make 2. and B., C, D. possible, some more fundamental changes in  fe/be-protocol may be needed. There seems to be
someeffort for a new  fe/be communications mechanism using CORBA.   But my proposal would be to adopt the X11 protocol
whichis quite
 
light  but still very clean, well understood and which can transfer
arbitrary  data in an efficient way.  There are even "low bandwidth" variants of it for using over  really slow links.
Alsosome kinds of "out of band" provisions exist,  that are used by window managers.  It should also be trivial to
adaptcrypto wrappers/proxies (such as
 
the  one in ssh)  The protocol is described in a document available from
http://www.x.org

F. As a lousy alternative to 1. fix the LO storage. Currently _all_ of  the LO files are kept in the same directory as
thetables and
 
indexes.  this can bog down the whole database quite fast if one lots of LOs
and  a file system that does linear scans on open (like ext2).  A sheme where LOs are kept in subdirectories based on
thehex  representation of their oids would avoid that (so LO with OID
 
0x12345678  would be stored in $PG_DATA/DBNAME/LO/12/34/56/78.lo or maybe
reversed  $PG_DATA/DBNAME/LO/78/56/34/12.lo to distribute them more evenly in  "buckets"

> All of these are fairly major projects, and it might be that we get
> little or nothing else done if we take these on.

But then, the other things to do _are_ little compared to these ;)

> But these are the problems we've been hearing about over and over and
> over.

The LO thing (and lack of decent full-text indexing) is what has kept me 
using hybrid solutions where I keep the LO data and home-grown full-text
indexes in file system outside of the database.

> I think fixing these would do more to improve Postgres than 
> almost any other work we might do.

Amen!

----------------
Hannu


pgsql-hackers by date:

Previous
From: "Hiroshi Inoue"
Date:
Subject: RE: [HACKERS] Open 6.5 items
Next
From: Dmitry Samersoff
Date:
Subject: RE: [HACKERS] idea for compiling