Thread: Re: [HACKERS] Priorities for 6.6

Re: [HACKERS] Priorities for 6.6

Philip Warner
Dear All,

It seems to me that there are a bunch of related issues that probably need to be tied together (and forgotten about?):

1. A 'nice' user interface for blobs 
2. Text fields stored as blobs
3. Naming issues for 'system' tables etc.
4. pg_dump support for blobs and other 'internal' structures.
5. Blob storage in files Vs. a 'nicer' storage medium.
6. The tuple-size problem(?)

Points (1) & (2) are really the same thing; if you provide a nice interface to blobs: "select len(blob_field) from
...."and "select blob_field from ...", then any discussion of the messiness associated with blobs will go away.
Personally,I would hate to lose the ability to store a blob's data using a series of 'lo_write' calls: one system I
workon (not in PG) has blob data as large as 24MB which makes blob_write functionality essential.

Points (3) & (4) recognize that there are a number issues floating around that relate to the basic inappropriateness of
usingSQL to reload the data structures of an existing database. I have only used a few commercial DBs, but the ones I
haveused uniformly have a 'dump' that produces data files in it's own format. There is no question that having pg_dump
producea schema and/or INSERT statements is nice, but a new option needs to be added to allow raw exports, and a new
pg_loadutility needs to be written. Cross-version compatibility between export formats must also be maintained

Point (5) recognizes that storing 'large' data in the same area that a row is stored in will remove any benefits of
clustering,so a method of handling blob data needs to be found, irrespective of whether PG still supports blobs as
such.I don't know how PG handles large text fields - some commercial systems allow the user to 'map' specific fields to
separatedata files. The current system (storing blobs in files) is fine except in so far as it *looks* messy, produces
*huge*directories, and is slow for many small blobs (file open/read/close per row).

I don't know anything about the 'tuple-size' problem (point 6), but it may also relate to a solution for storing
blob-data(or specific columns) in alternate locations.

I hope this is not all static...

Philip Warner.

Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.C.N. 008 659 498)             |          /(@)   ______---_
Tel: +61-03-5367 7422            |                 _________  \
Fax: +61-03-5367 7430            |                 ___________ |
Http://          |                /           \|                                |    --________--
PGP key available upon request,  |  /
and from   |/