Re: BLOB support - Mailing list pgsql-hackers

From Radosław Smogura
Subject Re: BLOB support
Date
Msg-id 201106021853.52943.rsmogura@softperience.eu
Whole thread Raw
In response to Re: BLOB support  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: BLOB support
Re: BLOB support
List pgsql-hackers
Tom Lane <tgl@sss.pgh.pa.us> Thursday 02 of June 2011 16:42:42
> Robert Haas <robertmhaas@gmail.com> writes:
> > But these problems can be fixed without inventing a completely new
> > system, I think.  Or at least we should try.  I can see the point of a
> > data type that is really a pointer to a LOB, and the LOB gets deleted
> > when the pointer is removed, but I don't think that should require
> > far-reaching changes all over the system (like relhaslobs) to make it
> > work efficiently.  I think you need to start with a problem statement,
> > get agreement that it is a problem and on what the solution should be,
> > and then go write the code to implement that solution.
> 
> Yes.  I think the appropriate problem statement is "provide streaming
> access to large field values, as an alternative to just fetching/storing
> the entire value at once".  I see no good reason to import the entire
> messy notion of LOBS/CLOBS.  (The fact that other databases have done it
> is not a good reason.)
> 
> For primitive types like text or bytea it seems pretty obvious what
> "streaming access" should entail, but it might be interesting to
> consider what it should mean for structured types.  For instance, if I
> have an array field with umpteen zillion elements, it might be nice to
> fetch them one at a time using the streaming access mechanism.  I don't
> say that that has to be in the first version, but it'd be a good idea to
> keep that in the back of your head so you don't design a dead-end
> solution that can't be extended in that direction.
> 
>             regards, tom lane

In context of LOBs streaming is resolved... I use current LO functionallity 
(so driver may be able to read LOBs as psql \lo_export does it or using COPY 
subprotocol) and client should get just LO's id. BLOBs in this implementation, 
like Robert wanted are just wrapper for core LO, with some extensions for 
special situations.... Adding of relhaslob in this impl is quite importnat to 
do not examine tupledesc for each table operation, but this value may be 
deduced during relation open (with performance penatly). I saw simillar is 
made few lines above when triggers are fired, and few lines below when indices 
are updated. 

Currently BLOBs may be emulated using core LO (JDBC driver does it), but among 
everything else, other problems are, if you look from point of view of 
application developing:

1. No tracking of unused LO (you store just id of such object). You may leak 
LO after row remove/update. User may write triggers for this, but it is not 
argument - BLOB type is popular, and it's simplicity of use is quite 
important. When I create app this is worst thing.

2. No support for casting in UPDATE/INSERT. So there is no way to simple 
migrate data (e.g. from too long varchars). Or to copy BLOBs.

3. Limitation of field size to 1GB.

Other solution, I was think about, is to introduce system triggers (such 
triggers can't be disabled or removed). So there will be new flag in triggers 
table.

Now I think, we should try to mix both aproches, as system triggers may give 
interesting API for other developers.

Other databases (may) store LOBs, Arrays, and Composites in external tables, 
so user get's just id of such object.

I think about two weaks about streaming, I have some concepts about this, but 
from point of view of memory consumption and performance. I will send concept 
later, I want to think a little bit about it once more, and search what can be 
actually done.

Regards,
Radek


pgsql-hackers by date:

Previous
From: Marko Kreen
Date:
Subject: Re: Please test peer (socket ident) auth on *BSD
Next
From: Marko Kreen
Date:
Subject: Re: Please test peer (socket ident) auth on *BSD