Re: BLOB support - Mailing list pgsql-hackers

From Radosław Smogura
Subject Re: BLOB support
Date
Msg-id 201106061723.08711.rsmogura@softperience.eu
Whole thread Raw
In response to Re: BLOB support  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: BLOB support
List pgsql-hackers
Tom Lane <tgl@sss.pgh.pa.us> Monday 06 of June 2011 16:13:26
> Radosław Smogura <rsmogura@softperience.eu> writes:
> >  I think more about this with contrast to sent references, but I still
> >  have in my mind construct
> >  Blob myWeddingDvd = conn.createBlob(myWeddingStream, size); //A bit
> >  outdated we have BlueRay
> >  conn.prepareStatemnt("INSERT INTO someonetubevideos values (?)")
> >  where 1st parameter is myWeddingDvd,
> 
> Yes, if you insist upon designing the API like that, then you come to
> the conclusion that you need global LOB identifiers.
> 
> However, there are many ways to design this that don't work that way.
> One idea to think about is
> 
>     insert into someonetubevideos values('')
>     returning open_for_write(videocolumn)
> 
> which gives you back some kind of writable stream ID (this is a
> transient, within-session ID, not global) for the target field in the
> row you just inserted.
I know, but this is a little bit old-fashioned bahaviour.

> BTW, as was noted upthread by Dimitri, this whole subject has been
> discussed before on pgsql-hackers.  You really ought to go re-read the
> previous threads.
> 
>             regards, tom lane

I read this, but it may be rethinked again. Actaully changes to TOAST (I mean 
streaming will be just for LOBs, I had written all found disadvantages for 
TOAST for LOB and it's looks like only performance of above is some kind of 
disadvantage, as well this prevent some less usefull concepts of Copy on Write 
for LOBs.

Introducing streaming for TOAST is little useless, sorry just for cite from 
my, mentoined document:

(This is generally about on demand stream of TOASTed value, in 
context of LOBs is acceptable, as long not transactional aware LOBs are 
acceptable). If we will add streaming of TOASTed values, so caller will 
get some reference to this value, we need to ensure that pointed data 
will not be changed, nor deleted - I think this will require caller to 
add FOR UPDATE (or silently to add this by server) for each statement 
returning pointers to TOASTed, as client may do transactional query, and 
other client just after (1st) may remove record, commit, and call 
VACUUM. In this situation when 1st will try to read data form given row, 
it will get error. This may be accpetable for LOBs (commonly LOBs may be not 
transaction aware, but I will be angry if this will happen with VARCHAR)

If this is acceptable I will do following changes.

Add 
- server_max_in_memory_lob_size - GUC server start-only config to describe 
maximum value of client session parameter max_in_memory_lob.

- max_in_memory_lob - session GUC describing how huge LOBs may be keept in 
memory before backing up to file

- rescursivly toasting, detoasting during insert/update/remove for searching 
for LOBs (we need this for arrays and complex types) - this is for last stage 
(error disallowing LOBs in composites/arrays may be quite enaugh, for 
begining) - I want LOBs to be starting point for LOBing other types (e.g. some 
big arrays may be LOBbed).

- during toasting, lob will be toasted and in place of LOB, the reference to 
it will be putted, and encoded in LOB datum.

- api for LOB manipulation (few changes to current implementation) in way that 
BLOB is field type and uninon of in_memory, in_file, in_toast. This few bytes 
for LOB will not affect size of datum looking at size of LOB.

- api for maintaing temoraly lob files, we need this as per session list of id 
-> file desc, to prevent prevent stealing of lobs by different connections 
(security)

- streaming api for TOASTED values (based on COPY protocol, or changed COPY 
protocol) or at least function calls - I havent looked at this in context of 
TOASTed LOBs.

Is it good?

Regards,
Radek



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Domains versus polymorphic functions, redux
Next
From: Tom Lane
Date:
Subject: Re: BLOB support