Re: [HACKERS] Arbitrary tuple size - Mailing list pgsql-hackers

From Hannu Krosing
Subject Re: [HACKERS] Arbitrary tuple size
Date
Msg-id 3785A16F.11C0761D@trust.ee
Whole thread Raw
In response to Re: [HACKERS] Arbitrary tuple size  (Bruce Momjian <maillist@candle.pha.pa.us>)
Responses Re: [HACKERS] Arbitrary tuple size
List pgsql-hackers
Vadim Mikheev wrote:
> 
> Bruce Momjian wrote:
> >
> > > Bruce Momjian wrote:
> > > >
> > > > If we get wide tuples, we could just throw all large objects into one
> > > > table, and have an on it.  We can then vacuum it to compact space, etc.
> > >
> > > Storing 2Gb LO in table is not good thing.
> > >
> > > Vadim
> > >
> >
> > Ah, but we have segemented tables now.  It will auto-split at 1 gig.
> 
> Well, now consider update of 2Gb row!
> I worry not due to non-overwriting but about writing
> 2Gb log record to WAL - we'll not be able to do it, sure.

Can't we write just some kind of diff (only changed pages) in WAL,
either starting at some thresold or just based the seek/write logic of
LOs?

It will add complexity, but having some arbitrary limits seems very
wrong.

It will also make indexing LOs more complex, but as we don't currently
index 
them anyway, its not a big problem yet.

Setting the limit higher (like 16M, where all my current LOs would fit
:) )
is just postponing the problems. Does "who will need more than 640k of
RAM"
sound familiar ?

> Isn't it why Informix restrict tuple len to 32k only?
> And the same is what Oracle does.

Does anyone know what the limit for Oracle8i is ? As they advertise it
as a 
replacement file system among other things, I guess it can't be too low
- 
I suspect 2G at minimum

> Both of them have ability to use > 1 page for single row,
> but they have this restriction anyway.
> 
> I don't like _arbitrary_ tuple size.

Why not ?

IMHO we should allow _arbitrary_ (in reality probably <= MAXINT), but 
optimize for some known size and tell the users that if they exceed it
the performance would suffer. 

So when I have 99% of my LOs in 10k-80k range but have a few 512k-2M
ones 
I can just live with the bigger ones having bad performance instead 
implementing an additional LO manager in the frontend too.

> I vote for some limit.

Why limit ?

> 32K or 64K, at max.

Why so low ? Please make it at least configurable, preferrably at
runtime.

And if you go that way, please assume this limit (in code) for tuple
size only,
and not in FE/BE protocol - it will make it easier for someone to fix
the backend 
to work with larger ones later

The LOs should remain load-on-demant anyway, just it should be made more
transparent 
for end-users.

> Vadim


pgsql-hackers by date:

Previous
From: Vadim Mikheev
Date:
Subject: Re: [HACKERS] Arbitrary tuple size
Next
From: Tom Ivar Helbekkmo
Date:
Subject: Re: [HACKERS] Fwd: Joins and links