Re: [HACKERS] LONG - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: [HACKERS] LONG
Date
Msg-id 199912122144.QAA08726@candle.pha.pa.us
Whole thread Raw
In response to Re: [HACKERS] LONG  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [HACKERS] LONG
libpq questions...when threads collide
List pgsql-hackers
> 2. While reading a tuple, fastgetattr() automatically fetches the
> out-of-line value if it sees the requested attribute is out-of-line.
> (I'd be inclined to mark out-of-line attributes in the same way that
> NULL attributes are marked: one bit in the tuple header shows if any
> out-of-line attrs are present, and if so there is a bitmap to show
> which ones are out-of-line.  We could also use Bruce's idea of
> commandeering the high-order bit of the varlena length word, but
> I think that's a much uglier and more fragile solution.)

Not sure if fastgetattr() is the place for this.  I thought the varlena
access routines themselves would work.  It is nice and clean to do it in
fastgetattr, but how do you know to pfree it?  I suppose if you kept the
high bit set, you could try cleaning up, but where?

My idea was to expand the out-of-line varlena, and unset the 'long' bit.
long-bit|length|reloid|tupleoid|attno|longlen

Unexpanded would be:
1|20|10032|23123|5|20000

unexpanded is:
0|20000|data


> 
> I think that these two changes would handle 99% of the problem.
> VACUUM would still need work, but most normal access to tuples would
> just work automatically, because all access to varlena fields must go
> through fastgetattr().
> 
> An as-yet-unsolved issue is how to avoid memory leaks of out-of-line
> values after they have been read in by fastgetattr().  However, I think
> that's going to be a nasty problem with Jan's approach as well.  The
> best answer might be to solve this in combination with addressing the
> problem of leakage of temporary results during expression evaluation,
> say by adding some kind of reference-count convention to all varlena
> values.

That's why I was going to do the expansion only in the varlena access
routines.  Patch already posted.

> 
> BTW, I don't see any really good reason to keep the out-of-line values
> in a separate physical file (relation) as Jan originally proposed.
> Why not keep them in the same file, but mark them as being something
> different than a normal tuple?  Sequential scans would have to know to
> skip over them (big deal), and VACUUM would have to handle them
> properly, but I think VACUUM is going to have to have special code to
> support this feature no matter what.  If we do make them a new primitive
> kind-of-a-tuple on disk, we could sidestep the problem of marking all
> the out-of-line values associated with a tuple when the tuple is
> outdated by a transaction.  The out-of-line values wouldn't have
> transaction IDs in them at all; they'd just be labeled with the CTID
> and/or OID of the primary tuple they belong to.  VACUUM would consult
> that tuple to determine whether to keep or discard an out-of-line value.

I disagree.  By moving to another table, we don't have non-standard
tuples in the main table.  We can create normal tuples in the long*
table, of identical format, and access them just like normal tuples. 
Having special long tuples in the main table that don't follow the
format of the other tuples it a certain mess.  The long* tables also
move the long data out of the main table so it is not accessed in
sequential scans.  Why keep them in the main table?

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] Re: Jesus, what have I done (was: LONG)
Next
From: wieck@debis.com (Jan Wieck)
Date:
Subject: Re: [HACKERS] LONG