Re: [HACKERS] LONG - Mailing list pgsql-hackers
From | Bruce Momjian |
---|---|
Subject | Re: [HACKERS] LONG |
Date | |
Msg-id | 199912122144.QAA08726@candle.pha.pa.us Whole thread Raw |
In response to | Re: [HACKERS] LONG (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: [HACKERS] LONG
libpq questions...when threads collide |
List | pgsql-hackers |
> 2. While reading a tuple, fastgetattr() automatically fetches the > out-of-line value if it sees the requested attribute is out-of-line. > (I'd be inclined to mark out-of-line attributes in the same way that > NULL attributes are marked: one bit in the tuple header shows if any > out-of-line attrs are present, and if so there is a bitmap to show > which ones are out-of-line. We could also use Bruce's idea of > commandeering the high-order bit of the varlena length word, but > I think that's a much uglier and more fragile solution.) Not sure if fastgetattr() is the place for this. I thought the varlena access routines themselves would work. It is nice and clean to do it in fastgetattr, but how do you know to pfree it? I suppose if you kept the high bit set, you could try cleaning up, but where? My idea was to expand the out-of-line varlena, and unset the 'long' bit. long-bit|length|reloid|tupleoid|attno|longlen Unexpanded would be: 1|20|10032|23123|5|20000 unexpanded is: 0|20000|data > > I think that these two changes would handle 99% of the problem. > VACUUM would still need work, but most normal access to tuples would > just work automatically, because all access to varlena fields must go > through fastgetattr(). > > An as-yet-unsolved issue is how to avoid memory leaks of out-of-line > values after they have been read in by fastgetattr(). However, I think > that's going to be a nasty problem with Jan's approach as well. The > best answer might be to solve this in combination with addressing the > problem of leakage of temporary results during expression evaluation, > say by adding some kind of reference-count convention to all varlena > values. That's why I was going to do the expansion only in the varlena access routines. Patch already posted. > > BTW, I don't see any really good reason to keep the out-of-line values > in a separate physical file (relation) as Jan originally proposed. > Why not keep them in the same file, but mark them as being something > different than a normal tuple? Sequential scans would have to know to > skip over them (big deal), and VACUUM would have to handle them > properly, but I think VACUUM is going to have to have special code to > support this feature no matter what. If we do make them a new primitive > kind-of-a-tuple on disk, we could sidestep the problem of marking all > the out-of-line values associated with a tuple when the tuple is > outdated by a transaction. The out-of-line values wouldn't have > transaction IDs in them at all; they'd just be labeled with the CTID > and/or OID of the primary tuple they belong to. VACUUM would consult > that tuple to determine whether to keep or discard an out-of-line value. I disagree. By moving to another table, we don't have non-standard tuples in the main table. We can create normal tuples in the long* table, of identical format, and access them just like normal tuples. Having special long tuples in the main table that don't follow the format of the other tuples it a certain mess. The long* tables also move the long data out of the main table so it is not accessed in sequential scans. Why keep them in the main table? -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
pgsql-hackers by date: