Home > mailing lists

Re: Frequent Update Project: Design Overview of HOT Updates - Mailing list pgsql-hackers

From	Pavan Deolasee
Subject	Re: Frequent Update Project: Design Overview of HOT Updates
Date	November 10, 2006 02:57:30
Msg-id	2e78013d0611092223n15180ccdkd0f00c20c71374e7@mail.gmail.com Whole thread Raw
In response to	Re: Frequent Update Project: Design Overview of HOT Updates (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: Frequent Update Project: Design Overview of HOT Updates
List	pgsql-hackers

Tree view

On 11/10/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Pavan Deolasee" <pavan.deolasee@gmail.com> writes:
> On 11/10/06, Josh Berkus < josh@agliodbs.com> wrote:
>> I believe that's the "unsolved technical issue" in the prototype, unless
>> Pavan has solved it in the last two weeks. Pavan?
>>
> When an overflow tuple is copied back to the main heap, the overflow tuple
> is
> marked with a special flag (HEAP_OVERFLOW_MOVEDBACK). Subsequently,
> when a backend tries to lock the overflow version of the tuple, it checks
> the flag
> and jumps to the main heap if the flag is set.

(1) How does it "jump to the main heap"? The links go the other
direction.

The overflow tuple has a special header to store the back pointer to the main heap.
This increases the tuple header size by 6 bytes, but the overhead is restricted only to the overflow
tuples.

(2) Isn't this full of race conditions?

I agree, there could be race conditions. But IMO we can handle those. When we
follow the tuple chain, we hold a SHARE lock on the main heap buffer. Also, when
the root tuple is vacuumable and needs to be overwritten, we acquire and keep EXCLUSIVE
lock on the main heap buffer.

This reduces the race conditions to a great extent.

(3) I thought you already used up the one remaining t_infomask bit.

Yes. The last bit in the t_infomask is used up to mark presence of overflow tuple header. But I believe there are few more bits that can be reused. There are three bits available in the t_ctid field as well (since ip_posid needs maximum 13 bits). One bit is used to identify whether a given tid points to the main heap or the overflow heap. This helps when tids are passed around in the code.

Since the back pointer from the overflow tuple always points to the main heap, the same bit can be used to mark copied-back tuples (we are doing it in a slight different way in the current prototype though).

Regards,
Pavan

pgsql-hackers by date:

From: Tom Lane
Date: 10 November 2006, 01:28:06
Subject: Re: Frequent Update Project: Design Overview of HOT Updates

From: Hannu Krosing
Date: 10 November 2006, 03:25:55
Subject: Re: Frequent Update Project: Design Overview of HOT

Re: Frequent Update Project: Design Overview of HOT Updates - Mailing list pgsql-hackers

Previous

Next