Re: Index only scan paving the way for "auto" clustered tables? - Mailing list pgsql-hackers

From Kääriäinen Anssi
Subject Re: Index only scan paving the way for "auto" clustered tables?
Date
Msg-id BC19EF15D84DC143A22D6A8F2590F0A7886413307D@EXMAIL.stakes.fi
Whole thread Raw
In response to Re: Index only scan paving the way for "auto" clustered tables?  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Index only scan paving the way for "auto" clustered tables?
List pgsql-hackers
Robert Haas wrote:
"""
And it seems to me that there could easily be format changes that
would make sense for particular cases, but not across the board,
like:

- index-organized tables (heap is a btree, and secondary indexes
reference the PK rather than the TID; this is how MySQL does it, and
Oracle offers it as an option)
- WORM tables (no updates or deletes, and no inserts after creating
transaction commits, allowing a much smaller tuple header)
- non-transactional tables (tuples visible as soon as they're written,
again allowing for smaller tuple header; useful for internal stuff and
perhaps for insert-only log tables)
"""

This is probably a silly idea, but I have been wondering about the
following idea: Instead of having visibility info in the row header,
have a couple of row visibility slots in the page header. These slots
could be shared between rows in the page, so that if you do a bulk
insert/update/delete you would only use one slot. If the slots
overflow, you would use external slots buffer.

When the row is all visible, no slot would be used at all.

The xmin, xmax and cid would be in the slots. ctid would have its
current meaning, except when the external slots would be used,
then ctid would point to the external slot, and it would have the real
row header. I don't know if there would be any other row header
parts which could be shared.

The external slots buffer would then contain xmin, xmax, cid and
the real ctid.

Updates would write the new rows to another page in the heap,
and old rows would stay in place, just as now. So there would not
be any redo log like configuration. Also, the external slots buffer
would be small (18 bytes per row), so it would not get out of
cache too easily.

The performance would suck if you had lots of small updates, or
long running transactions. On the other hand in data warehousing,
where bulk loads are normal, and there are a lot of small rows,
this could actually work.

As said, this is probably a silly idea. But as pluggable heap types
came up, I thought to ask if this could actually work. If this kind of
wondering posts are inappropriate for this list, please tell me so
that I can avoid these in the future.
- Anssi Kääriäinen


pgsql-hackers by date:

Previous
From: Greg Sabino Mullane
Date:
Subject: Re: Overhead cost of Serializable Snapshot Isolation
Next
From: Greg Sabino Mullane
Date:
Subject: Re: Overhead cost of Serializable Snapshot Isolation