Re: Including Snapshot Info with Indexes - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Including Snapshot Info with Indexes
Date
Msg-id 470F3846.3040102@enterprisedb.com
Whole thread Raw
In response to Re: Including Snapshot Info with Indexes  ("Gokulakannan Somasundaram" <gokul007@gmail.com>)
Responses Re: Including Snapshot Info with Indexes
Re: Including Snapshot Info with Indexes
List pgsql-hackers
Gokulakannan Somasundaram wrote:
>          Last two mails were sent by mistake without completion. I couldn't
> curse my system any further

:-)

> a) Dead Space, if it is successfull in its implementation of what it claims,
> will have the means to point out that all the tuples of certain chunks are
> frozen for registered relations and registered chunks. There would be lot of
> blocks which won't fall under this category.
>               i) For example, if the records are newly inserted, that block
> can't be marked as containing all frozen tuples.

If records have just been inserted to a block, it is in cache. Therefore
hitting that block to check visibility isn't going to cost much. There
might be some middle-ground where a tuple has been inserted a while ago,
so that the block has already been evicted from cache, but the
transaction hasn't yet been committed, but that's a pretty narrow use case.

Note that we can flag a page in the DSM not only by VACUUM, but by any
backend as soon as all tuples are visible to everyone. You do have to
scan the tuple headers on the page to determine that, but that's not so
much overhead, and it could be offloaded to the bgwriter.

>              ii) Imagine the case where there is a batch job / Huge select
> query running in a enterprise for more than 6hrs. All the blocks which have
> got inserted into the tables, during this period might not be able to get
> the advantage of DSM

Yep, true. A long-running transaction like that is problematic anyway,
because we can't vacuum away any dead rows generated during that period.

>             iii) Imagine the case for which i am actually proposing the
> Index with Snapshot infos. Partitioned tables. Every time a new table gets
> created, it has to get registered into the Deadspace. This requires more
> maintenance on the DBA Side

Why do you think that the DBA needs to register tables to the DSM
manually? Surely that would happen automatically.

>             iv) I understand the DeadSpaceLock to be a Global lock(If one
> transaction is updating the dead space for any unregistered chunk, no one
> else can query the DeadSpace). If my statement is right, then partitioned
> tables might not be able to benefit from DSM. We have to remember for tables
> with daily partitions, this would prove to be a nightmare

The patch submitted for 8.3 did use a global lock, and a fixed size
shared memory area, but those were exactly the reasons the patch was
rejected. It will have to be reworked for 8.4.

> Other than that there are lot of advantages, i foresee with including the
> indexes with snapshots
> i) Vacumming of these indexes need not be done with SuperExclusive Locks. It
> is possible to design a strategy to vacuum these indexes with Exclusive
> locks on pages

I'm not convinced that's true. We only need super-exclusive locks on
index pages for interlocking index and heap accesses with non-MVCC
snapshots, IOW system tables. And since the lock is only held for a
short time and infrequently, it hasn't been a problem at all.

> ii) The above would mean that index can be in operation while the vacuum is
> happening

Huh? VACUUM hasn't locked out other access since version 7.2!

> iii) As we have already stated, it provides a efficient clustering of
> related data.

Sorry, I missed that part. What's that?

> iv) The Reverse Mapping Index, if present provides an efficient solution to
> the Retail Vacuum problem. So HOT can be improved further with no need to
> place the restriction of the updated tuple should be in the same page
> iv) Updates on tables with primary keys(where primary key is not updated),
> will be able to resolve the unique constraint faster. This is a minor
> advantage.
> 
> The complexity of Reverse Mapping index will only be there for user-defined
> functional indexes.

The *run-time* complexity of that will only be there for UDF indexes,
but the *code* complexity will always be there. Sorry, I think the
probability of a reverse mapping index being accepted to Postgres is
very close to zero.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: Release notes introductory text
Next
From: Michael Paesold
Date:
Subject: Re: First steps with 8.3 and autovacuum launcher