Re: Visibility map thoughts - Mailing list pgsql-hackers
From | Gokulakannan Somasundaram |
---|---|
Subject | Re: Visibility map thoughts |
Date | |
Msg-id | 9362e74e0711050813o7eb2a353q7a7ea6cc9b86b608@mail.gmail.com Whole thread Raw |
In response to | Re: Visibility map thoughts ("Heikki Linnakangas" <heikki@enterprisedb.com>) |
List | pgsql-hackers |
On 11/5/07, Heikki Linnakangas <heikki@enterprisedb.com> wrote:
As you have pointed out 1 page in the visibility map points to 65535 pages in the heap. So even if we are locking the visibility map for a small time, it will affect all those scans, which will need to access these pages.
But we can set this bit only after a Vacuum process. The tuples might not be there, till the Vacuum process pitches in and marks this.
But these are determined by business conditions and we need to provide efficient solutions to deal with it.
I think DSM is not WAL-Logged. So when it gets reconstructed every time for a big table, isn't it a overhead?
Provided it supports Vacuuming & Freezing..
Because updates, inserts and deletes reduce the utility of Visibility map, it seems to be designed for more static tables, which don't experience much of these operations.
Gokulakannan Somasundaram wrote:
> a) The inserts won't increase the size of the table. If it increases, it has
> to lock one full page of Visibility map and this is not suitable for tables,
> which are short-lived like partitioned tables
The overhead of locking a page is very small.
Actually, extending a heap only needs to touch the visibility map when
we need a new visibility map page, if we initialize all bits to zero.
Like we do already anyway.
As you have pointed out 1 page in the visibility map points to 65535 pages in the heap. So even if we are locking the visibility map for a small time, it will affect all those scans, which will need to access these pages.
> b) Even if the inserts don't increase the size of the table, it might make
> DSM useless, if lot of inserts keep converting the all-visible ones to
> uncertain ones. For that matter, even the Deletes and Updates are also going
> to make lot of pages into uncertain ones.
Sure. If you have a lot of (random) inserts/updates/deletes, it becomes
much less useful.
A small mitigating factor is that an insert/update/delete will fetch the
heap page to memory anyway. Therefore having to access it just after the
update is cheap. This helps inserts in particular, because after the
inserting transaction is < OldestXmin, we can set the bit again.
But we can set this bit only after a Vacuum process. The tuples might not be there, till the Vacuum process pitches in and marks this.
> c) Visibility map gets useless, when there is a long running batch query /
> periodic background queries which run for longer times
Yeah, long running transactions are a bitch in many ways.
But these are determined by business conditions and we need to provide efficient solutions to deal with it.
> d) More updates- more blocks of uncertainity - space usage by DSM and the
> reference made to DSM is just an overhead
> e) Lot of times, people may not need index-only scans. Again this gets to be
> a overhead
The beauty of this approach is that the overhead is very small.
> f) If there are scheduled reboots, the DSM crashes and periodic slow-downs
> in the queries during the time, the DSM gets re-constructed.
That's rubbish.
I think DSM is not WAL-Logged. So when it gets reconstructed every time for a big table, isn't it a overhead?
> I am not opposing this, as it is a redundant feature for Thick indexes.
> After all every one of us, want Postgres to be the fastest one in the world.
And also the easiest to maintain, most space-efficient, most reliable
and so forth...
Provided it supports Vacuuming & Freezing..
> But because DSM has a inherent assumption that lot of tables will become
> static and all the tuples would be visible to everyone. If there are such
> tables, then definitely Thick index becomes a overhead in terms of space.
> But DSM should not become overhead at any cost, as it is a memory resident
> one at all times and also always gets into the lifecycle of a query. Only
> way to achieve it is to make it a dual purpose one. It should help Vacuum,
> freezing and visibility checks.
I don't understand this paragraph.
Because updates, inserts and deletes reduce the utility of Visibility map, it seems to be designed for more static tables, which don't experience much of these operations.
--
Thanks,
Gokul.
CertoSQL Project,
Allied Solution Group.
(www.alliedgroups.com)
pgsql-hackers by date: