Re: HOT: Incomplete issues - Mailing list pgsql-hackers
From | Pavan Deolasee |
---|---|
Subject | Re: HOT: Incomplete issues |
Date | |
Msg-id | 2e78013d0706270106g611ae372pd8be2334132a0f8f@mail.gmail.com Whole thread Raw |
In response to | HOT: Incomplete issues (ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp>) |
List | pgsql-hackers |
On 6/26/07, ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> wrote:
Thanks a lot for your tests. I am posting a revised patch on -patches.
Please use that for further testing.
In the last few days, many people have reviewed the patch including
Simon, Heikki, Greg and Korry. I shall post a separate mail summarizing
the changes since the last revision.
Yes, this is a known issue. Heikki had posted a patch to resolve this
conflict.
We can not remove a HEAPTUPLE_DEAD_CHAIN tuple because even if
it is dead, its might be the only way to reach to the live tuple at the end of the chain.
Chain pruning logic would ensure that we remove most of such tuples before
running vacuum on the page, but few might still be left. We can not
reuse the data space just yet because then we loose the xmax/xmin check.
Also with several redirecting line pointers, the HOT chain becomes very complex
and unmanageable.
There are in fact quite a few scenarios here:
1. A dead tuple which is part of a HOT chain can not be removed
2. A dead tuple which is marked LP_DELETE is removed and reported as "removable"
3. A redirect-dead line pointer is removed and reported as "removable"
In case 3, no real tuple is being removed. The tuple might have been
already reused or vacuumed. So it could be slight misleading.
Another problem with the current reporting is that if the original dead tuple
is tracked with a separate lp-deleted line pointer and the original root
offset is redirect-dead then it might be reported twice as "removable".
Once for lp-deleted tuple and again for the redirect-dead line pointer.
May be we should report the the redirect-dead offsets as
"removable redirected offsets" and not count them in "removable" tuples ?
A redirect-dead line pointer consumes 4 bytes of dead space in a page. If a table is full of
redirect-dead line pointers, we should trigger vacuum on the table. May be we can maintain
separate stats about redirect-dead line pointers and give them lower significance
while deciding whether to vacuum or not.
Hi,
I'm testing HOT patches, applying to CVS HEAD.
Thanks a lot for your tests. I am posting a revised patch on -patches.
Please use that for further testing.
In the last few days, many people have reviewed the patch including
Simon, Heikki, Greg and Korry. I shall post a separate mail summarizing
the changes since the last revision.
- MVCC-safe CLUSTER
When I clustered a table with HOT-updated tuples, I saw the following error
message. The HOT patch latest posted does not support MVCC-safe CLUSTER.
| ERROR: unexpected HeapTupleSatisfiesVacuum result
Yes, this is a known issue. Heikki had posted a patch to resolve this
conflict.
- Number of unremovable tuples reported by VACUUM VERBOSE
HOT-updated tuples (HEAPTUPLE_DEAD_CHAIN) are counted as "keeped" and
VACUUM VERBOSE prints them as "cannot be removed yet". However, we can
actually remove them. We can reuse the data space of HOT-updated tuples,
but need to keep their item pointers. We'd better to show them as two
different messages -- for example, unremovable tuples and unreusable
item pointers.
We can not remove a HEAPTUPLE_DEAD_CHAIN tuple because even if
it is dead, its might be the only way to reach to the live tuple at the end of the chain.
Chain pruning logic would ensure that we remove most of such tuples before
running vacuum on the page, but few might still be left. We can not
reuse the data space just yet because then we loose the xmax/xmin check.
Also with several redirecting line pointers, the HOT chain becomes very complex
and unmanageable.
There are in fact quite a few scenarios here:
1. A dead tuple which is part of a HOT chain can not be removed
2. A dead tuple which is marked LP_DELETE is removed and reported as "removable"
3. A redirect-dead line pointer is removed and reported as "removable"
already reused or vacuumed. So it could be slight misleading.
Another problem with the current reporting is that if the original dead tuple
is tracked with a separate lp-deleted line pointer and the original root
offset is redirect-dead then it might be reported twice as "removable".
Once for lp-deleted tuple and again for the redirect-dead line pointer.
May be we should report the the redirect-dead offsets as
"removable redirected offsets" and not count them in "removable" tuples ?
- ANALYZE and statistics of dead rows
Since redirected or redirect-dead item pointers are counted as "dead rows",
we overestimates the number of dead rows. It confuses statistics and
ill-affects to autovacuums; If autovacuum does ANALYZE, the number of
dead tuples looks suddenly increased and it triggers unnecessary VACUUMs
by the next autovacuum.
A redirect-dead line pointer consumes 4 bytes of dead space in a page. If a table is full of
redirect-dead line pointers, we should trigger vacuum on the table. May be we can maintain
separate stats about redirect-dead line pointers and give them lower significance
while deciding whether to vacuum or not.
Thanks,
Pavan
--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com
pgsql-hackers by date: