Home > mailing lists

Open issues for HOT patch - Mailing list pgsql-hackers

From	Tom Lane
Subject	Open issues for HOT patch
Date	September 17, 2007 23:49:33
Msg-id	24348.1190083765@sss.pgh.pa.us Whole thread Raw
Responses	Re: Open issues for HOT patch Re: Open issues for HOT patch Re: Open issues for HOT patch Re: Open issues for HOT patch
List	pgsql-hackers

Tree view

I have finished a first review pass over all of the HOT patch
(updated code is posted on -patches).  I haven't found any showstoppers,
but there seem still several areas that need discussion:

* The patch makes undocumented changes that cause autovacuum's decisions
to be driven by total estimated dead space rather than total number of
dead tuples.  Do we like this?  What should happen to the default
threshold parameters (they are not even in the same units as before...)?
Is there any value in even continuing to track dead tuple counts, per
se, in the pgstats machinery?  It seems redundant/expensive to track
both tuple counts and byte counts, and it's not like the size of the
stats file is not already known to be a performance issue ...

* I'm still pretty unhappy about the patch's use of a relcache copy of
GetAvgFSMRequestSize()'s result.  The fact that there's no provision for
ever updating the value while the relcache entry lives is part of it,
but the bigger part is that I'd rather not have anything at all
depending on that number.  FSM in its current form desperately needs to
die; and once it's replaced by some form of distributed on-disk storage,
it's unlikely that we will have any simple means of getting an
equivalent number.  The average request size was never meant for
external use anyway, but only as a filter to help reject useless entries
from getting into the limited shared-memory FSM space.  Perhaps we could
replace that heuristic with something that is page-local; seems like
dividing the total used space by the number of item pointers would give
at least a rough approximation of the page's average tuple size.

* We also need to think harder about when to invoke the page pruning
code.  As the patch stands, if you set a breakpoint at
heap_page_prune_opt it'll seem to be hit constantly (eg, once for every
system catalog probe), which seems uselessly often.  And yet it also
seems not often enough, because one thing I found out real fast is that
the "prune if free space < 1.2 average tuple size" heuristic fails badly
when you look at queries that execute multiple updates within the same
heap page.  We only prune when we first pin a particular target page,
and so the additional updates don't afford another chance to see if it's
time to prune.

I'd like to see if we can arrange to only do pruning when reading a page
that is known to be an update target (ie, never during plain SELECTs);
I suspect this would be relatively easy with some executor and perhaps
planner changes.  But that only fixes the first half of the gripe above;
I'm not at all sure what to do about the multiple-updates-per-page
issue.

Comments?
        regards, tom lane

pgsql-hackers by date:

From: "Sibte Abbas"
Date: 17 September 2007, 23:45:32
Subject: Re: Raw device I/O for large objects

From: Bruce Momjian
Date: 18 September 2007, 00:06:03
Subject: Re: Open issues for HOT patch

Open issues for HOT patch - Mailing list pgsql-hackers

Previous

Next