Thread: Re: [PATCHES] VACUUM Improvements - WIP Patch

Re: [PATCHES] VACUUM Improvements - WIP Patch

From
"Pavan Deolasee"
Date:
(taking the discussions to -hackers)

On Sat, Jul 12, 2008 at 11:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
>
> (2) It achieves speedup of VACUUM by pushing work onto subsequent
> regular accesses of the page, which is exactly the wrong thing.
> Worse, once you count the disk writes those accesses will induce it's
> not even clear that there's any genuine savings.
>

Well in the worst case that is true. But in most other cases, the
second pass work will be combined with other normal activities and the
overhead will be shared, at least there is a chance for that. I think
there is a chance for delaying the work until there is any real need
for that e.g. INSERT or UPDATE on the page which would require a free
line pointer.


> (3) The fact that it doesn't work until concurrent transactions have
> gone away makes it of extremely dubious value in real-world scenarios,
> as already noted by Simon.
>

If there are indeed long running concurrent transactions, we won't get
any benefit of this optimization. But then there are several more
common cases of very short concurrent transactions. In those cases and
for very large tables, reducing the vacuum time is a significant win.
The FSM will be written early and significant work of the VACUUM can
be finished quickly.

> It strikes me that what you are trying to do here is compensate for
> a bad decision in the HOT patch, which was to have VACUUM's first
> pass prune/defrag a page even when we know we are going to have to
> come back to that page later.  What about trying to fix things so
> that if the page contains line pointers that need to be removed,
> the first pass doesn't dirty it at all, but leaves all the work
> to be done at the second visit?  I think that since heap_page_prune
> has been refactored into a "scan" followed by an "apply", it'd be
> possible to decide before the "apply" step whether this is the case
> or not.
>

I am not against this idea. Just that it still requires us double scan
of the main table and that's exactly what we are trying to avoid with
this patch.

Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com


Re: [PATCHES] VACUUM Improvements - WIP Patch

From
Tom Lane
Date:
"Pavan Deolasee" <pavan.deolasee@gmail.com> writes:
> (taking the discussions to -hackers)
> On Sat, Jul 12, 2008 at 11:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> (2) It achieves speedup of VACUUM by pushing work onto subsequent
>> regular accesses of the page, which is exactly the wrong thing.
>> Worse, once you count the disk writes those accesses will induce it's
>> not even clear that there's any genuine savings.

> Well in the worst case that is true. But in most other cases, the
> second pass work will be combined with other normal activities and the
> overhead will be shared, at least there is a chance for that. I think
> there is a chance for delaying the work until there is any real need
> for that e.g. INSERT or UPDATE on the page which would require a free
> line pointer.

That's just arm-waving: right now, pruning will be done by the next
*reader* of the page, whether or not he has any intention of *writing*
it.   With no proposal on the table for improving that situation,
I don't see any credibility in arguing for over-complicating VACUUM
on the grounds that it might happen someday.  In any case, the work
that is supposed to be done by VACUUM is being pushed to a foreground
query, which I find to be completely against our design principles.

>> It strikes me that what you are trying to do here is compensate for
>> a bad decision in the HOT patch, which was to have VACUUM's first
>> pass prune/defrag a page even when we know we are going to have to
>> come back to that page later.  What about trying to fix things so
>> that if the page contains line pointers that need to be removed,
>> the first pass doesn't dirty it at all, but leaves all the work
>> to be done at the second visit?

> I am not against this idea. Just that it still requires us double scan
> of the main table and that's exactly what we are trying to avoid with
> this patch.

The part of the argument that I found convincing was trying to reduce
the write traffic (especially WAL log output), not avoiding a second
read.  And the fundamental point still remains: the work should be done
in background, not foreground.
        regards, tom lane


Re: [PATCHES] VACUUM Improvements - WIP Patch

From
Gregory Stark
Date:
"Tom Lane" <tgl@sss.pgh.pa.us> writes:

>>> It strikes me that what you are trying to do here is compensate for
>>> a bad decision in the HOT patch, which was to have VACUUM's first
>>> pass prune/defrag a page even when we know we are going to have to
>>> come back to that page later.  What about trying to fix things so
>>> that if the page contains line pointers that need to be removed,
>>> the first pass doesn't dirty it at all, but leaves all the work
>>> to be done at the second visit?
>
>> I am not against this idea. Just that it still requires us double scan
>> of the main table and that's exactly what we are trying to avoid with
>> this patch.
>
> The part of the argument that I found convincing was trying to reduce
> the write traffic (especially WAL log output), not avoiding a second
> read.  And the fundamental point still remains: the work should be done
> in background, not foreground.

I like the idea of only having to do a single pass through the table though.
Couldn't Pavan's original plan still work and just not have other clients try
to remove dead line pointers? At least not unless they're also pruning the
page due to an insert or update anyways?

Fundamentally it does seem like we want to rotate vacuum's work-load. What
we're doing now is kind of backwards.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support!


Re: [PATCHES] VACUUM Improvements - WIP Patch

From
Tom Lane
Date:
Gregory Stark <stark@enterprisedb.com> writes:
> I like the idea of only having to do a single pass through the table though.

Well, that argument was already overstated: we're not re-reading all of
the table now.  Just the pages containing dead line pointers.

> Couldn't Pavan's original plan still work and just not have other clients try
> to remove dead line pointers?

You could simply delay recycling of the really-truly-dead line pointers
until the next VACUUM, I suppose.  It's not clear how bad a
line-pointer-bloat problem that might leave you with.  (It would still
require tracking whether the last vacuum had completed successfully.
I note that any simple approach to that would foreclose ever doing
partial-table vacuums, which is something I thought was on the table
as soon as we had dead space mapping ability.)

> At least not unless they're also pruning the
> page due to an insert or update anyways?

Please stop pretending that this overhead will only be paid by
insert/update.  The current design for pruning does not work that way,
and we do not have a better design available.
        regards, tom lane


Re: [PATCHES] VACUUM Improvements - WIP Patch

From
Gregory Stark
Date:
"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> Gregory Stark <stark@enterprisedb.com> writes:
>> I like the idea of only having to do a single pass through the table though.
>
> Well, that argument was already overstated: we're not re-reading all of
> the table now.  Just the pages containing dead line pointers.
>
>> Couldn't Pavan's original plan still work and just not have other clients try
>> to remove dead line pointers?
>
> You could simply delay recycling of the really-truly-dead line pointers
> until the next VACUUM, I suppose.  It's not clear how bad a
> line-pointer-bloat problem that might leave you with.  (It would still
> require tracking whether the last vacuum had completed successfully.
> I note that any simple approach to that would foreclose ever doing
> partial-table vacuums, which is something I thought was on the table
> as soon as we had dead space mapping ability.)

Well there were three suggestions on how to track whether the last vacuum
committed or not. Keeping the last vacuum id in pg_class, keeping it per-page,
and keeping it per line pointer. ISTM either of the latter two would work with
partial table vacuums. Per line-pointer xids seemed excessively complicated to
me but per-page vacuum ids doesn't seem problematic.

I would definitely agree that partial-table vacuums are an essential part of
the future.

>> At least not unless they're also pruning the
>> page due to an insert or update anyways?
>
> Please stop pretending that this overhead will only be paid by
> insert/update.  The current design for pruning does not work that way,
> and we do not have a better design available.

Well I'm not pretending. I guess there's something I'm missing here. I thought
we only pruned when we wanted to insert a new tuple and found not enough
space.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about
EnterpriseDB'sPostgreSQL training!
 


Re: [PATCHES] VACUUM Improvements - WIP Patch

From
"Heikki Linnakangas"
Date:
Gregory Stark wrote:
> I thought
> we only pruned when we wanted to insert a new tuple and found not enough
> space.

Nope, we prune on any access to the page, if the page is "full enough", 
and the pd_prune_xid field suggests that there is something to prune.

The problem with only pruning on inserts is that by the time we get to 
heap_insert/heap_update, we're already holding a pin on the page, which 
prevents us from acquiring the vacuum lock.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: [PATCHES] VACUUM Improvements - WIP Patch

From
Bruce Momjian
Date:
I assume there is no TODO here.

---------------------------------------------------------------------------

Pavan Deolasee wrote:
> (taking the discussions to -hackers)
> 
> On Sat, Jul 12, 2008 at 11:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> >
> > (2) It achieves speedup of VACUUM by pushing work onto subsequent
> > regular accesses of the page, which is exactly the wrong thing.
> > Worse, once you count the disk writes those accesses will induce it's
> > not even clear that there's any genuine savings.
> >
> 
> Well in the worst case that is true. But in most other cases, the
> second pass work will be combined with other normal activities and the
> overhead will be shared, at least there is a chance for that. I think
> there is a chance for delaying the work until there is any real need
> for that e.g. INSERT or UPDATE on the page which would require a free
> line pointer.
> 
> 
> > (3) The fact that it doesn't work until concurrent transactions have
> > gone away makes it of extremely dubious value in real-world scenarios,
> > as already noted by Simon.
> >
> 
> If there are indeed long running concurrent transactions, we won't get
> any benefit of this optimization. But then there are several more
> common cases of very short concurrent transactions. In those cases and
> for very large tables, reducing the vacuum time is a significant win.
> The FSM will be written early and significant work of the VACUUM can
> be finished quickly.
> 
> > It strikes me that what you are trying to do here is compensate for
> > a bad decision in the HOT patch, which was to have VACUUM's first
> > pass prune/defrag a page even when we know we are going to have to
> > come back to that page later.  What about trying to fix things so
> > that if the page contains line pointers that need to be removed,
> > the first pass doesn't dirty it at all, but leaves all the work
> > to be done at the second visit?  I think that since heap_page_prune
> > has been refactored into a "scan" followed by an "apply", it'd be
> > possible to decide before the "apply" step whether this is the case
> > or not.
> >
> 
> I am not against this idea. Just that it still requires us double scan
> of the main table and that's exactly what we are trying to avoid with
> this patch.
> 
> Thanks,
> Pavan
> 
> -- 
> Pavan Deolasee
> EnterpriseDB http://www.enterprisedb.com
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: [PATCHES] VACUUM Improvements - WIP Patch

From
"Merlin Moncure"
Date:
On Fri, Aug 22, 2008 at 11:36 PM, Bruce Momjian <bruce@momjian.us> wrote:
>
> I assume there is no TODO here.

Well, there doesn't seem to be a TODO for partial/restartable vacuums,
which were mentioned upthread.  This is a really desirable feature for
big databases and removes one of the reasons to partition large
tables.

merlin


Re: [PATCHES] VACUUM Improvements - WIP Patch

From
"Joshua D. Drake"
Date:
Merlin Moncure wrote:
> On Fri, Aug 22, 2008 at 11:36 PM, Bruce Momjian <bruce@momjian.us> wrote:
>> I assume there is no TODO here.
> 
> Well, there doesn't seem to be a TODO for partial/restartable vacuums,
> which were mentioned upthread.  This is a really desirable feature for
> big databases and removes one of the reasons to partition large
> tables.

I would agree that partial vacuums would be very useful.

Joshua D. Drake

> 
> merlin
> 



Re: [PATCHES] VACUUM Improvements - WIP Patch

From
"Matthew T. O'Connor"
Date:
Joshua D. Drake wrote:
> Merlin Moncure wrote:
>> Well, there doesn't seem to be a TODO for partial/restartable vacuums,
>> which were mentioned upthread.  This is a really desirable feature for
>> big databases and removes one of the reasons to partition large
>> tables.
> I would agree that partial vacuums would be very useful. 


I think everyone agrees that partial vacuums would be useful / *A Good 
Thing* but it's the implementation that is the issue.  I was thinking 
about Alvaro's recent work to make vacuum deal with TOAST tables 
separately, which is almost like a partial vacuum since it effectively 
splits the vacuum work up into multiple independent blocks of work, the 
limitation obviously being that it can only split the work around 
TOAST.  Is there anyway that vacuum could work per relfile since we 
already split tables into files that are never greater than 1G?  I would 
think that if Vacuum never had more than 1G of work to do at any given 
moment it would make it much more manageable.



Re: [PATCHES] VACUUM Improvements - WIP Patch

From
Tom Lane
Date:
"Matthew T. O'Connor" <matthew@zeut.net> writes:
> I think everyone agrees that partial vacuums would be useful / *A Good 
> Thing* but it's the implementation that is the issue.

I'm not sure how important it will really be once we have support for
dead-space-map-driven vacuum.
        regards, tom lane


Re: [PATCHES] VACUUM Improvements - WIP Patch

From
"Matthew T. O'Connor"
Date:
Tom Lane wrote:
> "Matthew T. O'Connor" <matthew@zeut.net> writes:
>   
>> I think everyone agrees that partial vacuums would be useful / *A Good 
>> Thing* but it's the implementation that is the issue.
>>     
>
> I'm not sure how important it will really be once we have support for
> dead-space-map-driven vacuum.

Is that something we can expect any time soon? I haven't heard much 
about it really happening for 8.4.


Re: [PATCHES] VACUUM Improvements - WIP Patch

From
"Heikki Linnakangas"
Date:
Tom Lane wrote:
> [ off-list ]
> 
> "Matthew T. O'Connor" <matthew@zeut.net> writes:
>> Tom Lane wrote:
>>> I'm not sure how important it will really be once we have support for
>>> dead-space-map-driven vacuum.
> 
>> Is that something we can expect any time soon? I haven't heard much 
>> about it really happening for 8.4.
> 
> AFAIK Heikki has every intention of making it happen for 8.4.  I'll
> let him answer on-list, though.

I do. Unfortunately I've been swamped, and still am, with other stuff, 
and have had zero time to work on it :-(.

My current plan is to finish the FSM rewrite for/during the September 
commit fest, and submit a patch for dead-space-map driven VACUUM for the 
November commit fest.

My original plan was to enable index-only-scans using the DSM as well 
for 8.4, but it's pretty clear at this point that I don't have the time 
to finish that :-(.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: [PATCHES] VACUUM Improvements - WIP Patch

From
Heikki Linnakangas
Date:
Abhijit Menon-Sen wrote:
> At 2008-08-25 10:24:01 +0300, heikki@enterprisedb.com wrote:
>> My original plan was to enable index-only-scans using the DSM as well
>> for 8.4, but it's pretty clear at this point that I don't have the
>> time to finish that :-(.
> 
> I wonder how hard that would be.

It's doable, for sure.

The pieces I see as required for that are:
1. change the indexam API so that indexes can return tuples
2. make sure the DSM is suitable for index-only-scans. Ie. it must be 
completely up-to-date and WAL-logged, so that if the DSM says that all 
tuples on a page are visible, they really must be.
3. planner/stats changes, so that the planner can estimate how much of 
an index scan can be satisfied without looking at the heap (it's not an 
all-or-nothing plan-time decision with this design)

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com