Re: Limiting setting of hint bits by read-only queries; vacuum_delay - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Limiting setting of hint bits by read-only queries; vacuum_delay
Date
Msg-id CA+U5nMKvewhqqnqbecSExrRv2G4S2eACK_DL6BzxzMc50EkJTw@mail.gmail.com
Whole thread Raw
In response to Re: Limiting setting of hint bits by read-only queries; vacuum_delay  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Limiting setting of hint bits by read-only queries; vacuum_delay  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Limiting setting of hint bits by read-only queries; vacuum_delay  (Greg Stark <stark@mit.edu>)
List pgsql-hackers
On 25 March 2013 23:18, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Greg Stark <stark@mit.edu> writes:
>> On Mon, Mar 25, 2013 at 9:53 PM, Kevin Grittner <kgrittn@ymail.com> wrote:
>>> That would make it harder to construct a degenerate case
>
>> I don't think it's hard at all. It's the same as the case Simon wants
>> to solve except that the cost is incurred in a different way. Imagine
>> a system where there's a huge data load to a table which is then
>> read-only for an OLTP system. Until vacuum comes along -- and it may
>> never since the table never sees deletes or updates -- every
>> transaction needs to do a clog lookup for every tuple it sees. That
>> means a significant cpu slowdown for every row lookup forever more. To
>> save a one-time i/o cost.
>
> If I read Simon's original mail correctly, he was proposing that
> individual SELECTs would update some hint bits but eventually "get
> tired" and stop doing that.

Yes, thank you for understanding that.

> So repeated queries would eventually get
> all the hint bits set in a scenario such as you describe.  It's less
> clear what would happen in a scenario where there's a steady flow of
> new updates.

With a steady flow of updates we rely for cleanup upon i) the updates
themselves and/or small selects, ii) VACUUM

The idea is to avoid penalising users wishing to run larger queries,
who are effectively forced to clean up everybody's else's mess. It is
important to understand that the steady state is exactly the same
between the patched/unpatched cases when there are no larger queries.
The difference is that the steady state is maintained with patch,
rather than forcing the large query to perform long and slow *early*
cleanup of the database - earlier than would have occurred naturally
in the steady state. So we cannot argue that there is "more I/O"
because actually there will be exactly the same I/O required for
steady state updates, its just that less of that I/O happens earlier
during the large user query. Think about the I/O required with/without
the large query... it seems likely that early cleanup could itself
result in higher overall I/O.

> This is clearly worth thinking about and trying to find better solutions
> for.  I'm only against trying to solve it in the 9.3 timeframe.  It will
> take a lot longer than that to get something that works tolerably well.

I'll bet you all a beer at PgCon 2014 that this remains unresolved at
that point.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Request for vote to move forward with recovery.conf overhaul
Next
From: Tom Lane
Date:
Subject: Re: [COMMITTERS] pgsql: Add PF_PRINTF_ATTRIBUTE to on_exit_msg_fmt.