Re: Hint Bits and Write I/O - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Hint Bits and Write I/O
Date
Msg-id 1211963154.4489.312.camel@ebony.site
Whole thread Raw
In response to Re: Hint Bits and Write I/O  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Tue, 2008-05-27 at 19:32 -0400, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > My proposal is to have this as a two-stage process. When we set the hint
> > on a tuple in a clean buffer we mark it BM_DIRTY_HINTONLY, if not
> > already dirty. If we set a hint on a buffer that is BM_DIRTY_HINTONLY
> > then we mark it BM_DIRTY.
> 
> I wonder if it is worth actually counting the number of newly set hint
> bits, rather than just having a counter that saturates at two.  We could
> steal a byte from usage_count without making the buffer headers bigger.

That's the right place to start. We can instrument the backend like that
and then get some data about what actually happens. 

The other stuff is probably me just explaining it badly, so lets leave
it for now. You're right, it was too complex for first cut.

> > If the bgwriter has time, it will write out BM_DIRTY_HINTONLY buffers,
> > though on a consistently busy server this should not occur.
> 
> What do you mean by "if it has time"?  How would it know that?
> 
> > This won't change the behaviour of first-read-after-copy. To improve
> > that behaviour, I suggest that we only move from BM_DIRTY_HINTONLY to
> > BM_DIRTY when we are setting the hint for a new xid. If we are just
> > setting the same xid over-and-over again then we should avoid setting
> > the page dirty. So when data has been loaded via COPY, we will just
> > check the status of the xid once, then scan the whole page using the
> > single-item transaction cache.
> 
> This doesn't make any sense to me.  What is a "new xid"?  And what is
> "setting the same xid over and over"?  If a page is full of occurrences
> of the same xid, that doesn't really mean that it's less useful to
> correctly hint each occurrence.
> 
> The whole proposal seems a bit overly complicated.  What we talked about
> at PGCon was simply not setting the dirtybit when setting a hint bit.
> There's a certain amount of self-optimization there: if a page
> continually receives hint bit updates, that also means it is getting
> pinned and hence its usage_count stays high, thus it will tend to stay
> in shared buffers until something happens to make it really dirty.
> (Although that argument might not hold water for a bulk seqscan: you'll
> have hinted all the tuples and then very possibly throw the page away
> immediately.  So counting the hints and eventually deciding we did
> enough to justify dirtying the page might be worth doing.)

Yes, we probably need to do something different for bulk seqscans.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



pgsql-hackers by date:

Previous
From: Radek Strnad
Date:
Subject: Proposal - Collation at database level
Next
From: Tommy Gildseth
Date:
Subject: Add dblink function to check if a named connection exists