Re: [HACKERS] Clock with Adaptive Replacement - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: [HACKERS] Clock with Adaptive Replacement
Date
Msg-id CAH2-WzmbUWKvCqjDycpCOSF==PEswVf6WtVutgm9efohH0NfHA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Clock with Adaptive Replacement  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
On Thu, May 3, 2018 at 5:16 PM, Peter Geoghegan <pg@bowt.ie> wrote:
> On Thu, May 3, 2018 at 12:37 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Wed, May 2, 2018 at 3:06 PM, Vladimir Sitnikov
>> <sitnikov.vladimir@gmail.com> wrote:
>>> Sample output can be seen here:
>>> https://github.com/vlsi/pgsqlstat/tree/pgsqlio#pgsqlio
>>
>> Neat.  Not sure what generated this trace, but note this part:

> I don't have time to check this out just now, but it seems like an
> excellent idea. It would be nice if it could be enhanced further, so
> you get some idea of what the blocks are without having to decode them
> yourself using tools like pageinspect.

Although I'm a Linux user, I added --enable-dtrace to my standard
release build configure alias several months back. I wanted to be able
to use the user space probes it adds with BCC [1], which is a really
nice tool (the dtrace/system tap probes seem to be totally generic).
If you have a fairly recent kernel, many really nice tricks are
possible with only modest effort. I've been meaning to polish up a
selection of one-liners I came up with several months back, to make
them available for others.

Here is a one-liner that traces buffer reads within the backend with PID 6617:

pg@bat:~/notes/setup$ sudo /usr/share/bcc/tools/trace -p 6617 --time
'u:/code/postgresql/root/install/bin/postgres:postgresql:buffer__read__start
"fork: %d, blockNum: %u, spcNode: %u, dbNode: %u, relNode: %u,
backend: %d, isExtend: %d" arg1, arg2, arg3, arg4, arg5, arg6, arg7'

If I run the query "select * from pgbench_accounts where aid = 555"
within that backend, I can see the following:

TIME     PID    TID    COMM         FUNC             -
12:17:28 6617   6617   postgres     buffer__read__start fork: 0,
blockNum: 290, spcNode: 1663, dbNode: 13100, relNode: 16404, backend:
-1, isExtend: 0
12:17:28 6617   6617   postgres     buffer__read__start fork: 0,
blockNum: 3, spcNode: 1663, dbNode: 13100, relNode: 16404, backend:
-1, isExtend: 0
12:17:28 6617   6617   postgres     buffer__read__start fork: 0,
blockNum: 2, spcNode: 1663, dbNode: 13100, relNode: 16404, backend:
-1, isExtend: 0
12:17:28 6617   6617   postgres     buffer__read__start fork: 0,
blockNum: 9, spcNode: 1663, dbNode: 13100, relNode: 16396, backend:
-1, isExtend: 0

This is pretty easy to interpret: 290 is the root page, 3 in another
internal page, and 2 in the leaf page. Finally, block 9 is from the
heap relation.

I haven't actually instrumented the amount of time each read takes
here, so it's not quite as good as your dtrace script. I'm sure that
that would be trivial, since BCC really does seem to make
customization very easy. I just don't want to spend more than 10
minutes on this today.

[1] https://github.com/iovisor/bcc
-- 
Peter Geoghegan


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [HACKERS] Parallel Append implementation
Next
From: Robert Haas
Date:
Subject: Re: parallel.sgml for Gather with InitPlans