On Thu, Aug 1, 2013 at 12:13 PM, Sergey Burladyan <eshkinkot@gmail.com> wrote:
> Jeff Janes <jeff.janes@gmail.com> writes:
>
>> I'd use strace to find what file handle is being read and written, and
>> lsof to figure out what file that is.
>
> I use strace, it is more read then write:
> $ cut -d '(' -f 1 /var/tmp/pg.trace | sort | uniq -c | sort -n
> 49 select
> 708 close
> 1021 open
> 7356 write
> 212744 read
> 219650 lseek
Based on your iotop (or whatever that was that you posted previously)
most of the reads must be coming from the file system cache.
>
> top reads:
> 7859 read(150 open("base/16444/17685.129", O_RDWR|O_CREAT, 0600) = 150
> 9513 read(149 open("base/16444/17685.128", O_RDWR|O_CREAT, 0600) = 149
> 10529 read(151 open("base/16444/17685.130", O_RDWR|O_CREAT, 0600) = 151
> 12155 read(152 open("base/16444/17685.131", O_RDWR|O_CREAT, 0600) = 152
> 12768 read(154 open("base/16444/17685.133", O_RDWR|O_CREAT, 0600) = 154
> 16210 read(153 open("base/16444/17685.132", O_RDWR|O_CREAT, 0600) = 153
>
> it is 'items' table:
> select relname from pg_class where relfilenode = 17685;
> relname
> ---------
> items
>
> each read is 8192 bytes, so for EXPLAIN query with two simple index scan, *without* ANALYZE postgres
> read (7859 + 9513 + 10529 + 12155 + 12768 + 16210) * 8192 = 565 526 528 bytes from it.
>
>> It looks like it is more write than read, which does seem strange.
>
> Why it read something for simple EXPLAIN, without real executing query? :-)
I figured it was reading some system catalogs or something. I don't
know why it would be reading the table files. Or writing much of
anything, either.
I think the next step would be to run gdb -p <pid> (but don't start
gdb until backend is in the middle of a slow explain), then:
break read
c
bt
Then repeat the c and bt combination a few more times, to build up a
dataset on what the call stack is which is causing the reads to
happen.
Cheers,
Jeff