Home > mailing lists

Re: Slow count(*) again... - Mailing list pgsql-performance

From	Mladen Gogala
Subject	Re: Slow count(*) again...
Date	October 9, 2010 23:04:04
Msg-id	4CB11CC7.8070300@vmsinfo.com Whole thread Raw
In response to	Slow count(*) again... (Neil Whelchel <neil.whelchel@gmail.com>)
Responses	Re: Slow count() again... Re: Slow count() again...
List	pgsql-performance

Tree view

Neil Whelchel wrote:
> I know that there haven been many discussions on the slowness of count(*) even
> when an index is involved because the visibility of the rows has to be
> checked. In the past I have seen many suggestions about using triggers and
> tables to keep track of counts and while this works fine in a situation where
> you know what the report is going to be ahead of time, this is simply not an
> option when an unknown WHERE clause is to be used (dynamically generated).
> I ran into a fine example of this when I was searching this mailing list,
> "Searching in 856,646 pages took 13.48202 seconds. Site search powered by
> PostgreSQL 8.3." Obviously at some point count(*) came into play here because
> the site made a list of pages (1 2 3 4 5 6 > next). I very commonly make a
> list of pages from search results, and the biggest time killer here is the
> count(*) portion, even worse yet, I sometimes have to hit the database with
> two SELECT statements, one with OFFSET and LIMIT to get the page of results I
> need and another to get the amount of total rows so I can estimate how many
> pages of results are available. The point I am driving at here is that since
> building a list of pages of results is such a common thing to do, there need
> to be some specific high speed ways to do this in one query. Maybe an
> estimate(*) that works like count but gives an answer from the index without
> checking visibility? I am sure that this would be good enough to make a page
> list, it is really no big deal if it errors on the positive side, maybe the
> list of pages has an extra page off the end. I can live with that. What I
> can't live with is taking 13 seconds to get a page of results from 850,000
> rows in a table.
> -Neil-
>
>
Unfortunately, the problem is in the rather primitive way PostgreSQL
does I/O. It didn't change in 9.0 so there is nothing you could gain by
upgrading. If you execute strace -o /tmp/pg.out -e read <PID of the
sequential scan process> and inspect the file /tmp/pg.out when the query
finishes, you will notice a gazillion of read requests, all of them 8192
bytes in size. That means that PostgreSQL is reading the table block by
block, without any merging of the requests. You can alleviate the pain
by using the OS tricks, like specifying the deadline I/O scheduler in
the grub.conf and set prefetch on the FS block devices by using
blockdev, but there is nothing special that can be done, short of
rewriting the way PostgreSQL does I/O. There were rumors about the
version 9.0 and asynchronous I/O, but that didn't materialize. That is
really strange to me, because PostgreSQL tables are files or groups of
files, if the table size exceeds 1GB. It wouldn't be very hard to try
reading 1MB at a time and that would speed up the full table scan
significantly.
Problem with single block I/O is that there is a context switch for each
request, the I/O scheduler has to work hard to merge requests
appropriately and there is really no need for that, tables are files
navigating through files is not a problem, even with much larger blocks.
In another database, whose name I will not mention, there is a parameter
db_file_multiblock_read_count which specifies how many blocks will be
read by a single read when doing a full table scan. PostgreSQL is in
dire need of something similar and it wouldn't even be that hard to
implement.


--
Mladen Gogala
Sr. Oracle DBA
1500 Broadway
New York, NY 10036
(212) 329-5251
www.vmsinfo.com

pgsql-performance by date:

From: Scott Marlowe
Date: 09 October 2010, 22:47:42
Subject: Re: Slow count(*) again...

From: Joe Conway
Date: 09 October 2010, 23:10:46
Subject: Re: Slow count(*) again...

Re: Slow count(*) again... - Mailing list pgsql-performance

Previous

Next