Thread: select and count efficiency (~35 mln rows)

select and count efficiency (~35 mln rows)

From
Grzegorz Kuczera
Date:
This is my first question here, so I would like to say hello to everyone:)

In my case, the problem appears when I want to fetch some data to inflate the table with it. First of all, I am counting the records from the table (for paging, over indexed column), then the select query is performed (with limit equal to 15).
I use the c3p0 library to manage the pool connections to the database and what I get is the timeout: both queries (separately) can take longer than 150 seconds. If I execute the query straight on the database (without usage of jdbc), it sometimes takes even 11 minutes to complete.

Month or two ago I vacuumed the database manually. Now I can see - in the last_vacuum and last_autovacuum from the pg_stat_all_tables - that the nulls are written there.

There are about 35 milion records in the table.

Details:
  • version - PostgreSQL 9.3.5 on x86_64-unknown-linux-gnu, compiled by gcc (Debian 4.7.2-5) 4.7.2, 64-bit
  • result of select pg_postmaster_start_time() - 2015-12-22 12:03:55.471436+01

And here is the question, which I posted on the Stackoverflow two months ago:
http://stackoverflow.com/questions/33009865/why-index-only-scan-is-taking-so-long

I would appreciate any help/tips.
Attachment

Re: select and count efficiency (~35 mln rows)

From
Andreas Kretschmer
Date:
Grzegorz Kuczera <grzegorz.kuczera@comarch.com> wrote:

> This is my first question here, so I would like to say hello to everyone:)
>
> In my case, the problem appears when I want to fetch some data to inflate the
> table with it. First of all, I am counting the records from the table (for
> paging, over indexed column), then the select query is performed (with limit
> equal to 15).

tl;dr.

Please read the links below:

http://www.depesz.com/2007/08/29/better-results-paging-in-postgresql-82/
https://wiki.postgresql.org/wiki/File:Pagination_Done_the_PostgreSQL_Way.pdf
https://coderwall.com/p/lkcaag

>
> There are about 35 milion records in the table.

do you really want to show all records, even with pagination? REALLY?

>
> And here is the question, which I posted on the Stackoverflow two months ago:
>  http://stackoverflow.com/questions/33009865/
> why-index-only-scan-is-taking-so-long


Because of MVCC / visibility a seq-scan is requisite.

For a guestimate about the number of rows you can look into pg_class.



Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.                              (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."   (unknown)
Kaufbach, Saxony, Germany, Europe.              N 51.05082°, E 13.56889°


Re: select and count efficiency (~35 mln rows)

From
Grzegorz Kuczera
Date:
> Please read the links below:
>
> http://www.depesz.com/2007/08/29/better-results-paging-in-postgresql-82/
> https://wiki.postgresql.org/wiki/File:Pagination_Done_the_PostgreSQL_Way.pdf
> https://coderwall.com/p/lkcaag

I am reading the articles right now.

>> >There are about 35 milion records in the table.
> do you really want to show all records, even with pagination? REALLY?
>

There are different accounts and rights in the system, most offen the
number of rows is narrowed down. But there is one account type, which
can see that many results.
And to be precise - problem appears even for the users, who are not able
to access the hole table.



Attachment