Re: Can I get the number of results plus the results with a single query? - Mailing list pgsql-general

From Bryn Llewellyn
Subject Re: Can I get the number of results plus the results with a single query?
Date
Msg-id 9BD56F4C-6781-427E-9E0A-BD7F2BD1CF89@yugabyte.com
Whole thread Raw
In response to Re: Can I get the number of results plus the results with a single query?  ("Peter J. Holzer" <hjp-pgsql@hjp.at>)
Responses Re: Can I get the number of results plus the results with a single query?
List pgsql-general
hjp-pgsql@hjp.at wrote:

That's not quite what I meant. I meant "I don't think there can be what you want with just one query",

The OP wants some kind of progress indicator. To be useful, such an indicator should be approximately linear in time. I.e. if your query returns 10000 rows in 5 minutes (yes, that's slow, but you don't need a progress bar for fast queries), it should display "0/10000" after 0 seconds, "33/10000" after 1 second, "2000/10000" after 1 minute, etc. That ideal is of course unrealistic, it's quite ok if it displays "0/unknown" fpr a few seconds and speeds up and slows down during execution. But if it displays "0/unknown" for 4 minutes and 55 seconds and then counts up to 10000 during the last 5 seconds, the progress
indicator is useless.

You are stuffing the whole result into an array and THEN counting the number of elements. So when you get to the count all of the work (except sending the result to the client) is already done, so there is little point in displaying a progress indicator.

I see, Peter. You’d read the OP’s mind. But I’d failed to. I saw the subject, you I assumed that the OP wanted the entire result set together with the count of the results. (After all, there’s no inflexions of “page” in the OP’s question.)

It seems, then, that the requirement is to page through results like a Google search for « SQL injection » does. The first page shows this

« About 29,400,000 results (0.45 seconds) »

And then it lets you choose other pages to see. That result of about 30 million is famously approximate. After all, nobody, would able be to disprove its correctness. And paging through the results from a SQL query in a stateless browser client has its own (in)famous characteristics.

It sounds like the OP wants a fast approximate count for a query whose restriction isn’t known until runtime. (His example uses just a single table). The planner makes estimates like this. Might something along these lines provide what’s being sought:

create table t(k int primary key);
insert into t(k) select generate_series(1, 1000000);
explain (costs true) select * from t where k between 20000 and 20050;

This is what it produced for me:

 Index Only Scan using t_pkey on t  (cost=0.42..5.39 rows=48 width=4)
   Index Cond: ((k >= 20000) AND (k <= 20050))

Seems to me that "rows=48” is a perfectly reasonable estimate…

pgsql-general by date:

Previous
From: "Peter J. Holzer"
Date:
Subject: Re: Can I get the number of results plus the results with a single query?
Next
From: Ted Toth
Date:
Subject: plpython questions