Re: Yet Another COUNT(*)...WHERE...question - Mailing list pgsql-general
From | Andrew Armstrong |
---|---|
Subject | Re: Yet Another COUNT(*)...WHERE...question |
Date | |
Msg-id | 001401c7df52$95b30c00$b6f3a305@plasma Whole thread Raw |
In response to | Re: Yet Another COUNT(*)...WHERE...question ("Scott Marlowe" <scott.marlowe@gmail.com>) |
List | pgsql-general |
I don't know how PSQL does it, but MySQL has an SQL_CALC_FOUND_ROWS extension which allows the query to also return how many rows exist without the LIMIT clause. Perhaps there is similar for PSQL (check LIMIT docs?) - Andrew -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Scott Marlowe Sent: Thursday, 16 August 2007 1:24 AM To: Phoenix Kiula Cc: Gregory Stark; Postgres General Subject: Re: [GENERAL] Yet Another COUNT(*)...WHERE...question On 8/15/07, Phoenix Kiula <phoenix.kiula@gmail.com> wrote: > On 15/08/07, Scott Marlowe <scott.marlowe@gmail.com> wrote: > > On 8/15/07, Phoenix Kiula <phoenix.kiula@gmail.com> wrote: > > > On 15/08/07, Gregory Stark <stark@enterprisedb.com> wrote: > > > > "Phoenix Kiula" <phoenix.kiula@gmail.com> writes: > > > > > > > > > I'm grappling with a lot of reporting code for our app that relies on > > > > > queries such as: > > > > > > > > > > SELECT COUNT(*) FROM TABLE WHERE ....(conditions)... > > > > >... > > > > > The number of such possibilities for multiple WHERE conditions is > > > > > infinite... > > > > > > > > Depends on the "conditions" bit. You can't solve all of the infinite > > > > possibilities -- well you can, just run the query above -- but if you want > to do better it's all about understanding your data. > > > > > > > > > I am not sure what the advice here is. The WHERE condition comes from > > > the indices. So if the query was not "COUNT(*)" but just a couple of > > > columns, the query executes in less than a second. Just that COUNT(*) > > > becomes horribly slow. > > > > Sorry, but I don't believe you. if you're doing a count(*) on the > > same dataset that returns in < 1 second, then the count(*) with the > > same where clause will run in < 1 second. I haven't seen pgsql do > > anything else. > > > > Sorry I was not clear. Imagine an Amazon.com search results page. It > has about 15 results on Page 1, then it shows "Page 1 of 190". > > To show each page, the query probably has a "LIMIT 15 OFFSET 0" for > Page 1. However, to calculate the total number of pages, they probably > do a separate counts query, because doing a "select *" and then > counting the number of rows returned would be even more inefficient > than a count(*). When I go to amazon.com I only ever get three pages of results. ever. Because they know that returning 190 pages is not that useful, as hardly anyone is going to wander through that many pages. Google, you'll notice says "Results 1 - 10 of about 5,610,000 for blacksmith" i.e. it's guesstimating as well. no reason for google to look at every single row for blacksmith to know that there's about 5.6 million. > So, in reporting, two queries are fairly common I would think, unless > I am missing something? Yes, optimization. :) You don't need an exact count to tell someone that there's more data and they can go to it. Note that if you are planning on doing things google sized, you'll need to do what they did, invent your own specialized database. For us mere mortals, it's quite likely that you can do something like: explain select * from table where field like 'abc%'; and then parse the explain output for an approximate number. ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
pgsql-general by date: