Thread: More speed counting rows

More speed counting rows

From
Developer
Date:
Hello,

I am trying to optimize the count of files when I am using filters
(select by some row/s parameter/s)

In this case I think that postgresql really count all files.
Resulting in unacceptable times of 4 seconds in http server response.
Triggers+store in this case do not see very acceptable, because I need
store 1.5 millions of counting possibilities.

My question is:
Any method for indirect count like ordered indexes + quadratic count?
Any module?
Any suggestion?

--














--
--
Publicidad y Servicios http://www.pas-world.com
Directorio http://www.precioventa.com
Tienda http://informatica.precioventa.com/es/
Autoridad certificadora http://ca.precioventa.com/es/
--
--


Re: More speed counting rows

From
Scott Marlowe
Date:
On Mon, Jul 27, 2009 at 3:06 AM, Developer<dev002@pas-world.com> wrote:
> Hello,
>
> I am trying to optimize the count of files when I am using filters
> (select by some row/s parameter/s)
>
> In this case I think that postgresql really count all files.
> Resulting in unacceptable times of 4 seconds in http server response.
> Triggers+store in this case do not see very acceptable, because I need
> store 1.5 millions of counting possibilities.
>
> My question is:
> Any method for indirect count like ordered indexes + quadratic count?
> Any module?
> Any suggestion?

Postgres cannot just use indexes, it has tot hit the tables.  Rather
than suspecting what pgsql is doing, use explain analyze select ... to
see what your query is actually doing.  If it is having to scan the
table each time, then faster IO or a different table layout may be in
order.

Re: More speed counting rows

From
Chris Ernst
Date:
Developer wrote:
> Hello,
>
> I am trying to optimize the count of files when I am using filters
> (select by some row/s parameter/s)
>
> In this case I think that postgresql really count all files.
> Resulting in unacceptable times of 4 seconds in http server response.
> Triggers+store in this case do not see very acceptable, because I need
> store 1.5 millions of counting possibilities.
>
> My question is:
> Any method for indirect count like ordered indexes + quadratic count?
> Any module?
> Any suggestion?
>

I had a similar problem where HTTP requests triggered a count(*) over a
table that was growing rapidly.  The bigger the table got, the longer
the count took.  In my case, however, the counts only have to be a
reasonable estimate of the current state, so I solved this problem with
a count_sums table that gets updated every 30 minutes using a simple
perl script in a cron job.  The HTTP requests now trigger a very fast
select from a tiny, 9 row, 2 column table.

How "up to date" do the counts need to be?  If the count takes 4
seconds, can you run it every minute and store the counts in a table for
retrieval by the HTTP requests?  Or does it absolutely have to be the
exact count at the moment of the request?

If it needs to be more real-time, you could expand on this by adding
post insert/delete triggers that automatically update the counts table
to keep it current.  In my case it just wasn't necessary.

    - Chris

Re: More speed counting rows

From
David Wilson
Date:
On Mon, Jul 27, 2009 at 5:06 AM, Developer<dev002@pas-world.com> wrote:
> Hello,
>
> I am trying to optimize the count of files when I am using filters
> (select by some row/s parameter/s)
> My question is:
> Any method for indirect count like ordered indexes + quadratic count?
> Any module?
> Any suggestion?

If all you need is a good-enough estimate, you can try reporting the
tuples count out of the stats tables. It'll only be as up-to-date as
autovac makes it, though. I do this in one app to give me ballpark
figures for some constantly-growing tables.


--
- David T. Wilson
david.t.wilson@gmail.com

Re: More speed counting rows

From
Developer
Date:
> How "up to date" do the counts need to be?  If the count takes 4
> seconds, can you run it every minute and store the counts in a table for
> retrieval by the HTTP requests?
Now, I am storing integer value for filter in memory with timeout, but
in busy server, system sure crash without system memory (>700MB for all
combinations, if combinations counted > deleted memory count by
timeout).

> Or does it absolutely have to be the
> exact count at the moment of the request?
Some applications could fail, without exact number.



--














--
--
Publicidad y Servicios http://www.pas-world.com
Directorio http://www.precioventa.com
Tienda http://informatica.precioventa.com/es/
Autoridad certificadora http://ca.precioventa.com/es/
--
--