Re: Performance Issues with count() - Mailing list pgsql-general

From Jan Wieck
Subject Re: Performance Issues with count()
Date
Msg-id 200204232325.g3NNPYT03436@saturn.janwieck.net
Whole thread Raw
In response to Performance Issues with count()  ("asdf asdasfa" <sjg@email.com>)
Responses Re: Performance Issues with count()  (Martijn van Oosterhout <kleptog@svana.org>)
List pgsql-general
asdf asdasfa wrote:
[text/html is unsupported, treating like TEXT/PLAIN]

> <DIV><FONT face=Arial size=2>
> <P class=MsoNormal style="MARGIN: 0in 0in 0pt"><FONT face="Times New Roman" size=3>Hi,</FONT></P>

    First  a  little  hint: mime mail with plain text and HTML is
    usually  already  unwanted  on  mailing  lists,  because   it
    increases  the  bandwidth  for  absolutely no benefit at all.
    Plus, alot of people don't even read HTML mail, so  HTML-only
    mail  like  yours  isn't  the  smartest  way to get access to
    developers knowledge here.

    Anyway, the problem is that your Perl script is optimized for
    your  purpose, while PostgreSQL has to use generic algorithms
    that work in ANY situation to solve the problem.

    In particular, the Perl script uses a  hash  table  with  one
    entry  for  each group. Now what happens if the input data is
    spread out and contains 1G groups?  It'll  simply  blow  away
    your  script because it runs out of memory. This behaviour is
    unacceptable for a database system, so  as  you  see  in  the
    Explain output, PostgreSQL sorts and groups the input data in
    temporary files before counting the rows.  Due to  that,  the
    PostgreSQL solution to the problem requires a gazillion of IO
    operations, but it'll work whatever the input data is,  given
    that there is enough disk space.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



pgsql-general by date:

Previous
From: "Robert J. Sanford, Jr."
Date:
Subject: client authentication protocols?
Next
From: Bruce Momjian
Date:
Subject: Re: nested transactions