Re: index does not improve performance - Mailing list pgsql-general

From Jason Earl
Subject Re: index does not improve performance
Date
Msg-id 1012841913.28236.21.camel@npa01zz001
Whole thread Raw
In response to Re: index does not improve performance  (Milos Prudek <milos.prudek@tiscali.cz>)
List pgsql-general
Sometimes it's horses for courses.  I am currently replacing a DOS
Paradox application and I have run up against some of the same
obstacles.  With PostgreSQL getting a count of the rows in a table is an
expensive procedure (requiring a sequential scan).  That's just the
nature of the beast.  If you really need to return the total number of
rows then about the only way to do it is to maintain that information in
another table.  Of course, that requires creating custom triggers for
insert and delete, and you get to worry about contention for this new
table (and deadlock, concurrency and a list of other problems).

I have a similar problem with a set of similar tables that store
caseweights.  These tables are all fairly large (the smallest has 16
million rows), and I am using commodity hardware.  This makes sequential
scans very painful.  Queries that return a small portion of the total
rows, however, return very fast.

So I simply created a set of summary tables that hold 15 minute
summaries of the statistics I need.  I populate these tables with small
Python scripts launched automatically using cron.  Now the tables that I
actually query are several orders of magnitude smaller than my raw data
tables, and my queries return almost instantly.

Since it would appear that you are primarily logging firewall data
(which shouldn't change after the fact), your application is a prime
suspect for this type of optimization.

Hope this was helpful,
Jason

On Sat, 2002-02-02 at 07:44, Milos Prudek wrote:
> > Unfortunately Milos an index isn't likely to help on this type of a
> > query.  It would appear that the value 'UDP' accounts for the vast
> > majority of the rows, and so an index loses most of its value.  You see,
> > it actually takes *longer* to return queries using the indexes if a
> > significant portion of the table is being touched, because the database
> > has to check both the index and the tuple.  One of the new features of
>
> Jason and Neil,
>
> Thank you very much for the detailed answer. I'm relatively new to SQL.
> My background is much more inferior system, FoxPro for DOS, which in
> ancient times used what they called "patented Rushmore technology". I
> think that FoxPro was able to return COUNT immediately, if index was
> used, no matter what number of records the condition was true for. I
> wonder if this is something that PostgreSQL will eventually be able to
> do, or if it is simply not technically possible for some reason.
>
>
> --
> Milos Prudek
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org



pgsql-general by date:

Previous
From: David A Dickson
Date:
Subject: Change size of varchar(20) field
Next
From: Frank Bax
Date:
Subject: Re: Change size of varchar(20) field