Re: Grave performance issues... - Mailing list pgsql-general

From steve boyle
Subject Re: Grave performance issues...
Date
Msg-id a1fdi7$1r8f$1@news.tht.net
Whole thread Raw
In response to Grave performance issues...  ("Ztream" <ztream@highrad.org>)
Responses Stored proceures in perl or Python  ("David Siebert" <david@eclipsecat.com>)
List pgsql-general
I might be missing something but have you considered either using a trigger
to update the Distribution table each time the data table is amended OR wrap
updates to the Distribution table using a function i.e. f_add_data(....)
that would keep the two tables in sync.  I think this would probably do away
with the need to carry out the background process altogether.

Also are you using the GroupMember table in the update statement?

hih

sb

"Ztream" <ztream@highrad.org> wrote in message
news:a0fr1a$1eil$1@news.tht.net...
> I am porting a system that I have been developing on an MS platform
> (including SQL Server) to a linux system with postgresql. Apart from the
> usual difficulties (and not being very knowledgeable in the unix world), I
> have now gotten most things to work. However, I am experiencing some
pretty
> serious performance issues.
> The system collects data from users thru a web interface, and also has a
> background process that does rather heavy work. I will here refer to but
one
> of the SQL statements, to keep things focused. The idea here is that the
> background process is, among other things, constructing/updating a table
> called Distribution from a table called Data, which then represents the
> statistic distribution of the information found in Data (using user
groups,
> too).
> The update-statement actually performing this looks like this:
>
> UPDATE Distribution
> SET Value =
> (
> SELECT COUNT(*) FROM Data INNER JOIN GroupMember ON Data.UserID =
> GroupMember.UserID
> WHERE ItemID = Distribution.ItemID AND Grade = Distribution.Grade AND
> WeightGroupID = Distribution.WeightGroupID
> )
>
> Distribution takes the form:
> CREATE TABLE Distribution (
>  Grade int NOT NULL,
>  ItemID numeric(18, 0) NOT NULL,
>  WeightGroupID numeric(18, 0) NOT NULL,
>  Value float NOT NULL,
>  PRIMARY KEY (Grade, ItemID, WeightGroupID)
> );
>
> I am unsure if I need to post the DDL descriptions of the other tables;
feel
> free to ask.
>
> Anyway, there are two seemingly distinct problems with this:
>
> *1
> As you can see, this UPDATE touches the entire table upon each execution,
> which in postgresql seems to mean that performance quickly deteriorates
> unless you are constantly vacuuming. To answer an obvious question, I *do*
> feel that this is a necessary thing to do - if I were to update only those
> rows that would acquire a new value, it would most probably turn out to be
> the majority of them as the system is expected to receive a steady flow of
> input data when launched.
> This operation also needs to be performed very often - ideally about every
> 10th second. I tried leaving the background process running for a few days
> without vacuuming during the holiday, and while testing today I found that
> the above query takes aproximately 2 minutes to complete - for a
> Distribution size of 600 rows! Also, neither of the other two tables
> referenced contained more than 30 rows.
>
> *2
> Even when the involved tables are freshly vacuumed (and vacuum analyzed),
> the above query using the data amount from (*1) takes about 4 seconds to
> complete. That should be compared to the about 50ms it takes to complete
for
> the same amount of data on Windows 2000 against SQL Server on a comparable
> machine.
>
> Needless to say, this is rather catastrophical, and I'm pretty much out of
> ideas. The performance values given above are those I got while executing
> the query directly in psql, with no other connections to the database, so
> this does not seem to me to be a problem with the system in itself. A
> configuration problem, perhaps? A bug? SQL ignorance on my part? I could
> surely use some help.
> As previously noted, I am not very comfortable with unix systems, and not
> exactly an SQL guru either.
> The linux distribution is some version of redhat, and the postgresql
version
> is 7.1.3.
>
> Any help or attempt to help would be greatly appreciated.
>
>  / Ztream
>
>



pgsql-general by date:

Previous
From:
Date:
Subject: Write/Read Encrypted Text
Next
From: "steve boyle"
Date:
Subject: Re: Adding constraint to existing table.