Stored proceures in perl or Python - Mailing list pgsql-general

From David Siebert
Subject Stored proceures in perl or Python
Date
Msg-id OJEIJALIHAIBMMBFLCOBAENJDMAA.david@eclipsecat.com
Whole thread Raw
In response to Re: Grave performance issues...  ("steve boyle" <boylesa@dial.pipex.com>)
List pgsql-general
I am running Postgres 7.1 on Suse. I got all the RPMs from Suse but I can
not find the libs to use perl or Python in Postgres. I can add tcl and
PG/pgsql with out any problem. any help would be great.


-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org]On Behalf Of steve boyle
Sent: Tuesday, January 08, 2002 12:25 PM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Grave performance issues...


I might be missing something but have you considered either using a trigger
to update the Distribution table each time the data table is amended OR wrap
updates to the Distribution table using a function i.e. f_add_data(....)
that would keep the two tables in sync.  I think this would probably do away
with the need to carry out the background process altogether.

Also are you using the GroupMember table in the update statement?

hih

sb

"Ztream" <ztream@highrad.org> wrote in message
news:a0fr1a$1eil$1@news.tht.net...
> I am porting a system that I have been developing on an MS platform
> (including SQL Server) to a linux system with postgresql. Apart from the
> usual difficulties (and not being very knowledgeable in the unix world), I
> have now gotten most things to work. However, I am experiencing some
pretty
> serious performance issues.
> The system collects data from users thru a web interface, and also has a
> background process that does rather heavy work. I will here refer to but
one
> of the SQL statements, to keep things focused. The idea here is that the
> background process is, among other things, constructing/updating a table
> called Distribution from a table called Data, which then represents the
> statistic distribution of the information found in Data (using user
groups,
> too).
> The update-statement actually performing this looks like this:
>
> UPDATE Distribution
> SET Value =
> (
> SELECT COUNT(*) FROM Data INNER JOIN GroupMember ON Data.UserID =
> GroupMember.UserID
> WHERE ItemID = Distribution.ItemID AND Grade = Distribution.Grade AND
> WeightGroupID = Distribution.WeightGroupID
> )
>
> Distribution takes the form:
> CREATE TABLE Distribution (
>  Grade int NOT NULL,
>  ItemID numeric(18, 0) NOT NULL,
>  WeightGroupID numeric(18, 0) NOT NULL,
>  Value float NOT NULL,
>  PRIMARY KEY (Grade, ItemID, WeightGroupID)
> );
>
> I am unsure if I need to post the DDL descriptions of the other tables;
feel
> free to ask.
>
> Anyway, there are two seemingly distinct problems with this:
>
> *1
> As you can see, this UPDATE touches the entire table upon each execution,
> which in postgresql seems to mean that performance quickly deteriorates
> unless you are constantly vacuuming. To answer an obvious question, I *do*
> feel that this is a necessary thing to do - if I were to update only those
> rows that would acquire a new value, it would most probably turn out to be
> the majority of them as the system is expected to receive a steady flow of
> input data when launched.
> This operation also needs to be performed very often - ideally about every
> 10th second. I tried leaving the background process running for a few days
> without vacuuming during the holiday, and while testing today I found that
> the above query takes aproximately 2 minutes to complete - for a
> Distribution size of 600 rows! Also, neither of the other two tables
> referenced contained more than 30 rows.
>
> *2
> Even when the involved tables are freshly vacuumed (and vacuum analyzed),
> the above query using the data amount from (*1) takes about 4 seconds to
> complete. That should be compared to the about 50ms it takes to complete
for
> the same amount of data on Windows 2000 against SQL Server on a comparable
> machine.
>
> Needless to say, this is rather catastrophical, and I'm pretty much out of
> ideas. The performance values given above are those I got while executing
> the query directly in psql, with no other connections to the database, so
> this does not seem to me to be a problem with the system in itself. A
> configuration problem, perhaps? A bug? SQL ignorance on my part? I could
> surely use some help.
> As previously noted, I am not very comfortable with unix systems, and not
> exactly an SQL guru either.
> The linux distribution is some version of redhat, and the postgresql
version
> is 7.1.3.
>
> Any help or attempt to help would be greatly appreciated.
>
>  / Ztream
>
>



---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org


pgsql-general by date:

Previous
From: "David Siebert"
Date:
Subject: Re: PostgreSQL Replication and more
Next
From: Troy.Campano@LibertyMutual.com
Date:
Subject: Re: gborg