Re: Grave performance issues... - Mailing list pgsql-general

From Mark kirkwood
Subject Re: Grave performance issues...
Date
Msg-id 1010134078.9615.24.camel@spikey.slithery.org
Whole thread Raw
In response to Grave performance issues...  ("Ztream" <ztream@highrad.org>)
List pgsql-general
Dear Z,

..decided to have a bit of a play with this,

I created tables _similar_ to yours -
(guessed what data and groupmember were like):

CREATE TABLE Distribution (
  Grade         integer NOT NULL,
  ItemID        integer NOT NULL,
  WeightGroupID integer NOT NULL,
  Value         float   NOT NULL
);

CREATE TABLE Groupmember (
  GroupID       integer NOT NULL,
  GroupName     varchar(30) NOT NULL,
  UserID        integer NOT NULL
);

CREATE TABLE Data (
  ItemID        integer NOT NULL,
  UserID        integer NOT NULL,
  weightGroupID integer NOT NULL,
  Grade         integer NOT NULL
);

and populated them with generated data : ( 1000, 100 and 10000 rows
respectively)

I created an index on distribution only:

CREATE INDEX dist_i1 ON distribution(Grade, ItemID, WeightGroupID);


initially the update :
UPDATE Distribution
SET Value =
(
  SELECT COUNT(*)
  FROM Data INNER JOIN GroupMember ON Data.UserID = GroupMember.UserID
  WHERE ItemID = Distribution.ItemID
  AND Grade = Distribution.Grade
  AND WeightGroupID = Distribution.WeightGroupID
);

took about 60s ( slow)

Adding 3 other indexes :

CREATE INDEX groupm_i1 ON groupmember(UserID);
--
CREATE INDEX data_i1 ON data(UserID);
CREATE INDEX data_i2 ON data(Grade,ItemID,WeightGroupID);

and analyzing reduced the elapsed time for the update to 2s ( not bad )

Now I have brutalized your schema (apologies...) in the interest of
making scripted data generation easy (typed everything as numbers if
possible), but hopefully the basic idea will be appropriate...

On that note, dont use "fancy" datatypes like numeric if integer will do
(as the simple ones are faster)

On the tuning front I set postgresql.conf parameters :

shared_buffers = 16000  # 128M of shared buffers
sort_mem       = 10240  #  10M of sort memory

You probably dont need the buffers that high - unless you expect the big
table(s) to have millions of rows....

(I used Pg 7.2b4 but the same ideas should work with 7.1.x)

Good luck

Mark



pgsql-general by date:

Previous
From: "Jeffrey W. Baker"
Date:
Subject: Re: PostgreSQL GUI
Next
From: Andrew Sullivan
Date:
Subject: More on CURRENT_TIME and changes