Re: Performance Concern - Mailing list pgsql-performance

From John Pagakis
Subject Re: Performance Concern
Date
Msg-id KKEBKDPPLALEFHBEAOCCGECODEAA.thebfh@toolsmythe.com
Whole thread Raw
In response to Re: Performance Concern  (Rod Taylor <rbt@rbt.ca>)
List pgsql-performance
Thanks Rod.

While I was waiting for my post to make it I went ahead and made the key an
int.  It improved it a lot, but was still pretty slow.

This is weird:

I was testing in a query window thus:

UPDATE baz SET customer_id = '1234' WHERE ( SELECT baz_number FROM baz WHERE
customer_id IS NULL LIMIT 1000 );

In the version of the table I posted this took 3 1/2 minutes.  By making
baz_number not part of the key, adding a baz_key of int4 and adjusting the
above query for that it dropped to 1 1/2 minutes.

But, I realized that was not how my app was going to be updating, so I wrote
a little simulation in JAVA that gets a list of baz_keys where the customer_
is null and then iterates through the list one at a time attempting to
UPDATE baz SET customer_id = '1234' WHERE baz_key = <bazKeyFromList> AND
customer_id IS NULL.  One thousand iterations took only 37 seconds.

It would appear PostgreSQL is tuned towards single updates as opposed to
handing a big bunch off to the query engine.  Does that seem right?  Seems
odd to me.

Anyway thanks for your response.  I'll add some indexes and see if I can't
shave that time down even further.

__________________________________________________________________
John Pagakis
Email: john@pagakis.com


"If you can't beat them, arrange
 to have them beaten."
        -- George Carlin

This signature generated by
     ... and I Quote!!(tm) Copyright (c) 1999 SpaZmodic Frog Software, Inc.
                                              www.spazmodicfrog.com


-----Original Message-----
From: pgsql-performance-owner@postgresql.org
[mailto:pgsql-performance-owner@postgresql.org]On Behalf Of Rod Taylor
Sent: Friday, October 24, 2003 11:23 AM
To: john@pagakis.com
Cc: Postgresql Performance
Subject: Re: [PERFORM] Performance Concern


On Thu, 2003-10-23 at 08:21, John Pagakis wrote:
> Greetings.
>
> I have a table that will require 100,000 rows initially.
>
> Assume the following (some of the field names have been changed for
> confidentiality reasons):
>
> CREATE TABLE baz (
>     baz_number CHAR(15) NOT NULL,
>     customer_id CHAR(39),
>     foobar_id INTEGER,
>     is_cancelled BOOL DEFAULT false NOT NULL,
>     create_user VARCHAR(60) NOT NULL,
>     create_datetime TIMESTAMP DEFAULT 'now()' NOT NULL,
>     last_update_user VARCHAR(60) NOT NULL,
>     last_update_datetime TIMESTAMP DEFAULT 'now()' NOT NULL,
>     CONSTRAINT PK_baz PRIMARY KEY (baz_number)
> );
>
> ALTER TABLE baz
>     ADD FOREIGN KEY (customer_id) REFERENCES purchase (customer_id);
>
> ALTER TABLE baz
>     ADD FOREIGN KEY (foobar_id) REFERENCES foobar (foobar_id);
>
>
> Using JDBC, it took approximately one hour to insert 100,000 records.  I
> have an algorithm to generate a unique baz_number - it is a mixture of
alpha
> and numerics.

Using an int for identification is certainly suggested, however it
sounds like you may be short a few indexes on the foreign key'd fields.

EXPLAIN ANALYZE output is always nice..


pgsql-performance by date:

Previous
From: Allen Landsidel
Date:
Subject: Re: My own performance/tuning q&a
Next
From: Greg Stark
Date:
Subject: Re: vacuum locking