Re: Performance Concern - Mailing list pgsql-performance
From | John Pagakis |
---|---|
Subject | Re: Performance Concern |
Date | |
Msg-id | KKEBKDPPLALEFHBEAOCCGECODEAA.thebfh@toolsmythe.com Whole thread Raw |
In response to | Re: Performance Concern (Rod Taylor <rbt@rbt.ca>) |
List | pgsql-performance |
Thanks Rod. While I was waiting for my post to make it I went ahead and made the key an int. It improved it a lot, but was still pretty slow. This is weird: I was testing in a query window thus: UPDATE baz SET customer_id = '1234' WHERE ( SELECT baz_number FROM baz WHERE customer_id IS NULL LIMIT 1000 ); In the version of the table I posted this took 3 1/2 minutes. By making baz_number not part of the key, adding a baz_key of int4 and adjusting the above query for that it dropped to 1 1/2 minutes. But, I realized that was not how my app was going to be updating, so I wrote a little simulation in JAVA that gets a list of baz_keys where the customer_ is null and then iterates through the list one at a time attempting to UPDATE baz SET customer_id = '1234' WHERE baz_key = <bazKeyFromList> AND customer_id IS NULL. One thousand iterations took only 37 seconds. It would appear PostgreSQL is tuned towards single updates as opposed to handing a big bunch off to the query engine. Does that seem right? Seems odd to me. Anyway thanks for your response. I'll add some indexes and see if I can't shave that time down even further. __________________________________________________________________ John Pagakis Email: john@pagakis.com "If you can't beat them, arrange to have them beaten." -- George Carlin This signature generated by ... and I Quote!!(tm) Copyright (c) 1999 SpaZmodic Frog Software, Inc. www.spazmodicfrog.com -----Original Message----- From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org]On Behalf Of Rod Taylor Sent: Friday, October 24, 2003 11:23 AM To: john@pagakis.com Cc: Postgresql Performance Subject: Re: [PERFORM] Performance Concern On Thu, 2003-10-23 at 08:21, John Pagakis wrote: > Greetings. > > I have a table that will require 100,000 rows initially. > > Assume the following (some of the field names have been changed for > confidentiality reasons): > > CREATE TABLE baz ( > baz_number CHAR(15) NOT NULL, > customer_id CHAR(39), > foobar_id INTEGER, > is_cancelled BOOL DEFAULT false NOT NULL, > create_user VARCHAR(60) NOT NULL, > create_datetime TIMESTAMP DEFAULT 'now()' NOT NULL, > last_update_user VARCHAR(60) NOT NULL, > last_update_datetime TIMESTAMP DEFAULT 'now()' NOT NULL, > CONSTRAINT PK_baz PRIMARY KEY (baz_number) > ); > > ALTER TABLE baz > ADD FOREIGN KEY (customer_id) REFERENCES purchase (customer_id); > > ALTER TABLE baz > ADD FOREIGN KEY (foobar_id) REFERENCES foobar (foobar_id); > > > Using JDBC, it took approximately one hour to insert 100,000 records. I > have an algorithm to generate a unique baz_number - it is a mixture of alpha > and numerics. Using an int for identification is certainly suggested, however it sounds like you may be short a few indexes on the foreign key'd fields. EXPLAIN ANALYZE output is always nice..
pgsql-performance by date: