Re: Another weird one with an UPDATE - Mailing list pgsql-performance

From David Griffiths
Subject Re: Another weird one with an UPDATE
Date
Msg-id 05a401c3903f$a8524850$6501a8c0@griffiths2
Whole thread Raw
In response to Another weird one with an UPDATE  (David Griffiths <dgriffiths@boats.com>)
Responses Re: Another weird one with an UPDATE  (Stephan Szabo <sszabo@megazone.bigpanda.com>)
List pgsql-performance
Sorry - just found the FAQ (http://jamesthornton.com/postgres/FAQ/faq-english.html#4.22) on how IN is very slow.
 
So I rewrote the query:
 
\o ./data/temp.txt
SELECT current_timestamp;
UPDATE user_account SET last_name = 'abc'
WHERE EXISTS (SELECT ua.user_account_id FROM user_account ua, commercial_entity ce, commercial_service cs
WHERE ua.user_account_id = ce.user_account_id AND ce.commercial_entity_id = cs.commercial_entity_id);
SELECT current_timestamp;
\o
 
EXISTS is kind of a weird statement, and it doesn't appear to be identical (the number of rows updated was 72,000 rather than 3500). It also took 4 minutes to execute.
 
Is there any way around this other than breaking the query into two? As in:
 
pstmt1 = conn.preprareStatement("SELECT ua.user_account_id FROM user_account ua, commercial_entity ce, commercial_service cs
WHERE ua.user_account_id = ce.user_account_id AND ce.commercial_entity_id = cs.commercial_entity_id");
rset = pstmt1.executeQuery();
while (rset.next())
{
    pstmt2 = conn.prepareStatement("UPDATE user_account SET last_name = 'abc' WHERE user_account_id = ?");
    pstmt2.setLong(1, rset.getLong(1));
    ...
}
 
Unfort, that will be alot of data moved from Postgres->middle-tier (Weblogic/Resin), which is inefficient.
 
Anyone see another solution?
 
David.
----- Original Message -----
Sent: Saturday, October 11, 2003 12:44 PM
Subject: [PERFORM] Another weird one with an UPDATE

I am running an update-query to benchmark various databases; the postgres version is,
 
UPDATE user_account SET last_name = 'abc'
WHERE user_account_id IN (SELECT user_account_id FROM commercial_entity, commercial_service WHERE yw_account_id IS NULL
AND commercial_entity.commercial_entity_id = commercial_service.commercial_entity_id);
 
 
The inner query (the select), run by itself, takes about a second. Add the outer query (the update-portion), and the query dies. The machine has been vacuum-analzyed. Here is the explain-analyze:
 
benchtest=# EXPLAIN ANALYZE UPDATE user_account SET last_name = 'abc'
benchtest-# WHERE user_account_id IN (SELECT user_account_id FROM commercial_entity, commercial_service WHERE yw_account_id IS NULL
benchtest(# AND commercial_entity.commercial_entity_id = commercial_service.commercial_entity_id);
 Seq Scan on user_account  (cost=0.00..813608944.88 rows=36242 width=718) (actual time=15696258.98..16311130.29 rows=3075 loops=1)   Filter: (subplan)
   SubPlan
     ->  Materialize  (cost=11224.77..11224.77 rows=86952 width=36) (actual time=0.06..106.40 rows=84831 loops=72483)
           ->  Merge Join  (cost=0.00..11224.77 rows=86952 width=36) (actual time=0.21..1845.13 rows=85158 loops=1)
                 Merge Cond: ("outer".commercial_entity_id = "inner".commercial_entity_id)
                 ->  Index Scan using commercial_entity_pkey on commercial_entity  (cost=0.00..6787.27 rows=77862 width=24) (actual time=0.06..469.56 rows=78132 loops=1)
                       Filter: (yw_account_id IS NULL)
                 ->  Index Scan using comm_serv_comm_ent_id_i on commercial_service  (cost=0.00..2952.42 rows=88038 width=12) (actual time=0.03..444.80 rows=88038 loops=1)
 Total runtime: 16332976.21 msec
(10 rows)
 
Here are the relevant parts of the schema:
 
 
 
USER_ACCOUNT
 
            Column             |            Type             |          Modifiers
-------------------------------+-----------------------------+-----------------------------
 user_account_id               | numeric(10,0)               | not null
 first_name                    | character varying(100)      |
 last_name                     | character varying(100)      |
Indexes: user_account_pkey primary key btree (user_account_id),
         usr_acc_last_name_i btree (last_name),
Foreign Key constraints: $1 FOREIGN KEY (lang_id) REFERENCES lang(lang_id) ON UPDATE NO ACTION ON DELETE NO ACTION,
                         $2 FOREIGN KEY (source_id) REFERENCES source(source_id) ON UPDATE NO ACTION ON DELETE NO ACTION,
                         $3 FOREIGN KEY (user_role_id) REFERENCES user_role(user_role_id) ON UPDATE NO ACTION ON DELETE NO ACTION
 
 
 
COMMERCIAL_ENTITY
 
          Column           |            Type             |                          Modifiers
---------------------------+-----------------------------+-------------------------------------------------------------
 commercial_entity_id      | numeric(10,0)               | not null
 yw_account_id             | numeric(10,0)               |
Indexes: commercial_entity_pkey primary key btree (commercial_entity_id),
         comm_ent_yw_acc_id_i btree (yw_account_id)
Foreign Key constraints: $1 FOREIGN KEY (source_id) REFERENCES source(source_id) ON UPDATE NO ACTION ON DELETE NO ACTION,
                         $2 FOREIGN KEY (user_account_id) REFERENCES user_account(user_account_id) ON UPDATE NO ACTION ON DELETE NO ACTION
 
 
 
COMMERCIAL_SERVICE
 
        Column        |     Type      | Modifiers
----------------------+---------------+-----------
 commercial_entity_id | numeric(10,0) | not null
 service_type_id      | numeric(10,0) | not null
 source_id            | numeric(10,0) | not null
Indexes: commercial_service_pkey primary key btree (commercial_entity_id, service_type_id),
         comm_serv_comm_ent_id_i btree (commercial_entity_id),
         comm_serv_serv_type_id_i btree (service_type_id),
         comm_serv_source_id_i btree (source_id)
Foreign Key constraints: $1 FOREIGN KEY (commercial_entity_id) REFERENCES commercial_entity(commercial_entity_id) ON UPDATE NO ACTION ON DELETE NO ACTION,
                         $2 FOREIGN KEY (source_id) REFERENCES source(source_id) ON UPDATE NO ACTION ON DELETE NO ACTION,
                         $3 FOREIGN KEY (service_type_id) REFERENCES service_type(service_type_id) ON UPDATE NO ACTION ON DELETE NO ACTION
 
 
Here is the postgres.conf (or the variables that are not commented out):
 
tcpip_socket = true
max_connections = 500
shared_buffers = 32768          # min max_connections*2 or 16, 8KB each
wal_buffers = 128               # min 4, typically 8KB each
sort_mem = 4096                 # min 64, size in KB
effective_cache_size = 50000    # typically 8KB each
 
Is it a problem with "IN"?
 
David

pgsql-performance by date:

Previous
From: David Griffiths
Date:
Subject: Another weird one with an UPDATE
Next
From: Stephan Szabo
Date:
Subject: Re: Another weird one with an UPDATE