Thread: Very slow updates when using IN syntax subselect
If I do: select event_id from event join token using (token_number) where token_status=50 and reconciled=false limit 1; Then: update event set reconciled=true where event_id={XXX}; It returns in about a second, or less. But If I do the same thing with the IN syntax: update event set reconciled=true where event_id in (select event_id from event join token using (token_number) wheretoken_status=50 and reconciled=false LIMIT 1); On a 4 CPU machine, 2 CPU's peg at 100%, and the request just eats CPU forever. Any clues what might be going on? Help would be much appreciated. I'm not seeing this on all my DB's... just the important ones. PostgreSQL 7.4.1 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.2.3 20030502 (Red Hat Linux 3.2.3-20)
Bryce Nesbitt <bryce1@obviously.com> writes: > update event set reconciled=true where event_id in > (select event_id from event join token using (token_number) > where token_status=50 and reconciled=false LIMIT 1); > On a 4 CPU machine, 2 CPU's peg at 100%, and the request just eats CPU > forever. What does EXPLAIN show for this and for the base query? > PostgreSQL 7.4.1 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.2.3 This may be your problem right here. You are urgently in need of an update in any case --- there are a lot of nasty bugs fixed since 7.4.1: http://developer.postgresql.org/docs/postgres/release.html regards, tom lane
Tom Lane wrote: > Bryce Nesbitt <bryce1@obviously.com> writes: > >> update event set reconciled=true where event_id in >> (select event_id from event join token using (token_number) >> where token_status=50 and reconciled=false LIMIT 1); >> >> On a 4 CPU machine, 2 CPU's peg at 100%, and the request just eats CPU >> forever. >> > What does EXPLAIN show for this and for the base query? QUERY PLAN -------------------------------------------------------------------------------------------Nested Loop (cost=0.00..3.04rows=1 width=8) -> Seq Scan on event (cost=0.00..0.00 rows=1 width=408) Filter: (reconciled = false) -> Index Scan using token_token_number_key on token (cost=0.00..3.03 rows=1 width=11) Index Cond: (("outer".token_number)::text = (token.token_number)::text) Filter: (token_status = 50) (6 rows) QUERY PLAN -------------------------------------------------------------------------------------------------Nested Loop IN Join (cost=0.00..3.06rows=1 width=616) Join Filter: ("outer".event_id = "inner".event_id) -> Seq Scan on event (cost=0.00..0.00rows=1 width=616) -> Nested Loop (cost=0.00..3.04 rows=1 width=8) -> Seq Scan on event (cost=0.00..0.00rows=1 width=408) Filter: (reconciled = false) -> Index Scan using token_token_number_keyon token (cost=0.00..3.03 rows=1 width=11) Index Cond: (("outer".token_number)::text = (token.token_number)::text) Filter: (token_status = 50) (9 rows) select count(*) from event; -----------116226 stage=# select count(*) from token; ------- 8948
Bryce Nesbitt <bryce1@obviously.com> writes: > Tom Lane wrote: >> What does EXPLAIN show for this and for the base query? > -> Seq Scan on event (cost=0.00..0.00 rows=1 width=408) > Filter: (reconciled = false) > select count(*) from event; > ----------- > 116226 It seems pretty clear that you've never vacuumed nor analyzed these tables ... else the planner would have some clue about their sizes. Do that and then see what you get. regards, tom lane
Tom Lane wrote: > Bryce Nesbitt <bryce1@obviously.com> writes: > >> Tom Lane wrote: >> >>> What does EXPLAIN show for this and for the base query? >>> > > >> -> Seq Scan on event (cost=0.00..0.00 rows=1 width=408) >> Filter: (reconciled = false) >> > > >> select count(*) from event; >> ----------- >> 116226 >> > > It seems pretty clear that you've never vacuumed nor analyzed these > tables ... else the planner would have some clue about their sizes. > Do that and then see what you get. > They occur in fine time. That's good, thanks. But jeeze, can't postgres figure this out for itself?
Bryce Nesbitt wrote: > > It seems pretty clear that you've never vacuumed nor analyzed these > > tables ... else the planner would have some clue about their sizes. > > Do that and then see what you get. > > They occur in fine time. That's good, thanks. But jeeze, can't > postgres figure this out for itself? I'm sure you wouldn't appreciate it if PostgreSQL did a full table scan before each query to figure out the total size of the involved tables. -- Peter Eisentraut http://developer.postgresql.org/~petere/
Peter Eisentraut wrote: > Bryce Nesbitt wrote: > >>> It seems pretty clear that you've never vacuumed nor analyzed these >>> tables ... else the planner would have some clue about their sizes. >>> Do that and then see what you get. >>> >> They occur in fine time. That's good, thanks. But jeeze, can't >> postgres figure this out for itself? >> > > I'm sure you wouldn't appreciate it if PostgreSQL did a full table scan > before each query to figure out the total size of the involved tables. > Nope. But let's say the query optimizer thought the table had one row. The the query starts, and 111,000 rows later... It seems that a mismatch between the static table size, and the actual one counted as you go, would be a quick check. That could set a flag for later background processing.... -Bryce
Peter Eisentraut <peter_e@gmx.net> writes: > Bryce Nesbitt wrote: >> They occur in fine time. That's good, thanks. But jeeze, can't >> postgres figure this out for itself? > I'm sure you wouldn't appreciate it if PostgreSQL did a full table scan > before each query to figure out the total size of the involved tables. It's also less than polite to complain about the behavior of two-year-old releases, without making any effort to ascertain whether more-current versions are smarter. regards, tom lane
Tom Lane wrote: > Peter Eisentraut <peter_e@gmx.net> writes: > >> Bryce Nesbitt wrote: >> >>> They occur in finite time. That's good, thanks. But jeeze, can't >>> postgres figure this out for itself? >>> >> I'm sure you wouldn't appreciate it if PostgreSQL did a full table scan >> before each query to figure out the total size of the involved tables. >> > > It's also less than polite to complain about the behavior of > two-year-old releases, without making any effort to ascertain > whether more-current versions are smarter. > Sorry to offend. Are current versions smarter? The DB I was working with still had row counts of 1 after tens of thousands of records had been added. With new versions, must the DB still be VACUUMED, or is that a bit more automatic?