Re: UPDATEDs slowing SELECTs in a fully cached database - Mailing list pgsql-performance
From | lars |
---|---|
Subject | Re: UPDATEDs slowing SELECTs in a fully cached database |
Date | |
Msg-id | 4E1B90AF.1020408@yahoo.com Whole thread Raw |
In response to | Re: UPDATEDs slowing SELECTs in a fully cached database (lars <lhofhansl@yahoo.com>) |
Responses |
Re: UPDATEDs slowing SELECTs in a fully cached database
(Ivan Voras <ivoras@freebsd.org>)
|
List | pgsql-performance |
On 07/11/2011 04:02 PM, lars wrote: > On 07/11/2011 02:43 PM, Merlin Moncure wrote: >> On Mon, Jul 11, 2011 at 2:16 PM, Kevin Grittner >> <Kevin.Grittner@wicourts.gov> wrote: >>> lars<lhofhansl@yahoo.com> wrote: >>> >>>> Stopping the UPDATEs, waiting for any CHECKPOINTs to finish, >>>> and then running the SELECTs indeed shows a similar slowdown. >>>> >>>> Interestingly I see very heavy WAL traffic while executing the >>>> SELECTs. (So I was confused as to what caused the WAL traffic). >>> Hint bit changes aren't logged, so if it was that you would be >>> seeing writes to the heap, but not to the WAL. Clean-up of dead >>> tuples is logged -- this is probably the result of pruning dead >>> tuples. You could probably reduce the impact on your SELECT >>> statements at least a little by making autovacuum more aggressive. >> yeah. In fact, I'd like to disable autovacuum completely just to >> confirm this. In particular I'd like to know if that removes wal >> traffic when only selects are going on. Another way to check is to >> throw some queries to pg_stat_activity during your select period and >> see if any non-select activity (like autovacum vacuum). Basically I'm >> suspicious there is more to this story. >> >> hint bit flusing causing i/o during SELECT is a typical complaint >> (especially on systems with underperformant i/o), but I'm suspicious >> if that's really the problem here. Since you are on a virtualized >> platform, I can't help but wonder if you are running into some >> bottleneck that you wouldn't see on native hardware. >> >> What's iowait during the slow period? >> >> merlin > Thanks Kevin and Merlin this is extremely helpful... > > Ok, that makes much more sense (WALing hint bits did not make sense). > > I disabled auto-vacuum and did four tests: > 1. Run a bunch of updates, stop that process, wait until checkpointing > is finished, and run the selects (as before). > 2. run VACUUM manually, then run the SELECTs > 3. Have the UPDATEs and SELECTs touch a mutually exclusive, random > sets of row (still in sets of 10000). > So the SELECTs are guaranteed not to select rows that were updated. > 4. Lastly, change the UPDATEs to update a non-indexed column. To rule > out Index maintenance. Still distinct set of rows. > > In the first case I see the same slowdown (from ~14ms to ~400-500ms). > pg_stat_activity shows no other load during that > time. I also see write activity only on the WAL volume. > > In the 2nd case after VACUUM is finished the time is back to 14ms. As > an aside: If I run the SELECTs while VACUUM is > running the slowdown is about the same as in the first case until > (apparently) VACUUM has cleaned up most of the table, > at which point the SELECTs become faster again (~50ms). > > In the 3rd case I see exactly the same behavior, which is interesting. > Both before VACUUM is run and after. > There's no guarantee obviously that distinct rows do not share the > same page of course especially since the index is > updated as part of this (but see the 4th case). > > In case 4 I still see the same issue. Again both before and after VACUUM. > > In all cases I see from pg_stat_bgwriter that no backend writes > buffers directly (but I think that only pertains to dirty buffers, and > not the WAL). > > So I think I have a partial answer to my initial question. > > However, that brings me to some other questions: > Why do SELECTs cause dead tuples to be pruned (as Kevin suggests)? > That even happens when the updates did not touch the selected rows(?) > And why does that slow down the SELECTs? (checkpointing activity on > the EBS volume holding the database for example > does not slow down SELECTs at all, only WAL activity does). Does the > selecting backend do that work itself? > > Lastly, is this documented somewhere? (I apologize if it is and I > missed it). If not I'd be happy to write a wiki entry for this. > > Oh, and iowait hovers around 20% when SELECTs are slow: avg-cpu: %user %nice %system %iowait %steal %idle 1.54 0.00 0.98 18.49 0.07 78.92 When SELECTs are fast it looks like this: avg-cpu: %user %nice %system %iowait %steal %idle 8.72 0.00 0.26 0.00 0.00 91.01 Note that this is a 12 core VM. So one core at 100% would show as 8.33% CPU.
pgsql-performance by date: