Re: VERY slow queries at random - Mailing list pgsql-performance
From | Gunther Mayer |
---|---|
Subject | Re: VERY slow queries at random |
Date | |
Msg-id | 46682524.7090406@gmail.com Whole thread Raw |
In response to | Re: VERY slow queries at random (Scott Marlowe <smarlowe@g2switchworks.com>) |
Responses |
Re: VERY slow queries at random
|
List | pgsql-performance |
Scott Marlowe wrote: > Gunther Mayer wrote: >> Hi there, >> >> We run a small ISP with a FreeBSD/freeradius/postgresql 8.2.4 backend >> and 200+ users. Authentication happens via UAM/hotspot and I see a lot >> of authorisation and accounting packets that are handled via PL/PGSQL >> functions directly in the database. >> >> Everything seems to work 100% except that a few times a day I see >> >> Jun 6 10:41:31 caligula postgres[57347]: [4-1] radiususer: LOG: >> duration: 19929.291 ms statement: SELECT fn_accounting_start(...) >> >> in my logs. I'm logging slow queries with log_min_duration_statement = >> 500 in my postgresql.conf. Sometimes another query runs equally slow or >> even slower (I've seen 139 seconds!!!) a few minutes before or after as >> well, but then everything is back to normal. >> >> Even though I haven't yet indexed my data I know that the system is >> performant because my largest table (the accounting one) only has 5000+ >> rows, the entire database is only a few MB's and I have plenty of memory >> (2GB), shared_buffers = 100MB and max_fsm_pages = 179200. Also from >> briefly enabling >> >> log_parser_stats = on >> log_planner_stats = on >> log_executor_stats = on >> >> I saw that most queries are 100% satisfied from cache so the disk >> doesn't even get hit. Finally, the problem seems unrelated to load >> because it happens at 4am just as likely as at peak traffic time. >> >> What the heck could cause such erratic behaviour? I suspect some type of >> resource problem but what and how could I dig deeper? > > Maybe your hard drive is set to spin down after a certain period of > idle, and since most all your data is coming from memory, then it > might be that on the rare occasion when it needs to hit the drive it's > not spun up anymore. I doubt that as a serious amount of logging is taking place on the box all the time which goes straight to disk. Also, no disk in the world would take more than a minute to spin up... > Maybe some other process is cranking up (cron jobs???) that are > chewing up all your I/O bandwidth? Hmm, I investigated that too but if that was the case the queries would run slow always at the same time of the day. > Hard to say. Anything in the system logs that would give you a hint? > Try correlating them by the time of the slow pgsql queries. Nothing relevant in the system logs at the time of the slow query appearing. I have in the mean time tweaked syslog-ng.conf such that as soon as it detects a "duration: <greater than 500>ms" log message it spawns top and top -m io and redirects the output to file. At least in that way I can check what's keeping the system busy immediately *after* a slow query has occured. Of course now Murphy's law has it that since I've done that (30 hours ago) not a single slow query has fired, but hey, I'll look at the results once I have them. On another note, autovacuum couldn't cause such issues, could it? I do have autovacuum enabled (autovacuum=on as well as stats_start_collector=on, stats_block_level = on and stats_row_level = on), is there any possibility that autovacuum is not as resource friendly as advertised? Gunther
pgsql-performance by date: