Thread: Why would writes to pgsql_tmp bottleneck at 1mb/s?
PG Hackers, What follows is iostat output from a TPC-H test on Solaris 10. The machine is creating indexes on a table which is 50G in size, so it needs to use pgsql_tmp for internal swapping: tty md15 sd1 sd2 sd3 cpu tin tout kps tps serv kps tps serv kps tps serv kps tps serv us sy wt id 0 84 22526 1211 1 1024 1 5 0 0 0 5634 337 1 30 8 0 61 0 242 24004 1337 1 1024 1 5 0 0 0 6007 355 1 33 8 0 59 0 85 22687 1277 1 1024 1 5 0 0 0 5656 322 1 31 8 0 62 0 85 20876 1099 1 1024 2 9 0 0 0 5185 292 1 28 7 0 64 md15 is WAL (pg_xlog). sd3 is PGDATA. sd1 i pgsql_tmp. As you can see, we're getting a nice 23mb/s peak for WAL (thanks to forcedirectio) and database writes peak at 6mb/s. However, pgsql_tmp, which is being used heavily, hovers around 1mb/s, and never goes above 1.5mb/s. This seems to be throttling the whole system. Any suggestions on why this should be? Do we have a performance bug in the pg_tmp code? -- --Josh Josh Berkus Aglio Database Solutions San Francisco
People: > As you can see, we're getting a nice 23mb/s peak for WAL (thanks to > forcedirectio) and database writes peak at 6mb/s. However, pgsql_tmp, > which is being used heavily, hovers around 1mb/s, and never goes above > 1.5mb/s. This seems to be throttling the whole system. Never mind, I'm a dork. I accidentally cut the "SET maintenance_work_mem = 2000000" out of my config file, and it was running with the default 1024K .... -- --Josh Josh Berkus Aglio Database Solutions San Francisco
Hmmmm..... > > As you can see, we're getting a nice 23mb/s peak for WAL (thanks to > > forcedirectio) and database writes peak at 6mb/s. However, pgsql_tmp, > > which is being used heavily, hovers around 1mb/s, and never goes above > > 1.5mb/s. This seems to be throttling the whole system. > > Never mind, I'm a dork. I accidentally cut the "SET maintenance_work_mem > = 2000000" out of my config file, and it was running with the default 1024K Maybe I'm not an idiot (really!) even with almost 2GB of maintenance_mem, PG still writes to pgsql_tmp no faster than 2MB/s. I think there may be an artificial bottleneck there. Question is, PostgreSQL, OS or hardware? Suggestions? -- --Josh Josh Berkus Aglio Database Solutions San Francisco
> Maybe I'm not an idiot (really!) even with almost 2GB of maintenance_mem, PG > still writes to pgsql_tmp no faster than 2MB/s. I think there may be an > artificial bottleneck there. Question is, PostgreSQL, OS or hardware? I'm curious: what is your cpu usage while this is happening? I've noticed similar slow index creation behaviour, but I did not make any connection to pgsql_temp (because it was not on a separate partition). I was indexing an oid field of a 700GB table and it took about four days on a 1.2GHz UltraSparcIII (solaris 9, 8GB core). I noticed that the one CPU that was pegged at near 100%, leading me to believe it was CPU bound. Odd thing is that the same operation on a 2GHz Pentium IV box (Linux) on the same data took about a day. Truss showed that a great majority of that time was in userland. -Aaron
Josh Berkus <josh@agliodbs.com> writes: > Maybe I'm not an idiot (really!) even with almost 2GB of maintenance_mem, PG > still writes to pgsql_tmp no faster than 2MB/s. I think there may be an > artificial bottleneck there. Question is, PostgreSQL, OS or hardware? AFAIR that's just fwrite() ... regards, tom lane
Tom, > > Maybe I'm not an idiot (really!) even with almost 2GB of > > maintenance_mem, PG still writes to pgsql_tmp no faster than 2MB/s. I > > think there may be an artificial bottleneck there. Question is, > > PostgreSQL, OS or hardware? > > AFAIR that's just fwrite() ... Well, are there any hacks to speed it up? It's about doubling the amount of time it takes to create an index on a very large table. -- --Josh Josh Berkus Aglio Database Solutions San Francisco
Josh Berkus <josh@agliodbs.com> writes: >> AFAIR that's just fwrite() ... > Well, are there any hacks to speed it up? It's about doubling the amount of > time it takes to create an index on a very large table. Huh? Doubled compared to what? regards, tom lane
Tom, > Huh? Doubled compared to what? Compared to how much data writing I can do to the database when pgsql_tmp isn't engaged. In other words, when pgsql_tmp isn't being written, database writing is 9mb/s. When pgsql_tmp gets engaged, that drops to 4mb/s. Alternatively, the WAL drive, which is the same hardware, will write at 10mb/s. -- --Josh Josh Berkus Aglio Database Solutions San Francisco