Re: ANALYZE sampling is too good - Mailing list pgsql-hackers
From | Mark Kirkwood |
---|---|
Subject | Re: ANALYZE sampling is too good |
Date | |
Msg-id | 52A686CD.6050104@catalyst.net.nz Whole thread Raw |
In response to | Re: ANALYZE sampling is too good (Mark Kirkwood <mark.kirkwood@catalyst.net.nz>) |
Responses |
Re: ANALYZE sampling is too good
|
List | pgsql-hackers |
On 10/12/13 15:32, Mark Kirkwood wrote: > On 10/12/13 15:17, Mark Kirkwood wrote: >> On 10/12/13 15:11, Mark Kirkwood wrote: >>> On 10/12/13 15:04, Mark Kirkwood wrote: >>>> On 10/12/13 13:53, Mark Kirkwood wrote: >>>>> On 10/12/13 13:20, Mark Kirkwood wrote: >>>>>> On 10/12/13 13:14, Mark Kirkwood wrote: >>>>>>> On 10/12/13 12:14, Heikki Linnakangas wrote: >>>>>>>> >>>>>>>> >>>>>>>> I took a stab at using posix_fadvise() in ANALYZE. It turned >>>>>>>> out to be very easy, patch attached. Your mileage may vary, but >>>>>>>> I'm seeing a nice gain from this on my laptop. Taking a 30000 >>>>>>>> page sample of a table with 717717 pages (ie. slightly larger >>>>>>>> than RAM), ANALYZE takes about 6 seconds without the patch, and >>>>>>>> less than a second with the patch, with >>>>>>>> effective_io_concurrency=10. If anyone with a good test data >>>>>>>> set loaded would like to test this and post some numbers, that >>>>>>>> would be great. >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> I did a test run: >>>>>>> >>>>>>> pgbench scale 2000 (pgbench_accounts approx 25GB). >>>>>>> postgres 9.4 >>>>>>> >>>>>>> i7 3.5Ghz Cpu >>>>>>> 16GB Ram >>>>>>> 500 GB Velociraptor 10K >>>>>>> >>>>>>> (cold os and pg cache both runs) >>>>>>> Without patch: ANALYZE pgbench_accounts 90s >>>>>>> With patch: ANALYZE pgbench_accounts 91s >>>>>>> >>>>>>> So I'm essentially seeing no difference :-( >>>>>> >>>>>> >>>>>> Arrg - sorry forgot the important bits: >>>>>> >>>>>> Ubuntu 13.10 (kernel 3.11.0-14) >>>>>> filesystem is ext4 >>>>>> >>>>>> >>>>>> >>>>> >>>>> Doing the same test as above, but on a 80GB Intel 520 (ext4 >>>>> filesystem mounted with discard): >>>>> >>>>> (cold os and pg cache both runs) >>>>> Without patch: ANALYZE pgbench_accounts 5s >>>>> With patch: ANALYZE pgbench_accounts 5s >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>>> Redoing the filesystem on the 520 as btrfs didn't seem to make any >>>> difference either: >>>> >>>> (cold os and pg cache both runs) >>>> Without patch: ANALYZE pgbench_accounts 6.4s >>>> With patch: ANALYZE pgbench_accounts 6.4s >>>> >>>> >>>> >>> >>> Ah - I have just realized I was not setting effective_io_concurrency >>> - so I'll redo the test. - Apologies. >>> >>> >> >> Redoing the test on the velociraptor gives me exactly the same >> numbers as before (effective_io_concurrency = 10 instead of 1). >> >> > > Good grief - repeating the test gave: > > Without patch: ANALYZE pgbench_accounts 90s > With patch: ANALYZE pgbench_accounts 42s > > pretty consistent *now*. No idea what was going on in the 1st run > (maybe I happened to have it running at the same time as a > checkpoint)? Anyway will stop now before creating more confusion. > > Just one more... The Intel 520 with ext4: Without patch: ANALYZE pgbench_accounts 5s With patch: ANALYZE pgbench_accounts 1s And double checking - With patch, but effective_io_concurrency = 1: ANALYZE pgbench_accounts 5s These results look more like Heikki's. Which suggests more benefit on SSD than spinning disks. Some more data points (apart from mine) would be good to see tho. Cheers Mark
pgsql-hackers by date: