Re: So, is COUNT(*) fast now? - Mailing list pgsql-hackers
From | Kääriäinen Anssi |
---|---|
Subject | Re: So, is COUNT(*) fast now? |
Date | |
Msg-id | BC19EF15D84DC143A22D6A8F2590F0A788641330BE@EXMAIL.stakes.fi Whole thread Raw |
In response to | So, is COUNT(*) fast now? (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: So, is COUNT(*) fast now?
Re: So, is COUNT(*) fast now? |
List | pgsql-hackers |
Quoting Robert Haas: """ I tried this on my MacBook Pro this morning, using pgbench -i -s 500 to create a database about 7.5GB in size, and then using "SELECT sum(1) FROM pgbench_accounts" as a test query, on a build WITHOUT --enable-cassert. This machine has 4GB of memory, and I set shared_buffers = 400MB. (No, I'm not sure whether that's the optimal setting for shared_buffers for this machine.) """ I did some tests where I tried to compare the effect of having the index ordered tuples not be in the same order they are in the base table. The idea is to test what effect accessing the VM map randomly as opposed to sequential order has on performance. I suspect the above test access the VM in order (the accounts table is effectively clustered on the index used in the test). I might be mistaken here. My test setup is this: drop table if exists test; drop table if exists test2; create unlogged table test /* with (fillfactor = 10) */ as select generate_series(0, 20*1000*1000) as id; create index idx1 on test(id); vacuum test; create unlogged table test2 /* with (fillfactor = 10) */ as (select * from test order by random()); create index idx2 on test2(id); vacuum test2; Table size is around 600MB, index size is around 350MB and VM on-disk size is 16kB with default fillfactor. With fillfactor = 10, the VM size is 104 KB, and table size is around 6GB. The index size is the same. Results for the randomly ordered table: # select count(*) from test2; 14822.045 ms 14826.253 ms 14815.450 ms Results for the effectively clustered table: # select count(*) from test; 11761.890 ms 11767.926 ms 11810.900 ms Now, this test still has the benefit of fitting the VM easily into the L1 cache. Next, I did a ugly hack to get the table size large enough so that the VM will trash the L1 cache while still having somewhat reasonable test setup creation time. My harware is old, 1GB of memory, processor is Genuine Intel(R) CPU L2400 @ 1.66GHz. The L1 data cache size is 32kB on my. The hack is to simply set fillfactor to 10. The VM size is now 104kB, the table size is about 6.3 GB while the index size is still the same as in above test. Results for the randomly ordered table: # select count(*) from test2; 21606.683 ms 21829.063 ms 21637.434 ms Results for the effectively clustered table: # select count(*) from test; 11714.663 ms 11449.264 ms 11658.534 ms Now, the next step would be to trash the L2 cache (20GB table size should do this on Sandy Bridge, where L2 cache is 256KB). I don't have hardware to do that test. It is worth noting that the L2 cache is shared on Sandy Bridge, so it is likely that an index-only scan of a large enough table would slow down other processes, too. Without tests this is only FUD, though. The test would be to scan a 20GB table's index repeatedly in one process, and see how it affects standard in-memory pgbench results for other processes. Compare this with doing the same with a sequential scan process. Lessons learned (or what I learned, at least): - Clustering is important for index only scans. Picking a clustered index over non-clustered index will have a big performance effect. - Large table index-only scans are going to be more expensivecompared to sequential scan than what pgbench accounts tests suggests. I assume that the accounts table is effectivelyclustered on the index used. I haven't verified this. - There is the possibility that index-only scans willtrash the caches for other processes, too. Not tested, though. I am sure these results will vary significantly based on hardware used. I am also notorious for screwing up benchmarks, so verifying these results is recommended. You will need around 16GB of disk space for the fillfactor = 10 test. I would recommend you have more than 1GB of memory, otherwise creating the test setup can take some time... - Anssi Kääriäinen
pgsql-hackers by date: