On Sun, Oct 30, 2011 at 8:02 AM, Kääriäinen Anssi
<anssi.kaariainen@thl.fi> wrote:
> Table size is around 600MB, index size is around 350MB and VM on-disk
> size is 16kB with default fillfactor. With fillfactor = 10, the VM size is 104
> KB, and table size is around 6GB. The index size is the same.
What I think you're probably measuring here (oprofile would tell us
for sure) is that once the size of the table goes beyond about half a
gigabyte, it will have more than one page in the visibility map. The
index-only scan code keeps the most recently used visibility map page
pinned to save on overhead, but if you're bouncing back and forth
between data in the first ~500MB of the table and data in the last
~100MB, each switch will result in dropping the current pin and
getting a new one, which figures to be fairly expensive. With the
table is only a little over 500GB, you're probably only changing VM
pages every couple of tuples, but with a 6GB table just about every
tuple will switch to a new VM page.
Now, maybe you're right and the CPU caches are the more significant
effect. But I wouldn't like to bet on it without seeing how much the
drop-and-get-new-pin operations are costing us.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company