Thread: Unexpected speed PLAIN vs. MAIN
I'm comparing speed of some queries against tables having the same data but different storage, and got an unexpected behavior. The tables have 2 integer fields and a PcPatch field ("p", custom type from pgPointCloud). There are no TOASTs involved (the toast table associated with the table with MAIN storage is empty, the table with PLAIN storage has no toast table). Running a SELECT count(p) takes 6261.699 ms on the table with MAIN storage and 18488.713 ms on the table with PLAIN storage. The number of buffer reads are about the same. Why would reading presence/absence of a value be faster from MAIN than from PLAIN storage ? The explain output: =# explain (analyze, verbose, buffers) select count(pa) from rtlidar_dim_main;Aggregate (cost=1202627.85..1202627.86 rows=1width=32) (actual time=6261.644..6261.644 rows=1 loops=1) Output: count(pa) Buffers: shared hit=32 read=1187659 -> Seq Scan on public.rtlidar_dim_main (cost=0.00..1199640.48 rows=1194948 width=32) (actual time=0.060..6105.566 rows=1194948loops=1) Output: id, source, pa Buffers: shared hit=32 read=1187659Total runtime: 6261.699 ms =# explain (analyze, verbose, buffers) select count(pa) from rtlidar_dim_plain;Aggregate (cost=1202627.85..1202627.86 rows=1width=32) (actual time=18473.973..18473.973 rows=1 loops=1) Output: count(pa) Buffers: shared hit=37 read=1187654 -> Seq Scan on public.rtlidar_dim_plain (cost=0.00..1199640.48 rows=1194948 width=32) (actual time=0.058..18247.974rows=1194948 loops=1) Output: id, source, pa Buffers: shared hit=37 read=1187654Totalruntime: 18474.028 ms The relation sizes: =# select pg_total_relation_size('rtlidar_dim_plain');9756426240 =# select pg_total_relation_size('rtlidar_dim_main');9756434432 --strk;
Sandro Santilli <strk@keybit.net> writes: > I'm comparing speed of some queries against tables having the same data > but different storage, and got an unexpected behavior. > The tables have 2 integer fields and a PcPatch field > ("p", custom type from pgPointCloud). > There are no TOASTs involved (the toast table associated with the table > with MAIN storage is empty, the table with PLAIN storage has no toast table). > Running a SELECT count(p) takes 6261.699 ms on the table with MAIN storage > and 18488.713 ms on the table with PLAIN storage. > The number of buffer reads are about the same. > Why would reading presence/absence of a value be faster from MAIN than > from PLAIN storage ? Hm ... MAIN allows in-line compression while PLAIN doesn't. But for count(), that would only make a difference if it resulted in a smaller physical table size, which it evidently didn't. My best guess is that the OS had many of the pages from rtlidar_dim_main sitting in OS disk cache, so that those "buffer reads" didn't all translate to physical I/O. Try flushing the OS cache immediately before each trial to get more-reproducible results. regards, tom lane
On Mon, May 04, 2015 at 01:50:45PM -0400, Tom Lane wrote: > Sandro Santilli <strk@keybit.net> writes: > > I'm comparing speed of some queries against tables having the same data > > but different storage, and got an unexpected behavior. > > > The tables have 2 integer fields and a PcPatch field > > ("p", custom type from pgPointCloud). > > > There are no TOASTs involved (the toast table associated with the table > > with MAIN storage is empty, the table with PLAIN storage has no toast table). > > > Running a SELECT count(p) takes 6261.699 ms on the table with MAIN storage > > and 18488.713 ms on the table with PLAIN storage. > > > The number of buffer reads are about the same. > > Why would reading presence/absence of a value be faster from MAIN than > > from PLAIN storage ? > > Hm ... MAIN allows in-line compression while PLAIN doesn't. But for > count(), that would only make a difference if it resulted in a smaller > physical table size, which it evidently didn't. > > My best guess is that the OS had many of the pages from rtlidar_dim_main > sitting in OS disk cache, so that those "buffer reads" didn't all > translate to physical I/O. Try flushing the OS cache immediately before > each trial to get more-reproducible results. Bingo, it was the OS disk cache. Thanks for the tip ! That cache (Linux) acts in mysterious ways, btw. After a new boot, with no explicit flushing, I obtained slow times in both tables (~18 secs) with queries in this order: PLAIN,MAIN,PLAIN,MAIN. Then 3 queries in a row against MAIN brought down its timing to 2, but after that no number of consecutive queries against PLAIN could do that. It took a disk flush (echo 3 > /proc/sys/vm/drop_caches; sync was not enough) to get the 18 seconds back on reading MAIN and allowing me to force caching PLAIN via consecutive calls... I'll play a bit with pgfincore to learn more. (http://git.postgresql.org/gitweb/?p=pgfincore.git;a=summary) --strk;