Re: GIN improvements part2: fast scan - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: GIN improvements part2: fast scan
Date
Msg-id 52FF9CEC.90809@fuzzy.cz
Whole thread Raw
In response to Re: GIN improvements part2: fast scan  (Alexander Korotkov <aekorotkov@gmail.com>)
List pgsql-hackers
On 9.2.2014 11:11, Alexander Korotkov wrote:
> On Fri, Feb 7, 2014 at 5:33 PM, Heikki Linnakangas
> <hlinnakangas@vmware.com <mailto:hlinnakangas@vmware.com>> wrote:
>
>     On 02/06/2014 01:22 PM, Alexander Korotkov wrote:
>
>         Difference is very small. For me, it looks ready for commit.
>
>
>     Great, committed!
>
>     Now, to review the catalog changes...
>
>
> I've rebased catalog changes with last master. Patch is attached. I've
> rerun my test suite with both last master ('committed') and attached
> patch ('ternary-consistent').
>
>          method         |       sum
> ------------------------+------------------
>  committed              | 143491.715000001
>  fast-scan-11           | 126916.111999999
>  fast-scan-light        |       137321.211
>  fast-scan-light-heikki | 138168.028000001
>  master                 |       446976.288
>  ternary-consistent     |       125923.514
>

Hi,

I've repeated the benchmarks - it took a few days to process that, but
here are the results. And IMHO it looks 100% fine.

I've tested all the patches since 25/01/2014, mostly because of
curiosity but also for comparison with current patches. So these are the
patches (name + date of the message with the patch):

  heikki-20140125      0001 + 0002 + 0003 + 0004
  heikki-20140126      load-all-entries-before-consistent-check-1
  alexander-20140127-1 0001 + 0002 + 0003 + 0004 + 0005
  alexander-20140127-2 0001 + 0002 + 0003 + 0004 + 0005 + 0006
  heikki-20140202      ternary-logic + binary-heap
                       + preconsistent-only-on-new-page
  alexander-20140203   fast-scan-10
  alexander-20140204   fast-scan-11
  alexander-20140205   fast-scan-light
  heikki-20140206      fast-scan-light-heikki1 (comitted 07/02)
  alexander-20140209   ternary-consistent

I've tested both 9.3 and master for comparison. Package with all the
patches is available here: http://www.fuzzy.cz/tmp/gin/patches.tgz

The results are available on http://www.fuzzy.cz/tmp/gin/ as before.

I've tested these datasets:

  3-words-common
  3-words-common + ORDER BY
  3-words-medium
  3-words-medium + ORDER BY
  3-words-rare
  3-words-rare + ORDER BY
  6-words-common
  6-words-common + ORDER BY
  6-words-medium
  6-words-medium
  6-words-rare
  6-words-rare + ORDER BY
  postgres-queries
  postgres-queries + ORDER BY

I.e. basically the same queries as before, except that I've added a
version without "ORDER BY" clause. The main difference is that I added a
"postgres-queries" dataset with 33k real-world queries collected from
search at postgresql.org.

Another improvement is that instead of a single measurement, I've ran
the tests 10x, then threw away the first run and computed average,
median, min, max and stddev. You can choose the value to plot under the
chart.

The files with results for the 'postgres-queries' are ~70MB, which makes
viewing the dataset on the web a major PITA (first it takes very long to
download it, then it hogs the browser). So don't do that unless you want
to punish yourself for something bad you've done.

An alternative way to view the data is using a simple gnuplot charts. In
that case get http://www.fuzzy.cz/tmp/gin/plots.tgz. There's always a
.plot and .data file for each dataset/value combination. The dataset is
always "speedup vs. master"


Looking ad the postgres-queries results (attached), I see almost no
differences between these patches:

  alexander-20140204   fast-scan-11
  alexander-20140205   fast-scan-light
  heikki-20140206      fast-scan-light-heikki1 (comitted 07/02)
  alexander-20140209   ternary-consistent

And the same is true for the other tests - see the attached gnuplot
charts for a few of the tests.

So IMHO this looks quite great, no need for worries. Let me know if you
have any questions / would like to see another chart.

regards
Tomas

Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease
Next
From: Tom Lane
Date:
Subject: Re: narwhal and PGDLLIMPORT