Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays - Mailing list pgsql-hackers

From David Rowley
Subject Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays
Date
Msg-id CAApHDvo_PzGm5EEHUjsEKPr+afj9KXse-iqrCNke5wJeTAit2g@mail.gmail.com
Whole thread Raw
In response to Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays  (James Coleman <jtc331@gmail.com>)
Responses Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays  (James Coleman <jtc331@gmail.com>)
Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers
On Sat, 20 Mar 2021 at 09:41, James Coleman <jtc331@gmail.com> wrote:
> I've attached a cleaned up patch. Last CF it was listed in is
> https://commitfest.postgresql.org/29/2542/ -- what's the appropriate
> step to take here given it's an already existing patch, but not yet
> moved into recent CFs?

I had a look at this patch.  I like the idea of using a simplehash.h
hash table to hash the constant values so that repeat lookups can be
performed much more quickly, however, I'm a bit concerned that there
are quite a few places in the code where we often just execute a
ScalarArrayOpExpr once and I'm a bit worried that we'll slow down
expression evaluation of those cases.

The two cases that I have in mind are:

1. eval_const_expressions() where we use the executor to evaluate the
ScalarArrayOpExpr to see if the result is Const.
2. CHECK constraints with IN clauses and single-row INSERTs.

I tried to benchmark both of these but I'm struggling to get stable
enough performance for #2, even with fsync=off.  Sometimes I'm getting
results 2.5x slower than other runs.

For benchmarking #1 I'm also not too sure I'm getting stable enough
results for them to mean anything.

I was running:

create table a (a int);

bench.sql: explain select * from a where a
in(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16);

drowley@amd3990x:~$ pgbench -n -T 60 -j 64 -c 64 -f bench.sql -P 10 postgres
Master (6d41dd045):
tps = 992586.991045 (without initial connection time)
tps = 987964.990483 (without initial connection time)
tps = 994309.670918 (without initial connection time)

Master + v4-0001-Hash-lookup-const-arrays-in-OR-d-ScalarArrayOps.patch
tps = 956022.557626 (without initial connection time)
tps = 963043.352823 (without initial connection time)
tps = 968582.600100 (without initial connection time)

This puts the patched version about 3% slower. I'm not sure how much
of that is changes in the binary and noise and how much is the
needless hashtable build done for eval_const_expressions().

I wondered if we should make it the query planner's job of deciding if
the ScalarArrayOpExpr should be hashed or not.  I ended up with the
attached rough-cut patch that introduces HashedScalarArrayOpExpr and
has the query planner decide if it's going to replace
ScalarArrayOpExpr with these HashedScalarArrayOpExpr during
preprocess_expression().  I do think that we might want to consider
being a bit selective about when we do these replacements.  It seems
likely that we'd want to do this for EXPRKIND_QUAL and maybe
EXPRKIND_TARGET, but I imagine that converting ScalarArrayOpExpr to
HashedScalarArrayOpExpr for EXPRKIND_VALUES would be a waste of time
since those will just be executed once.

I tried the same above test with the
v4-0001-Hash-lookup-const-arrays-in-OR-d-ScalarArrayOps.patch plus the
attached rough-cut patch and got:

master + v4-0001-Hash-lookup-const-arrays-in-OR-d-ScalarArrayOps.patch
+ v5-0002-Rough-cut-patch-for-HashedScalarArrayOpExpr.patch
tps = 1167969.983173 (without initial connection time)
tps = 1199636.793314 (without initial connection time)
tps = 1190690.939963 (without initial connection time)

I can't really explain why this became faster. I was expecting it just
to reduce that slowdown of the v4 patch a little. I don't really see
any reason why it would become faster.  It's almost 20% faster which
seems like too much to just be fluctuations in code alignment in the
binary.

The attached patch is still missing the required changes to
llvmjit_expr.c. I think that was also missing from the original patch
too, however.

Also, I added HashedScalarArrayOpExpr to plannodes.h.  All other Expr
type nodes are in primnodes.h. However, I put HashedScalarArrayOpExpr
in plannodes.h because the parser does not generate this and it's not
going to be stored in the catalogue files anywhere. I'm not so sure
inventing a new Expr type node that only can be generated by the
planner is a good thing to do.

Anyway, wondering what you think of the idea of allowing the planner
to choose if it's going to hash or not?

It might also be good if someone else can check if they can get a bit
more stable performance results from benchmarking the patches.

(Also attached your v4 patch again just so anyone following along at
home does not need to hunt around for the correct set of patches to
apply to test this)

David

Attachment

pgsql-hackers by date:

Previous
From: Andrew Gierth
Date:
Subject: Re: ALTER TABLE ADD COLUMN fast default
Next
From: "houzj.fnst@fujitsu.com"
Date:
Subject: RE: Table refer leak in logical replication