Re: [HACKERS] path toward faster partition pruning - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [HACKERS] path toward faster partition pruning
Date
Msg-id CA+TgmoaVLDLc8=YESRwD32gPhodU_ELmXyKs77gveiYp+JE4vQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] path toward faster partition pruning  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Responses Re: [HACKERS] path toward faster partition pruning
List pgsql-hackers
+/*
+ * PartitionBoundCmpArg - Caller-defined argument to be passed to
+ *                                               partition_bound_cmp()
+ *
+ * The first (fixed) argument involved in a comparison is the partition bound
+ * found in the catalog, while an instance of the following struct describes
+ * either a new partition bound being compared against existing bounds
+ * (caller should set is_bound to true and set bound), or a new tuple's
+ * partition key specified in datums (caller should set ndatums to the number
+ * of valid datums that are passed in the array).
+ */
+typedef struct PartitionBoundCmpArg
+{
+       bool    is_bound;
+       union
+       {
+               PartitionListValue         *lbound;
+               PartitionRangeBound        *rbound;
+               PartitionHashBound         *hbound;
+       }       bound;
+
+       Datum  *datums;
+       int             ndatums;
+} PartitionBoundCmpArg;

This is a pretty strange definition.  datums/ndatums are never valid
at the same time as any of lbound/rbound/hbound, but are not included
in the union.  Also, is_bound doesn't tell you which of
rbound/lbound/hbound is actually valid.  Granted, the current calling
convention looks like a mess, too.  Apparently, the argument to
partition_bound_cmp is a PartitionBoundSpec if using hash
partitioning, a Datum if list partitioning, and either a
PartitionRangeBound or a Datum * if range partitioning depending on
the value of probe_is_bound, and I use the word "apparently" because
there are zero words of comments explaining what the argument to
partition_bound_cmp of type "void *" is supposed to mean.  I really
should have noticed that and insisted that it be fixed before
partitioning got committed.

Looking a bit further, there are exactly two calls to
partition_bound_cmp().  One is in partition_bound_bsearch() and the
other is in check_new_partition_bound(). Now, looking at this, both
the call to partition_bound_cmp() and every single call to
partition_bound_bsearch() are inside a switch branch where we've
dispatched on the partitioning type, which means that from code that
is already specialized by partitioning type we are calling generic
code which then turns back around and calls code that is specialized
by partitioning type.  Now, that could make sense if the generic code
is pretty complex, but here's it's basically just the logic to do a
bsearch.  It seems to me that a cleaner solution here would be to
duplicate that logic.  Then we could have...

static int partition_list_bsearch(PartitionKey key, PartitionBoundInfo
boundinfo,
                        Datum value, bool *is_equal);
static int partition_range_bsearch(PartitionKey key,
PartitionBoundInfo boundinfo,
                        PartitionRangeBound *probe);
static int partition_range_datum_bsearch(PartitionKey key,
PartitionBoundInfo boundinfo,
                        int nvalues, Datum *values);
static int partition_hash_bsearch(PartitionKey key, PartitionBoundInfo
boundinfo,
                        int modulus, int remainder, bool *is_equal);

...which would involve fewer branches at runtime and more type-safety
at compile time.  partition_hash_bsearch could directly call
partition_hbound_cmp, partition_list_bsearch could directly invoke
FunctionCall2Coll, partition_range_bsearch could directly call
partition_rbound_cmp, and partition_range_datum_bsearch could directly
call partition_rbound_datum_cmp.

All-in-all that seems a lot nicer to me than what we have here now.
IIUC, the purpose of this patch is to let you search on a prefix of
the partition keys, but I think that's really only possible for range
partitioning, and it seems like the proposed nvalues argument to
partition_range_datum_bsearch would give you what you need.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: [Patch] Checksums for SLRU files
Next
From: Tomas Vondra
Date:
Subject: Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions