Re: Allowing GIN array_ops to work on anyarray - Mailing list pgsql-hackers

From M Enrique
Subject Re: Allowing GIN array_ops to work on anyarray
Date
Msg-id CADCw5Qbs3DknJugKqQeSc5O0SDR8=xK23HR0PN810VqR_mE4-Q@mail.gmail.com
Whole thread
In response to Allowing GIN array_ops to work on anyarray  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Allowing GIN array_ops to work on anyarray
List pgsql-hackers
This is awesome. I will build it to start using and testing it in my development environment. Thank you so much for making this change.

On Thu, Aug 11, 2016 at 11:33 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
In
https://www.postgresql.org/message-id/15293.1466536829@sss.pgh.pa.us
I speculated that it might not take too much to replace all the variants
of GIN array_ops with a single polymorphic opclass over anyarray.
Attached is a proposed patch that does that.

There are two bits of added functionality needed to make this work:

1. We need to abstract the storage type.  The patch does this by teaching
catalog/index.c to recognize an opckeytype specified as ANYELEMENT with an
opcintype of ANYARRAY, and doing the array element type lookup at index
creation time.

2. We need to abstract the key comparator.  The patch does this by
teaching gin/ginutil.c that if the opclass omits a GIN_COMPARE_PROC,
it should look up the default btree comparator for the index key type.

Both of these seem to me to be reasonable general-purpose behaviors with
potential application to other opclasses.

In the aforementioned message I worried that a core opclass defined this
way might conflict with user-built opclasses for specific array types,
but it seems to work out fine without any additional tweaks: CREATE INDEX
already prefers an exact match if it finds one, and only falls back to
matching anyarray when it doesn't.  Also, all the replaced opclasses are
presently default for their types, which means that pg_dump won't print
them explicitly in CREATE INDEX commands, so we don't have a dump/reload
or pg_upgrade hazard from them disappearing.

A potential downside is that for an opclass defined this way, we add a
lookup_type_cache() call to each initGinState() call.  That's basically
just a single dynahash lookup once the caches are populated, so it's not
much added cost, but conceivably it could be measurable in bulk insert
operations.  If it does prove objectionable my inclination would be to
look into ways to avoid the repetitive function lookups of initGinState,
perhaps by letting it cache that stuff in the index's relcache entry.

I'll put this on the September commitfest docket.

                        regards, tom lane

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Small issues in syncrep.c
Next
From: "David E. Wheeler"
Date:
Subject: Re: Add hint for function named "is"