Home > mailing lists

Re: [PATCH] Fix ScalarArrayOpExpr estimation for GIN indexes - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: [PATCH] Fix ScalarArrayOpExpr estimation for GIN indexes
Date	December 20, 2011 04:09:12
Msg-id	10309.1324357729@sss.pgh.pa.us Whole thread Raw
In response to	[PATCH] Fix ScalarArrayOpExpr estimation for GIN indexes (Marti Raudsepp <marti@juffo.org>)
Responses	Re: [PATCH] Fix ScalarArrayOpExpr estimation for GIN indexes
List	pgsql-hackers

Tree view

Marti Raudsepp <marti@juffo.org> writes:
> Since PostgreSQL 9.1, GIN has new cost estimation code. This code
> assumes that the only expression type it's going to see is OpExpr.
> However, ScalarArrayOpExpr has also been possible in earlier versions.
> Estimating col <op> ANY (<array>) queries segfaults in 9.1 if there's
> a GIN index on the column.

Ugh.  I think we subconsciously assumed that ScalarArrayOpExpr couldn't
appear here because GIN doesn't set amsearcharray, but of course that's
wrong.

> (It seems that RowCompareExpr and NullTest clauses are impossible for
> a GIN index -- at least my efforts to find such cases failed)

No, those are safe for the moment --- indxpath.c has a hard-wired
assumption that RowCompareExpr is only usable with btree, and NullTest
is only considered to be indexable if amsearchnulls is set.  Still,
it'd likely be better if this code ignored unrecognized qual expression
types rather than Assert'ing they're not there.  It's not like the cases
it *does* handle are done so perfectly that ignoring an unknown qual
could be said to bollix the estimate unreasonably.

> Attached is an attempted fix for the issue. I split out the code for
> the extract call and now run that for each array element, adding
> together the average of (partialEntriesInQuals, exactEntriesInQuals,
> searchEntriesInQuals) for each array element. After processing all
> quals, I multiply the entries by the number of array_scans (which is
> the product of all array lengths) to get the total cost.

> This required a fair bit of refactoring, but I tried to follow the
> patterns for OpExpr pretty strictly -- discounting scans over NULL
> elements, returning 0 cost when none of the array elements can match,
> accounting for cache effects when there are multiple scans, etc. But
> it's also possible that I have no idea what I'm really doing. :)

Hmm.  I am reminded of how utterly unreadable "diff -u" format is for
anything longer than single-line changes :-( ... but I think I don't
like this refactoring much.  Will take a closer look tomorrow.
        regards, tom lane

pgsql-hackers by date:

From: Robert Haas
Date: 20 December 2011, 04:00:56
Subject: Re: JSON for PG 9.2

From: Nikhil Sontakke
Date: 20 December 2011, 05:14:47
Subject: Re: Review: Non-inheritable check constraints

Re: [PATCH] Fix ScalarArrayOpExpr estimation for GIN indexes - Mailing list pgsql-hackers

Previous

Next