On Sun, May 27, 2018 at 12:45 PM, Jonathan S. Katz <jkatz@postgresql.org> wrote:
> Next, see bad.sql. 1.2MM sparsely clustered rows inserted, GiST indexes
> builds in about 30s on my machine. SP-GiST does not build at all, or at
> least I have been composing this email for about 10 minutes since I kicked
> off my latest and it has yet to terminate.
>
> I can understand this being an extreme case for SP-GiST as it’s better
> for data set that’s more densely clustered, but I wanted to provide
> this info to rule out whether or not this is a bug.
While I'm no SP-GiST expert, I strongly suspect that you've identified
a real bug here. Your test case has been running on my development
machine for 20 minutes now (this is server class hardware).
I ran perf with your testcase, and I see that the majority of
instructions are executed within these functions:
22.88% postgres postgres [.] spgdoinsert
12.98% postgres postgres [.] range_deserialize
11.44% postgres postgres [.] FunctionCall2Coll
10.40% postgres postgres [.] heap_tuple_untoast_attr
8.62% postgres postgres [.] spgExtractNodeLabels
5.92% postgres postgres [.] getQuadrant
4.90% postgres postgres [.] AllocSetAlloc
spgdoinsert() contains the following comment:
/*
* Bail out if query cancel is pending. We must have this somewhere
* in the loop since a broken opclass could produce an infinite
* picksplit loop.
*/
CHECK_FOR_INTERRUPTS();
Perhaps the problem is in the range type SP-GiST opclass support code
- it could have this exact picksplit infinite loop problem. That's
perfectly consistent with what we see here.
--
Peter Geoghegan