I wrote:
> My apologies for having let this slip through the cracks. I think
> I'd wanted to understand why the committed version of the GiST
> index tests doesn't expose the problem, and I never got time to
> study that. I still haven't tracked it down, but the proposed patch
> seems clearly safe so I've gone ahead and pushed it.
I couldn't quite let this go, and after some testing I see the issue.
The code coverage report correctly shows that we reach almost all of
gistProcessEmptyingQueue in the regression tests, but what it fails to
expose is that the MemoryContextReset call is only reached in calls
from gistEmptyAllBuffers, which occur only at the end of the index
build. We never get there from gistBufferingBuildInsert, which is
the troublesome path. I haven't grokked gistProcessEmptyingQueue
fully, but if the comment in gistBufferingBuildInsert is correct:
/* If we filled up (half of a) buffer, process buffer emptying. */
then this is just a matter of not having enough data volume --- as
indeed Egor's test case proves, if you break it down a little: the
failure doesn't happen in the index builds with 10k or 20k tuples,
you need 30k. The same was true in Alexander's original report.
So we could fix the lack of test coverage with something about
like this:
diff --git a/src/test/regress/sql/gist.sql b/src/test/regress/sql/gist.sql
index 6f1fc65..a42d3ee 100644
*** a/src/test/regress/sql/gist.sql
--- b/src/test/regress/sql/gist.sql
*************** select p from gist_tbl order by circle(p
*** 170,175 ****
--- 170,180 ----
select p from gist_tbl order by circle(p,1) <-> point(0,0) limit 1;
-- Force an index build using buffering.
+ insert into gist_tbl
+ select box(point(0.05*i, 0.05*i), point(0.05*i, 0.05*i)),
+ point(0.05*i, 0.05*i),
+ circle(point(0.05*i, 0.05*i), 1.0)
+ from generate_series(10001,30000) as i;
create index gist_tbl_box_index_forcing_buffering on gist_tbl using gist (p)
with (buffering=on, fillfactor=50);
The question is whether it's worth the extra test cycles forevermore.
On my machine, the time for the gist test script goes from ~400ms to
~600ms. That's still less than some of the concurrent scripts (brin,
privileges, and join_hash all take more than that for me), so maybe
it's fine. But it seems like a lot for a change that doesn't move the
raw code coverage results by even one line.
Thoughts?
regards, tom lane