Thread: BUG #12694: crash if the number of result rows is lower than gin_fuzzy_search_limit
BUG #12694: crash if the number of result rows is lower than gin_fuzzy_search_limit
From
olaf.gw@googlemail.com
Date:
The following bug has been logged on the website: Bug reference: 12694 Logged by: Olaf Gawenda Email address: olaf.gw@googlemail.com PostgreSQL version: 9.4.0 Operating system: Linux 3.2.0-4-amd64 #1 SMP Debian 3.2.65-1+deb7 Description: the following sequence of commands get a crash if the numer of result rows is lower than gin_fuzzy_search_limit: create table test (t text, ts_vec tsvector); insert into test (t) values (),(),(), ...; -- test data not posted update test set ts_vec = to_tsvector('english', t); create index on test using gin(ts_vec); analyze test; set enable_seqscan = off; set gin_fuzzy_search_limit = 1000; select t from test where ts_vec @@ to_tsquery('english', '...');
Re: BUG #12694: crash if the number of result rows is lower than gin_fuzzy_search_limit
From
Michael Paquier
Date:
On Thu, Jan 29, 2015 at 3:12 AM, <olaf.gw@googlemail.com> wrote: > Bug reference: 12694 > Logged by: Olaf Gawenda > Email address: olaf.gw@googlemail.com > PostgreSQL version: 9.4.0 > Operating system: Linux 3.2.0-4-amd64 #1 SMP Debian 3.2.65-1+deb7 > Description: > > the following sequence of commands get a crash if the numer of result rows > is lower than gin_fuzzy_search_limit: > > create table test (t text, ts_vec tsvector); > > insert into test (t) values (),(),(), ...; -- test data not posted > > update test set ts_vec = to_tsvector('english', t); > > create index on test using gin(ts_vec); > analyze test; > set enable_seqscan = off; > set gin_fuzzy_search_limit = 1000; > > select t from test where ts_vec @@ to_tsquery('english', '...'); This can be reproduced easily with a test case like that: create table aa as select array[(random() * 1000000)::int, (random() * 1000000)::int, (random() * 1000000)::int] as a from generate_series(1,10); create index aai on aa using gin(a); set gin_fuzzy_search_limit = 1; set enable_seqscan = off; select * from aa where a <@ array[1,2]; -- Michael
Re: BUG #12694: crash if the number of result rows is lower than gin_fuzzy_search_limit
From
Heikki Linnakangas
Date:
On 01/29/2015 03:09 PM, Michael Paquier wrote: > On Thu, Jan 29, 2015 at 3:12 AM, <olaf.gw@googlemail.com> wrote: >> Bug reference: 12694 >> Logged by: Olaf Gawenda >> Email address: olaf.gw@googlemail.com >> PostgreSQL version: 9.4.0 >> Operating system: Linux 3.2.0-4-amd64 #1 SMP Debian 3.2.65-1+deb7 >> Description: >> >> the following sequence of commands get a crash if the numer of result rows >> is lower than gin_fuzzy_search_limit: >> >> create table test (t text, ts_vec tsvector); >> >> insert into test (t) values (),(),(), ...; -- test data not posted >> >> update test set ts_vec = to_tsvector('english', t); >> >> create index on test using gin(ts_vec); >> analyze test; >> set enable_seqscan = off; >> set gin_fuzzy_search_limit = 1000; >> >> select t from test where ts_vec @@ to_tsquery('english', '...'); > > This can be reproduced easily with a test case like that: > create table aa as > select array[(random() * 1000000)::int, > (random() * 1000000)::int, > (random() * 1000000)::int] as a > from generate_series(1,10); > create index aai on aa using gin(a); > set gin_fuzzy_search_limit = 1; > set enable_seqscan = off; > select * from aa where a <@ array[1,2]; The problem is in startScan() function: > if (GinFuzzySearchLimit > 0) > { > /* > * If all of keys more than threshold we will try to reduce result, we > * hope (and only hope, for intersection operation of array our > * supposition isn't true), that total result will not more than > * minimal predictNumberResult. > */ > > for (i = 0; i < so->totalentries; i++) > if (so->entries[i]->predictNumberResult <= so->totalentries * GinFuzzySearchLimit) > return; > > for (i = 0; i < so->totalentries; i++) > if (so->entries[i]->predictNumberResult > so->totalentries * GinFuzzySearchLimit) > { > so->entries[i]->predictNumberResult /= so->totalentries; > so->entries[i]->reduceResult = TRUE; > } > } > > for (i = 0; i < so->nkeys; i++) > startScanKey(ginstate, so, so->keys + i); > } If the early return is taken, startScanKey() is not called, and many fields in the GinScanKey struct are left uninitialized. That causes the segfault later. This was not as big a problem before 9.4, because startScanKey() didn't do very much. It just reset a few fields, which in a new scan were reset already by ginNewScanKey(). But it is in fact possible to get an assertion failure on 9.3 too, if the plan contains a re-scan of GIN scan, and gin_fuzzy_search_limit is set. Attached is a script that does it. Not sure why, but I'm not seeing a segfault or assert failure on earlier branches. The plan of the segfaulting query looks identical between 9.2 and 9.3, so perhaps there have been some changes to the executor on how and when it calls rescan. Nevertheless, the code looks just as wrong on earlier branches, so I think it should be fixed all the way to 9.1 where that early return in startScan() was introduced. The fix is simple: make sure that startScanKey() is always called, by getting rid of the early return above. Attached. I'll apply this later today or tomorrow unless someone sees a problem with this. - Heikki
Attachment
Re: BUG #12694: crash if the number of result rows is lower than gin_fuzzy_search_limit
From
Heikki Linnakangas
Date:
On 01/29/2015 04:07 PM, Heikki Linnakangas wrote: > The problem is in startScan() function: > >> if (GinFuzzySearchLimit > 0) >> { >> /* >> * If all of keys more than threshold we will try to reduce result, we >> * hope (and only hope, for intersection operation of array our >> * supposition isn't true), that total result will not more than >> * minimal predictNumberResult. >> */ >> >> for (i = 0; i < so->totalentries; i++) >> if (so->entries[i]->predictNumberResult <= so->totalentries * GinFuzzySearchLimit) >> return; >> >> for (i = 0; i < so->totalentries; i++) >> if (so->entries[i]->predictNumberResult > so->totalentries * GinFuzzySearchLimit) >> { >> so->entries[i]->predictNumberResult /= so->totalentries; >> so->entries[i]->reduceResult = TRUE; >> } >> } >> >> for (i = 0; i < so->nkeys; i++) >> startScanKey(ginstate, so, so->keys + i); >> } > > If the early return is taken, startScanKey() is not called, and many > fields in the GinScanKey struct are left uninitialized. That causes the > segfault later. > > This was not as big a problem before 9.4, because startScanKey() didn't > do very much. It just reset a few fields, which in a new scan were reset > already by ginNewScanKey(). But it is in fact possible to get an > assertion failure on 9.3 too, if the plan contains a re-scan of GIN > scan, and gin_fuzzy_search_limit is set. Attached is a script that does > it. Not sure why, but I'm not seeing a segfault or assert failure on > earlier branches. The plan of the segfaulting query looks identical > between 9.2 and 9.3, so perhaps there have been some changes to the > executor on how and when it calls rescan. Never mind, I had some debugging code in place in my 9.3 branch that caused the assertion failure. Still, it looks broken on 9.3 and below, too. Either that, or GinNewKey() check in gingetbitmap is always true, and the startScanKey() calls were unnecessary. - Heikki
Re: BUG #12694: crash if the number of result rows is lower than gin_fuzzy_search_limit
From
Tom Lane
Date:
Heikki Linnakangas <hlinnakangas@vmware.com> writes: > The fix is simple: make sure that startScanKey() is always called, by > getting rid of the early return above. Attached. I'll apply this later > today or tomorrow unless someone sees a problem with this. Another, even simpler fix would be to just move the startScanKey() call loop to before the "if (GinFuzzySearchLimit > 0)" block. Is there a particular reason why it's a good idea to do things in the current order? It almost looks like a patch application error as it stands. With either fix, I concur that we should back-patch it. It's not at all clear how come older branches don't fail because of this. regards, tom lane
Re: BUG #12694: crash if the number of result rows is lower than gin_fuzzy_search_limit
From
Heikki Linnakangas
Date:
On 01/29/2015 05:26 PM, Tom Lane wrote: > Heikki Linnakangas <hlinnakangas@vmware.com> writes: >> The fix is simple: make sure that startScanKey() is always called, by >> getting rid of the early return above. Attached. I'll apply this later >> today or tomorrow unless someone sees a problem with this. > > Another, even simpler fix would be to just move the startScanKey() > call loop to before the "if (GinFuzzySearchLimit > 0)" block. > Is there a particular reason why it's a good idea to do things in > the current order? It almost looks like a patch application error > as it stands. The code in startScanKey() uses the predictNumberResult estimates, which the loop modifies, so it would change how that behaves. I'm not sure if it would actually be better that way though; it's not clear to me how the fuzzy search limit should interact with the fast scan code. It won't affect correctness either way. The for()-loop sets the so->entries[i]-reduceResult fields, so it makes sense to do it right after the startScanEntry() calls. Otherwise the logic would be "1. do some initialization on entries, 2. initialize keys, 3. do more initialization on entries", which would be a bit strange. <looks at the loop a bit more> Hmm, isn't the if-check in the second loop unnecessary? The first loop checks that so->entries[i]->predictNumberResult > so->totalentries * GinFuzzySearchLimit for all entries. If there are any where that doesn't hold, it returns (which is a bad idea). There's no need to check for the same condition again in the second loop. - Heikki
Re: BUG #12694: crash if the number of result rows is lower than gin_fuzzy_search_limit
From
Tom Lane
Date:
Heikki Linnakangas <hlinnakangas@vmware.com> writes: > On 01/29/2015 05:26 PM, Tom Lane wrote: >> Another, even simpler fix would be to just move the startScanKey() >> call loop to before the "if (GinFuzzySearchLimit > 0)" block. >> Is there a particular reason why it's a good idea to do things in >> the current order? It almost looks like a patch application error >> as it stands. > The code in startScanKey() uses the predictNumberResult estimates, which > the loop modifies, so it would change how that behaves. Oh! Sorry, I'd missed looking at what the qsort comparator did. > I'm not sure if > it would actually be better that way though; it's not clear to me how > the fuzzy search limit should interact with the fast scan code. Yeah, it might be better but it's not very clear what the implications are. Probably shouldn't touch that as part of an emergency bug fix. regards, tom lane
Re: BUG #12694: crash if the number of result rows is lower than gin_fuzzy_search_limit
From
Heikki Linnakangas
Date:
On 01/29/2015 06:04 PM, Tom Lane wrote: > Heikki Linnakangas <hlinnakangas@vmware.com> writes: >> On 01/29/2015 05:26 PM, Tom Lane wrote: >> I'm not sure if >> it would actually be better that way though; it's not clear to me how >> the fuzzy search limit should interact with the fast scan code. > > Yeah, it might be better but it's not very clear what the implications > are. Probably shouldn't touch that as part of an emergency bug fix. Committed and backpatched a minimal fix. For master and 9.4, I'm thinking of applying the attached. It makes it clear that startScan() is not used to re-start a scan with existing scan keys, but is always called on a newly initialized scan keys. It also plugs the obvious leaking of the arrays. I'll look at the other memory leaks separately, but this seems appropriate for 9.4. - Heikki
Attachment
Re: BUG #12694: crash if the number of result rows is lower than gin_fuzzy_search_limit
From
Tom Lane
Date:
Heikki Linnakangas <hlinnakangas@vmware.com> writes: > For master and 9.4, I'm thinking of applying the attached. It makes it > clear that startScan() is not used to re-start a scan with existing scan > keys, but is always called on a newly initialized scan keys. Looks reasonable to me, but should ginFreeScanKeys() null out the pointers after freeing them, to be sure we find any incorrect accesses? It might not be worth the trouble; but if you have any doubts at all about the order of operations this seems like a good safety feature. Also, in the department of nitpicks, I'd do this: { IndexScanDesc scan = (IndexScanDesc) PG_GETARG_POINTER(0); + GinScanOpaque so = (GinScanOpaque) scan->opaque; TIDBitmap *tbm = (TIDBitmap *) PG_GETARG_POINTER(1); int64 ntids; more like this: { IndexScanDesc scan = (IndexScanDesc) PG_GETARG_POINTER(0); TIDBitmap *tbm = (TIDBitmap *) PG_GETARG_POINTER(1); + GinScanOpaque so = (GinScanOpaque) scan->opaque; int64 ntids; I think of the PG_GETARG calls as being an ugly stepchild of a proper function header declaration, and as such, they should come first unless there is an unavoidable reason not to. regards, tom lane
Re: BUG #12694: crash if the number of result rows is lower than gin_fuzzy_search_limit
From
Heikki Linnakangas
Date:
On 01/29/2015 08:59 PM, Tom Lane wrote: > Heikki Linnakangas <hlinnakangas@vmware.com> writes: >> For master and 9.4, I'm thinking of applying the attached. It makes it >> clear that startScan() is not used to re-start a scan with existing scan >> keys, but is always called on a newly initialized scan keys. > > Looks reasonable to me, but should ginFreeScanKeys() null out the pointers > after freeing them, to be sure we find any incorrect accesses? It might > not be worth the trouble; but if you have any doubts at all about the > order of operations this seems like a good safety feature. Nah, I'm not worried about that. ginFreeScanKeys() frees the whole 'keys' array, so we'd have bigger problems if there was a reference-after-free. > Also, in the department of nitpicks, I'd do this: > > { > IndexScanDesc scan = (IndexScanDesc) PG_GETARG_POINTER(0); > + GinScanOpaque so = (GinScanOpaque) scan->opaque; > TIDBitmap *tbm = (TIDBitmap *) PG_GETARG_POINTER(1); > int64 ntids; > > more like this: > > { > IndexScanDesc scan = (IndexScanDesc) PG_GETARG_POINTER(0); > TIDBitmap *tbm = (TIDBitmap *) PG_GETARG_POINTER(1); > + GinScanOpaque so = (GinScanOpaque) scan->opaque; > int64 ntids; > > I think of the PG_GETARG calls as being an ugly stepchild of a proper > function header declaration, and as such, they should come first unless > there is an unavoidable reason not to. Ok, committed with that fix. - Heikki