Thread: Review: B-Tree emulation for GIN
Hi Teodor Sigaev,
I am getting server crash in contrib regression. May be I am doing something wrong here. Regression diff is attached.
BTW these queries work fine outside the regression.
--
Ibrar Ahmed
EnterpriseDB http://www.enterprisedb.com
I am getting server crash in contrib regression. May be I am doing something wrong here. Regression diff is attached.
BTW these queries work fine outside the regression.
--
Ibrar Ahmed
EnterpriseDB http://www.enterprisedb.com
Attachment
will see, may be it's needed to update the patch Ibrar Ahmed wrote: > Hi Teodor Sigaev, > > I am getting server crash in contrib regression. May be I am doing > something wrong here. Regression diff is attached. > > BTW these queries work fine outside the regression. > > > -- > Ibrar Ahmed > EnterpriseDB http://www.enterprisedb.com > > > ------------------------------------------------------------------------ > > -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Updated patch. Ibrar Ahmed wrote: > Hi Teodor Sigaev, > > I am getting server crash in contrib regression. May be I am doing > something wrong here. Regression diff is attached. > > BTW these queries work fine outside the regression. > > > -- > Ibrar Ahmed > EnterpriseDB http://www.enterprisedb.com > > > ------------------------------------------------------------------------ > > -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Attachment
Thanks,
--
Ibrar Ahmed
EnterpriseDB http://www.enterprisedb.com
On Fri, Dec 19, 2008 at 3:26 PM, Teodor Sigaev <teodor@sigaev.ru> wrote:
Updated patch.
Ibrar Ahmed wrote:------------------------------------------------------------------------Hi Teodor Sigaev,
I am getting server crash in contrib regression. May be I am doing something wrong here. Regression diff is attached.
BTW these queries work fine outside the regression.
--
Ibrar Ahmed
EnterpriseDB http://www.enterprisedb.com
--
Ibrar Ahmed
EnterpriseDB http://www.enterprisedb.com
On Fri, 2008-12-19 at 13:26 +0300, Teodor Sigaev wrote: > Updated patch. I have merged this with HEAD, written a brief document (which is mostly a copy of the btree-gist page), added the module to the contrib Makefile, and made some very minor changes. Patch attached. A couple comments: 1. Right now, to implement "less than" you need to start at the beginning of the index and scan until you reach the supplied query datum. This is because GIN doesn't support backwards scans ( http://archives.postgresql.org/pgsql-hackers/2008-07/msg01146.php ). Unfortunately, that means numeric is not supportable for btree-gin because there is no left-most value from which to start the scan. Do you see an easy workaround to support numeric? 2. Why do you create a shell type "int32" from btree_gin.sql? I tried doing a search-and-replace to use "int4" instead, and it seems to work fine (and eliminates the warnings). I left it with int32 in my version of the patch because I thought you may have some reason for using it. Regards, Jeff Davis
Attachment
On Sun, 2008-12-28 at 23:41 -0800, Jeff Davis wrote: > 2. Why do you create a shell type "int32" from btree_gin.sql? I tried > doing a search-and-replace to use "int4" instead, and it seems to work > fine (and eliminates the warnings). I left it with int32 in my version > of the patch because I thought you may have some reason for using it. > Attached an updated patch with this change. Regards, Jeff Davis
Attachment
> I have merged this with HEAD, written a brief document (which is mostly > a copy of the btree-gist page), added the module to the contrib > Makefile, and made some very minor changes. Patch attached. Thank you > > A couple comments: > > 1. Right now, to implement "less than" you need to start at the > beginning of the index and scan until you reach the supplied query > datum. This is because GIN doesn't support backwards scans > ( http://archives.postgresql.org/pgsql-hackers/2008-07/msg01146.php ). > > Unfortunately, that means numeric is not supportable for btree-gin > because there is no left-most value from which to start the scan. Do you > see an easy workaround to support numeric? Hmm, let we use minimal varlena struct (with size equal to VARHDRSZ) as left-most (and fake) value. It is never used for any actual argument except compare function, so, new compare function is provided. New version of patch is attached and it based on on your patch (btree-gin.20090111.patch.gz). > 2. Why do you create a shell type "int32" from btree_gin.sql? I tried > doing a search-and-replace to use "int4" instead, and it seems to work > fine (and eliminates the warnings). I left it with int32 in my version > of the patch because I thought you may have some reason for using it. My mistake, thank you for fix -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Attachment
On Fri, 2009-01-16 at 17:42 +0300, Teodor Sigaev wrote: > > Unfortunately, that means numeric is not supportable for btree-gin > > because there is no left-most value from which to start the scan. Do you > > see an easy workaround to support numeric? > Hmm, let we use minimal varlena struct (with size equal to VARHDRSZ) as > left-most (and fake) value. It is never used for any actual argument except > compare function, so, new compare function is provided. New version of patch is > attached and it based on on your patch (btree-gin.20090111.patch.gz). > Looks good to me. I updated the CommitFest wiki to point to this patch. I like the fact that this patch does not modify the numeric ADT. It still relies on the fact that the numeric type will never make use of the minimal varlena struct, however. I bring this up in case someone sees it as a problem. Thanks,Jeff Davis
Jeff Davis escribió: > I like the fact that this patch does not modify the numeric ADT. It > still relies on the fact that the numeric type will never make use of > the minimal varlena struct, however. I bring this up in case someone > sees it as a problem. Greg Stark was working on a patch to make certain values use the short representation. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
>> still relies on the fact that the numeric type will never make use of >> the minimal varlena struct, however. I bring this up in case someone >> sees it as a problem. > > Greg Stark was working on a patch to make certain values use the short > representation. Fake value contains only VARHDRSZ bytes. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Jeff Davis <pgsql@j-davis.com> writes: > I like the fact that this patch does not modify the numeric ADT. It > still relies on the fact that the numeric type will never make use of > the minimal varlena struct, however. I bring this up in case someone > sees it as a problem. I'm pretty certain I recall Greg Stark recommending that we adopt something like that as the standard numeric representation of zero. It didn't get done yet, but it might happen someday. regards, tom lane
On Mon, 2009-01-19 at 11:35 -0500, Tom Lane wrote: > Jeff Davis <pgsql@j-davis.com> writes: > > I like the fact that this patch does not modify the numeric ADT. It > > still relies on the fact that the numeric type will never make use of > > the minimal varlena struct, however. I bring this up in case someone > > sees it as a problem. > > I'm pretty certain I recall Greg Stark recommending that we adopt > something like that as the standard numeric representation of zero. > It didn't get done yet, but it might happen someday. > Then we should use the previous version of the patch here: http://archives.postgresql.org/message-id/1231709713.25019.129.camel@jdavis Was there any talk of supporting a +/- infinity in numeric? If we did that, it would allow numeric to be supported for btree-gin. Regards,Jeff Davis
> I'm pretty certain I recall Greg Stark recommending that we adopt > something like that as the standard numeric representation of zero. > It didn't get done yet, but it might happen someday. Changes: - use NULL as left-most value. It's safe because NULL numeric value cannot be an argument for any function except gin_numeric_cmp and it cannot be returned in regular SQL query. - fix uninstall script for support numeric type. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Attachment
On Mon, 2009-01-19 at 20:15 +0300, Teodor Sigaev wrote: > Changes: > - use NULL as left-most value. It's safe because NULL numeric value > cannot be an argument for any function except gin_numeric_cmp and it > cannot be returned in regular SQL query. gin_numeric_cmp() can be called from regular SQL. I missed this before, but that function will segfault if you call gin_numeric_cmp(NULL, 1) (in v0.7 at least). I know you mean a C NULL, not a SQL NULL, but it reminded me to test SQL NULL. And how does GIN handle SQL NULL values in the column? Does it index them at all, or just ignore them? Regards,Jeff Davis
> gin_numeric_cmp() can be called from regular SQL. I missed this before, > but that function will segfault if you call gin_numeric_cmp(NULL, 1) (in > v0.7 at least). Fixed, gin_numeric_cmp is marked as strict. > And how does GIN handle SQL NULL values in the column? Does it index > them at all, or just ignore them? SQL NULL: GIN doesn't support it (amindexnulls/amsearchnulls == false) C NULL: NULL-numeric could be returned only by gin_extract_query_numeric which cannot be called by user directly because of internal type of argument. GIN doesn't do anything with values returned by gin_extract_query_numeric except providing they as an argument for comparing functions. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Attachment
On Mon, 2009-01-19 at 21:41 +0300, Teodor Sigaev wrote: > > And how does GIN handle SQL NULL values in the column? Does it index > > them at all, or just ignore them? > SQL NULL: GIN doesn't support it (amindexnulls/amsearchnulls == false) > C NULL: NULL-numeric could be returned only by gin_extract_query_numeric which > cannot be called by user directly because of internal type of argument. > GIN doesn't do anything with values returned by gin_extract_query_numeric except > providing they as an argument for comparing functions. Ok, I understand now. I will look at this later. Regards,Jeff Davis
On Mon, 2009-01-19 at 21:41 +0300, Teodor Sigaev wrote: > > gin_numeric_cmp() can be called from regular SQL. I missed this before, > > but that function will segfault if you call gin_numeric_cmp(NULL, 1) (in > > v0.7 at least). > > Fixed, gin_numeric_cmp is marked as strict. > > > And how does GIN handle SQL NULL values in the column? Does it index > > them at all, or just ignore them? > SQL NULL: GIN doesn't support it (amindexnulls/amsearchnulls == false) > C NULL: NULL-numeric could be returned only by gin_extract_query_numeric which > cannot be called by user directly because of internal type of argument. > GIN doesn't do anything with values returned by gin_extract_query_numeric except > providing they as an argument for comparing functions. Ok, looks good. I updated the wiki to show this as the latest version of the patch. Thanks,Jeff Davis
Looked at this a bit ... do you think it's really a good idea to remove the strategy number argument of comparePartial? The argument given in the docs for it is that it might be needed to determine when to end the scan, and that still seems plausible to me. The description of extractQuery's extra_data parameter seems confusing too. AFAICS it is incorrect, or at least misleading, to describe it as void ** extra_data[]; it is really void ***extra_data, because there is only one object there not an array. regards, tom lane
> Looked at this a bit ... do you think it's really a good idea to remove > the strategy number argument of comparePartial? The argument given in > the docs for it is that it might be needed to determine when to end the > scan, and that still seems plausible to me. Strategy number is not removed, it's replaced by pointer to opclass-specific data on per key basis. Actually, partial match feature is new for 8.4 and there is no any compatibility problem. None of existing opclasses (except contrib/btree_gin) uses strategy number in comparePartial. > The description of extractQuery's extra_data parameter seems confusing > too. AFAICS it is incorrect, or at least misleading, to describe it as > void ** extra_data[]; it is really void ***extra_data, because there is > only one object there not an array. It's really array, see fillScanKey() in ginscan.c, line 61 of patched file. extractQuery could provide data for each returned key. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
On Wed, 2009-02-04 at 20:22 +0300, Teodor Sigaev wrote: > > The description of extractQuery's extra_data parameter seems confusing > > too. AFAICS it is incorrect, or at least misleading, to describe it as > > void ** extra_data[]; it is really void ***extra_data, because there is > > only one object there not an array. > > It's really array, see fillScanKey() in ginscan.c, line 61 of patched file. > extractQuery could provide data for each returned key. > If I understand correctly, the extra_data parameter to extractQuery() should mean "a pointer to an array of void*". "void **extra_data[]" might be misinterpreted as "an array of void**". Would "void *(*extra_data)[]" be more descriptive? I'm not an expert on style questions like this, but it seems to match the parameter lists for comparePartial() and consistent(). "void ***extra_data" seems reasonable, too. Regards,Jeff Davis
On Mon, 2009-01-19 at 21:41 +0300, Teodor Sigaev wrote: > > gin_numeric_cmp() can be called from regular SQL. I missed this before, > > but that function will segfault if you call gin_numeric_cmp(NULL, 1) (in > > v0.7 at least). > > Fixed, gin_numeric_cmp is marked as strict. > > > And how does GIN handle SQL NULL values in the column? Does it index > > them at all, or just ignore them? > SQL NULL: GIN doesn't support it (amindexnulls/amsearchnulls == false) > C NULL: NULL-numeric could be returned only by gin_extract_query_numeric which > cannot be called by user directly because of internal type of argument. > GIN doesn't do anything with values returned by gin_extract_query_numeric except > providing they as an argument for comparing functions. Looking through the code again, gin_compare_prefix_##type looks a little confusing. Is there a reason for using: (data->strategy == BTLessStrategyNumber || data->strategy == BTLessEqualStrategyNumber ) ? PointerGetDatum(data->datum) : a rather than just using: PointerGetDatum(data->datum) Also, it might be a little less confusing if you used two separate variables rather than using "res" for two purposes. Regards,Jeff Davis
> Looking through the code again, gin_compare_prefix_##type looks a little > confusing. > > Is there a reason for using: > (data->strategy == BTLessStrategyNumber || > data->strategy == BTLessEqualStrategyNumber ) ? > PointerGetDatum(data->datum) : a > rather than just using: > PointerGetDatum(data->datum) Added comments: /* * Datum a is a value from extract_query method and for BTLess* * strategy it is a left-most value. So, use original datum from * QueryInfo to decide stop scanning on not. Datum b is always * from index. */ > Also, it might be a little less confusing if you used two separate > variables rather than using "res" for two purposes. done -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Attachment
The GIN_EXTRACT_VALUE macro returns a pointer to a static 'entries' variable. That doesn't seem safe. Is it really never possible to have to two GIN searches in a plan, both calling and usingthe value returned by extractValue simultaneously? In any case that seems like a pretty weak assumption. You might want to declare extra_data as just "void *", instead of an array of pointers. The data type implementation might want to store something there that's not per-key, but applies to the whole query. I see that you're passing it to comparePartial, but that seems to be just future-proofing. What kind of a data type are you envisioning that would make use of it? It seems that you could pass the same information in the partial key Datum itself that extractQuery returns. You're currently using it as a way to avoid some palloc's in gin_tsquery_consistent(). That seems like a pretty dirty hack. I doubt there's any meaningful performance advantage from that, but if there is, I think you could use a statically allocated array instead. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
> The GIN_EXTRACT_VALUE macro returns a pointer to a static 'entries' > variable. That doesn't seem safe. Is it really never possible to have to > two GIN searches in a plan, both calling and using the value returned > by extractValue simultaneously? In any case that seems like a pretty > weak assumption. Fixed. > You might want to declare extra_data as just "void *", instead of an > array of pointers. The data type implementation might want to store > something there that's not per-key, but applies to the whole query. I > see that you're passing it to comparePartial, but that seems to be just > future-proofing. What kind of a data type are you envisioning that would wildspeed module (http://www.sigaev.ru/cvsweb/cvsweb.cgi/wildspeed/) - for each key from it's needed to store some info. Right now it's coded directly in Datum, but it looks ugly (at least for me). It's possible to clarify interface by introducing new type: typedef void* OpaquePtr; Then, prototypes will be: extractQuery(..., OpaquePtr* extra_data[]) consistent(...., OpaquePtr extra_data[]) comparePartial(..., OpaquePtr extra_data) Or another option: partial match feature is new for 8.4, so we could change interface: typedef struct KeyData { bool pmatch, void *extra_data; } KeyData; Datum *extractQuery(Datum query, int32 *nkeys, StrategyNumber n, KeyData* data[]) bool consistent(bool check[], StrategyNumber n, Datum query, bool *recheck, KeyData data[]) comparePartial(Datum partial_key, Datum key, KeyData *data); > make use of it? It seems that you could pass the same information in the > partial key Datum itself that extractQuery returns. You're currently > using it as a way to avoid some palloc's in gin_tsquery_consistent(). > That seems like a pretty dirty hack. I doubt there's any meaningful > performance advantage from that, but if there is, I think you could use > a statically allocated array instead. It's easy to un-dirty that hack, but before I'd like to see your comments about thoughts above. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Attachment
Teodor Sigaev wrote: >> You might want to declare extra_data as just "void *", instead of an >> array of pointers. The data type implementation might want to store >> something there that's not per-key, but applies to the whole query. I >> see that you're passing it to comparePartial, but that seems to be >> just future-proofing. What kind of a data type are you envisioning >> that would > > wildspeed module (http://www.sigaev.ru/cvsweb/cvsweb.cgi/wildspeed/) - > for each key from it's needed to store some info. Right now it's coded > directly in Datum, but it looks ugly (at least for me). Ok, I guess it's best to leave it as you had in the patch then. >> make use of it? It seems that you could pass the same information in >> the partial key Datum itself that extractQuery returns. You're >> currently using it as a way to avoid some palloc's in >> gin_tsquery_consistent(). That seems like a pretty dirty hack. I doubt >> there's any meaningful performance advantage from that, but if there >> is, I think you could use a statically allocated array instead. > > It's easy to un-dirty that hack, but before I'd like to see your > comments about thoughts above. Yeah, please revert that hack. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
>> It's easy to un-dirty that hack, but before I'd like to see your >> comments about thoughts above. > > Yeah, please revert that hack. Done. Also, I changed void *extra_data to Pointer extra_data and corresponding **extra_data and ***extra_data to simplify understanding. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Attachment
[ back to this patch... ] Teodor Sigaev <teodor@sigaev.ru> writes: >> Looked at this a bit ... do you think it's really a good idea to remove >> the strategy number argument of comparePartial? The argument given in >> the docs for it is that it might be needed to determine when to end the >> scan, and that still seems plausible to me. > Strategy number is not removed, it's replaced by pointer to opclass-specific > data on per key basis. Actually, partial match feature is new for 8.4 and there > is no any compatibility problem. None of existing opclasses (except > contrib/btree_gin) uses strategy number in comparePartial. I'm still not real happy about omitting the strategy number from comparePartial's arguments. Yes, the extractQuery function can make up for that if it has to, but why should it have to? It seems to me to be inconsistent that the consistent function gets the strategy number but comparePartial doesn't. I know that there's not a backwards compatibility issue here, but it still seems to me that you're forcing opclasses to do things the hard way (allocating an extra_data entry) when they might only need information that's already part of the call signature for existing functions. regards, tom lane
BTW ... while I'm thinking about it: it seems to me to be a serious error that the consistent() function isn't given nkeys so that it can know the length of the arrays it's being handed. I suppose it's possible for it to re-deduce nkeys by examining the query datum, but that could be quite expensive; and it's certainly error-prone to have to keep extractQuery() and consistent() in sync on this. Since we are adding arguments to consistent() anyway for 8.4, I propose that its signature ought to be bool consistent(bool check[], StrategyNumber n, Datum query, int32 nkeys, bool *recheck, Pointer extra_data[]) where the first three arguments are what existed in 8.3. regards, tom lane
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: Tom> BTW ... while I'm thinking about it: it seems to me to be aTom> serious error that the consistent() function isn't givennkeysTom> so that it can know the length of the arrays it's being handed.Tom> I suppose it's possible for it to re-deducenkeys by examiningTom> the query datum, but that could be quite expensive; and it'sTom> certainly error-prone tohave to keep extractQuery() andTom> consistent() in sync on this. I ran into exactly this problem in hstore; the omission of nkeys there is seriously annoying. -- Andrew (irc:RhodiumToad)
Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: > Tom> BTW ... while I'm thinking about it: it seems to me to be a > Tom> serious error that the consistent() function isn't given nkeys > Tom> so that it can know the length of the arrays it's being handed. > I ran into exactly this problem in hstore; the omission of nkeys there > is seriously annoying. It's de-omitted ... regards, tom lane
Teodor Sigaev <teodor@sigaev.ru> writes: > [ btree_gin 0.12 ] Committed with some editorializations. There are still a few loose ends: * the question about zero-key queries that I mentioned before * After this new look at the code I think that matchPartialInPendingList is completely broken. Surely it's an infinite loop if the comparePartial function returns -1. I also don't understand why it stops short of examining the tuple at maxoff, or for that matter why it's doing any searching at all when the code that's calling it seems to be implementing a binary search. I think that part of the code needs more testing and commenting. * In keyGetItem(), it's not at all clear what the assumptions are for the contents of the entryRes[] array and for the interaction with a LossyPage entry. Is it intentional that you run through the array comparing to a LossyPage key before deciding to exit the loop? It seems like that might be affecting the behavior at the next call, but if so you're making some undocumented assumptions about the way comparison of a lossy-page pointer is going to behave. I thought it'd be cleaner to move the ItemPointerIsLossyPage check up to before that loop (right after the ItemPointerIsMax test) but the more I look at it the less sure I am if that would break things. That code needs more commenting too. * I'd also like to come to some agreement about getting rid of the fail-on-NULL-scankey problem in newScanKey(). As I noted in the comment there, we could make that work cleanly if we are willing to assume that all GIN-indexable operators are strict. We already assume the same for hash and btree operators, so it doesn't seem like a big problem to do this, but I wonder if there are any objections. regards, tom lane
On Wed, 2009-03-25 at 19:31 -0400, Tom Lane wrote: > * I'd also like to come to some agreement about getting rid of the > fail-on-NULL-scankey problem in newScanKey(). As I noted in the > comment there, we could make that work cleanly if we are willing to > assume that all GIN-indexable operators are strict. We already assume > the same for hash and btree operators, so it doesn't seem like a big > problem to do this, but I wonder if there are any objections. "IS NULL" is indexable in a btree and non-strict, so there is at least some precedent. Also, if extractQuery is non-strict, shouldn't we call it and see if it returns some useful keys? If so, I don't see a reason to assume that nothing matches. If the opclass author wants a search against NULL to mean "matches nothing", they can just make extractQuery non-strict and return -1. However, if extractQuery is strict or returns NULL, I'm fine with either an error or assuming "nothing matches". I don't see a functionality difference either way, so we should just document whatever seems to make the most sense. Regards,Jeff Davis
Jeff Davis <pgsql@j-davis.com> writes: > Also, if extractQuery is non-strict, shouldn't we call it and see if it > returns some useful keys? Perhaps. One risk factor for approaching it that way is that there are probably a lot of opclasses out there that haven't bothered to mark these functions strict, since it's never mattered before. (A handy example is that the brand new btree_gin opclasses did not bother, as submitted; though in a fit of paranoia I made them do so before committing.) If the extractQuery function isn't actually guarding against this, you'll get a crash. That's not a showstopper reason not to change, of course, but it does mean that I'd like to see an actual use case for a non-strict GIN index operator before taking any risk. Note that IS NULL isn't an operator, so even if we were to try to support it in GIN, that would be a different code path (just as it is in btree). regards, tom lane
> * After this new look at the code I think that matchPartialInPendingList > is completely broken. Surely it's an infinite loop if the > comparePartial function returns -1. I also don't understand why it fixed > stops short of examining the tuple at maxoff, or for that matter why > it's doing any searching at all when the code that's calling it seems > to be implementing a binary search. I think that part of the code > needs more testing and commenting. Tuples of rows are stored in pending list in the same order as in entry tree (order by (attrnum, Datum)). So, it's possible to use binary search. By test's binary search presents about 5-15% of performance. But heap's row could be one page or on several pages. If page contains several rows, each row is checked separately and tuples of one heap row are placed from pendingPosition->firstOffset up to pendingPosition->lastOffset. The page is marked by GIN_LIST_FULLROW flag if it contains all tuples of row(s) or it's a last page in page's chain containing single heap row. scanGetCandidate() always returns offset range for tuples of one heap row. Although it may return NOT all tuples of row, if so then collectDatumForItem() will ask scanGetCandidate() about next portion of tuples of that heap row. It's safe because tuple's of one heap row are continuosly stored in pending list. > * In keyGetItem(), it's not at all clear what the assumptions are for > the contents of the entryRes[] array and for the interaction with a > LossyPage entry. Is it intentional that you run through the array > comparing to a LossyPage key before deciding to exit the loop? It > seems like that might be affecting the behavior at the next call, > but if so you're making some undocumented assumptions about the way > comparison of a lossy-page pointer is going to behave. I thought it'd > be cleaner to move the ItemPointerIsLossyPage check up to before that > loop (right after the ItemPointerIsMax test) but the more I look at it > the less sure I am if that would break things. That code needs more > commenting too. entryRes[] array is used for two purposes: - as an argument for consistentFn - to store information about entries which was a minimal on previous call/loop. So, only that entries should be renewed by entryGetItem() call. Lossy page is encoded as ItemPointer with greatest offset (0xffff), so keyGetItem() could return items from page and then return the same page as lossy. This is not a problem because of usage of TIDBitmap to callect all results. So, algorithm of keyGetItem could be described as: 1 renew all entries which is equal to previous result (entryGetItems returns them in increasing order) 2 find minimal entries 3 fill up entryRes[] to new values and check lossy or call consistentFn > * I'd also like to come to some agreement about getting rid of the > fail-on-NULL-scankey problem in newScanKey(). As I noted in the > comment there, we could make that work cleanly if we are willing to > assume that all GIN-indexable operators are strict. We already assume > the same for hash and btree operators, so it doesn't seem like a big > problem to do this, but I wonder if there are any objections. Agree. I changed the GIN code, but don't know where is other places to change to fixate this agreement. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Attachment
Teodor Sigaev <teodor@sigaev.ru> writes: > [ fixes for the GIN stuff I complained about before ] This all looks good to me, please apply. One little suggestion: ! /* ! * entryRes array is used for: ! * - as an argument for consistentFn ! * - entry->curItem with corresponding key->entryRes[i] == false are greater ! * than key->curItem, so next loop/call they should be renewed ! * by entryGetItem(). So, we need to set up an array before ! * checking of lossy page. ! */ pgindent will reflow this comment block, since it's not at the left margin. To keep the formatting looking good you'll need to add /*-------- >> * I'd also like to come to some agreement about getting rid of the >> fail-on-NULL-scankey problem in newScanKey(). As I noted in the >> comment there, we could make that work cleanly if we are willing to >> assume that all GIN-indexable operators are strict. We already assume >> the same for hash and btree operators, so it doesn't seem like a big >> problem to do this, but I wonder if there are any objections. > Agree. I changed the GIN code, but don't know where is other places > to change to fixate this agreement. I don't think there is anything else that needs to be done for that. regards, tom lane