Thread: Supporting = operator in gin/gist_trgm_ops
Hello, A french user recently complained that with an index created using gin_trgm_ops (or gist_trgm_ops), you can use the index with a clause like col LIKE 'something' but not col = 'something' even though both clauses are technically identical. That's clearly not a high priority thing to support, but looking at the code it seems to me that this could be achieved quite simply: just adding a new operator = in the opclass, with an operator strategy number that falls back doing exactly what LikeStrategyNumber is doing and that's it. There shouldn't be any wrong results, even using wildcards as the recheck will remove any incorrect one. Did I miss something? And if not would such a patch be welcome?
Julien Rouhaud <rjuju123@gmail.com> writes: > A french user recently complained that with an index created using > gin_trgm_ops (or gist_trgm_ops), you can use the index with a clause > like > col LIKE 'something' > but not > col = 'something' Huh, I'd supposed we did that already. > even though both clauses are technically identical. That's clearly > not a high priority thing to support, but looking at the code it seems > to me that this could be achieved quite simply: just adding a new > operator = in the opclass, with an operator strategy number that falls > back doing exactly what LikeStrategyNumber is doing and that's it. > There shouldn't be any wrong results, even using wildcards as the > recheck will remove any incorrect one. I think you may be overoptimistic about being able to use the identical code path without regard for LIKE wildcards; but certainly it should be possible to do this with not a lot of new code. +1. regards, tom lane
On Mon, Oct 26, 2020 at 5:03 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Julien Rouhaud <rjuju123@gmail.com> writes: > > A french user recently complained that with an index created using > > gin_trgm_ops (or gist_trgm_ops), you can use the index with a clause > > like > > col LIKE 'something' > > but not > > col = 'something' > > Huh, I'd supposed we did that already. > > > even though both clauses are technically identical. That's clearly > > not a high priority thing to support, but looking at the code it seems > > to me that this could be achieved quite simply: just adding a new > > operator = in the opclass, with an operator strategy number that falls > > back doing exactly what LikeStrategyNumber is doing and that's it. > > There shouldn't be any wrong results, even using wildcards as the > > recheck will remove any incorrect one. > > I think you may be overoptimistic about being able to use the identical > code path without regard for LIKE wildcards; but certainly it should be > possible to do this with not a lot of new code. +1. Well, that's what I was thinking too, but I tried all the possible wildcard combinations I could think of and I couldn't find any case yielding wrong results. As far as I can see the index scans return at least all the required rows, and all extraneous rows are correctly removed either by heap recheck or index recheck. I'm attaching a patch POC pach with regression tests covering those combinations. I also found a typo in the 1.4--1.5 pg_trgm upgrade script, so I'm also attaching a patch for that.
Attachment
On Mon, Oct 26, 2020 at 12:02 PM Julien Rouhaud <rjuju123@gmail.com> wrote: > > On Mon, Oct 26, 2020 at 5:03 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > Julien Rouhaud <rjuju123@gmail.com> writes: > > > A french user recently complained that with an index created using > > > gin_trgm_ops (or gist_trgm_ops), you can use the index with a clause > > > like > > > col LIKE 'something' > > > but not > > > col = 'something' > > > > Huh, I'd supposed we did that already. > > > > > even though both clauses are technically identical. That's clearly > > > not a high priority thing to support, but looking at the code it seems > > > to me that this could be achieved quite simply: just adding a new > > > operator = in the opclass, with an operator strategy number that falls > > > back doing exactly what LikeStrategyNumber is doing and that's it. > > > There shouldn't be any wrong results, even using wildcards as the > > > recheck will remove any incorrect one. > > > > I think you may be overoptimistic about being able to use the identical > > code path without regard for LIKE wildcards; but certainly it should be > > possible to do this with not a lot of new code. +1. > > Well, that's what I was thinking too, but I tried all the possible > wildcard combinations I could think of and I couldn't find any case > yielding wrong results. As far as I can see the index scans return at > least all the required rows, and all extraneous rows are correctly > removed either by heap recheck or index recheck. > > I'm attaching a patch POC pach with regression tests covering those > combinations. I also found a typo in the 1.4--1.5 pg_trgm upgrade > script, so I'm also attaching a patch for that. Oops, I forgot to git-add the 1.5--1.6.sql upgrade script in the previous patch.
Attachment
Julien Rouhaud <rjuju123@gmail.com> writes: > On Mon, Oct 26, 2020 at 5:03 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I think you may be overoptimistic about being able to use the identical >> code path without regard for LIKE wildcards; but certainly it should be >> possible to do this with not a lot of new code. +1. > Well, that's what I was thinking too, but I tried all the possible > wildcard combinations I could think of and I couldn't find any case > yielding wrong results. As far as I can see the index scans return at > least all the required rows, and all extraneous rows are correctly > removed either by heap recheck or index recheck. But "does it get the right answers" isn't the only figure of merit. If the index scan visits far more rows than necessary, that's bad. Maybe it's OK given that we only make trigrams from alphanumerics, but I'm not quite sure. regards, tom lane
On Mon, Oct 26, 2020 at 12:19 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Julien Rouhaud <rjuju123@gmail.com> writes: > > On Mon, Oct 26, 2020 at 5:03 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> I think you may be overoptimistic about being able to use the identical > >> code path without regard for LIKE wildcards; but certainly it should be > >> possible to do this with not a lot of new code. +1. > > > Well, that's what I was thinking too, but I tried all the possible > > wildcard combinations I could think of and I couldn't find any case > > yielding wrong results. As far as I can see the index scans return at > > least all the required rows, and all extraneous rows are correctly > > removed either by heap recheck or index recheck. > > But "does it get the right answers" isn't the only figure of merit. > If the index scan visits far more rows than necessary, that's bad. > Maybe it's OK given that we only make trigrams from alphanumerics, > but I'm not quite sure. Ah, yes this might lead to bad performance if the "fake wildcard" matches too many rows, but this shouldn't be a very common use case, and the only alternative for that might be to create trigrams for non alphanumerics characters. I didn't try to do that because it would mean meaningful overhead for mainstream usage of pg_trgm, and would also mean on-disk format break. In my opinion supporting = should be a best effort, especially for such corner cases.
On Mon, Oct 26, 2020 at 7:38 AM Julien Rouhaud <rjuju123@gmail.com> wrote: > Ah, yes this might lead to bad performance if the "fake wildcard" > matches too many rows, but this shouldn't be a very common use case, > and the only alternative for that might be to create trigrams for non > alphanumerics characters. I didn't try to do that because it would > mean meaningful overhead for mainstream usage of pg_trgm, and would > also mean on-disk format break. In my opinion supporting = should be > a best effort, especially for such corner cases. It would be more efficient to generate trigrams for equal operator using generate_trgm() instead of generate_wildcard_trgm(). It some cases it would generate more trigrams. For instance generate_trgm() would generate '__a', '_ab', 'ab_' for '%ab%' while generate_wildcard_trgm() would generate nothing. Also I wonder how our costing would work if there are multiple indices of the same column. We should clearly prefer btree than pg_trgm gist/gin, and I believe our costing provides this. But we also should prefer btree_gist/btree_gin than pg_trgm gist/gin, and I'm not sure our costing provides this especially for gist. ------ Regards, Alexander Korotkov
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: not tested Documentation: not tested Hi, this patch implements a useful and missing feature. Thank you. It includes documentation, which to a non-native speaker as myself seems appropriate. It includes comprehensive tests that cover the implemented cases. In the thread Alexander has pointed out, quote: "It would be more efficient to generate trigrams for equal operator using generate_trgm() instead of generate_wildcard_trgm()" I will echo the sentiment, though from a slightly different and possibly not as important point of view. The method used to extract trigrams from the query should match the method used to extract trigrams from the values when they get added to the index. This is gin_extract_value_trgm() and is indeed using generate_trgm(). I have no opinion over Alexander's second comment regarding costing. I change the status to 'Waiting on Author', but please feel free to override my opinion if you feel I am wrong and reset it to 'Needs review'. Cheers, //Georgios The new status of this patch is: Waiting on Author
On Wed, Nov 11, 2020 at 8:34 PM Georgios Kokolatos <gkokolatos@protonmail.com> wrote: > > The following review has been posted through the commitfest application: > make installcheck-world: tested, passed > Implements feature: tested, passed > Spec compliant: not tested > Documentation: not tested > > Hi, > > this patch implements a useful and missing feature. Thank you. > > It includes documentation, which to a non-native speaker as myself seems appropriate. > It includes comprehensive tests that cover the implemented cases. > > In the thread Alexander has pointed out, quote: > "It would be more efficient to generate trigrams for equal operator > using generate_trgm() instead of generate_wildcard_trgm()" > > I will echo the sentiment, though from a slightly different and possibly not > as important point of view. The method used to extract trigrams from the query > should match the method used to extract trigrams from the values when they > get added to the index. This is gin_extract_value_trgm() and is indeed using > generate_trgm(). > > I have no opinion over Alexander's second comment regarding costing. > > I change the status to 'Waiting on Author', but please feel free to override > my opinion if you feel I am wrong and reset it to 'Needs review'. Thanks for the reminder Georgios! Thanks a lot Alexander for the review! Indeed, I should have used generate_trgm() rather than generate_wildcard_trgm(). IIUC, the rest of the code should still be doing the same as [I]LikeStrategyNumber. I attach a v3 with that modification. For the costing, I tried this naive dataset: CREATE TABLE t1 AS select md5(random()::text) AS val from generate_series(1, 100000); CREATE INDEX t1_btree ON t1 (val); CREATE INDEX t1_gist ON t1 USING gist (val gist_trgm_ops); Cost are like this (all default configuration, using any random existing entry): # EXPLAIN ANALYZE SELECT * FROM t1 where val = '8dcf324ce38428e4d27a363953ac1c51'; QUERY PLAN ------------------------------------------------------------------------------------------------------------------- Index Only Scan using t1_btree on t1 (cost=0.42..4.44 rows=1 width=33) (actual time=0.192..0.194 rows=1 loops=1) Index Cond: (val = '8dcf324ce38428e4d27a363953ac1c51'::text) Heap Fetches: 0 Planning Time: 0.133 ms Execution Time: 0.222 ms (5 rows) # EXPLAIN ANALYZE SELECT * FROM t1 where val = '8dcf324ce38428e4d27a363953ac1c51'; QUERY PLAN ------------------------------------------------------------------------------------------------------------- Index Scan using t1_gist on t1 (cost=0.28..8.30 rows=1 width=33) (actual time=0.542..2.359 rows=1 loops=1) Index Cond: (val = '8dcf324ce38428e4d27a363953ac1c51'::text) Planning Time: 0.189 ms Execution Time: 2.382 ms (4 rows) # EXPLAIN ANALYZE SELECT * FROM t1 where val = '8dcf324ce38428e4d27a363953ac1c51'; QUERY PLAN ----------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on t1 (cost=400.01..404.02 rows=1 width=33) (actual time=2.486..2.488 rows=1 loops=1) Recheck Cond: (val = '8dcf324ce38428e4d27a363953ac1c51'::text) Heap Blocks: exact=1 -> Bitmap Index Scan on t1_gin (cost=0.00..400.01 rows=1 width=0) (actual time=2.474..2.474 rows=1 loops=1) Index Cond: (val = '8dcf324ce38428e4d27a363953ac1c51'::text) Planning Time: 0.206 ms Execution Time: 2.611 ms So assuming that this dataset is representative enough, costing indeed prefers a btree index over a gist/gin index, which should avoid regression with this change. Gin is however quite off, likely because it's a bitmap index scan rather than an index scan, so gist is preferred in this scenario. That's not ideal, but I'm not sure that there are many people having both gin_trgm_ops and gist_trgm_ops.
Attachment
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Friday, November 13, 2020 10:50 AM, Julien Rouhaud <julien.rouhaud@free.fr> wrote: > On Wed, Nov 11, 2020 at 8:34 PM Georgios Kokolatos > gkokolatos@protonmail.com wrote: > > > The following review has been posted through the commitfest application: > > make installcheck-world: tested, passed > > Implements feature: tested, passed > > Spec compliant: not tested > > Documentation: not tested > > Hi, > > this patch implements a useful and missing feature. Thank you. > > It includes documentation, which to a non-native speaker as myself seems appropriate. > > It includes comprehensive tests that cover the implemented cases. > > In the thread Alexander has pointed out, quote: > > "It would be more efficient to generate trigrams for equal operator > > using generate_trgm() instead of generate_wildcard_trgm()" > > I will echo the sentiment, though from a slightly different and possibly not > > as important point of view. The method used to extract trigrams from the query > > should match the method used to extract trigrams from the values when they > > get added to the index. This is gin_extract_value_trgm() and is indeed using > > generate_trgm(). > > I have no opinion over Alexander's second comment regarding costing. > > I change the status to 'Waiting on Author', but please feel free to override > > my opinion if you feel I am wrong and reset it to 'Needs review'. > > Thanks for the reminder Georgios! Thanks a lot Alexander for the review! > > Indeed, I should have used generate_trgm() rather than > generate_wildcard_trgm(). IIUC, the rest of the code should still be > doing the same as [I]LikeStrategyNumber. I attach a v3 with that > modification. Great, thanks! v3 looks good. > > For the costing, I tried this naive dataset: > > CREATE TABLE t1 AS select md5(random()::text) AS val from > generate_series(1, 100000); > CREATE INDEX t1_btree ON t1 (val); > CREATE INDEX t1_gist ON t1 USING gist (val gist_trgm_ops); > > Cost are like this (all default configuration, using any random existing entry): > > EXPLAIN ANALYZE SELECT * FROM t1 where val = > > ============================================= > > '8dcf324ce38428e4d27a363953ac1c51'; > QUERY PLAN > > ----------------------------------------------- > > Index Only Scan using t1_btree on t1 (cost=0.42..4.44 rows=1 > width=33) (actual time=0.192..0.194 rows=1 loops=1) > Index Cond: (val = '8dcf324ce38428e4d27a363953ac1c51'::text) > Heap Fetches: 0 > Planning Time: 0.133 ms > Execution Time: 0.222 ms > (5 rows) > > EXPLAIN ANALYZE SELECT * FROM t1 where val = > > ============================================= > > '8dcf324ce38428e4d27a363953ac1c51'; > QUERY PLAN > > ----------------------------------------------- > > Index Scan using t1_gist on t1 (cost=0.28..8.30 rows=1 width=33) > (actual time=0.542..2.359 rows=1 loops=1) > Index Cond: (val = '8dcf324ce38428e4d27a363953ac1c51'::text) > Planning Time: 0.189 ms > Execution Time: 2.382 ms > (4 rows) > > EXPLAIN ANALYZE SELECT * FROM t1 where val = > > ============================================= > > '8dcf324ce38428e4d27a363953ac1c51'; > QUERY PLAN > > ----------------------------------------------- > > Bitmap Heap Scan on t1 (cost=400.01..404.02 rows=1 width=33) (actual > time=2.486..2.488 rows=1 loops=1) > Recheck Cond: (val = '8dcf324ce38428e4d27a363953ac1c51'::text) > Heap Blocks: exact=1 > -> Bitmap Index Scan on t1_gin (cost=0.00..400.01 rows=1 width=0) > (actual time=2.474..2.474 rows=1 loops=1) > Index Cond: (val = '8dcf324ce38428e4d27a363953ac1c51'::text) > Planning Time: 0.206 ms > Execution Time: 2.611 ms > > So assuming that this dataset is representative enough, costing indeed > prefers a btree index over a gist/gin index, which should avoid > regression with this change. It sounds reasonable, although I would leave it to a bit more cost savvy people to argue the point. > > Gin is however quite off, likely because it's a bitmap index scan > rather than an index scan, so gist is preferred in this scenario. > That's not ideal, but I'm not sure that there are many people having > both gin_trgm_ops and gist_trgm_ops. Same as above. In short, I think v3 of the patch looks good to change to 'RFC' status. Given the possible costing concerns, I will refrain from changing the status just yet, to give the opportunity to more reviewers to chime in. If in the next few days there are no more reviews, I will update the status to 'RFC' to move the patch forward. Thoughts? Cheers, //Georgios
Hi! On Fri, Nov 13, 2020 at 1:47 PM Georgios <gkokolatos@protonmail.com> wrote: > In short, I think v3 of the patch looks good to change to 'RFC' status. > Given the possible costing concerns, I will refrain from changing the > status just yet, to give the opportunity to more reviewers to chime in. > If in the next few days there are no more reviews, I will update the > status to 'RFC' to move the patch forward. > > Thoughts? I went through and revised this patch. I made the documentation statement less categorical. pg_trgm gist/gin indexes might have lower performance of equality operator search than B-tree. So, we can't claim the B-tree index is always not needed. Also, simple comparison operators are <, <=, >, >=, and they are not supported. I also have checked that btree_gist is preferred over pg_trgm gist index for equality search. Despite our gist cost estimate is quite dumb, it selects btree_gist index due to its lower size. So, this part also looks good to me. I'm going to push this if no objections. ------ Regards, Alexander Korotkov
Attachment
On 2020-11-14 06:30, Alexander Korotkov wrote: > [v4-0001-Handle-equality...in-contrib-pg_trgm.patch (~] > > I'm going to push this if no objections. > About the sgml, in doc/src/sgml/pgtrgm.sgml : Beginning in <productname>PostgreSQL</productname> 14, these indexes also support equality operator (simple comparison operators are not supported). should be: Beginning in <productname>PostgreSQL</productname> 14, these indexes also support the equality operator (simple comparison operators are not supported). (added 'the') And: Although these indexes might have lower the performance of equality operator search than regular B-tree indexes. should be (I think - please check the meaning) Although these indexes might have a lower performance with equality operator search than with regular B-tree indexes. I am not sure I understood this last sentence correctly. Does this mean the slower trgm index might be chosen over the faster btree? Thanks, Erik Rijkers
Hi, Erik! On Sat, Nov 14, 2020 at 11:37 AM Erik Rijkers <er@xs4all.nl> wrote: > On 2020-11-14 06:30, Alexander Korotkov wrote: > > > [v4-0001-Handle-equality...in-contrib-pg_trgm.patch (~] > > > > I'm going to push this if no objections. > > > > About the sgml, in doc/src/sgml/pgtrgm.sgml : > > > Beginning in <productname>PostgreSQL</productname> 14, these indexes > also support equality operator (simple comparison operators are not > supported). > > should be: > > Beginning in <productname>PostgreSQL</productname> 14, these indexes > also support the equality operator (simple comparison operators are not > supported). > > (added 'the') > > > And: > > Although these indexes might have lower the performance of equality > operator > search than regular B-tree indexes. > > should be (I think - please check the meaning) > > Although these indexes might have a lower performance with equality > operator > search than with regular B-tree indexes. > > > I am not sure I understood this last sentence correctly. Does this mean > the slower trgm index might be chosen over the faster btree? Thank you for your review. I mean searching for an equal string with pg_trgm GiST/GIN might be slower than the same search with B-tree. If you need both pg_trgm similarity/pattern search and equal search on your column, you have choice. You can run with a single pg_trgm index, but your search for an equal string wouldn't be as fast as with B-tree. Alternatively you can have two indexes: pg_trgm index for similarity/pattern search and B-tree index for equality search. Second option gives you a fast equality search, but the second B-tree index would take extra space and maintenance overhead. For equality search, the B-tree index should be chosen by the planner (and that was tested). ------ Regards, Alexander Korotkov
On Sat, Nov 14, 2020 at 6:07 PM Alexander Korotkov <aekorotkov@gmail.com> wrote: > > Hi, Erik! > > On Sat, Nov 14, 2020 at 11:37 AM Erik Rijkers <er@xs4all.nl> wrote: > > On 2020-11-14 06:30, Alexander Korotkov wrote: > > > > > [v4-0001-Handle-equality...in-contrib-pg_trgm.patch (~] > > > > > > I'm going to push this if no objections. > > > > > > > About the sgml, in doc/src/sgml/pgtrgm.sgml : > > > > > > Beginning in <productname>PostgreSQL</productname> 14, these indexes > > also support equality operator (simple comparison operators are not > > supported). > > > > should be: > > > > Beginning in <productname>PostgreSQL</productname> 14, these indexes > > also support the equality operator (simple comparison operators are not > > supported). > > > > (added 'the') > > > > > > And: > > > > Although these indexes might have lower the performance of equality > > operator > > search than regular B-tree indexes. > > > > should be (I think - please check the meaning) > > > > Although these indexes might have a lower performance with equality > > operator > > search than with regular B-tree indexes. > > > > > > I am not sure I understood this last sentence correctly. Does this mean > > the slower trgm index might be chosen over the faster btree? > > Thank you for your review. > > I mean searching for an equal string with pg_trgm GiST/GIN might be > slower than the same search with B-tree. If you need both pg_trgm > similarity/pattern search and equal search on your column, you have > choice. You can run with a single pg_trgm index, but your search for > an equal string wouldn't be as fast as with B-tree. Alternatively you > can have two indexes: pg_trgm index for similarity/pattern search and > B-tree index for equality search. Second option gives you a fast > equality search, but the second B-tree index would take extra space > and maintenance overhead. For equality search, the B-tree index > should be chosen by the planner (and that was tested). Thanks everyone for the review, and thanks Alexander for the modifications! I agree that it's important to document that those indexes may be less performant than btree indexes. I also agree that there's an extraneous "the" in the documentation. Maybe this rewrite could be better? Note that those indexes may not be as afficient as regulat B-tree indexes for equality operator. While at it, there's a small grammar error: case EqualStrategyNumber: - /* Wildcard search is inexact */ + /* Wildcard and equal search is inexact */ It should be /* Wildcard and equal search are inexact */
On 2020-11-14 12:53, Julien Rouhaud wrote: > On Sat, Nov 14, 2020 at 6:07 PM Alexander Korotkov > <aekorotkov@gmail.com> > > Note that those indexes may not be as afficient as regulat B-tree > indexes > for equality operator. 'afficient as regulat' should be 'efficient as regular' Sorry to be nitpicking - it's the one thing I'm really good at :P Erik
On Sat, Nov 14, 2020 at 7:58 PM Erik Rijkers <er@xs4all.nl> wrote: > > On 2020-11-14 12:53, Julien Rouhaud wrote: > > On Sat, Nov 14, 2020 at 6:07 PM Alexander Korotkov > > <aekorotkov@gmail.com> > > > > Note that those indexes may not be as afficient as regulat B-tree > > indexes > > for equality operator. > > > 'afficient as regulat' should be > 'efficient as regular' > > > Sorry to be nitpicking - it's the one thing I'm really good at :P Oops indeed :)
On Sat, Nov 14, 2020 at 8:26 PM Julien Rouhaud <julien.rouhaud@free.fr> wrote: > On Sat, Nov 14, 2020 at 7:58 PM Erik Rijkers <er@xs4all.nl> wrote: > > > > On 2020-11-14 12:53, Julien Rouhaud wrote: > > > On Sat, Nov 14, 2020 at 6:07 PM Alexander Korotkov > > > <aekorotkov@gmail.com> > > > > > > Note that those indexes may not be as afficient as regulat B-tree > > > indexes > > > for equality operator. > > > > > > 'afficient as regulat' should be > > 'efficient as regular' > > > > > > Sorry to be nitpicking - it's the one thing I'm really good at :P > > Oops indeed :) Pushed with all the corrections above. Thanks! ------ Regards, Alexander Korotkov
On Sun, Nov 15, 2020 at 1:55 PM Alexander Korotkov <aekorotkov@gmail.com> wrote: > > On Sat, Nov 14, 2020 at 8:26 PM Julien Rouhaud <julien.rouhaud@free.fr> wrote: > > On Sat, Nov 14, 2020 at 7:58 PM Erik Rijkers <er@xs4all.nl> wrote: > > > > > > On 2020-11-14 12:53, Julien Rouhaud wrote: > > > > On Sat, Nov 14, 2020 at 6:07 PM Alexander Korotkov > > > > <aekorotkov@gmail.com> > > > > > > > > Note that those indexes may not be as afficient as regulat B-tree > > > > indexes > > > > for equality operator. > > > > > > > > > 'afficient as regulat' should be > > > 'efficient as regular' > > > > > > > > > Sorry to be nitpicking - it's the one thing I'm really good at :P > > > > Oops indeed :) > > Pushed with all the corrections above. Thanks! Thanks a lot!
On 2020-11-15 06:55, Alexander Korotkov wrote: >> > Sorry to be nitpicking - it's the one thing I'm really good at :P Hi Alexander, The last touch... (you forgot the missing 'the') thanks! Erik Rijkers
Attachment
On Sat, Nov 14, 2020 at 12:31 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
I went through and revised this patch. I made the documentation
statement less categorical. pg_trgm gist/gin indexes might have lower
performance of equality operator search than B-tree. So, we can't
claim the B-tree index is always not needed. Also, simple comparison
operators are <, <=, >, >=, and they are not supported.
Is "simple comparison" here a well-known term of art? If I read the doc as committed (which doesn't include the sentence above), and if I didn't already know what it was saying, I would be left wondering which comparisons those are. Could we just say "inequality operators"?
Cheers,
Jeff
On Mon, Nov 16, 2020 at 2:13 AM Jeff Janes <jeff.janes@gmail.com> wrote: > On Sat, Nov 14, 2020 at 12:31 AM Alexander Korotkov <aekorotkov@gmail.com> wrote: >> I went through and revised this patch. I made the documentation >> statement less categorical. pg_trgm gist/gin indexes might have lower >> performance of equality operator search than B-tree. So, we can't >> claim the B-tree index is always not needed. Also, simple comparison >> operators are <, <=, >, >=, and they are not supported. > > Is "simple comparison" here a well-known term of art? If I read the doc as committed (which doesn't include the sentenceabove), and if I didn't already know what it was saying, I would be left wondering which comparisons those are. Could we just say "inequality operators"? You're right. "Simple comparison" is vague, let's replace it with "inequality". Pushed, thanks! ------ Regards, Alexander Korotkov