Thread: Avoid mix char with bool type in comparisons

Avoid mix char with bool type in comparisons

From
Ranier Vilela
Date:
Hi,

GIN Indexes:

Defines a type char GinTernaryValue with 3 values:
#define GIN_FALSE 0 /* item is not present / does not match */
#define GIN_TRUE 1 /* item is present / matches */
#define GIN_MAYBE 2 /* don't know if item is present / don't know
* if matches */

So, any use of this GinTernaryValue are:

1. if (key->entryRes[j]) be FALSE if GIN_FALSE
2. if (key->entryRes[j]) be TRUE if GIN_TRUE
3. if (key->entryRes[j]) be TRUE if GIN_MAYBE

So gin matchs can fail with GYN_MAYBE or I lost something?

regards,
Ranier Vilela
Attachment

Re: Avoid mix char with bool type in comparisons

From
Tom Lane
Date:
Ranier Vilela <ranier.vf@gmail.com> writes:
> So, any use of this GinTernaryValue are:

> 1. if (key->entryRes[j]) be FALSE if GIN_FALSE
> 2. if (key->entryRes[j]) be TRUE if GIN_TRUE
> 3. if (key->entryRes[j]) be TRUE if GIN_MAYBE

Yeah, that's how it's designed.  Unless you can point to a bug,
I do not think we ought to change this code.

            regards, tom lane



Re: Avoid mix char with bool type in comparisons

From
Ranier Vilela
Date:
Em qui., 6 de out. de 2022 às 20:52, Tom Lane <tgl@sss.pgh.pa.us> escreveu:
Ranier Vilela <ranier.vf@gmail.com> writes:
> So, any use of this GinTernaryValue are:

> 1. if (key->entryRes[j]) be FALSE if GIN_FALSE
> 2. if (key->entryRes[j]) be TRUE if GIN_TRUE
> 3. if (key->entryRes[j]) be TRUE if GIN_MAYBE

Yeah, that's how it's designed.  Unless you can point to a bug,
I do not think we ought to change this code.
Thanks for answering.

My main concerns is this point:
  /* If already matched on earlier page, do no extra work */
- if (key->entryRes[j])
+ if (key->entryRes[j] == GIN_TRUE)
  continue;

If GIN_MAYBE cases are erroneously ignored.

regards,
Ranier Vilela

Re: Avoid mix char with bool type in comparisons

From
Tom Lane
Date:
Ranier Vilela <ranier.vf@gmail.com> writes:
> My main concerns is this point:
>   /* If already matched on earlier page, do no extra work */
> - if (key->entryRes[j])
> + if (key->entryRes[j] == GIN_TRUE)
>   continue;

> If GIN_MAYBE cases are erroneously ignored.

So, if that's a bug, you should be able to produce a test case?

            regards, tom lane



Re: Avoid mix char with bool type in comparisons

From
Ranier Vilela
Date:
Em qui., 6 de out. de 2022 às 21:21, Tom Lane <tgl@sss.pgh.pa.us> escreveu:
Ranier Vilela <ranier.vf@gmail.com> writes:
> My main concerns is this point:
>   /* If already matched on earlier page, do no extra work */
> - if (key->entryRes[j])
> + if (key->entryRes[j] == GIN_TRUE)
>   continue;

> If GIN_MAYBE cases are erroneously ignored.

So, if that's a bug, you should be able to produce a test case?
No Tom, unfortunately I don't have the knowledge to create a test with GIN_MAYBE values.

With the patch, all current tests pass.
Either there are no bugs, or there are no tests that detect this specific case.
And I agree with you, without a test showing the bug,
there's not much chance of the patch progressing.

Unless someone with more knowledge can say that the patch improves robustness.

regards,
Ranie Vilela

Re: Avoid mix char with bool type in comparisons

From
Robert Haas
Date:
On Thu, Oct 6, 2022 at 8:35 PM Ranier Vilela <ranier.vf@gmail.com> wrote:
> No Tom, unfortunately I don't have the knowledge to create a test with GIN_MAYBE values.

Well then don't post.

Basically what you're saying is that you suspect there might be a
problem with this code but you don't really know that and you can't
test it. That's not a good enough reason to take up the time of
everyone on this list.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Avoid mix char with bool type in comparisons

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Oct 6, 2022 at 8:35 PM Ranier Vilela <ranier.vf@gmail.com> wrote:
>> No Tom, unfortunately I don't have the knowledge to create a test with GIN_MAYBE values.

> Well then don't post.

> Basically what you're saying is that you suspect there might be a
> problem with this code but you don't really know that and you can't
> test it. That's not a good enough reason to take up the time of
> everyone on this list.

FWIW, I did take a look at this code, and I don't see any bug.
The entryRes[] array entries are indeed GinTernaryValue, but it's
obvious by inspection that matchPartialInPendingList only returns
true or false, therefore collectMatchesForHeapRow also only deals
in true or false, never maybe.  I do not think changing
matchPartialInPendingList to return ternary would be an improvement,
because then it'd be less obvious that it doesn't deal in maybe.

            regards, tom lane



Re: Avoid mix char with bool type in comparisons

From
Robert Haas
Date:
On Fri, Oct 7, 2022 at 11:40 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> FWIW, I did take a look at this code, and I don't see any bug.
> The entryRes[] array entries are indeed GinTernaryValue, but it's
> obvious by inspection that matchPartialInPendingList only returns
> true or false, therefore collectMatchesForHeapRow also only deals
> in true or false, never maybe.  I do not think changing
> matchPartialInPendingList to return ternary would be an improvement,
> because then it'd be less obvious that it doesn't deal in maybe.

I mean if the code isn't buggy, I'm glad, but I think there should
have been more substantial grounds for getting you to spend time
looking at it. It's not asking too much for people to produce a
non-zero amount of evidence that the thing they are worried about is
actually a problem.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Avoid mix char with bool type in comparisons

From
Ranier Vilela
Date:
Em sex., 7 de out. de 2022 às 12:40, Tom Lane <tgl@sss.pgh.pa.us> escreveu:
Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Oct 6, 2022 at 8:35 PM Ranier Vilela <ranier.vf@gmail.com> wrote:
>> No Tom, unfortunately I don't have the knowledge to create a test with GIN_MAYBE values.

> Well then don't post.

> Basically what you're saying is that you suspect there might be a
> problem with this code but you don't really know that and you can't
> test it. That's not a good enough reason to take up the time of
> everyone on this list.

FWIW, I did take a look at this code, and I don't see any bug.
The entryRes[] array entries are indeed GinTernaryValue, but it's
obvious by inspection that matchPartialInPendingList only returns
true or false, therefore collectMatchesForHeapRow also only deals
in true or false, never maybe. 

Thanks for spending your time with this.

Anyway, it's not *true* that  collectMatchesForHeapRow deals
only "true" and "false", once that key->entryRes is initialized with GIN_FALSE not false.

/*
* Reset all entryRes and hasMatchKey flags
*/
for (i = 0; i < so->nkeys; i++)
{
GinScanKey key = so->keys + i;
memset(key->entryRes, GIN_FALSE, key->nentries);
}

Maybe only typo, that doesn't matter to anyone but some static analysis tools that alarm about these stupid things.

/*
* Reset all entryRes and hasMatchKey flags
*/
for (i = 0; i < so->nkeys; i++)
{
GinScanKey key = so->keys + i;
memset(key->entryRes, false, key->nentries);
}

I do not think changing
matchPartialInPendingList to return ternary would be an improvement,
because then it'd be less obvious that it doesn't deal in maybe.
On this point I don't quite agree with you, since for anyone wanting to read the code in gin.h,
they will think in terms of GIN_FALSE, GIN_TRUE and GIN_MAYBE,
and will have to spend some time thinking about why they are mixing char and bool types.

Besides that, it remains a trap, just waiting for someone to fall in the future.
if (key->entryRes[j]) be TRUE when GIN_MAYBE.

regards,
Ranier Vilela

Re: Avoid mix char with bool type in comparisons

From
Tom Lane
Date:
Ranier Vilela <ranier.vf@gmail.com> writes:
> Anyway, it's not *true* that  collectMatchesForHeapRow deals
> only "true" and "false", once that key->entryRes is initialized with
> GIN_FALSE not false.

Look: GinTernaryValue is carefully designed so that it is
representation-compatible with bool, including that GIN_FALSE is
identical to false and GIN_TRUE is identical to true.  I'm quite
uninterested in debating whether that's a good idea or not.
Moreover, there are a ton of other places besides this one where
the GIN code relies on that equivalence, so we'd have to do a
lot more than what's in this patch if we wanted to decouple that.

            regards, tom lane



Re: Avoid mix char with bool type in comparisons

From
Ranier Vilela
Date:
Em sex., 7 de out. de 2022 às 13:32, Robert Haas <robertmhaas@gmail.com> escreveu:
On Fri, Oct 7, 2022 at 11:40 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> FWIW, I did take a look at this code, and I don't see any bug.
> The entryRes[] array entries are indeed GinTernaryValue, but it's
> obvious by inspection that matchPartialInPendingList only returns
> true or false, therefore collectMatchesForHeapRow also only deals
> in true or false, never maybe.  I do not think changing
> matchPartialInPendingList to return ternary would be an improvement,
> because then it'd be less obvious that it doesn't deal in maybe.

I mean if the code isn't buggy, I'm glad, but I think there should
have been more substantial grounds for getting you to spend time
looking at it. It's not asking too much for people to produce a
non-zero amount of evidence that the thing they are worried about is
actually a problem.
Sorry if you think this is all just a waste of time.

I think that while the current code has no real bugs, that doesn't mean it doesn't have readability and style issues.
And that not being able to produce tests should not be an impediment to improving the current code.
I believe I have contributed much more than changing "fo" to "of" in comments.

Right now I have:
02/09/2022  15:58               593 0001-fix-typo-isnan-test-geo_ops.patch
02/09/2022  15:57             7.746 0001-fix-wrong-isnan-test-geo_ops.patch
11/07/2022  09:39             2.271 0001-Promove-unshadowing-of-two-variables-PGPROC-type.patch
11/07/2022  09:39             2.939 0001-Reduce-Wsign-compare-warnings-from-clang-12-compiler.patch
11/07/2022  09:39            15.377 0001-Refactoring-strlen-comparisons-with-zero.patch
02/09/2022  09:06             6.711 0002-avoid-small-issues-brin_minmax_multi.patch
11/07/2022  09:39             6.402 001-aset-reduces-memory-consumption.patch
01/07/2022  12:53             2.111 001-avoid-unecessary-MemSet-calls.patch
11/07/2022  09:39            59.662 001-improve-executor.patch
11/07/2022  09:39            69.155 001-improve-getsnapshot.patch
11/07/2022  09:39            15.465 001-improve-memory.patch
11/07/2022  09:39            24.130 001-improve-scability-procarray.patch
11/07/2022  09:39            74.693 001-improve-scaling.patch
22/05/2022  13:23             6.579 001-improve-sort.patch
11/07/2022  09:39           280.420 001-improve-table-open.patch
11/07/2022  09:39            11.451 001-reduces-memory-consumption.patch
11/07/2022  09:39             8.516 002-generation-reduces-memory-consumption.patch
11/07/2022  09:39             6.025 003-aset-reduces-memory-consumption.patch
11/07/2022  09:39             8.966 004-generation-reduces-memory-consumption_BUG.patch
04/09/2022  18:28            12.095 all.patch
05/10/2022  09:41             2.048 all2.patch
20/09/2022  10:59             1.513 all_20_09_2022.patch
09/10/2020  11:42               673 avoid_dereferencing_null_pointer.patch
29/09/2022  20:39               437 avoid_useless_reassign_lgosegno.patch
29/09/2022  20:43               418 avoid_useless_retesting_log_min_duration.patch
29/09/2022  20:44               625 avoid_useless_var_record.patch
11/07/2022  09:39            32.180 FAST-001-improve-scability.patch
11/07/2022  09:39            51.453 FAST-001-improve-sort.patch
11/07/2022  09:39            62.491 FAST2-001-improve-sort.patch
04/10/2022  08:22               493 fix_declaration_volatile_signal_pg_test_fsync.patch
29/09/2022  20:45               484 fix_declaration_volatile_signal_var.patch
25/08/2020  12:19             1.087 fix_dereference_null_statscmds.patch
26/06/2020  11:26             1.526 fix_null_deference_pquery.patch
28/08/2020  15:53               537 fix_null_memcmp_call.patch
25/08/2020  14:53               541 fix_possible_overflow_executils.patch
25/08/2020  14:17               757 fix_possible_overflow_nodeagg.patch
05/09/2020  10:45            14.049 fix_redudant_init.patch
05/09/2020  10:35               933 fix_redudant_initialization_arrayfuncs.patch
05/09/2020  10:47             2.403 fix_redudant_initialization_bklno_hash.patch
05/09/2020  10:07               793 fix_redudant_initialization_firstmissingnum_heaptuple.patch
05/09/2020  10:36               362 fix_redudant_initialization_formatting.patch
05/09/2020  10:08               406 fix_redudant_initialization_offsetnumber_gistutil.patch
05/09/2020  10:25               851 fix_redudant_initialization_parse_utilcmd.patch
05/09/2020  10:29               742 fix_redudant_initialization_procarray.patch
05/09/2020  10:30               604 fix_redudant_initialization_spell.patch
05/09/2020  10:16             1.157 fix_redudant_initialization_status_nbtsearch.patch
05/09/2020  10:21               537 fix_redudant_initialization_storage.patch
05/09/2020  10:28               531 fix_redudant_initialization_syslogger.patch
05/09/2020  10:31               878 fix_redudant_initialization_to_tsany.patch
05/09/2020  10:36               452 fix_redudant_initialization_tsrank.patch
05/09/2020  10:38             1.324 fix_redudant_initialization_tuplesort.patch
05/09/2020  10:34               428 fix_redudant_initialization_wparser_def.patch
05/09/2020  10:18               797 fix_redudant_prefix_spgtextproc.patch
05/09/2020  10:19               834 fix_redudant_waits_xlog.patch
25/08/2020  15:48             2.319 fix_unchecked_return_spi_connect.patch
09/10/2020  09:15               420 fix_uninitialized_var_flag_spell.patch
09/09/2022  11:25            68.543 fprintf_fixes.patch
09/09/2020  09:17            13.805 getsnapshotdata.patch
27/09/2022  16:05             4.083 head_27_09_2022.patch
24/08/2020  19:31            21.023 hugepage.patch
14/05/2022  20:32             6.545 improve_sort.patch
15/09/2022  11:50             6.327 patchs_16_09_2022.patch
05/10/2022  14:30            15.376 postgres_05_10_2022.patch
11/07/2022  16:25             2.068 postgres_executor.patch
29/06/2022  11:01            29.995 postgres_sort.patch
07/09/2020  22:07            25.449 prefetch.patch
14/09/2020  10:22            19.919 setvbuf.patch
14/09/2020  14:36            19.919 setvfbuf.patch
05/09/2022  13:40             7.857 string_fixes.patch
11/07/2022  09:39            34.130 strlen.patch
05/10/2022  09:42             2.048 style_use_compatible_var_type.patch
28/08/2020  10:19             5.155 unloop_toast_tuple_init.patch
14/09/2022  20:00             3.237 use-heapalloc-instead-deprecated-localalloc.patch
11/09/2020  11:47             3.733 v1-0001-simplified_read_binary_file.patch
07/07/2022  15:22           106.755 v1-0001-WIP-Replace-MemSet-calls-with-struct-initialization.patch
11/07/2022  09:39            14.918 v1-001-improve-memory.patch
28/05/2022  08:45            24.989 v1-001-improve-scability-procarray.patch
11/07/2022  09:39            14.972 v1-001-improve-scaling.patch
09/09/2022  11:26            68.543 v1-fprintf_fixes.patch
05/09/2022  21:42            20.586 v1-string_fixes.patch
11/07/2022  09:39            34.469 v10-001-improve-scability.patch
11/07/2022  09:39            32.229 v11-001-improve-scability.patch
11/07/2022  09:39            36.060 v12-001-improve-scability.patch
11/07/2022  09:39            53.064 v13-001-improve-scability.patch
11/07/2022  09:39            48.123 v14-001-improve-scability.patch
11/07/2022  09:39            35.443 v15-001-improve-scability.patch
09/08/2022  15:56            95.542 v2-0001-Improve-performance-of-and-reduce-overheads-of-me.patch
11/09/2020  16:58             4.228 v2-0001-simplified_read_binary_file.patch
11/07/2022  16:03           106.755 v2-0001-WIP-Replace-MemSet-calls-with-struct-initialization.patch
11/07/2022  09:39            17.894 v2-001-improve-memory.patch
11/07/2022  09:39           349.497 v2-001-improve-scability-procarray.patch
11/07/2022  09:39            69.227 v2-001-improve-scaling.patch
11/07/2022  09:39            18.709 v2-002-generation-reduces-memory-consumption.patch
11/07/2022  09:39            42.893 v2-002-improve-sort.patch
05/09/2022  23:16            52.064 v2-string_fixes.patch
11/09/2020  18:38             4.047 v3-0001-simplified_read_binary_file.patch
01/08/2022  13:52            26.670 v3-0001-WIP-Replace-MemSet-calls-with-struct-initialization.patch
11/07/2022  09:39           349.781 v3-001-improve-scability-procarray.patch
11/07/2022  09:39            45.707 v3-002-improve-sort.patch
05/09/2022  08:34             7.510 v3_avoid_referencing_out_of_bounds_array_elements.patch
15/09/2020  14:29             4.306 v4-0001-simplified_read_binary_file.patch
11/07/2022  09:39           352.181 v4-001-improve-scability.patch
11/07/2022  09:39            51.453 v4-002-improve-sort.patch
11/07/2022  09:39           354.611 v5-001-improve-scability.patch
11/07/2022  09:39            51.453 v5-002-improve-sort.patch
11/07/2022  09:39           355.739 v6-001-improve-scability.patch
11/07/2022  09:39            61.904 v6-002-improve-sort.patch
11/07/2022  09:39            13.547 v7-001-improve-scability.patch
11/07/2022  09:39            62.491 v7-002-improve-sort.patch
11/07/2022  09:39            27.800 v8-001-improve-scability.patch
11/07/2022  09:39            33.358 v9-001-improve-scability.patch
27/06/2020  11:17             7.754 windows_fixes_v1.patch
 
And it could contribute much, much more.

regards,
Ranier Vilela

Re: Avoid mix char with bool type in comparisons

From
Ranier Vilela
Date:
Em sex., 7 de out. de 2022 às 14:44, Tom Lane <tgl@sss.pgh.pa.us> escreveu:
Ranier Vilela <ranier.vf@gmail.com> writes:
> Anyway, it's not *true* that  collectMatchesForHeapRow deals
> only "true" and "false", once that key->entryRes is initialized with
> GIN_FALSE not false.

Look: GinTernaryValue is carefully designed so that it is
representation-compatible with bool, including that GIN_FALSE is
identical to false and GIN_TRUE is identical to true.  I'm quite
uninterested in debating whether that's a good idea or not.
Moreover, there are a ton of other places besides this one where
the GIN code relies on that equivalence, so we'd have to do a
lot more than what's in this patch if we wanted to decouple that.
Just now I checked all the places where GinTernaryValue is used and they all trust using GIN_TRUE, GIN_FALSE and GIN_MAYBE.
The only problematic place I found was at ginget.c and jsonb_gin.c

jsonb_gin.c:
The function execute_jsp_gin_node returns GIN_TRUE if the node->type is JSP_GIN_ENTRY and the value is GIN_MAYBE.
IMO, it should be GIN_FALSE?

case JSP_GIN_ENTRY:
{
int index = node->val.entryIndex;

if (ternary)
return ((GinTernaryValue *) check)[index];
else
return ((bool *) check)[index] ? GIN_TRUE : GIN_FALSE;
}

The array entryRes is used only in ginget.c and ginscan.c.

The GinTernaryValue is used only in:
File contrib\pg_trgm\trgm_gin.c:
File src\backend\access\gin\ginarrayproc.c:
File src\backend\access\gin\ginget.c:
File src\backend\access\gin\ginlogic.c:
File src\backend\access\gin\ginscan.c:
File src\backend\utils\adt\jsonb_gin.c:
File src\backend\utils\adt\tsginidx.c:

regards,
Ranier Vilela

Re: Avoid mix char with bool type in comparisons

From
Ranier Vilela
Date:
Em sex., 7 de out. de 2022 às 15:45, Ranier Vilela <ranier.vf@gmail.com> escreveu:
Em sex., 7 de out. de 2022 às 14:44, Tom Lane <tgl@sss.pgh.pa.us> escreveu:
Ranier Vilela <ranier.vf@gmail.com> writes:
> Anyway, it's not *true* that  collectMatchesForHeapRow deals
> only "true" and "false", once that key->entryRes is initialized with
> GIN_FALSE not false.

Look: GinTernaryValue is carefully designed so that it is
representation-compatible with bool, including that GIN_FALSE is
identical to false and GIN_TRUE is identical to true.  I'm quite
uninterested in debating whether that's a good idea or not.
Moreover, there are a ton of other places besides this one where
the GIN code relies on that equivalence, so we'd have to do a
lot more than what's in this patch if we wanted to decouple that.
Just now I checked all the places where GinTernaryValue is used and they all trust using GIN_TRUE, GIN_FALSE and GIN_MAYBE.
The only problematic place I found was at ginget.c and jsonb_gin.c

jsonb_gin.c:
The function execute_jsp_gin_node returns GIN_TRUE if the node->type is JSP_GIN_ENTRY and the value is GIN_MAYBE.
IMO, it should be GIN_FALSE?
I spend more time checking this and I think that answer is not.

So, in the patch attached it changes only ginget.c, to avoid mixing GinTernaryValue with bool.
And add comments to warn that GIN_MAYBE comparison is equal GIN_TRUE in jsonb_gin.c

I think that's all that needs to be changed at the moment.
I'm also glad there are no bugs, and IMO I hope the code has become more readable and secure.

regards,
Ranier Vilela
Attachment