Thread: Re: Simplify the logic a bit (src/bin/scripts/reindexdb.c)

Re: Simplify the logic a bit (src/bin/scripts/reindexdb.c)

From
Álvaro Herrera
Date:
On 2025-Feb-14, Ranier Vilela wrote:

> Attached is the prototype version v1.
> What do you think?

I think this is still a bit confused.  The new function's comment says
"prepare the list of tables to ..." but what it really receives is a
list of _indexes_ (as evidenced by the fact that they're compared to
pg_index.indexrelid).  So on input the user_list is an index list, and
on output it has been changed into a list of tables, and the list of
indexes is the function's return value?  That seems quite weird.  I
propose to change the API of this new function thus:

static void
get_parallel_tabidx_list(PGconn *conn,
        SimpleStringList *index_list,
        SimpleStringList *table_list,
        bool echo);

where index_list is inout and the table_list is just an output argument.

I would also remove the 'type' argument, because I don't see a need to
keep it.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"After a quick R of TFM, all I can say is HOLY CR** THAT IS COOL! PostgreSQL was
amazing when I first started using it at 7.2, and I'm continually astounded by
learning new features and techniques made available by the continuing work of
the development team."
Berend Tober, http://archives.postgresql.org/pgsql-hackers/2007-08/msg01009.php



Re: Simplify the logic a bit (src/bin/scripts/reindexdb.c)

From
Ranier Vilela
Date:
Hi Álvaro.

Em qui., 27 de fev. de 2025 às 16:50, Álvaro Herrera <alvherre@alvh.no-ip.org> escreveu:
On 2025-Feb-14, Ranier Vilela wrote:

> Attached is the prototype version v1.
> What do you think?

I think this is still a bit confused.  The new function's comment says
"prepare the list of tables to ..." but what it really receives is a
list of _indexes_ (as evidenced by the fact that they're compared to
pg_index.indexrelid).  So on input the user_list is an index list, and
on output it has been changed into a list of tables, and the list of
indexes is the function's return value?  That seems quite weird.
  Yeah, I think that is confusing.

  I
propose to change the API of this new function thus:

static void
get_parallel_tabidx_list(PGconn *conn,
                SimpleStringList *index_list,
                SimpleStringList *table_list,
                bool echo);

where index_list is inout and the table_list is just an output argument.
Ok.
 

I would also remove the 'type' argument, because I don't see a need to
keep it.
Ok.

Thanks for your guidance.

v2 attached, please comment if you have any further objections.

best regards,
Ranier Vilela
Attachment

Re: Simplify the logic a bit (src/bin/scripts/reindexdb.c)

From
Álvaro Herrera
Date:
On 2025-Feb-28, Ranier Vilela wrote:

> v2 attached, please comment if you have any further objections.

I think running the tests would have been a good idea, as would checking
for compiler warnings.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"I love the Postgres community. It's all about doing things _properly_. :-)"
(David Garamond)



Re: Simplify the logic a bit (src/bin/scripts/reindexdb.c)

From
Ranier Vilela
Date:
Em qui., 6 de mar. de 2025 às 15:51, Álvaro Herrera <alvherre@alvh.no-ip.org> escreveu:
On 2025-Feb-28, Ranier Vilela wrote:

> v2 attached, please comment if you have any further objections.

I think running the tests would have been a good idea, as would checking
for compiler warnings.
Hi.
Rebased for current head.

Thanks for the commiter 24503fa
 
best regards,
Ranier Vilela
Attachment

Re: Simplify the logic a bit (src/bin/scripts/reindexdb.c)

From
Álvaro Herrera
Date:
Hello

I find that this is still quite broken -- both the original, and your
patch.  I have already complained about a fundamental bug in the
original in [1].  In addition to what I reported there, in the unpatched
code I noticed that we're wasting memory and CPU by comparing the
qualified table name, when it would be faster to just store the table
OID and do the comparison based on that.  Because the REINDEX INDEX
query only needs to specify the *index* name, not the table name, we
don't need the table name to be stored in the indices_tables_list: we
can convert it into an OID list and store that instead.  The strcmp
becomes a numeric comparison.  Yay.  This is of course a pretty trivial,
almost meaningless change.

[1] https://postgr.es/m/202503071820.j25zn3lo4hvn@alvherre.pgsql

But your patch also makes the mistake that indices_table_list is passed
as a pointer by value, so the list that is filled by
get_parallel_tabidx_list() can never have any effect.  I wonder if you
tested this patch at all.  With your patch and the sample setup I posted
in [1], the program doesn't do anything.  It never calls REINDEX at all.
It is a dead program.  It's so bogus that in fact, there's no program,
it's just a bug that takes a lot of disk space.

For this to work, we need to pass that list (the output list) as a
pointer reference, so that the caller can _receive_ the output list.

We also get this compiler warning:

/pgsql/source/master/src/bin/scripts/reindexdb.c: In function ‘get_parallel_tabidx_list’:
/pgsql/source/master/src/bin/scripts/reindexdb.c:782:17: warning: this ‘if’ clause does not guard...
[-Wmisleading-indentation]
  782 |                 if (cell != index_list->head)
      |                 ^~
/pgsql/source/master/src/bin/scripts/reindexdb.c:785:25: note: ...this statement, but the latter is misleadingly
indentedas if it were guarded by the ‘if’
 
  785 |                         appendQualifiedRelation(&catalog_query, cell->val, conn, echo);
      |                         ^~~~~~~~~~~~~~~~~~~~~~~

Not to mention the 'git apply' warnings:

I schmee: master 0$ git apply /tmp/v3-simplifies-reindex-one-database-reindexdb.patch
/tmp/v3-simplifies-reindex-one-database-reindexdb.patch:191: space before tab in indent.
                              fmtQualifiedIdEnc(PQgetvalue(res, i, 1),
/tmp/v3-simplifies-reindex-one-database-reindexdb.patch:192: space before tab in indent.
                                                PQgetvalue(res, i, 0),
/tmp/v3-simplifies-reindex-one-database-reindexdb.patch:193: space before tab in indent.
                                                PQclientEncoding(conn)));
advertencia: 3 líneas agregan errores de espacios en blanco.


Anyway, my version of this is attached.  It fixes the problems with your
patch, but not Orlov's fundamental bug.  Without fixing that bug, this
program does not deserve this supposedly parallel mode that doesn't
actually deliver.  I wonder if we shouldn't just revert 47f99a407de2
completely.

You, on the other hand, really need to be much more careful with the
patches you submit. 

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Selbst das größte Genie würde nicht weit kommen, wenn es
alles seinem eigenen Innern verdanken wollte." (Johann Wolfgang von Goethe)
               Ni aún el genio más grande llegaría muy lejos si
                    quisiera sacarlo todo de su propio interior.



Re: Simplify the logic a bit (src/bin/scripts/reindexdb.c)

From
Ranier Vilela
Date:
Em sex., 7 de mar. de 2025 às 15:54, Álvaro Herrera <alvherre@alvh.no-ip.org> escreveu:
Hello

I find that this is still quite broken -- both the original, and your
patch.  I have already complained about a fundamental bug in the
original in [1].  In addition to what I reported there, in the unpatched
code I noticed that we're wasting memory and CPU by comparing the
qualified table name, when it would be faster to just store the table
OID and do the comparison based on that.  Because the REINDEX INDEX
query only needs to specify the *index* name, not the table name, we
don't need the table name to be stored in the indices_tables_list: we
can convert it into an OID list and store that instead.  The strcmp
becomes a numeric comparison.  Yay.  This is of course a pretty trivial,
almost meaningless change.

[1] https://postgr.es/m/202503071820.j25zn3lo4hvn@alvherre.pgsql

But your patch also makes the mistake that indices_table_list is passed
as a pointer by value, so the list that is filled by
get_parallel_tabidx_list() can never have any effect.  I wonder if you
tested this patch at all.  With your patch and the sample setup I posted
in [1], the program doesn't do anything.  It never calls REINDEX at all.
It is a dead program.  It's so bogus that in fact, there's no program,
it's just a bug that takes a lot of disk space.

For this to work, we need to pass that list (the output list) as a
pointer reference, so that the caller can _receive_ the output list.

We also get this compiler warning:

/pgsql/source/master/src/bin/scripts/reindexdb.c: In function ‘get_parallel_tabidx_list’:
/pgsql/source/master/src/bin/scripts/reindexdb.c:782:17: warning: this ‘if’ clause does not guard... [-Wmisleading-indentation]
  782 |                 if (cell != index_list->head)
      |                 ^~
/pgsql/source/master/src/bin/scripts/reindexdb.c:785:25: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the ‘if’
  785 |                         appendQualifiedRelation(&catalog_query, cell->val, conn, echo);
      |                         ^~~~~~~~~~~~~~~~~~~~~~~

Not to mention the 'git apply' warnings:

I schmee: master 0$ git apply /tmp/v3-simplifies-reindex-one-database-reindexdb.patch
/tmp/v3-simplifies-reindex-one-database-reindexdb.patch:191: space before tab in indent.
                                                         fmtQualifiedIdEnc(PQgetvalue(res, i, 1),
/tmp/v3-simplifies-reindex-one-database-reindexdb.patch:192: space before tab in indent.
                                                                                           PQgetvalue(res, i, 0),
/tmp/v3-simplifies-reindex-one-database-reindexdb.patch:193: space before tab in indent.
                                                                                           PQclientEncoding(conn)));
advertencia: 3 líneas agregan errores de espacios en blanco.


Anyway, my version of this is attached.  It fixes the problems with your
patch, but not Orlov's fundamental bug.  Without fixing that bug, this
program does not deserve this supposedly parallel mode that doesn't
actually deliver.  I wonder if we shouldn't just revert 47f99a407de2
completely.

You, on the other hand, really need to be much more careful with the
patches you submit.
Yes of course, thank you for making the necessary corrections.

best regards,
Ranier Vilela

Re: Simplify the logic a bit (src/bin/scripts/reindexdb.c)

From
Álvaro Herrera
Date:
On 2025-Mar-07, Álvaro Herrera wrote:

> Anyway, my version of this is attached.  It fixes the problems with your
> patch, but not Orlov's fundamental bug.

And of course I forgot to actually attach the patch.  Good grief.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"No deja de ser humillante para una persona de ingenio saber
que no hay tonto que no le pueda enseñar algo." (Jean B. Say)

Attachment

Re: Simplify the logic a bit (src/bin/scripts/reindexdb.c)

From
Ranier Vilela
Date:


Em sex., 7 de mar. de 2025 às 16:01, Álvaro Herrera <alvherre@alvh.no-ip.org> escreveu:
On 2025-Mar-07, Álvaro Herrera wrote:

> Anyway, my version of this is attached.  It fixes the problems with your
> patch, but not Orlov's fundamental bug.

And of course I forgot to actually attach the patch.  Good grief.
Test with your v4 patch, on Windows 64 bits.

results: 
reindexdb -U postgres -d postgres -j4 --echo -i foo1 -i foo2 -i bar1 -i bar2 -i baz1 -i baz2 -i baz3 -i baz4

Cut to show only REINDEX (order) cmds:

RESET search_path;
SELECT c.relname, ns.nspname
 FROM pg_catalog.pg_class c, pg_catalog.pg_namespace ns
 WHERE c.relnamespace OPERATOR(pg_catalog.=) ns.oid
  AND c.oid OPERATOR(pg_catalog.=) 'public.baz4'::pg_catalog.regclass;
SELECT pg_catalog.set_config('search_path', '', false);
REINDEX INDEX public.baz4;
RESET search_path;
SELECT c.relname, ns.nspname
 FROM pg_catalog.pg_class c, pg_catalog.pg_namespace ns
 WHERE c.relnamespace OPERATOR(pg_catalog.=) ns.oid
  AND c.oid OPERATOR(pg_catalog.=) 'public.baz1'::pg_catalog.regclass;
SELECT pg_catalog.set_config('search_path', '', false);
REINDEX INDEX public.baz1;
RESET search_path;
SELECT c.relname, ns.nspname
 FROM pg_catalog.pg_class c, pg_catalog.pg_namespace ns
 WHERE c.relnamespace OPERATOR(pg_catalog.=) ns.oid
  AND c.oid OPERATOR(pg_catalog.=) 'public.baz2'::pg_catalog.regclass;
SELECT pg_catalog.set_config('search_path', '', false);
REINDEX INDEX public.baz2;
RESET search_path;
SELECT c.relname, ns.nspname
 FROM pg_catalog.pg_class c, pg_catalog.pg_namespace ns
 WHERE c.relnamespace OPERATOR(pg_catalog.=) ns.oid
  AND c.oid OPERATOR(pg_catalog.=) 'public.baz3'::pg_catalog.regclass;
SELECT pg_catalog.set_config('search_path', '', false);
REINDEX INDEX public.baz3;
SELECT pg_catalog.set_config('search_path', '', false);
RESET search_path;
SELECT c.relname, ns.nspname
 FROM pg_catalog.pg_class c, pg_catalog.pg_namespace ns
 WHERE c.relnamespace OPERATOR(pg_catalog.=) ns.oid
  AND c.oid OPERATOR(pg_catalog.=) 'public.foo1'::pg_catalog.regclass;
SELECT pg_catalog.set_config('search_path', '', false);
REINDEX INDEX public.foo1;
RESET search_path;
SELECT c.relname, ns.nspname
 FROM pg_catalog.pg_class c, pg_catalog.pg_namespace ns
 WHERE c.relnamespace OPERATOR(pg_catalog.=) ns.oid
  AND c.oid OPERATOR(pg_catalog.=) 'public.foo2'::pg_catalog.regclass;
SELECT pg_catalog.set_config('search_path', '', false);
REINDEX INDEX public.foo2;
SELECT pg_catalog.set_config('search_path', '', false);
RESET search_path;
SELECT c.relname, ns.nspname
 FROM pg_catalog.pg_class c, pg_catalog.pg_namespace ns
 WHERE c.relnamespace OPERATOR(pg_catalog.=) ns.oid
  AND c.oid OPERATOR(pg_catalog.=) 'public.bar1'::pg_catalog.regclass;
SELECT pg_catalog.set_config('search_path', '', false);
REINDEX INDEX public.bar1;
RESET search_path;
SELECT c.relname, ns.nspname
 FROM pg_catalog.pg_class c, pg_catalog.pg_namespace ns
 WHERE c.relnamespace OPERATOR(pg_catalog.=) ns.oid
  AND c.oid OPERATOR(pg_catalog.=) 'public.bar2'::pg_catalog.regclass;
SELECT pg_catalog.set_config('search_path', '', false);
REINDEX INDEX public.bar2;

best regards,
Ranier Vilela

Re: Simplify the logic a bit (src/bin/scripts/reindexdb.c)

From
Ranier Vilela
Date:
Em sex., 7 de mar. de 2025 às 16:14, Ranier Vilela <ranier.vf@gmail.com> escreveu:


Em sex., 7 de mar. de 2025 às 16:01, Álvaro Herrera <alvherre@alvh.no-ip.org> escreveu:
On 2025-Mar-07, Álvaro Herrera wrote:

> Anyway, my version of this is attached.  It fixes the problems with your
> patch, but not Orlov's fundamental bug.

And of course I forgot to actually attach the patch.  Good grief.
Test with your v4 patch, on Windows 64 bits.
Rebased against 682c5be

Tests:
reindexdb -U postgres -d postgres -j4 --echo -i foo1 -i foo2 -i bar1 -i bar2 -i baz1 -i baz2 -i baz3 -i baz4 | grep REINDEX
File STDIN:
REINDEX INDEX public.baz4;
REINDEX INDEX public.baz1;
REINDEX INDEX public.baz2;
REINDEX INDEX public.baz3;
REINDEX INDEX public.foo1;
REINDEX INDEX public.foo2;
REINDEX INDEX public.bar1;
REINDEX INDEX public.bar2;

best regards,
Ranier Vilela
Attachment

Re: Simplify the logic a bit (src/bin/scripts/reindexdb.c)

From
Ranier Vilela
Date:

Em seg., 17 de mar. de 2025 às 16:17, Ranier Vilela <ranier.vf@gmail.com> escreveu:
Em sex., 7 de mar. de 2025 às 16:14, Ranier Vilela <ranier.vf@gmail.com> escreveu:


Em sex., 7 de mar. de 2025 às 16:01, Álvaro Herrera <alvherre@alvh.no-ip.org> escreveu:
On 2025-Mar-07, Álvaro Herrera wrote:

> Anyway, my version of this is attached.  It fixes the problems with your
> patch, but not Orlov's fundamental bug.

And of course I forgot to actually attach the patch.  Good grief.
Test with your v4 patch, on Windows 64 bits.
Rebased against 682c5be
Thanks Álvaro, for the commit.

best regards,
Ranier Vilela

Re: Simplify the logic a bit (src/bin/scripts/reindexdb.c)

From
Álvaro Herrera
Date:
On 2025-Mar-18, Ranier Vilela wrote:

> Thanks Álvaro, for the commit.

Thank you for the impetus.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/



Re: Simplify the logic a bit (src/bin/scripts/reindexdb.c)

From
Ranier Vilela
Date:
Em seg., 17 de mar. de 2025 às 16:17, Ranier Vilela <ranier.vf@gmail.com> escreveu:
Em sex., 7 de mar. de 2025 às 16:14, Ranier Vilela <ranier.vf@gmail.com> escreveu:


Em sex., 7 de mar. de 2025 às 16:01, Álvaro Herrera <alvherre@alvh.no-ip.org> escreveu:
On 2025-Mar-07, Álvaro Herrera wrote:

> Anyway, my version of this is attached.  It fixes the problems with your
> patch, but not Orlov's fundamental bug.

And of course I forgot to actually attach the patch.  Good grief.
Test with your v4 patch, on Windows 64 bits.
Rebased against 682c5be
Coverity reported two new warnings regarding reindexdb.

CID 1593911: (#1 of 1): Dereference after null check (FORWARD_NULL)
49. var_deref_op: Dereferencing null pointer indices_tables_cell. 

CID 1593910: (#1 of 1): Explicit null dereferenced (FORWARD_NULL)
9. var_deref_op: Dereferencing null pointer process_list. 

Attached is an attempt to fix.
 
best regards,
Ranier Vilela
Attachment