Re: query_id, pg_stat_activity, extended query protocol - Mailing list pgsql-hackers

From Sami Imseih
Subject Re: query_id, pg_stat_activity, extended query protocol
Date
Msg-id CAA5RZ0teFAahPnR9Xb3w7wh3chPF_cFfigYzY_pBHMdOsAm=JA@mail.gmail.com
Whole thread Raw
In response to Re: query_id, pg_stat_activity, extended query protocol  (Sami Imseih <samimseih@gmail.com>)
Responses Re: query_id, pg_stat_activity, extended query protocol
List pgsql-hackers
> Attached is the patch I am finishing with, with some new tests for
> BRIN and btree to force parallel builds with immutable expressions
> through functions.

glad to see the asserts are working as expected ad finding these issues.
I took a look at the patch and tested it. It looks good. My only concern
is the stability of using min_parallel_table_scan_size = 0. Will it always
guarantee parallel workers? Can we print some debugging that proves
a parallel worker was spun up?

Something like this I get with DEBUG1

postgres=*# CREATE INDEX btree_test_expr_idx ON btree_test_expr USING btree (btree_test_func());
DEBUG:  building index "btree_test_expr_idx" on table "btree_test_expr" with request for 1 parallel workers

Also, we can just set the max_parallel_maintenance_workers to 1.

What do you think?

Regards,

Sami
DEBUG:  building index "btree_test_expr_idx" on table "btree_test_expr" with request for 1 parallel workers



On Wed, Sep 25, 2024 at 8:08 PM Michael Paquier <michael@paquier.xyz> wrote:
On Wed, Sep 25, 2024 at 05:00:00PM +0300, Alexander Lakhin wrote:
> Please look at the script, which triggers Assert added by 24f520594:
> (assuming shared_preload_libraries=pg_stat_statements)

Or just compute_query_id = on.

> SELECT repeat('x', 100) INTO t FROM generate_series(1, 100000);
> CREATE FUNCTION f() RETURNS int LANGUAGE sql IMMUTABLE RETURN 0;
> CREATE INDEX ON t(f());
>
> TRAP: failed Assert("!IsQueryIdEnabled() || pgstat_get_my_query_id() != 0"), File: "execMain.c", Line: 300, PID: 1288609
> ExceptionalCondition at assert.c:52:13
> ExecutorRun at execMain.c:302:6
> postquel_getnext at functions.c:903:24
> fmgr_sql at functions.c:1198:15
> ExecInterpExpr at execExprInterp.c:746:8
> ExecInterpExprStillValid at execExprInterp.c:2034:1
> ExecEvalExprSwitchContext at executor.h:367:13

And this assertion is doing the job I want it to do, because it is
telling us that we are not setting a query ID when doing a parallel
btree build.  The query string that we would report at the beginning
of _bt_parallel_build_main() is passed down as a parameter, but not
the query ID.  Hence pg_stat_activity would report a NULL query ID
when spawning parallel workers in this cases, even if there is a query
string.

The same can be said for the parallel build for BRIN, that uses a lot
of logic taken from btree for there parallel parameters, and even
vacuum, as it goes through a parse analyze where its query ID would be
set. but that's missed in the workers.

See _bt_parallel_build_main(), _brin_parallel_build_main() and
parallel_vacuum_main() which are the entry point used by the workers
for all of them.  For BRIN, note that I can get the same failure with
the following query, based on the table of your previous test that
would spawn a worker:
CREATE INDEX foo ON t using brin(f());

The recovery test 027_stream_regress.pl not catching these failures
means that we don't have tests with an index expression for such
parallel builds, or the assertion would have triggered.  It looks like
this is just because we don't do a parallel btree build with an index
expression where we need to go through the executor to build its
IndexInfo.

Note that parallel workers launched by execParallel.c pass down the
query ID in a minimal PlannedStmt where we use pgstat_get_my_query_id,
so let's do the same for all these.

Attached is the patch I am finishing with, with some new tests for
BRIN and btree to force parallel builds with immutable expressions
through functions.  These fail the assertions in the recovery TAP
test.  It may be a good idea to keep these tests in the long-term
anyway.  It took me a few minutes to find out that
min_parallel_table_scan_size and max_parallel_maintenance_workers was
enough to force workers to spawn even if tables have no data, to make
the tests cheaper.

Thoughts or comments?
--
Michael

pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: AIO writes vs hint bits vs checksums
Next
From: "Imseih (AWS), Sami"
Date:
Subject: Re: query_id, pg_stat_activity, extended query protocol