Re: pg_background (and more parallelism infrastructure patches) - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: pg_background (and more parallelism infrastructure patches)
Date
Msg-id CAA4eK1J=X_Kuhm8XQD37+yQiqnYF04OsZy4KCRxWu4Wz1rYy5Q@mail.gmail.com
Whole thread Raw
In response to pg_background (and more parallelism infrastructure patches)  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: pg_background (and more parallelism infrastructure patches)
List pgsql-hackers
On Fri, Jul 25, 2014 at 11:41 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> Patch 4 adds infrastructure that allows one session to save all of its
> non-default GUC values and another session to reload those values.
> This was written by Amit Khandekar and Noah Misch.  It allows
> pg_background to start up the background worker with the same GUC
> settings that the launching process is using.  I intend this as a
> demonstration of how to synchronize any given piece of state between
> cooperating backends.  For real parallelism, we'll need to synchronize
> snapshots, combo CIDs, transaction state, and so on, in addition to
> GUCs.  But GUCs are ONE of the things that we'll need to synchronize
> in that context, and this patch shows the kind of API we're thinking
> about for these sorts of problems.

Don't we need some way to prohibit changing GUC by launching process,
once it has shared the existing GUC?


> Patch 6 is pg_background itself.  I'm quite pleased with how easily
> this came together.  The existing background worker, dsm, shm_toc, and
> shm_mq infrastructure handles most of the heavily lifting here -
> obviously with some exceptions addressed by the preceding patches.
> Again, this is the kind of set-up that I'm expecting will happen in a
> background worker used for actual parallelism - clearly, more state
> will need to be restored there than here, but nonetheless the general
> flow of the code here is about what I'm imagining, just with somewhat
> more different kinds of state.  Most of the work of writing this patch
> was actually figuring out how to execute the query itself; what I
> ended up with is mostly copied form exec_simple_query, but with some
> difference here and there.  I'm not sure if it would be
> possible/advisable to try to refactor to reduce duplication.

1. This patch generates warning on windows 
1>pg_background.obj : error LNK2001: unresolved external symbol StatementTimeout

You need to add PGDLLIMPORT for StatementTimeout

2.
CREATE FUNCTION pg_background_launch(sql pg_catalog.text,
  queue_size pg_catalog.int4 DEFAULT 65536)

Here shouldn't queue_size be pg_catalog.int8 as I could see
some related functions in test_shm_mq uses int8?

CREATE FUNCTION test_shm_mq(queue_size pg_catalog.int8,
CREATE FUNCTION test_shm_mq_pipelined(queue_size pg_catalog.int8,

Anyway I think corresponding C function doesn't use matching function
to extract the function args.

pg_background_launch(PG_FUNCTION_ARGS)
{
text   *sql = PG_GETARG_TEXT_PP(0);
int32 queue_size = PG_GETARG_INT64(1);

Here it should _INT32 variant to match with current sql definition,
otherwise it leads to below error.

postgres=# select pg_background_launch('vacuum verbose foo');
ERROR:  queue size must be at least 64 bytes

3.
Comparing execute_sql_string() and exec_simple_query(), I could see below
main differences: 
a. Allocate a new memory context different from message context
b. Start/commit control of transaction is outside execute_sql_string
c. enable/disable statement timeout is done from outside incase of execute_sql_string()
d. execute_sql_string() prohibits Start/Commit/Abort transaction statements.
e. execute_sql_string() doesn't log statements
f. execute_sql_string() uses binary format for result whereas exec_simple_query()
   uses TEXT as defult format
g. processed stat related info from caller incase of execute_sql_string().

Can't we handle all these or other changes inside exec_simple_query()
based on some parameter or may be a use a hook similar to what we
do in ProcessUtility?

Basically it looks bit odd to have a duplicate (mostly) copy of
exec_simple_query().

4.
Won't it be better if pg_background_worker_main() can look more
like PostgresMain() (in terms of handling different kind of messages),
so that it can be extended in future to handle parallel worker.


5.
pg_background_result()
{
..
/* Read and processes messages from the shared memory queue. */
}

Shouldn't the processing of messages be a separate function as
we do for pqParseInput3().


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Etsuro Fujita
Date:
Subject: Re: inherit support for foreign tables
Next
From: Heikki Linnakangas
Date:
Subject: Re: inherit support for foreign tables