Re: [HACKERS] Passing values to a dynamic background worker - Mailing list pgsql-hackers

From Keith Fiske
Subject Re: [HACKERS] Passing values to a dynamic background worker
Date
Msg-id CAG1_KcBj52LpvVaFT4aBfttuf2i4DscBVPpOwwbNqsY8pzqQ_g@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Passing values to a dynamic background worker  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
List pgsql-hackers


On Tue, Apr 18, 2017 at 5:40 AM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
On 2017/04/18 18:12, Kyotaro HORIGUCHI wrote:
> At Mon, 17 Apr 2017 16:19:13 -0400, Keith Fiske wrote:
>> So after reading a recent thread on the steep learning curve for PG
>> internals [1], I figured I'd share where I've gotten stuck with this in a
>> new thread vs hijacking that one.
>>
>> One of the goals I had with pg_partman was to see if I could get the
>> partitioning python scripts redone as C functions using a dynamic
>> background worker to be able to commit in batches with a single call. My
>> thinking was to have a user-function that can accept arguments for things
>> like the interval value, batch size, and other arguments to the python
>> script, then start/stop a dynamic bgw up for each batch so it can commit
>> after each one. The dymanic bgw would essentially just have to call the
>> already existing partition_data() plpgsql function, but I have to be able
>> to pass the argument values that the user gave down into the dynamic bgw.
>>
>> I've reached a roadblock in that bgw_main_arg can only accept a single
>> argument that must be passed by value for a dynamic bgw. I already worked
>> around this for passing the database name to the my existing use of a bgw
>> with doing partition maintenance (pass a simple integer to use as an index
>> array value). But I'm not sure how to do this for passing multiple values
>> in. I'm assuming this would be the place where I'd see about storing values
>> in shared memory to be able to re-use later? I'm not even sure if that's
>> the right approach, and if it is, where to even start to understand how to
>> do that.
>
> On the other hand, AFAICS, DSM doesn't seem well documented. I
> mangaged to find a related document in Postgres Wiki but it seems
> a bit old.
>
> https://wiki.postgresql.org/wiki/Parallel_Internal_Sort
>
> This is a little complex than static shared memory, and it is
> *not* guaranteed to mapped at the same address among workers. You
> will see an instance in LaunchParallelWorkers() and the related
> functions in parallel.c. The basic of its usage would be as the
> follows.
>
> - Create a segment :
>    dsm_segment *seg = dsm_create(size);
> - Send its handle via the bgw_main_arg.
>    worker.bgw_main_arg = dsm_segment_handle(seg);
> - Attach the memory on the other side.
>    dsm_segment *seg = dsm_attach(main_arg);
>
> On both side, the address of the attached shared memory is
> obtained using dsm_segment_address(seg).
>
> dsm_detach(seg) detaches the segment. All users of this segment
> detach the segment, it will be destroyed.

Perhaps, the more modern DSA mechanism could be applicable here, too.

Some recent commits demonstrate examples of DSA usage, such as BRIN
autosummarization commit (7526e10224f) and tidbitmap.c's shared iteration
support commit (98e6e89040a05).

Thanks,
Amit


Thank you both very much for the suggestions!

Keith

pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: [HACKERS] Continuous buildfarm failures on hamster with bin-check
Next
From: Peter Eisentraut
Date:
Subject: Re: [HACKERS] Interval for launching the table sync worker