Re: patch for parallel pg_dump - Mailing list pgsql-hackers

From Joachim Wieland
Subject Re: patch for parallel pg_dump
Date
Msg-id CACw0+13MaDaJadSMMVdrp8PJVapqUjj-kyca-HXM0qN5G6gKtQ@mail.gmail.com
Whole thread Raw
In response to Re: patch for parallel pg_dump  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: patch for parallel pg_dump  (Robert Haas <robertmhaas@gmail.com>)
Re: patch for parallel pg_dump  (Alvaro Herrera <alvherre@commandprompt.com>)
List pgsql-hackers
On Tue, Mar 13, 2012 at 1:53 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> What I mean is that the function ArchiveEntry() is defined in
> pg_backup_archiver.c, and it takes an argument called relpages, and
> the string "relpages" does not appear anywhere else in that file.

Uhm, that's kinda concerning, isn't it... fixed...


[...pgpipe...]
> Dunno.  Can we get an opinion on that from one of the Windows guys?
> Andrew, Magnus?

Waiting for the verdict here...


>> If a child terminates without leaving a message, the master will still
>> detect it and just say "a worker process died unexpectedly" (this part
>> was actually broken, but now it's fixed :-) )
>
> All that may be true, but I still don't see why it's right for this to
> apply in the cases where the worker thread says die_horribly(), but
> not in the cases where the worker says exit_horribly().

Hm, I'm not calling the error handler from exit_horribly because it
doesn't have the AH. It looks like the code assumes that
die_horribly() is called whenever AH is available and if not,
exit_horribly() should be called which eventually calls these
preregistered exit-hooks via exit_nicely() to clean up the connection.

I think we should somehow unify both functions, the code is not very
consistent in this respect, it also calls exit_horribly() when it has
AH available. See for example pg_backup_tar.c

Or is there another distinction between them? The question how to
clean it up basically brings us back to the question what to do about
global variables in general and for error handlers in particular.


>> Or we change fmtQualifiedId to take an int and then we always pass the
>> archive version instead of the Archive* ?
>
> Hmm, I think that might make sense.

Done.


>>> +enum escrow_action { GET, SET };
>>> +static void
>>> +parallel_error_handler_escrow_data(enum escrow_action act,
>>> ParallelState *pstate)
>>> +{
>>> +       static ParallelState *s_pstate = NULL;
>>> +
>>> +       if (act == SET)
>>> +               s_pstate = pstate;
>>> +       else
>>> +               *pstate = *s_pstate;
>>> +}
>>>
>>> This seems like a mighty complicated way to implement a global variable.
>>
>> Well, we talked about that before, when you complained that you
>> couldn't get rid of the global g_conn because of the exit handler.
>> You're right that in fact it is an indirect global variable here but
>> it's clearly limited to the use of the error handler and you can be
>> sure that nobody other than this function writes to it or accesses it
>> without calling this function.
>
> Sure, but since all the function does is write to it or access it,
> what good does that do me?

It encapsulates the variable so that it can only be used for one
specific use case.

Attaching a new version.

Attachment

pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: Re: initdb and fsync
Next
From: Heikki Linnakangas
Date:
Subject: Re: Chronic performance issue with Replication Failover and FSM.