Re: patch for parallel pg_dump - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: patch for parallel pg_dump |
Date | |
Msg-id | CA+Tgmoa9=9yRvRZBLq1BdFzVL-K_3FC7uFNDuMoxBhdywd9Ltg@mail.gmail.com Whole thread Raw |
In response to | Re: patch for parallel pg_dump (Joachim Wieland <joe@mcknight.de>) |
Responses |
Re: patch for parallel pg_dump
Re: patch for parallel pg_dump |
List | pgsql-hackers |
On Tue, Apr 3, 2012 at 10:40 AM, Joachim Wieland <joe@mcknight.de> wrote: >> On a similar note, what's the point of changing struct Archive to have >> int numWorkers instead of int number_of_jobs, and furthermore >> shuffling the declaration around to a different part of the struct? > > number_of_jobs was in the struct RestoreOptions before, a structure > that is not used when doing a dump. I moved it to the Archive as it is > used by both dump and restore and since all other code talks about > "workers" I changed the name to "numWorkers". Gah. Somehow I feel that splitting up this patch into two pieces hasn't made anything any better. >> On another note, I am not sure that I like the messaging protocol >> you've designed. It seems to me that this has little chance of being >> reliable: >> >> + void (*on_exit_msg_func)(const char *modulename, const char *fmt, >> va_list ap) = vwrite_msg; >> >> I believe the idea here is that you're going to capture the dying gasp >> of the worker thread and send it to the master instead of printing it >> out. But that doesn't seem very reliable. There's code all over the >> place (and, in particular, in pg_dump.c) that assumes that we may >> write messages at any time, and in particular that we may emit >> multiple messages just before croaking. > > I guess you're talking about the code in pg_dump that reads in the > database schema and the details of all the different objects in the > schema. This code is run before forking off workers and is always > executed in the master. OK, but it seems like a pretty fragile assumption that none of the workers will ever manage to emit any other error messages. We don't rely on this kind of assumption in the backend, which is a lot better-structured and less spaghetti-like than the pg_dump code. >> Also, I like the idea of making it possible to use assertions in >> front-end code. But I think that if we're going to do that, we should >> make it work in general, not just for things that include >> pg_backup_archiver.h. > > I completely agree. Assertions helped a lot dealing with concurrent > code. How do you want to tackle this for now? Want me to create a > separate header pg_assert.h as part of my patch? Or is it okay to > factor it out later and include it from the general header then? I'll just go do it, barring objections. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: