Re: [PoC PATCH] Parallel dump to /dev/null - Mailing list pgsql-hackers

From Michael Banck
Subject Re: [PoC PATCH] Parallel dump to /dev/null
Date
Msg-id 1520282511.22202.21.camel@credativ.de
Whole thread Raw
In response to Re: [PoC PATCH] Parallel dump to /dev/null  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
List pgsql-hackers
Hi Alvaro,

Am Montag, den 05.03.2018, 13:56 -0300 schrieb Alvaro Herrera:
> I made a few amendments (here's v5) and was ready to push, when I
> noticed that _StartBlobs() does not seem to be doing the right thing.
> Did you test this with a few large objects?

I did test it on the regression database, which leaves one or two large
objects behind:


mba@fock:~$ psql -h /tmp -p 65432 -d regression -c '\dl'
                Large objects
  ID   | Owner |         Description          
-------+-------+------------------------------
  3001 | mba   | testing comments
 36867 | mba   | I Wandered Lonely as a Cloud
 47229 | mba   | 
(3 rows)

What exactly is wrong with _StartBlobs()? It calls setFilePath() which
makes sure /dev/null is opened and not something else.

> The reason I noticed is I wondered about the amount of open() calls
> (plus zlib function calls) we would save by keeping an FD open to
> /dev/null, rather than opening it over and over for each object -- ie.,
> maybe not touch setFilePath() at all, if possible.  That looks perhaps
> too invasive, so maybe not.  But do audit other callsites that may open
> files.

I see your point; I've hacked together the attached additional PoC
patch, which keeps the dataFH open. The parallel case was tricky, I had
to add an additional flag to lclContext so that DEVNULL is only opened
once for data files cause I could not find a place where to set it for
parallel workers and it crashed otherwise.

This cuts down the number of /dev/null opens from 352 to 6 for a -j 2
dump of the regression database to /dev/null.

In my opinion, I think this is too evolved and possibly error-prone for
being just an optimization, so I'd drop that for v11 for now and maybe
revisit it later.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
Attachment

pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: select_parallel test failure: gather sometimes losing tuples(maybe during rescans)?
Next
From: Tom Lane
Date:
Subject: Re: PATCH: psql tab completion for SELECT