Thread: getting rid of "thread fork emulation" in pgbench?

getting rid of "thread fork emulation" in pgbench?

From
Fabien COELHO
Date:
Hello hackers,


This is take 2, I already suggested this 2 years ago...


There is a "thread fork emulation" hack which allows pgbench to be 
compiled without threads by emulating them with forks, obviously with 
limited capabilities. This feature is triggered by configuring postgresql 
with --disable-thread-safety.

I'm not aware of platforms which would still benefit from this feature (we 
are probably talking of a PostgreSQL 9.6 maybe released in 2016), and this 
thread emulation is a real pain for maintaining and extending it: some 
features cannot be implemented, or must be implemented twice, or must be 
implemented in a heavy way because it cannot be assumed to be in a 
threaded implementation.

My question is: would someone still object strongly to the removal of this 
"thread fork emulation" hack in pgbench?

That would mean arguing that it is more important than a simpler code base 
for pgbench, which would help both its maintenance and extension, and must 
be kept despite these costs, hence the "strongly".


Tom's opinion, 2 years ago, was to object strongly: 
http://www.postgresql.org/message-id/24846.1372352618@sss.pgh.pa.us

"I would object strongly to that, as it would represent a significant 
movement of the goalposts on what is required to build Postgres at all, ie 
platforms on which --enable-thread-safety is unavailable or expensive 
would be out in the cold.  Perhaps that set is approaching empty, but a 
project that's still standardized on C89 has little business making such a 
choice IMO."

However, about not providing a full feature under thread emulation:

"Perhaps that's worth doing.  I agree with Fabien that full support of
this feature in the process model is more trouble than it's worth,
though, and I wouldn't scream loudly if we just didn't support it.
--disable-thread-safety doesn't have to be entirely penalty-free."


Robert Haas did also object strongly:
http://www.postgresql.org/message-id/CA+TgmoaJawAM0A4N1RnsndFhTOYYbwnt+7N7XWLcW=n-uDC79Q@mail.gmail.com

"I don't believe that to be an acceptable restriction.  We generally 
require features to work on all platforms we support.  We have made 
occasional compromises there, but generally only when the restriction is 
fundamental to the platform rather than for developer convenience."


So the underlying question is, does Tom and Robert's opinion have changed 
from 2 years ago?

If not, I would really like to know at least *one* current platform for 
which --disable-thread-safety is required, just to know why I waste time 
over such things.

-- 
Fabien.



Re: getting rid of "thread fork emulation" in pgbench?

From
Tom Lane
Date:
Fabien COELHO <coelho@cri.ensmp.fr> writes:
> There is a "thread fork emulation" hack which allows pgbench to be 
> compiled without threads by emulating them with forks, obviously with 
> limited capabilities. This feature is triggered by configuring postgresql 
> with --disable-thread-safety.

> I'm not aware of platforms which would still benefit from this feature (we 
> are probably talking of a PostgreSQL 9.6 maybe released in 2016), and this 
> thread emulation is a real pain for maintaining and extending it: some 
> features cannot be implemented, or must be implemented twice, or must be 
> implemented in a heavy way because it cannot be assumed to be in a 
> threaded implementation.

> My question is: would someone still object strongly to the removal of this 
> "thread fork emulation" hack in pgbench?

By my count, the pthread-emulation code amounts to about 175 lines out of
the nearly 4000 in pgbench.c.  That's not an undue burden IMO (it's not,
for comparison's sake, very much more than the amount of WIN32-specific
code in that file).  And what will you do instead?  It would be fine
I think for pgbench to not allow --jobs different from 1 on a threadless
platform, but not for it to fail to work altogether.  It looks to me like
allowing it to compile without that code would take nearly as much
effort/mess as what's there now.

> So the underlying question is, does Tom and Robert's opinion have changed 
> from 2 years ago?

Not mine.
        regards, tom lane



Re: getting rid of "thread fork emulation" in pgbench?

From
Fabien COELHO
Date:
Hello Tom,

>> My question is: would someone still object strongly to the removal of this
>> "thread fork emulation" hack in pgbench?
>
> By my count, the pthread-emulation code amounts to about 175 lines out of
> the nearly 4000 in pgbench.c.  That's not an undue burden IMO (it's not,
> for comparison's sake, very much more than the amount of WIN32-specific
> code in that file).

From my point of view, the burden is how things can be implemented if you 
have to assume "no threads".

In the discussion at hand which motivates this renewed request, the 
question is whether external merge sorts implying rescanning and 
reprinting files over and over should be implemented in pgbench so as to 
recombine log files.

Now if we have threads, it is simpler (if the overhead is reasonable) that 
threads share a file handle and just generate one file, so there is no 
merging. That is not possible with processes at least for the aggregated 
logs because the thread data must be combined, or would require that 
pgbench use shared memory and so on, but then it is really a lot of bother 
for a limited feature.

The alternative is that the feature would not be available if no thread, 
but then people object to that, on principle.

> And what will you do instead? It would be fine I think for pgbench to 
> not allow --jobs different from 1 on a threadless platform, but not for 
> it to fail to work altogether.

Sure. No thread really means working with only one thread.

> It looks to me like allowing it to compile without that code would take 
> nearly as much effort/mess as what's there now.

My motivation is to simplify how things are done by simply assuming that
threads are available and can share data, esp for counters.

>> So the underlying question is, does Tom and Robert's opinion have changed
>> from 2 years ago?
>
> Not mine.

Ok! I'll ask again in 2017, and probably again in 2019:-)

-- 
Fabien.



Re: getting rid of "thread fork emulation" in pgbench?

From
Tom Lane
Date:
Fabien COELHO <coelho@cri.ensmp.fr> writes:
>> And what will you do instead? It would be fine I think for pgbench to 
>> not allow --jobs different from 1 on a threadless platform, but not for 
>> it to fail to work altogether.

> Sure. No thread really means working with only one thread.

>> It looks to me like allowing it to compile without that code would take 
>> nearly as much effort/mess as what's there now.

> My motivation is to simplify how things are done by simply assuming that
> threads are available and can share data, esp for counters.

pgbench does not get to "assume that threads are available", at least
not as long as the project as a whole supports --disable-thread-safety.

As I said, I wouldn't have a problem with restricting the --jobs setting
on threadless platforms, which seems like it would fix your problem since
you wouldn't need to have more than one process involved.
        regards, tom lane



Re: getting rid of "thread fork emulation" in pgbench?

From
Fabien COELHO
Date:
> As I said, I wouldn't have a problem with restricting the --jobs setting
> on threadless platforms, which seems like it would fix your problem since
> you wouldn't need to have more than one process involved.

Indeed!

So I undestand that "thread fork emulation" could eventually go from your 
point of view, provided that pgbench compiles and runs with one job.

I'll try to submit a patch to that effect, maybe to June CF.

-- 
Fabien.



Re: getting rid of "thread fork emulation" in pgbench?

From
Robert Haas
Date:
On Sun, Mar 29, 2015 at 1:49 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
> Now if we have threads, it is simpler (if the overhead is reasonable) that
> threads share a file handle and just generate one file, so there is no
> merging.

I really, really wish you'd stop arguing against the patch to allow
merging of pgbench logs in this basis.  There may or may not be other
reasons to reject that patch, but arguing that we can or should reject
it because of the hypothetical possibility that sharing a file handle
will work out just isn't right.  People who work hard to develop a
patch that does something useful don't deserve to have it kicked to
the curb because someone can imagine a development path that would
eventually obsolete it.  That patch deserves to be evaluated on its
own merits.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: getting rid of "thread fork emulation" in pgbench?

From
Fabien COELHO
Date:
Hello Robert,

> I really, really wish you'd stop arguing against the patch to allow
> merging of pgbench logs in this basis.

Hmmm. I'm lost. I thought that discussing how to best implement a feature 
was part of reviewing a patch.

> There may or may not be other reasons to reject that patch, but arguing 
> that we can or should reject it because of the hypothetical possibility 
> that sharing a file handle will work out just isn't right.

I must say that I do not understand why discussing options, and actually 
implementing and testing them for hours to see the performance impact, is 
bad, but it will save time next time.

As for pgbench thread forlk emulation induced complexity, I already paid 
for some stuff I contributed to pgbench, so I feel I'm entitled to rant 
about it:-)

> People who work hard to develop a patch

I know. I also spent quite some time on testing alternate implementations 
on this one.

> that does something useful don't deserve to have it kicked to the curb 
> because someone can imagine a development path that would eventually 
> obsolete it. That patch deserves to be evaluated on its own merits.

Sure. I did that in the first post. The code is heavy and awkward, include 
copy-pastes, the merge sorts (there are 2 of thems) complexity are bad, 
the different log formats are hard coded and must be updated in several 
place if modified or extended, ...

IMO this would be best outside of pgbench in a script, but this means that 
the script must be given indication about the file format which depends on 
the options used to run pgbench, which makes it awkward as well.

So the current status is between "rejected" and "returned with feedback", 
make your pick.

Now I think that the feature provides value and I am trying to help 
salvage it. I can do other things.

-- 
Fabien.



Re: getting rid of "thread fork emulation" in pgbench?

From
Robert Haas
Date:
On Mon, Mar 30, 2015 at 2:00 AM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
>> I really, really wish you'd stop arguing against the patch to allow
>> merging of pgbench logs in this basis.
>
> Hmmm. I'm lost. I thought that discussing how to best implement a feature
> was part of reviewing a patch.

Of course it is.  But the feature you are talking about and the
feature Tomas is talking about are two different features.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company