Thread: [HACKERS] multithreading in Batch/pipelining mode for libpq

[HACKERS] multithreading in Batch/pipelining mode for libpq

From
Ilya Roublev
Date:
Hello, 

And I'm wondering, what are the possibilities to speed up execution of asynchronous queries yet more. What I need is to make a huge amount of inserts (or other sorts of queries either returning no results or those which results  I'd like somehow to ignore completely) very quickly, so the idea to use batch mode sounds rather good for me taking into account  https://blog.2ndquadrant.com/postgresql-latency-pipelining-batching/

So I'd like to create the corresponding prepared statement and to perform asynchronously all these queries through this prepared statement. What I'm trying to do now is to understand (at least in general):
1) is it possible technically (possibly by changing some part of libpq code) to ignore results (especially for this sort of queries like insert), processing somehow separately the situation when some error occurs? 
2) if the answer to the previous question is negative, is it possible to send asynchronous queries in one thread while reading results in another thread? I understand that this immediately contradicts the requirement not to use the same PGconn object in different threads. But may be some locks of this PGconn object are possible, so that the state of PGconn cannot be made in one thread while the other locks PGconn and vice versa?

Naturally I failed to implement this without locking, info in PGconn very quickly becomes inconsistent, the number of queries sent does not correspond to the number of results to be read, etc. So I'd like to know at first is it possible at all (possibly by some changes to be made in libpq)? Sorry if my idea sounds rather naive. And thanks for your answer and advice. 

Only in the case all this sounds not so stupid and something is possible to make, I'd like to ask for some details, if you do not mind.

Thank you very much again in advance.

With my best regards,
Ilya

Re: [HACKERS] multithreading in Batch/pipelining mode for libpq

From
Craig Ringer
Date:


On 22 Apr. 2017 6:04 am, "Ilya Roublev" <iroublev@gmail.com> wrote:

1) is it possible technically (possibly by changing some part of libpq code) to ignore results (especially for this sort of queries like insert), processing somehow separately the situation when some error occurs? 

There is a patch out there to allow libpq result processing by callback I think. Might be roughly what you want.

2) if the answer to the previous question is negative, is it possible to send asynchronous queries in one thread while reading results in another thread?

Not right now. libpq's state tracking wouldn't cope.

I imagine it could be modified to work with some significant refactoring. You'd need to track state with a shared fifo of some kind where dispatch outs queries on the fifo as it sends them and receive pops them from it.

I started on that for the batch mode stuff but it's not in any way thread safe there.
 

 locking, info in PGconn very quickly becomes inconsistent, the number of queries sent does not correspond to the number of results to be read, etc. So I'd like to know at first is it possible at all (possibly by some changes to be made in libpq)? Sorry if my idea sounds rather naive. And thanks for your answer and advice. 

Yeah, it's possible. The protocol can handle it, it's just libpq that can't.

Re: [HACKERS] multithreading in Batch/pipelining mode for libpq

From
Andres Freund
Date:
On 2017-04-22 09:14:50 +0800, Craig Ringer wrote:
> 2) if the answer to the previous question is negative, is it possible to
> send asynchronous queries in one thread while reading results in another
> thread?
> 
> 
> Not right now. libpq's state tracking wouldn't cope.
> 
> I imagine it could be modified to work with some significant refactoring.
> You'd need to track state with a shared fifo of some kind where dispatch
> outs queries on the fifo as it sends them and receive pops them from it.

FWIW, I think it'd be a *SERIOUSLY* bad idea trying to make individual
PGconn interactions threadsafe. It'd imply significant overhead in a lot
of situations, and programming it would have to become a lot more
complicated (since you need to synchronize command submission between
threads).   For almost all cases it's better to either use multiple
connections or use a coarse grained mutex around all of libpq.

- Andres



Re: [HACKERS] multithreading in Batch/pipelining mode for libpq

From
Greg Stark
Date:
On 21 April 2017 at 21:31, Ilya Roublev <iroublev@gmail.com> wrote:
> What I need is to make a huge amount of inserts

This may be a silly question but I assume you've already considered
using server-side COPY? That's the most efficient way to load a lot of
data currently.


-- 
greg