Thread: Libpq optimization
In the libpq COPY interface function PQputCopyData(): /** Check for NOTICE messages coming back from the server. Since the* server might generate multiple notices during theCOPY, we have to* consume those in a reasonably prompt fashion to prevent the comm* buffers from filling up and possiblyblocking the server.*/ if (!PQconsumeInput(conn)) return -1; /* I/O failure */ parseInput(conn); I moved it to a different location, just a bit further, after the check for "is output buffer full and we are ready to flush?" in the same function if ((conn->outBufSize - conn->outCount - 5) < nbytes) { <here> .... .... } As the code comment suggests, it is extremely important to consume incoming messages from the server to prevent deadlock. However we should only worry about it before sending data out. Most calls to PQputCopyData don't actually send any data but just place it in the out buffer and return. Therefore we can perform the consumeinput/parseinput right before flushing, instead of reading from the server every time we call PQputCopyData and not send anything (which happens probably in 99% of the time). Right? Or am I missing something. This change improves COPY performance. thx Alon.
"Alon Goldshuv" <agoldshuv@greenplum.com> writes: > As the code comment suggests, it is extremely important to consume incoming > messages from the server to prevent deadlock. However we should only worry > about it before sending data out. And, unfortunately, you've broken it. The pqFlush call visible in that routine is not the only place that may try to send data (see also pqPutMsgEnd). regards, tom lane
Tom, > And, unfortunately, you've broken it. The pqFlush call visible in that > routine is not the only place that may try to send data (see also > pqPutMsgEnd). You are right, thanks for pointing that out. Still, in pqPutMsgEnd we will be sending data only after 8K is reached, which is about once in 80 for a 100 byte row size... Alon.