Thread: RE: [HACKERS] Postgres Speed or lack thereof
> The other thing that jumps out here is the unreasonably high > position of > recv(), which is called 962187 times. The script being read > by psql was > only 957186 characters. Evidently we're invoking a kernel recv() call > once per character read from the frontend. I suspect this is an > inefficiency introduced by Magnus Hagander's recent rewrite of backend > libpq (see, I told you there was a reason for using stdio ;-)). We're > gonna have to do something about that, though it's not as critical as > the memory-allocation issue. Could be because of that. I noticed that the backend calls pq_getchar() a _lot_ of times, looping for reading a single character. It did that before too. The difference was that pq_getchar() called fgetc() then, and calls recv() now. I don't know, maybe recv() is more expensive than fgetc()? But I really can't see any reason it shuold be called more often now than before. An interesting fact is that pq_getchar() doesn't show up at all. Could be because it's fast, but still executed many times, right? Or it could be that the 'inner loops' in pq_getchar(), pq_peekchar(), or pqGetNBytes() don't work as expected. On my system (Linux 2.2), I only get one recv() call for each entry into these functions - as it should be - might it be different on yours? Ok, so I give up, perhaps we need a buffer after all :-) //Magnus
> Could be because of that. I noticed that the backend calls pq_getchar() a > _lot_ of times, looping for reading a single character. It did that before > too. The difference was that pq_getchar() called fgetc() then, and calls > recv() now. > I don't know, maybe recv() is more expensive than fgetc()? But I really > can't see any reason it shuold be called more often now than before. > An interesting fact is that pq_getchar() doesn't show up at all. Could be > because it's fast, but still executed many times, right? Or it could be that > the 'inner loops' in pq_getchar(), pq_peekchar(), or pqGetNBytes() don't > work as expected. On my system (Linux 2.2), I only get one recv() call for > each entry into these functions - as it should be - might it be different on > yours? It is very possible that fgetc() is a macro on your platform. See /usr/include/stdio.h. If so, it has no function call overhead. On BSD/OS, it used to be a macro, but now with threads, it is not. They have a macro version, but is under a different name. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Magnus Hagander <mha@sollentuna.net> writes: > I don't know, maybe recv() is more expensive than fgetc()? Vastly. recv() is a kernel call. You have the overhead of getting control into the kernel, normally several times more expensive than a function call; of passing parameters back and forth from user space to kernel space (for example, the kernel will probably have to translate and range-check the buffer pointer you pass it, to ensure you can't fool the kernel into scribbling on some other process's memory); of verifying that the descriptor number you pass is open and you have permission to read it; and of finding the associated data buffer. Plus the scheduler may run to reconsider whether to give control back to you, or switch off to another user process. Etc etc etc. fgetc() is a plain C function that normally just has to fetch the next byte out of a buffer that's already been read into your address space --- that is, when you're using stdio, you pay all the above-described kernel interaction overhead once per bufferload, not once per character. If you use getc(), which is allowed to be a macro, you don't even pay the function-call overhead; that form is probably less than a dozen instructions, except when the buffer is empty. Judging by the profile numbers, recv()'ing a single character takes close to 1400 instructions on my system. > An interesting fact is that pq_getchar() doesn't show up at all. Could be > because it's fast, but still executed many times, right? Right, it doesn't run long enough to get itself into the top functions. It's there though --- the dynamic profile shows: ----------------------------------------------- 0.03 18.83 5001/5001 ReadCommand [27] [28] 10.4 0.03 18.83 5001 SocketBackend [28] 0.00 18.73 5000/5000 pq_getstr[30] 0.01 0.09 5001/5001 pq_getnchar [303] ----------------------------------------------- 0.00 18.73 5000/5000 SocketBackend [28] [30] 10.4 0.00 18.73 5000 pq_getstr [30] 0.11 18.62 5000/5000 pqGetString [29] ----------------------------------------------- 0.11 18.62 5000/5000 pq_getstr [30] [29] 10.4 0.11 18.62 5000 pqGetString [29] 0.47 18.15 957186/957186 pq_getchar [31] ----------------------------------------------- 0.47 18.15 957186/957186 pqGetString [29] [31] 10.3 0.47 18.15 957186 pq_getchar [31] 18.15 0.00 957186/962187 recv [32] ----------------------------------------------- 0.09 0.00 5001/962187 pqGetNBytes [315] 18.15 0.00 957186/962187 pq_getchar [31] [32] 10.1 18.24 0.00 962187 recv [32] ----------------------------------------------- In the old code with fgetc(), the execution time of fgetc() was probably not much worse than pq_getchar --- ie, about half a second not 18 seconds for this test sequence... What we need to do here is to re-introduce the buffering ability of stdio into backend libpq. If you compare the current frontend libpq, you'll notice that it reads or writes the socket a bufferload at a time, not a character at a time. regards, tom lane
> What we need to do here is to re-introduce the buffering ability of > stdio into backend libpq. If you compare the current frontend libpq, > you'll notice that it reads or writes the socket a bufferload at a time, > not a character at a time. FYI, I never profiles massive inserts. I usually profiles just SELECT statements on large tables. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026