repeated pointless memmove() calls in pqReadData() - Mailing list pgsql-bugs
From | Brent Ewing |
---|---|
Subject | repeated pointless memmove() calls in pqReadData() |
Date | |
Msg-id | 20010522155708.A3583@nootka.mbt.washington.edu Whole thread Raw |
Responses |
Re: repeated pointless memmove() calls in pqReadData()
|
List | pgsql-bugs |
22 May 2001 Dear Sir; First, thank you for PostGres. It is great! Second, I am puzzled by repeated, apparently pointless calls to memmove() in the function pqReadData, which is in `src/interfaces/libpq/fe-misc.c'. I am running and looking at PostGres version 7.1.1 built on a Alpha running Digital UNIX. I use the Digital compilers to build PostGres. I created a table using the `script' -- Create genome_seq table -- -- genome_seq contains information extracted from NCBI H sapiens genome contigs -- -- table label -- ----------- -- accession -- version -- organism -- chromosome -- comments -- sequence -- length -- CREATE TABLE genome_seq ( accession text, version text, organism text, chromosome text, comments text, sequence text, length int4 ) -- GENOME_SEQ.TABLE (END) The `sequence' can be a long string. I have sequence strings of tens of millions of characters. I create a file of the rows and load the data using the `COPY FROM' SQL command for performance as suggested in the documents. (I believe this is irrelevant.) I am using PostGres and the libpq library calls to get the sequence quickly. It worked well for sequences of 150k characters but bogged down for the 10M character strings, taking a minute or so, which seemed excessive to me. I used top to watch system activity and noticed that the the client was getting essentially all the CPU cycles, and the backend was getting essentially nothing. I did not understand why the client needs to spend a lot of time with long text strings so I repeated this with the `psql' interface and found the same phenomenon, which makes sense because it uses libpq too. I ran both my program and psql under dbx and interrupted it repeatedly while it ate up CPU cycles and found the function stack looked like genbank_db=# select sequence from genome_seq where version = 'NT_009151.3'; Interrupt signal Interrupt at >*[memmove, 0x3ff81499998] subq r5, 0x8, r5 (dbx) where > 0 memmove(0x0, 0x0, 0x0, 0x0, 0x0) [0x3ff81499998] 1 pqReadData(0x0, 0x0, 0x3ffbffe8880, 0x14000c400, 0x0) [0x3ffbffeb16c] 2 PQgetResult(0x3ffbffe8880, 0x14000c400, 0x0, 0x100400000, 0x3ffbffe8aa0) [0x3ffbffe887c] 3 PQexec(0x14000a4c0, 0x0, 0x140000a00, 0x14000e3c0, 0x0) [0x3ffbffe8a9c] 4 SendQuery(0x0, 0x14000eb40, 0x12000b410, 0x14000eb40, 0x3b) [0x120009024] 5 MainLoop(0x14000d320, 0x3e00000000, 0x14000eb40, 0x100000000, 0x1) [0x12000b42c] 6 main(0x0, 0x0, 0x0, 0x0, 0x100000002) [0x12000d364] (dbx) which puzzled me. I added diagnostics to pqReadData() so the function looks like . . . int pqReadData(PGconn *conn) { int someread = 0; int nread; static int mncount = 0; static long onumbyte = 0; fprintf( stderr, "pqReadData: start\n" ); if (conn->sock < 0) { printfPQExpBuffer(&conn->errorMessage, "pqReadData() -- connection not open\n"); return -1; } /* Left-justify any data in the buffer to make room */ if (conn->inStart < conn->inEnd) { /* if( conn->inStart != 0 ) { */ memmove(conn->inBuffer, conn->inBuffer + conn->inStart, conn->inEnd - conn->inStart); ++mncount; fprintf( stderr, "pqReadData: count: %d memmove( dst: %ld src: %ld numbyt: %ld ) del_numbyt: %ld inStart: %ld\n", mncount, (long)conn->inBuffer, (long)( conn->inBuffer + conn->inStart ), (long)( conn->inEnd - conn->inStart ), (long)( conn->inEnd - conn->inStart ) - onumbyte, (long)( conn->inStart ) ); onumbyte = conn->inEnd - conn->inStart; /* } */ conn->inEnd -= conn->inStart; conn->inCursor -= conn->inStart; conn->inStart = 0; } . . . . I start up psql and make a query select sequence from genome_seq where version = 'NT_009151.3'; and the resulting stderr looks like startup: begin: stderr pqReadData: start pqReadData: start pqReadData: start pqReadData: start pqReadData: start pqReadData: count: 1 memmove( dst: 5368891392 src: 5368891421 numbyt: 8163 ) del_numbyt: 8163 inStart: 29 pqReadData: start pqReadData: count: 2 memmove( dst: 5368891392 src: 5368891392 numbyt: 16355 ) del_numbyt: 8192 inStart: 0 src/interfaces/libpq/fe-misc.c: realloc mem pqReadData: start pqReadData: count: 3 memmove( dst: 5381845472 src: 5381845472 numbyt: 24547 ) del_numbyt: 8192 inStart: 0 pqReadData: start pqReadData: count: 4 memmove( dst: 5381845472 src: 5381845472 numbyt: 32739 ) del_numbyt: 8192 inStart: 0 src/interfaces/libpq/fe-misc.c: realloc mem pqReadData: start pqReadData: count: 5 memmove( dst: 5381878272 src: 5381878272 numbyt: 40931 ) del_numbyt: 8192 inStart: 0 pqReadData: start pqReadData: count: 6 memmove( dst: 5381878272 src: 5381878272 numbyt: 49123 ) del_numbyt: 8192 inStart: 0 pqReadData: start pqReadData: count: 7 memmove( dst: 5381878272 src: 5381878272 numbyt: 57315 ) del_numbyt: 8192 inStart: 0 pqReadData: start pqReadData: count: 8 memmove( dst: 5381878272 src: 5381878272 numbyt: 65507 ) del_numbyt: 8192 inStart: 0 src/interfaces/libpq/fe-misc.c: realloc mem pqReadData: start pqReadData: count: 9 memmove( dst: 5381943840 src: 5381943840 numbyt: 73699 ) del_numbyt: 8192 inStart: 0 pqReadData: start pqReadData: count: 10 memmove( dst: 5381943840 src: 5381943840 numbyt: 81891 ) del_numbyt: 8192 inStart: 0 pqReadData: start pqReadData: count: 11 memmove( dst: 5381943840 src: 5381943840 numbyt: 90083 ) del_numbyt: 8192 inStart: 0 pqReadData: start pqReadData: count: 12 memmove( dst: 5381943840 src: 5381943840 numbyt: 98275 ) del_numbyt: 8192 inStart: 0 pqReadData: start pqReadData: count: 13 memmove( dst: 5381943840 src: 5381943840 numbyt: 106467 ) del_numbyt: 8192 inStart: 0 . . . (many lines omitted) . pqReadData: start pqReadData: count: 1562 memmove( dst: 5398590208 src: 5398590208 numbyt: 12804067 ) del_numbyt: 8192 inStart: 0 pqReadData: start pqReadData: count: 1563 memmove( dst: 5398590208 src: 5398590208 numbyt: 12812259 ) del_numbyt: 8192 inStart: 0 pqReadData: start pqReadData: count: 1564 memmove( dst: 5398590208 src: 5398590208 numbyt: 12820451 ) del_numbyt: 8192 inStart: 0 pqReadData: start pqReadData: count: 1565 memmove( dst: 5398590208 src: 5398590208 numbyt: 12828643 ) del_numbyt: 8192 inStart: 0 pqReadData: start pqReadData: count: 1566 memmove( dst: 5398590208 src: 5398590208 numbyt: 12836835 ) del_numbyt: 8192 inStart: 0 pqReadData: start pqReadData: count: 1567 memmove( dst: 5398590208 src: 5398590208 numbyt: 12845027 ) del_numbyt: 8192 inStart: 0 pqReadData: start pqReadData: count: 1568 memmove( dst: 5398590208 src: 5398590208 numbyt: 12853219 ) del_numbyt: 8192 inStart: 0 pqReadData: start pqReadData: count: 1569 memmove( dst: 5398590208 src: 5398590208 numbyt: 12861411 ) del_numbyt: 8192 inStart: 0 pqReadData: start pqReadData: count: 1570 memmove( dst: 5398590208 src: 5398590208 numbyt: 12869603 ) del_numbyt: 8192 inStart: 0 pqReadData: start pqReadData: count: 1571 memmove( dst: 5398590208 src: 5398590208 numbyt: 12877795 ) del_numbyt: 8192 inStart: 0 pqReadData: start pqReadData: count: 1572 memmove( dst: 5398590208 src: 5398590208 numbyt: 12885987 ) del_numbyt: 8192 inStart: 0 pqReadData: start pqReadData: count: 1573 memmove( dst: 5398590208 src: 5398590208 numbyt: 12894179 ) del_numbyt: 8192 inStart: 0 pqReadData: start pqReadData: count: 1574 memmove( dst: 5398590208 src: 5398590208 numbyt: 12902371 ) del_numbyt: 8192 inStart: 0 Incidentally, the length of sequence is 12908986 characters. It appears to me that the memmove is moving data from and to the same memory locations ~1500 times incrementing the number of bytes moved by 8K each time. I tried skipping the memmove when the conn->inStart == 0 and found that the output sequence is identical to when it is not skipped, as I expected. I appreciate your consideration. Best Wishes, Brent Ewing -- -------------------------------- Brent Ewing bge@u.washington.edu [206] 616 6040 University of Washington Box 352145 Seattle, Washington 98195-2145 --------------------------------
pgsql-bugs by date: