Thread: BUG #7914: pg_dump aborts occasionally
The following bug has been logged on the website: Bug reference: 7914 Logged by: Shin-ichi MORITA Email address: s-morita@beingcorp.co.jp PostgreSQL version: 8.4.16 Operating system: Windows Server 2012 Description: = Hi great team, I encountered a strange error during pg_dump as below: ---- pg_dump: Dumping the contents of table "t_file_data" failed: PQgetCopyData() failed. pg_dump: Error message from server: lost synchronization with server: got message type "d", length 21861 pg_dump: The command was: COPY public.t_file_data (c_file_info_id, c_page_no, c_data) TO stdout; pg_dump: *** aborted because of error ---- It seems that large tables tend to cause this to happen. In order to figure out what happens, I looked into src/interfaces/libpq/fe-protocol3.c. Then I have found that it is possible that PGconn::inBuffer is unintentionally enlarging during execution of getCopyDataMessage(). Please look at fe-protocol3.c:1432: 1425: avail =3D conn->inEnd - conn->inCursor; 1426: if (avail < msgLength - 4) 1427: { : 1432: if (pqCheckInBufferSpace(conn->inCursor + (size_t) msgLength - 4, When several messages have been already proccessed and avail < msgLength - 4, inBuffer will be enlarged without left-justification. I think that pqCheckInBufferSpace() should be called only when inBuffer is left-justified like this: 1432: if (conn->inStart =3D=3D 0 && pqCheckInBufferSpace(conn->inCurso= r + (size_t) msgLength - 4, When inBuffer was not left-justified at that moment, the line 1432 will be executed again after inBuffer is left-justified in pqReadData() called by pqGetCopyData3(). Please check this out. Sorry if I have mistaken. Note that I did not consider the async case. Thanks and best regards,
s-morita@beingcorp.co.jp writes: > I encountered a strange error during pg_dump as below: > ---- > pg_dump: Dumping the contents of table "t_file_data" failed: PQgetCopyData() > failed. > pg_dump: Error message from server: lost synchronization with server: got > message type "d", length 21861 Hm. Can you create a reproducible test case for this? > I think that pqCheckInBufferSpace() should be called only when inBuffer is > left-justified like this: > 1432: if (conn->inStart == 0 && pqCheckInBufferSpace(conn->inCursor + > (size_t) msgLength - 4, I think changing that code that way would break the considerations mentioned in pqParseInput3; nor is it clear why this would help, anyway. It doesn't seem likely that the buffer is so large that it can't be enlarged by another 20K. regards, tom lane
Hi Tom, > s-morita@beingcorp.co.jp writes: >> I encountered a strange error during pg_dump as below: >> ---- >> pg_dump: Dumping the contents of table "t_file_data" failed: >> PQgetCopyData() >> failed. >> pg_dump: Error message from server: lost synchronization with server: got >> message type "d", length 21861 > > Hm. Can you create a reproducible test case for this? This issue does not always happen. It seems to depend on the environment and the load. My test environment is actually a virual one as follows: Hypervisor: VMware vSphere Hypervisor 5.1 Allocated vCPUs: 1 Allocated Memory: 4GB OS: Windows Server 2012 1. Create test data using the script below. 2. Run pg_dump: pg_dump --host localhost --port 5432 --username postgres --format custom --verbose --file "..." --table public.t_data test_db ---- CREATE TABLE t_data ( c_id integer NOT NULL, c_data bytea, CONSTRAINT t_data_pkey PRIMARY KEY (c_id) ) WITH ( OIDS=FALSE ); CREATE LANGUAGE plpgsql; CREATE FUNCTION create_data(n INT) RETURNS VOID AS $$ BEGIN FOR i IN 1..n LOOP INSERT INTO t_data VALUES (i, rpad('', 20000, 'x')::bytea); END LOOP; END; $$ LANGUAGE plpgsql; SELECT create_data(300000); ---- >> I think that pqCheckInBufferSpace() should be called only when inBuffer >> is >> left-justified like this: >> 1432: if (conn->inStart == 0 && pqCheckInBufferSpace(conn->inCursor >> + >> (size_t) msgLength - 4, > > I think changing that code that way would break the considerations > mentioned in pqParseInput3; nor is it clear why this would help, anyway. > It doesn't seem likely that the buffer is so large that it can't be > enlarged by another 20K. I think this issue happens when pg_dump is slower than the backend for some reason. Actually, pg_dump's memory consumption glows over 500MB when this issue is happening. If pg_dump is slower than the backend for some reason and each message is not exactly 8K, inBuffer will frequently overflow in getCopyDataMessage(). This will cause frequent calls to pqCheckInBufferSpace() without left-justification. Each call will double inBuffer and inBuffer will rapidly glow even if each message is 20K. > regards, tom lane Regards, Shin-ichi MORITA
"Shin-ichi MORITA" <s-morita@beingcorp.co.jp> writes: > Hi Tom, >> Hm. Can you create a reproducible test case for this? > I think this issue happens when pg_dump is slower than the backend > for some reason. If so, perhaps injecting a sleep() delay into the right place in pg_dump or libpq would make it reproducible? I wouldn't have any problem crediting a test with such modified code as being a valid test. regards, tom lane
>>> Hm. Can you create a reproducible test case for this? >> I think this issue happens when pg_dump is slower than the backend >> for some reason. > > If so, perhaps injecting a sleep() delay into the right place in pg_dump > or libpq would make it reproducible? I wouldn't have any problem > crediting a test with such modified code as being a valid test. I think it's a good idea. Although I can not confirm this by actually modifying the code right now, somewhere in the loop calling PQgetCopyData() in pg_dump.c would be good. pg_dump.c: 1348: for (;;) 1349: { 1350: ret = PQgetCopyData(conn, ©buf, 0); 1351: 1352: if (ret < 0) 1353: break; /* done or error */ 1354: 1355: if (copybuf) 1356: { 1357: WriteData(fout, copybuf, ret); 1358: PQfreemem(copybuf); 1359: } sleep(1); /* maybe */ An alternative way would be running pg_dump with a lower priority. Actually, I can reproduce this issue by setting the priority of pg_dump to Low using Windows Task Manager on the "single processor" environment. Regards, Shin-ichi MORITA
"Shin-ichi MORITA" <s-morita@beingcorp.co.jp> writes: >> If so, perhaps injecting a sleep() delay into the right place in pg_dump >> or libpq would make it reproducible? > An alternative way would be running pg_dump with a lower priority. > Actually, I can reproduce this issue by setting the priority of > pg_dump to Low using Windows Task Manager > on the "single processor" environment. I tried nice'ing pg_dump without any success in making it bloat memory. I'm suspicious that there's something Windows-specific in what you're seeing, because it's fairly hard to credit that nobody's seen this problem in the ten years or so that libpq has been doing it like that. Can anyone else manage to reproduce a similar behavior? regards, tom lane
> I tried nice'ing pg_dump without any success in making it bloat memory. > I'm suspicious that there's something Windows-specific in what you're > seeing, because it's fairly hard to credit that nobody's seen this > problem in the ten years or so that libpq has been doing it like that. > > Can anyone else manage to reproduce a similar behavior? I tried several other conditions as well including Linux, 32bit OS, non-virtual environments, and so on. However, so far, I have not been able to reproduce this issue yet with conditions other than the following combination: Hypervisor: VMware vSphere Hypervisor 5.1 Allocated vCPUs: 1 Allocated Memory: more than 2GB OS: Windows Server 2012 or 2008R2 (64bit) Run pg_dump under high load or run pb_dump with the Low priority using Windows Task Manager This might be a rare case. By the way, we have already worked around this issue by adding --inserts. Thank you. Best regards, Shin-ichi MORITA
I wrote: > "Shin-ichi MORITA" <s-morita@beingcorp.co.jp> writes: >>> If so, perhaps injecting a sleep() delay into the right place in pg_dump >>> or libpq would make it reproducible? >> An alternative way would be running pg_dump with a lower priority. >> Actually, I can reproduce this issue by setting the priority of >> pg_dump to Low using Windows Task Manager >> on the "single processor" environment. > I tried nice'ing pg_dump without any success in making it bloat memory. > I'm suspicious that there's something Windows-specific in what you're > seeing, because it's fairly hard to credit that nobody's seen this > problem in the ten years or so that libpq has been doing it like that. > Can anyone else manage to reproduce a similar behavior? I started thinking about this issue again after seeing a similar report at http://www.postgresql.org/message-id/flat/850BF81CE7324F81AF2A44C8660CF2EE@dell2 and this time I think I see the problem. Look at the loop in pqReadData() created by this test: /* * Hack to deal with the fact that some kernels will only give us back * 1 packet per recv() call, even if we asked for more and there is * more available. If it looks like we are reading a long message, * loop back to recv() again immediately, until we run out of data or * buffer space. Without this, the block-and-restart behavior of * libpq's higher levels leads to O(N^2) performance on long messages. * * Since we left-justified the data above, conn->inEnd gives the * amount of data already read in the current message. We consider * the message "long" once we have acquired 32k ... */ if (conn->inEnd > 32768 && (conn->inBufSize - conn->inEnd) >= 8192) { someread = 1; goto retry3; } Once we have more than 32K in the input buffer, this logic will loop until we either fill the buffer (to within 8K anyway) or get ahead of the sender (so that recv() returns zero). Then, we go and process all the message(s) that we just read. If what we have includes some message(s) and then an incomplete message at the end of the buffer, we'll call pqCheckInBufferSpace, which is quite likely to result in doubling the input buffer size, even though the incomplete message might be much smaller than the buffer size. That's fine so far, but then when we go back to pqReadData, it's capable of filling that enlarged buffer again. Lather, rinse, repeat. I've been able to reproduce buffer bloat by inserting some delay into this loop (I put "usleep(10000);" in front of the goto above) and then sending a steady stream of circa-50KB data messages. To see any significant buffer bloat in practice, you need an environment where the sender + network is consistently faster than the receiver. That's fairly unlikely if the sender and receiver are on different machines, just because that recv() loop in pqReadData is probably faster than the network connection. It seems also unlikely when they're on the same machine: that recv() loop is certainly faster than any data-sending loop in the server, so you'd need some scheduler effect that consistently gives the backend process priority over the pg_dump process, which is exactly the opposite of what generally happens on Unix machines. Perhaps Windows schedules differently? Or maybe the reason we're seeing this now, after not seeing it for many years, is that networks are getting faster. Anyway, I still don't like your original suggestion of simply skipping the pqCheckInBufferSpace call; that's just losing the knowledge we have of how big the next message will be. Rather I think what we are missing here is that we should deduct the excess space to the left of inStart, and/or forcibly left-justify the buffer, before deciding that we need to enlarge the buffer. I think it'd be safe for pqCheckInBufferSpace to do that (ie move the data within the buffer), though I need to look at the callers more closely to be sure. regards, tom lane
Hi Tom, Thank you for the information. And I'm sorry for not replying sooner. It took time to recall this issue. (2014/05/08 8:58), Tom Lane wrote: > I started thinking about this issue again after seeing a similar report at > http://www.postgresql.org/message-id/flat/850BF81CE7324F81AF2A44C8660CF2EE@dell2 > and this time I think I see the problem. Look at the loop in pqReadData() > created by this test: > > I've been able to reproduce buffer bloat by inserting some delay into > this loop (I put "usleep(10000);" in front of the goto above) and then > sending a steady stream of circa-50KB data messages. Oh, I'm glad to hear that. Thank you for the explanation. > exactly the opposite of what generally happens on Unix machines. Perhaps > Windows schedules differently? Or maybe the reason we're seeing this now, It might be so. Although I'm not sure why it occurred, I think our conditions were rare: Hypervisor: VMware vSphere Hypervisor 5.1 Allocated vCPUs: 1 OS: Windows Server 2012 or 2008R2 (64bit) Run pg_dump under high load or run pb_dump with the Low priority using Windows Task Manager > Anyway, I still don't like your original suggestion of simply skipping > the pqCheckInBufferSpace call; that's just losing the knowledge we have of > how big the next message will be. Rather I think what we are missing here > is that we should deduct the excess space to the left of inStart, and/or > forcibly left-justify the buffer, before deciding that we need to enlarge > the buffer. I think it'd be safe for pqCheckInBufferSpace to do that > (ie move the data within the buffer), though I need to look at the callers > more closely to be sure. I agree with you. The new pqCheckInBufferSpace seems to update inStart, inEnd, and inCursor. I think it would be OK unless the callers assume they won't be updated. Thank you. Shin-ichi