Thread: TRUNCATE HANGS
Hello, I'm running into a problem with truncate causing my test application to fail. I've noticed some similar problems posted but noting that quite lines up with what I'm seeing. I have a partitioned table. Storing m records per partition, where I use a sequence table to keep track of the over all record count and a sequence table to track records per partition, using is_called column to indicate data in the partition. When an insert occurs to partition N, the sequence table for partition N+1 (wrap to zero as needed) is checked. If is_called is true, partition N+1 is truncated. The insert to partition N is based on the total record count. To invoke the truncation, a truncate list table is update. This table is read by a scheduled process (cron) which actually invokes TRUNCATE on the named table. For test purposes, the scope is 10 million records, 900K per partition. The test itself is two threads. Thread A is inserting records in a fairly tight loop. About 3k per second. Thread B is looks for the MIN, and MAX keys then randomly selects a chunk from the table set. There is no correlation between the record key and record count, CHECK constraints will not apply. For my purposes pg index searches to read the data without CHECK is reasonable. Running this test, a deadlock can be created without fail. Postgres processes always look like: postgres 16989 26837 64 11:20 ? 00:42:06 postgres: postgres uisdb 127.0.0.1(34404) INSERT waiting postgres 16990 26837 44 11:20 ? 00:28:51 postgres: postgres uisdb 127.0.0.1(34405) idle in transaction postgres 16991 26837 5 11:20 ? 00:03:43 postgres: postgres uisdb 127.0.0.1(34406) PARSE waiting root 16992 28129 0 11:20 pts/3 00:00:00 psql -U postgres -d uisdb postgres 16993 26837 0 11:20 ? 00:00:00 postgres: postgres uisdb [local] idle postgres 17801 17800 0 12:09 ? 00:00:00 /bin/bash /var/lib/pgsql/data/scripts/partition_rolling.sh postgres 17813 17801 0 12:09 ? 00:00:00 psql -U postgres -d uisdb -t --command TRUNCATE ONLY contact_reports_part0; postgres 17814 26837 0 12:09 ? 00:00:00 postgres: postgres uisdb [local] TRUNCATE TABLE waiting root 18074 10513 0 12:25 pts/4 00:00:00 grep post postgres 26837 1 0 Dec03 ? 00:00:02 /usr/bin/postmaster -p 5432 -D /var/lib/pgsql/data postgres 26839 26837 0 Dec03 ? 00:00:00 postgres: logger process postgres 26841 26837 0 Dec03 ? 00:00:14 postgres: writer process postgres 26842 26837 0 Dec03 ? 00:00:03 postgres: wal writer process postgres 26843 26837 0 Dec03 ? 00:00:00 postgres: autovacuum launcher process postgres 26844 26837 0 Dec03 ? 00:00:02 postgres: stats collector process I'm new to pg, still working thru pg_locks etc. However, I can line up the pg_locks where not granted to these insert, pending and truncate processes above. When I kill the 'idle' process, things unblock, but I'm at loss for what is the underlying cause of this problem. Thanks in advance for any suggestions. Tom -- View this message in context: http://postgresql.1045698.n5.nabble.com/TRUNCATE-HANGS-tp3292333p3292333.html Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.
tmoore wrote: > Running this test, a deadlock can be created without fail. You haven't shown any evidence of a deadlock -- just blocking. That's not at all the same thing. > postgres 16990 26837 44 11:20 ? 00:28:51 postgres: postgres uisdb > 127.0.0.1(34405) idle in transaction > postgres 16993 26837 0 11:20 ? 00:00:00 postgres: postgres > uisdb [local] idle > When I kill the 'idle' process, things unblock Are you sure it isn't the "idle in transaction" one that matters? What happens if you just COMMIT it? > I'm at loss for what is the underlying cause of this problem. You should be able to tell from pg_locks which transaction is blocking what. This doesn't look like a bug. If you still have problems solving this, you should probably start a new thread on pgsql-general or pgsql-novice. If you can create a small self-contained test case, which starts with creating and populating tables, so that others can reproduce your issue, you'll be able to get more specific help. -Kevin
Can I ask you to elaborate on one of your comments before I move to the general thread ? You mention committing the 'idle thread'. The test is not doing any transaction based work, the write functions just do sql insert, no begin commit blocks at the application level. Any tips on interpreting pg_locks ? Last thing, the test was modified such that only insert and truncate activity was done, no selects. No blocking occurred. Tom -- View this message in context: http://postgresql.1045698.n5.nabble.com/TRUNCATE-HANGS-tp3292333p3292403.html Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.
tmoore wrote: > The test is not doing any transaction based work, the write > functions just do sql insert, no begin commit blocks at the > application level. Well something is starting transactions; otherwise you wouldn't have a transaction sitting "idle in transaction". Are you perhaps accessing the database through Java with setAutoCommit(false)? (This is one of many ways you could have a transaction pending.) Whatever is starting the transactions is connecting through the local loopback address, since that connection is showing IP address 127.0.0.1. It's connecting to uisdb as the postgres user. Other connections are not using TCP connections, which is why I suspected it might be a Java process -- the JDBC driver always uses a TCP connection. > Any tips on interpreting pg_locks ? http://wiki.postgresql.org/wiki/Lock_Monitoring > No blocking occurred. I'm not following you. Originally you were saying that there was a deadlock, which is a set of blocks which form a cycle. You also said that things were stuck until you killed a particular connection, at which point things got moving again, which at least suggests the possibility of blocking. One of the connections you showed reported that it was waiting, which I think means it is blocked. You showed one process idle in transaction, and most locks are released at transaction completion. Why do you say there was no blocking? -Kevin
Thanks for the link. Yes, the test is java with autocommit false, on the localhost is involved. To clarify the no blocking comment, let me describe the test again. Three threads of executions, java writer, java reader, and psql via cron for truncation. As an experiment, the java reader was disabled and the test did not block. Tom -- View this message in context: http://postgresql.1045698.n5.nabble.com/TRUNCATE-HANGS-tp3292333p3292460.html Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.
I'm a little tired. I just indicated autocommit false, that was incorrect. Autocommit is true. Sorry. Thanks again for the comment.s Tom -- View this message in context: http://postgresql.1045698.n5.nabble.com/TRUNCATE-HANGS-tp3292333p3292468.html Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.
tmoore wrote: > I'm a little tired. I just indicated autocommit false, that was > incorrect. Autocommit is true. I would double-check that. As I said, what you showed us indicates that something logging in through TCP on localhost is using transactions, and not committing them when needed. > Three threads of executions, java writer, java reader, and psql via > cron for truncation. As an experiment, the java reader was disabled > and the test did not block. TRUNCATE will block or be blocked by just about anything, read or write. A reading transaction won't release the conflicting lock until the transaction ends. -Kevin
I did that, and then realized it was time to call it an afternoon ... Your comments have been very helpful. Thanks Tom -- View this message in context: http://postgresql.1045698.n5.nabble.com/TRUNCATE-HANGS-tp3292333p3294007.html Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.