Thread: Query using partitioned table hangs

Query using partitioned table hangs

From
Rob W
Date:
A re-post, since I'm really stuck on this and could use some advice on how to troubleshoot this...

I have an app that was previously using a large unpartitioned table with no problems. I partitioned this table and am
nowexperiencing intermittent hangs when inserting data into the partitioned table. The stored procedure that does the
insertseems to run to completion even when it 'hangs'. There are no messages in the log file, no errors/exceptions that
I'maware of, and I'm at a loss as to what is causing this, so any help would be much appreciated. 

Here's the details:

PostgreSQL version: 8.4
OS: Reproduced on both, RHEL 5.3 and OS X 10.5
Application: Java 1.6, using PostgreSQL 8.4 JDBC type 4 driver.

The application invokes a pl/pgsql stored procedure (with autocommit set to true). This sproc does several inserts -
seebelow for the code. This works fine with a single unpartitioned table, even for large inserts. 

With the table partitioned, it hangs intermittently, usually occurs within 20 - 30 minutes of running the application,
afterinvoking the sproc ~25 times. When it hangs, it hangs indefinitely - I know because I inadvertently left it in
'hung'state for 24 hours, so it's not just slow. 

The last command of the sproc truncates the staging table, raw_data. The table raw_data is empty when the query is
hung,which suggests that the sproc is running to completion. I tried sending a SIGHUP to the postmaster process (kill
-1)while the sproc was hung, thinking it might be a thread blocked somewhere, but that had no effect. 

The relevant tables, queries etc are listed below:


-- The parent table
CREATE TABLE event (
    id BIGSERIAL PRIMARY KEY,
    evt_time TIMESTAMP WITH TIME ZONE NOT NULL,
    fk_host INTEGER REFERENCES dim_host NOT NULL,
    fk_user INTEGER REFERENCES dim_user
);


-- example of a child table:
CREATE TABLE event_y2009m09
(
  CONSTRAINT event_y2009m09_evt_time_check CHECK (evt_time >= '2009-09-01'::date AND evt_time < '2009-10-01'::date)
)
INHERITS (event)
WITH (
  OIDS=FALSE
);


-- Example of trigger function used
CREATE OR REPLACE FUNCTION event_insert_trigger()
  RETURNS trigger AS
$BODY$
BEGIN
    IF ( NEW.evt_time >= DATE '2009-08-01' AND NEW.evt_time < DATE '2009-09-01' ) THEN
        INSERT INTO event_y2009m08 VALUES (NEW.*);
    ELSIF ( NEW.evt_time >= DATE '2009-09-01' AND NEW.evt_time < DATE '2009-10-01' ) THEN
        INSERT INTO event_y2009m09 VALUES (NEW.*);
    ELSE
        RAISE EXCEPTION 'Date out of range.';
    END IF;
    RETURN NULL;
END;
$$
LANGUAGE plpgsql;


-- The sproc that hangs
CREATE OR REPLACE FUNCTION normalize_data()
  RETURNS void AS
$BODY$
DECLARE
    str_value text;
    match record;
BEGIN
    -- Run analyze to keep statistics up to date
    -- raw_data contains ~60,000 records at this point
    ANALYZE raw_data;

    INSERT INTO dim_host (name_str)
    SELECT DISTINCT host_name FROM raw_data
    EXCEPT
    SELECT name_str FROM dim_host;
    ANALYZE dim_host;

    -- Do a few more inserts like the above

    -- Then copy all the records from the staging table
    -- Simplified for brevity - real query is a 12-way join
    -- All FK constraints on table event are dropped before
    -- we run this query.
    -- All appropriate partitions and triggers have been created
    -- in advance of running this query.

    INSERT INTO event (evt_time, fk_host, fk_user)
    SELECT r.evt_time, dim_host.id, dim_user.id
    FROM raw_data as r
    JOIN dim_host ON r.host_name = dim_host.name_str
    LEFT JOIN dim_user ON r.user_name = dim_user.user_name;

    TRUNCATE raw_data;

    RETURN;
END;
$$ LANGUAGE plpgsql;


Re: Query using partitioned table hangs

From
Tom Lane
Date:
Rob W <digital_illuminati@yahoo.com> writes:
> A re-post, since I'm really stuck on this and could use some advice on how to troubleshoot this...

Have you looked into pg_locks to see if it's blocked waiting for a lock?
The TRUNCATE in particular would require exclusive lock on the table,
so it could be waiting for some other process that's touched the table.

            regards, tom lane

Re: Query using partitioned table hangs

From
Rob W
Date:
--- On Mon, 12/7/09, Tom Lane <wrote:
> Have you looked into pg_locks to see if it's blocked
> waiting for a lock?
> The TRUNCATE in particular would require exclusive lock on
> the table, so it could be waiting for some other process
> that's touched the table.

Thanks Tom - while pg_locks did not reveal any locks, you gave me the clue that solved the mystery: the truncate
command.

pg_locks show no locks at all on this table, but just in case, I tried removing an ANLAYZE on the table being
truncated,that was being done right at the beginning of the sproc, and should have finished long before the truncate
wasbeing run, and hey presto no deadlocks. 

A couple of weird things about this: the fact the pg_locks showed no locks, and secondly the fact that it used to work
finebefore we started partitioning an unrelated table.