Re: EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT - Mailing list pgsql-performance

From Tomas Vondra
Subject Re: EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT
Date
Msg-id 54F8F2BB.2010402@2ndquadrant.com
Whole thread Raw
In response to Re: EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT  (Gunnlaugur Thor Briem <gunnlaugur@gmail.com>)
Responses Re: EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
Hi,

On 5.3.2015 16:01, Gunnlaugur Thor Briem wrote:
> Hi, thanks for your follow-up questions.
>
> - postgres version is 9.1.13
> - the number of rows (in this latest instance) is 28,474,842
> - I've clustered and vacuum-full-ed and analyzed this table frequently,
> attempting to troubleshoot this. (Running vacuum full on the whole
> catalog seems a little excessive, and unlikely to help.)
> - no other processes are likely to be interfering; nothing other than
> PostgreSQL runs on this machine (except for normal OS processes and New
> Relic server monitoring service); concurrent activity in PostgreSQL is
> low-level and unrelated, and this effect is observed systematically
> whenever this kind of operation is performed on this table
> - no override for this table; the system default_statistics_target is
> 100 (the default)
> - yes, an ANALYZE is performed on the temp table after the COPY and
> before the INSERT
> - no index on the temp table (but I'm scanning the whole thing anyway).
> There are indexes on f_foo as detailed in my original post, and I expect
> the PK to make the WHERE NOT EXISTS filtering efficient (as it filters
> on exactly all columns of the PK) ... but even if it didn't, I would
> expect that to only slow down the actual insert execution, not the EXPLAIN.

The only thing I can think of is some sort of memory exhaustion,
resulting in swapping out large amounts of memory. That'd explain the
I/O load. Can you run something like vmstat to see if this really is swap?

The fact that plain INSERT does not do that contradicts that, as it
should be able to plan either both queries (plain and EXPLAIN), or none
of them.

Can you prepare a self-contained test case? I.e. a script that
demonstrates the issue? I tried to reproduce the issue using the
information provided so far, but unsuccessfully :-(

Even if you could reproduce the problem on another machine (because of
keeping the data internal) on a server with debug symbols and see where
most of the time is spent (e.g. using 'perf top'), that'd be useful.

regards

--
Tomas Vondra                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


pgsql-performance by date:

Previous
From: Marc Mamin
Date:
Subject: Re: EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT
Next
From: Tom Lane
Date:
Subject: Re: EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT