Thread: Problem with insert into select...

Problem with insert into select...

From
stephen farrell
Date:
I'm having a problem with a queyr like:  INSERT INTO FACT (x,x,x,x,x,x)
SELECT a.key,b.key,c.key,d.key,e.key,f.key from x,a,b,c,d,e,f where x=a
and x=b ....   -- postgres7.4 is running out of memory.  I'm not sure
why this would happen -- does it buffer the subselect before doing the
insert?

Things are pretty big scale: 3gb ram, 32768 shared buffers, 700gb disk,
millions of rows in the tables.





Re: Problem with insert into select...

From
Tom Lane
Date:
stephen farrell <sfarrell@almaden.ibm.com> writes:
> I'm having a problem with a queyr like:  INSERT INTO FACT (x,x,x,x,x,x)
> SELECT a.key,b.key,c.key,d.key,e.key,f.key from x,a,b,c,d,e,f where x=a
> and x=b ....   -- postgres7.4 is running out of memory.  I'm not sure
> why this would happen -- does it buffer the subselect before doing the
> insert?

What does EXPLAIN show for the query?  And we need to see the exact
query and table definitions, not abstractions.

            regards, tom lane

Re: Problem with insert into select...

From
stephen farrell
Date:
Ok -- so we created indexes and it was able to complete successfully.
But why would creating indexes affect the memory footprint, and should it?


Does it buffer the sub-select before doing the insert, or does it do the
insert record-by-record?


See correspondence below for details:


Steve,

         With the indexes created it worked.  It took about 4 hours, but
it inserted all of the records.

    stephen farrell <sfarrell@almaden.ibm.com>

11/20/2003 05:22 PM

         To:        James Rhodes/Almaden/IBM@IBMUS
         cc:
         Subject:        Re: [Fwd: Re: [PERFORM] Problem with insert
into select...]



if you do "explain" before the sql statement (e.g., "explain select *
from foo"), it'll tell you the query plan.

James Rhodes wrote:
 >
 > Steve,
 >
 >         Here is the detailed structure of the tables and the query that
 > is failing (the "INSERT INTO FACT" query) and I attached the logfile.
 >  Also what is EXPLAIN???
 >
 > CREATE TABLE RAW ( RAW_KEY  serial, PATNO_TEXT VARCHAR (9),
 > APPDATE_DATETIME VARCHAR (11), ISDATE_DATETIME VARCHAR (11),
 > WHATEVERSNO_TEXT VARCHAR (5), WHATEVERSNO_NUMBER VARCHAR (6), APPNO_TEXT
 > VARCHAR (10), TITLE_TEXT TEXT, USCLASS_TEXT VARCHAR (14),
 > USCLASS_TEXTLIST_TEXT TEXT, AUTHORCODE_TEXT VARCHAR (9),
 > AUTHORNORM_TEXT VARCHAR (195), AUTHOR_TEXT VARCHAR (212),
 > AUTHOR_TEXTLIST_TEXT TEXT, AUTHORADDRESS_TEXT VARCHAR (84),
 > AUTHORADDRESS_TEXTLIST_TEXT TEXT, INVENTOR_TEXT VARCHAR (50),
 > INVENTOR_TEXTLIST_TEXT TEXT, INVENTORADDRESS_TEXT VARCHAR (90),
 > INVENTORADDRESS_TEXTLIST_TEXT TEXT, AGENT_TEXT TEXT, AGENT_TEXTLIST_TEXT
 > TEXT, USSEARCHFIELD_TEXT VARCHAR (26), USSEARCHFIELD_TEXTLIST_TEXT
 > VARCHAR (150), USREFISDATE_TEXT VARCHAR (13), USREFISDATE_TEXTLIST_TEXT
 > TEXT, USREFNAME_TEXT VARCHAR (34), USREFNAME_TEXTLIST_TEXT TEXT,
 > ABSTRACT_TEXT TEXT, ABSTRACT_TEXTLIST_TEXT TEXT, ABSTRACT_RICHTEXT_PAR
 > TEXT, WHATEVERS_RICHTEXT_PAR TEXT, USREFPATNO_RICHTEXT_PAR TEXT, PRIMARY
 > KEY(RAW_KEY));
 >
 >
 > CREATE TABLE ISSUE_TIME (
 > TAB_KEY  serial,
 > ISDATE_DATETIME varchar (8),
 > MONTH INT,
 > DAY INT,
 > YEAR INT
 > , PRIMARY KEY(TAB_KEY))
 >
 > CREATE TABLE SOMETHING_NUMBER (
 > TAB_KEY  serial,
 > PATNO_TEXT varchar (7)
 > , PRIMARY KEY(TAB_KEY))
 >
 > CREATE TABLE APP_TIME (
 > TAB_KEY  serial,
 > APPDATE_DATETIME varchar (8),
 > MONTH INT,
 > DAY INT,
 > YEAR INT
 > , PRIMARY KEY(TAB_KEY))
 >
 > CREATE TABLE AUTHOR (
 > TAB_KEY  serial,
 > CODE varchar (6),
 > AUTHOR text
 > , PRIMARY KEY(TAB_KEY))
 >
 > CREATE TABLE APPLICATION_NUMBER (
 > TAB_KEY  serial,
 > APPNO_TEXT varchar (7)
 > , PRIMARY KEY(TAB_KEY))
 >
 > CREATE TABLE WHATEVERS (
 > TAB_KEY  serial,
 > abstract_richtext_par text,
 > WHATEVERS_richtext_par text,
 > raw_key int,
 > title_text text
 > , PRIMARY KEY(TAB_KEY))
 >
 > CREATE TABLE FACT (DYN_DIM1 BIGINT, DYN_DIM2 BIGINT,DYN_DIM3
 > BIGINT,ISSUE_TIME BIGINT, SOMETHING_NUMBER BIGINT, APP_TIME BIGINT,
 > AUTHOR BIGINT, APPLICATION_NUMBER BIGINT, WHATEVERS BIGINT)
 >
 > INSERT INTO FACT (ISSUE_TIME, SOMETHING_NUMBER, APP_TIME, AUTHOR,
 > APPLICATION_NUMBER, WHATEVERS)  SELECT ISSUE_TIME.TAB_KEY,
 > SOMETHING_NUMBER.TAB_KEY, APP_TIME.TAB_KEY, AUTHOR.TAB_KEY,
 > APPLICATION_NUMBER.TAB_KEY, WHATEVERS.TAB_KEY  FROM ISSUE_TIME,
 > SOMETHING_NUMBER, APP_TIME, AUTHOR, APPLICATION_NUMBER, WHATEVERS, raw
 >  WHERE ISSUE_TIME.ISDATE_DATETIME=raw.ISDATE_DATETIME  AND
 > SOMETHING_NUMBER.PATNO_TEXT=raw.PATNO_TEXT  AND
 > APP_TIME.APPDATE_DATETIME=raw.APPDATE_DATETIME  AND
 > AUTHOR.CODE=AUTHORCODE_TEXT AND AUTHOR.AUTHOR=(AUTHOR_TEXT ||
 >  ' | ' || AUTHOR_TEXTLIST_TEXT) AND
 > APPLICATION_NUMBER.APPNO_TEXT=raw.APPNO_TEXT  AND
 > WHATEVERS.raw_key=raw.raw_key

Tom Lane wrote:
> stephen farrell <sfarrell@almaden.ibm.com> writes:
>
>>I'm having a problem with a queyr like:  INSERT INTO FACT (x,x,x,x,x,x)
>>SELECT a.key,b.key,c.key,d.key,e.key,f.key from x,a,b,c,d,e,f where x=a
>>and x=b ....   -- postgres7.4 is running out of memory.  I'm not sure
>>why this would happen -- does it buffer the subselect before doing the
>>insert?
>
>
> What does EXPLAIN show for the query?  And we need to see the exact
> query and table definitions, not abstractions.
>
>             regards, tom lane


Re: Problem with insert into select...

From
Neil Conway
Date:
stephen farrell <sfarrell@almaden.ibm.com> writes:
> With the indexes created it worked.  It took about 4 hours, but it
> inserted all of the records.

Has this been satisfactorily resolved?

If not, can you post an EXPLAIN ANALYZE for the failing query, as Tom
asked earlier?

-Neil