Re: Problem with insert into select... - Mailing list pgsql-performance

From stephen farrell
Subject Re: Problem with insert into select...
Date
Msg-id 3FBD72D5.1040202@almaden.ibm.com
Whole thread Raw
In response to Re: Problem with insert into select...  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Problem with insert into select...
List pgsql-performance
Ok -- so we created indexes and it was able to complete successfully.
But why would creating indexes affect the memory footprint, and should it?


Does it buffer the sub-select before doing the insert, or does it do the
insert record-by-record?


See correspondence below for details:


Steve,

         With the indexes created it worked.  It took about 4 hours, but
it inserted all of the records.

    stephen farrell <sfarrell@almaden.ibm.com>

11/20/2003 05:22 PM

         To:        James Rhodes/Almaden/IBM@IBMUS
         cc:
         Subject:        Re: [Fwd: Re: [PERFORM] Problem with insert
into select...]



if you do "explain" before the sql statement (e.g., "explain select *
from foo"), it'll tell you the query plan.

James Rhodes wrote:
 >
 > Steve,
 >
 >         Here is the detailed structure of the tables and the query that
 > is failing (the "INSERT INTO FACT" query) and I attached the logfile.
 >  Also what is EXPLAIN???
 >
 > CREATE TABLE RAW ( RAW_KEY  serial, PATNO_TEXT VARCHAR (9),
 > APPDATE_DATETIME VARCHAR (11), ISDATE_DATETIME VARCHAR (11),
 > WHATEVERSNO_TEXT VARCHAR (5), WHATEVERSNO_NUMBER VARCHAR (6), APPNO_TEXT
 > VARCHAR (10), TITLE_TEXT TEXT, USCLASS_TEXT VARCHAR (14),
 > USCLASS_TEXTLIST_TEXT TEXT, AUTHORCODE_TEXT VARCHAR (9),
 > AUTHORNORM_TEXT VARCHAR (195), AUTHOR_TEXT VARCHAR (212),
 > AUTHOR_TEXTLIST_TEXT TEXT, AUTHORADDRESS_TEXT VARCHAR (84),
 > AUTHORADDRESS_TEXTLIST_TEXT TEXT, INVENTOR_TEXT VARCHAR (50),
 > INVENTOR_TEXTLIST_TEXT TEXT, INVENTORADDRESS_TEXT VARCHAR (90),
 > INVENTORADDRESS_TEXTLIST_TEXT TEXT, AGENT_TEXT TEXT, AGENT_TEXTLIST_TEXT
 > TEXT, USSEARCHFIELD_TEXT VARCHAR (26), USSEARCHFIELD_TEXTLIST_TEXT
 > VARCHAR (150), USREFISDATE_TEXT VARCHAR (13), USREFISDATE_TEXTLIST_TEXT
 > TEXT, USREFNAME_TEXT VARCHAR (34), USREFNAME_TEXTLIST_TEXT TEXT,
 > ABSTRACT_TEXT TEXT, ABSTRACT_TEXTLIST_TEXT TEXT, ABSTRACT_RICHTEXT_PAR
 > TEXT, WHATEVERS_RICHTEXT_PAR TEXT, USREFPATNO_RICHTEXT_PAR TEXT, PRIMARY
 > KEY(RAW_KEY));
 >
 >
 > CREATE TABLE ISSUE_TIME (
 > TAB_KEY  serial,
 > ISDATE_DATETIME varchar (8),
 > MONTH INT,
 > DAY INT,
 > YEAR INT
 > , PRIMARY KEY(TAB_KEY))
 >
 > CREATE TABLE SOMETHING_NUMBER (
 > TAB_KEY  serial,
 > PATNO_TEXT varchar (7)
 > , PRIMARY KEY(TAB_KEY))
 >
 > CREATE TABLE APP_TIME (
 > TAB_KEY  serial,
 > APPDATE_DATETIME varchar (8),
 > MONTH INT,
 > DAY INT,
 > YEAR INT
 > , PRIMARY KEY(TAB_KEY))
 >
 > CREATE TABLE AUTHOR (
 > TAB_KEY  serial,
 > CODE varchar (6),
 > AUTHOR text
 > , PRIMARY KEY(TAB_KEY))
 >
 > CREATE TABLE APPLICATION_NUMBER (
 > TAB_KEY  serial,
 > APPNO_TEXT varchar (7)
 > , PRIMARY KEY(TAB_KEY))
 >
 > CREATE TABLE WHATEVERS (
 > TAB_KEY  serial,
 > abstract_richtext_par text,
 > WHATEVERS_richtext_par text,
 > raw_key int,
 > title_text text
 > , PRIMARY KEY(TAB_KEY))
 >
 > CREATE TABLE FACT (DYN_DIM1 BIGINT, DYN_DIM2 BIGINT,DYN_DIM3
 > BIGINT,ISSUE_TIME BIGINT, SOMETHING_NUMBER BIGINT, APP_TIME BIGINT,
 > AUTHOR BIGINT, APPLICATION_NUMBER BIGINT, WHATEVERS BIGINT)
 >
 > INSERT INTO FACT (ISSUE_TIME, SOMETHING_NUMBER, APP_TIME, AUTHOR,
 > APPLICATION_NUMBER, WHATEVERS)  SELECT ISSUE_TIME.TAB_KEY,
 > SOMETHING_NUMBER.TAB_KEY, APP_TIME.TAB_KEY, AUTHOR.TAB_KEY,
 > APPLICATION_NUMBER.TAB_KEY, WHATEVERS.TAB_KEY  FROM ISSUE_TIME,
 > SOMETHING_NUMBER, APP_TIME, AUTHOR, APPLICATION_NUMBER, WHATEVERS, raw
 >  WHERE ISSUE_TIME.ISDATE_DATETIME=raw.ISDATE_DATETIME  AND
 > SOMETHING_NUMBER.PATNO_TEXT=raw.PATNO_TEXT  AND
 > APP_TIME.APPDATE_DATETIME=raw.APPDATE_DATETIME  AND
 > AUTHOR.CODE=AUTHORCODE_TEXT AND AUTHOR.AUTHOR=(AUTHOR_TEXT ||
 >  ' | ' || AUTHOR_TEXTLIST_TEXT) AND
 > APPLICATION_NUMBER.APPNO_TEXT=raw.APPNO_TEXT  AND
 > WHATEVERS.raw_key=raw.raw_key

Tom Lane wrote:
> stephen farrell <sfarrell@almaden.ibm.com> writes:
>
>>I'm having a problem with a queyr like:  INSERT INTO FACT (x,x,x,x,x,x)
>>SELECT a.key,b.key,c.key,d.key,e.key,f.key from x,a,b,c,d,e,f where x=a
>>and x=b ....   -- postgres7.4 is running out of memory.  I'm not sure
>>why this would happen -- does it buffer the subselect before doing the
>>insert?
>
>
> What does EXPLAIN show for the query?  And we need to see the exact
> query and table definitions, not abstractions.
>
>             regards, tom lane


pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: duration logging setting in 7.4
Next
From: Josh Berkus
Date:
Subject: Re: [HACKERS] More detail on settings for pgavd?