Re: 8.3rc1 Out of memory when performing update - Mailing list pgsql-performance

From Stephen Denne
Subject Re: 8.3rc1 Out of memory when performing update
Date
Msg-id F0238EBA67824444BC1CB4700960CB480482B5B7@dmpeints002.isotach.com
Whole thread Raw
In response to Re: 8.3rc1 Out of memory when performing update  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: 8.3rc1 Out of memory when performing update
List pgsql-performance
> "Stephen Denne" <Stephen.Denne@datamail.co.nz> writes:
> > A simple update query, over roughly 17 million rows,
> > populating a newly added column in a table, resulted in an
> > out of memory error when the process memory usage reached
> > 2GB. Could this be due to a poor choice of some configuration
> > parameter, or is there a limit on how many rows I can update
> > in a single statement?
>
> Do you have any triggers or foreign keys on that table?  For that
> matter, let's see its whole schema definition.
>
>             regards, tom lane

No triggers on that table, one primary key, one foreign key, two indexes.
The foreign key references a primary key which is also an integer.
No other tables which reference document_file.
No inherited tables.
There are as many document_file rows as there are rows in the document table,
document_file.document_id is unique, though not constrained.
(Designed as a one to many relationship, but only ever used as one to one.)


I altered the update statement slightly, and reran the query.

I disabled autovacuum after a while and cancelled the autovacuum process that was trying to vacuum analyze
document_file.

The altered query has been running over 3 hours now,
without using lots of memory (38M private bytes).
2046 temp files were created (2.54GB worth),
which have recently changed from slowly growing in size
to very very slowly reducing in number.


Altered query that has not crashed:
UPDATE ONLY document_file AS df SET document_type_id = d.document_type_id FROM document AS d WHERE d.id = document_id;

Hash Join  (cost=674810.80..6701669.63 rows=16972702 width=621)
  Hash Cond: (df.document_id = d.id)
  ->  Seq Scan on document_file df  (cost=0.00..750298.65 rows=27702365 width=617)
  ->  Hash  (cost=396352.02..396352.02 rows=16972702 width=8)
        ->  Seq Scan on document d  (cost=0.00..396352.02 rows=16972702 width=8)


c.f. original (re-explained):
UPDATE document_file SET document_type_id = (SELECT document_type_id FROM document d where d.id = document_id);

Seq Scan on document_file  (cost=0.00..281183329.64 rows=27702834 width=617)
  SubPlan
    ->  Index Scan using pk_document_id on document d  (cost=0.00..10.12 rows=1 width=4)
          Index Cond: (id = $0)



Schema as reported by pgadmin:

CREATE TABLE document_file
(
  id integer NOT NULL DEFAULT nextval(('document_file_seq'::text)::regclass),
  document_id integer NOT NULL,
  archive_directory_location character varying(255) NOT NULL,
  mime_type character varying(255),
  file_name character varying(255) NOT NULL,
  life_cycle_status character varying(255),
  version integer DEFAULT 0,
  is_current boolean DEFAULT true,
  file_size integer NOT NULL,
  document_type_id integer,
  CONSTRAINT pk_document_file_id PRIMARY KEY (id),
  CONSTRAINT fk_document_id FOREIGN KEY (document_id)
      REFERENCES document (id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE CASCADE
)
WITH (OIDS=FALSE);
ALTER TABLE document_file OWNER TO postgres;
GRANT ALL ON TABLE document_file TO postgres;
GRANT ALL ON TABLE document_file TO vapps;
GRANT ALL ON TABLE document_file TO vrconfig;

CREATE INDEX location_ix
  ON document_file
  USING btree
  (archive_directory_location);

CREATE INDEX tc_file_document
  ON document_file
  USING btree
  (document_id);


Disclaimer:
At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any
attachmentsis confidential and may be subject to legal privilege.  If it is not intended for you please advise by reply
immediately,destroy it and do not copy, disclose or use it in any way. 

__________________________________________________________________
  This email has been scanned by the DMZGlobal Business Quality
              Electronic Messaging Suite.
Please see http://www.dmzglobal.com/services/bqem.htm for details.
__________________________________________________________________


pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: 8.3rc1 Out of memory when performing update
Next
From: Tom Lane
Date:
Subject: Re: 8.3rc1 Out of memory when performing update