Re: Running update in chunks? - Mailing list pgsql-general
From | Kevin Grittner |
---|---|
Subject | Re: Running update in chunks? |
Date | |
Msg-id | 20130122022150.120590@gmx.com Whole thread Raw |
In response to | Running update in chunks? (Tim Uckun <timuckun@gmail.com>) |
Responses |
Re: Running update in chunks?
Re: Running update in chunks? Re: Running update in chunks? |
List | pgsql-general |
Kevin Grittner wrote: > update imports set make_id = 0 > > Query returned successfully: 98834 rows affected, 45860 ms execution time. For difficult problems, there is nothing like a self-contained test case, that someone else can run to see the issue. Here's a starting point: create extension if not exists hstore; create schema cars; drop table if exists cars.imports; CREATE TABLE cars.imports ( id serial NOT NULL, target_id integer, batch_id integer, make_id integer, model_id integer, date timestamp without time zone, division_id integer, dealer_id integer, data hstore, created_at timestamp without time zone NOT NULL, updated_at timestamp without time zone NOT NULL, CONSTRAINT imports_pkey PRIMARY KEY (id) ); CREATE INDEX index_imports_on_data ON cars.imports USING gin (data); CREATE INDEX index_imports_on_dealer_id ON cars.imports USING btree (dealer_id); CREATE INDEX index_imports_on_division_id ON cars.imports USING btree (division_id); CREATE INDEX index_imports_on_make_id ON cars.imports USING btree (make_id); CREATE INDEX index_imports_on_model_id ON cars.imports USING btree (model_id); insert into cars.imports (created_at, updated_at) select now(), now() from (select generate_series(1, 100000)) x; vacuum freeze analyze; \timing on update cars.imports set make_id = 0; vacuum analyze; update cars.imports set make_id = 0; ... and here's what I get when I run it on my desktop computer with ordinary disk drives and a completely default configuration: test=# create extension if not exists hstore; CREATE EXTENSION Time: 48.032 ms test=# create schema cars; CREATE SCHEMA Time: 8.150 ms test=# drop table if exists cars.imports; NOTICE: table "imports" does not exist, skipping DROP TABLE Time: 0.205 ms test=# CREATE TABLE cars.imports test-# ( test(# id serial NOT NULL, test(# target_id integer, test(# batch_id integer, test(# make_id integer, test(# model_id integer, test(# date timestamp without time zone, test(# division_id integer, test(# dealer_id integer, test(# data hstore, test(# created_at timestamp without time zone NOT NULL, test(# updated_at timestamp without time zone NOT NULL, test(# CONSTRAINT imports_pkey PRIMARY KEY (id) test(# ); NOTICE: CREATE TABLE will create implicit sequence "imports_id_seq" for serial column "imports.id" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "imports_pkey" for table "imports" CREATE TABLE Time: 152.677 ms test=# CREATE INDEX index_imports_on_data ON cars.imports USING gin (data); CREATE INDEX Time: 6.391 ms test=# CREATE INDEX index_imports_on_dealer_id ON cars.imports USING btree (dealer_id); CREATE INDEX Time: 64.668 ms test=# CREATE INDEX index_imports_on_division_id ON cars.imports USING btree (division_id); CREATE INDEX Time: 65.573 ms test=# CREATE INDEX index_imports_on_make_id ON cars.imports USING btree (make_id); CREATE INDEX Time: 64.959 ms test=# CREATE INDEX index_imports_on_model_id ON cars.imports USING btree (model_id); CREATE INDEX Time: 64.906 ms test=# insert into cars.imports (created_at, updated_at) select now(), now() from (select generate_series(1, 100000)) x; INSERT 0 100000 Time: 2516.559 ms test=# vacuum freeze analyze; VACUUM Time: 3357.778 ms test=# \timing on Timing is on. test=# update cars.imports set make_id = 0; UPDATE 100000 Time: 2937.241 ms test=# vacuum analyze; VACUUM Time: 2097.426 ms test=# update cars.imports set make_id = 0; UPDATE 100000 Time: 3935.939 ms Ubuntu 12.10 i7-3770 CPU @ 3.40GHz with 16GB RAM Linux Kevin-Desktop 3.5.0-22-generic #34-Ubuntu SMP Tue Jan 8 21:47:00 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux A pair of WDC WD10EALX-009BA0 (15.01H51) 7200 RPM drives in RAID 1. PostgreSQL 9.2.2 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu/Linaro 4.7.2-2ubuntu1) 4.7.2, 64-bit test=# SELECT name, current_setting(name), source test-# FROM pg_settings test-# WHERE source NOT IN ('default', 'override'); name | current_setting | source ----------------------------+--------------------+---------------------- application_name | psql | client client_encoding | UTF8 | client DateStyle | ISO, MDY | configuration file default_text_search_config | pg_catalog.english | configuration file lc_messages | en_US.UTF-8 | configuration file lc_monetary | en_US.UTF-8 | configuration file lc_numeric | en_US.UTF-8 | configuration file lc_time | en_US.UTF-8 | configuration file log_timezone | US/Central | configuration file max_connections | 100 | configuration file max_stack_depth | 2MB | environment variable shared_buffers | 32MB | configuration file TimeZone | US/Central | configuration file (13 rows) I did absolutely no tuning from the default configuration. So, what timings do you get if you run the identical script? Is there something you can do to the above script (maybe in terms of populating data) which will cause the performance you see? -Kevin
pgsql-general by date: