slow query on tables with new columns added. - Mailing list pgsql-performance

From M. D.
Subject slow query on tables with new columns added.
Date
Msg-id 4E7CC6B9.2040404@turnkey.bz
Whole thread Raw
Responses Re: slow query on tables with new columns added.  (Filip Rembiałkowski <plk.zuber@gmail.com>)
List pgsql-performance
Hi everyone,

I did a software upgrade, and with it came a new feature where when
selecting a customer it queries for the sum of a few columns.  This
takes 7 seconds for the 'Cash Sale' customer - by far the most active
customer. I'd like to see if it's possible to get it down a bit by
changing settings.

Query:
explain analyse select sum(item_points),sum(disc_points) from invoice
left join gltx on invoice.invoice_id = gltx.gltx_id
where gltx.inactive_on is null and gltx.posted = 'Y' and
gltx.customer_id = 'A0ZQ2gsACIsEKLI638ikyg'

item_points and disc_points are the 2 columns added, so they are mostly 0.

table info:
CREATE TABLE gltx   -- rows: 894,712
(
   gltx_id character(22) NOT NULL,
   "version" integer NOT NULL,
   created_by character varying(16) NOT NULL,
   updated_by character varying(16),
   inactive_by character varying(16),
   created_on date NOT NULL,
   updated_on date,
   inactive_on date,
   external_id numeric(14,0),
   data_type integer NOT NULL,
   "number" character varying(14) NOT NULL,
   reference_str character varying(14),
   post_date date NOT NULL,
   post_time time without time zone NOT NULL,
   work_date date NOT NULL,
   memo text,
   customer_id character(22),
   vendor_id character(22),
   station_id character(22),
   employee_id character(22),
   store_id character(22) NOT NULL,
   shift_id character(22),
   link_id character(22),
   link_num integer NOT NULL,
   printed character(1) NOT NULL,
   paid character(1) NOT NULL,
   posted character(1) NOT NULL,
   amount numeric(18,4) NOT NULL,
   card_amt numeric(18,4) NOT NULL,
   paid_amt numeric(18,4) NOT NULL,
   paid_date date,
   due_date date,
   CONSTRAINT gltx_pkey PRIMARY KEY (gltx_id),
   CONSTRAINT gltx_c0 FOREIGN KEY (customer_id)
       REFERENCES customer (customer_id) MATCH SIMPLE
       ON UPDATE NO ACTION ON DELETE NO ACTION,
   CONSTRAINT gltx_c1 FOREIGN KEY (vendor_id)
       REFERENCES vendor (vendor_id) MATCH SIMPLE
       ON UPDATE NO ACTION ON DELETE NO ACTION,
   CONSTRAINT gltx_c2 FOREIGN KEY (station_id)
       REFERENCES station (station_id) MATCH SIMPLE
       ON UPDATE NO ACTION ON DELETE NO ACTION,
   CONSTRAINT gltx_c3 FOREIGN KEY (employee_id)
       REFERENCES employee (employee_id) MATCH SIMPLE
       ON UPDATE NO ACTION ON DELETE NO ACTION,
   CONSTRAINT gltx_c4 FOREIGN KEY (store_id)
       REFERENCES store (store_id) MATCH SIMPLE
       ON UPDATE NO ACTION ON DELETE NO ACTION,
   CONSTRAINT gltx_c5 FOREIGN KEY (shift_id)
       REFERENCES shift (shift_id) MATCH SIMPLE
       ON UPDATE NO ACTION ON DELETE NO ACTION,
   CONSTRAINT gltx_c6 FOREIGN KEY (link_id)
       REFERENCES gltx (gltx_id) MATCH SIMPLE
       ON UPDATE NO ACTION ON DELETE SET NULL
)
WITH (
   OIDS=FALSE
);
ALTER TABLE gltx OWNER TO quasar;
GRANT ALL ON TABLE gltx TO quasar;

CREATE INDEX gltx_i0
   ON gltx
   USING btree
   (data_type);

CREATE INDEX gltx_i1
   ON gltx
   USING btree
   (post_date);

CREATE INDEX gltx_i2
   ON gltx
   USING btree
   (number);

CREATE INDEX gltx_i3
   ON gltx
   USING btree
   (data_type, number);

CREATE INDEX gltx_i4
   ON gltx
   USING btree
   (customer_id, paid);

CREATE INDEX gltx_i5
   ON gltx
   USING btree
   (vendor_id, paid);

CREATE INDEX gltx_i6
   ON gltx
   USING btree
   (work_date);

CREATE INDEX gltx_i7
   ON gltx
   USING btree
   (link_id);


CREATE TABLE invoice  -- 623,270 rows
(
   invoice_id character(22) NOT NULL,
   ship_id character(22),
   ship_via character varying(20),
   term_id character(22),
   promised_date date,
   tax_exempt_id character(22),
   customer_addr text,
   ship_addr text,
   comments text,
   item_points numeric(14,0) NOT NULL,
   disc_points numeric(14,0) NOT NULL,
   CONSTRAINT invoice_pkey PRIMARY KEY (invoice_id),
   CONSTRAINT invoice_c0 FOREIGN KEY (invoice_id)
       REFERENCES gltx (gltx_id) MATCH SIMPLE
       ON UPDATE NO ACTION ON DELETE CASCADE,
   CONSTRAINT invoice_c1 FOREIGN KEY (ship_id)
       REFERENCES customer (customer_id) MATCH SIMPLE
       ON UPDATE NO ACTION ON DELETE NO ACTION,
   CONSTRAINT invoice_c2 FOREIGN KEY (term_id)
       REFERENCES term (term_id) MATCH SIMPLE
       ON UPDATE NO ACTION ON DELETE NO ACTION,
   CONSTRAINT invoice_c3 FOREIGN KEY (tax_exempt_id)
       REFERENCES tax (tax_id) MATCH SIMPLE
       ON UPDATE NO ACTION ON DELETE NO ACTION
)
WITH (
   OIDS=FALSE
);

Both tables have mostly writes, some updates, very few deletes.

Explain analyse: (http://explain.depesz.com/s/SYW)

Aggregate  (cost=179199.52..179199.53 rows=1 width=10) (actual time=7520.922..7520.924 rows=1 loops=1)
   ->   Merge Join  (cost=9878.78..177265.66 rows=386771 width=10) (actual time=104.651..6690.194 rows=361463 loops=1)
         Merge Cond: (invoice.invoice_id = gltx.gltx_id)
         ->   Index Scan using invoice_pkey on invoice  (cost=0.00..86222.54 rows=623273 width=33) (actual
time=0.010..1316.507rows=623273 loops=1) 
         ->   Index Scan using gltx_pkey on gltx  (cost=0.00..108798.53 rows=386771 width=23) (actual
time=104.588..1822.886rows=361464 loops=1) 
               Filter: ((gltx.inactive_on IS NULL) AND (gltx.posted = 'Y'::bpchar) AND (gltx.customer_id =
'A0ZQ2gsACIsEKLI638ikyg'::bpchar))
Total runtime: 7521.026 ms


PostgreSQL: 9.0.4 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-50), 64-bit  -
selfcompiled 
Linux: Linux server.domain.lan 2.6.18-238.12.1.el5xen #1 SMP Tue May 31 13:35:45 EDT 2011 x86_64 x86_64 x86_64
GNU/Linux

Hardware: single CPU: model name      : Intel(R) Xeon(R) CPU           E5335  @ 2.00GHz
RAM: 8GB
DB Size: 5876MB
HDs: Raid 1 Sata drives - dell PowerEdge 1900 - lower middle class server


Postgres config:
max_connections = 200                   #it's a bit high I know, but most connections are idle
shared_buffers = 2048MB                 #
work_mem = 8MB                          # tried up to 32MB, but no diff
maintenance_work_mem = 16MB             #
bgwriter_delay = 2000ms                 #
checkpoint_segments = 15                #
checkpoint_completion_target = 0.8      #
seq_page_cost = 5.0                     #
random_page_cost = 2.5                  #
effective_cache_size = 2048MB        # just upgraded to 2GB. had another aggressive memory using program before, so did
notwant to have this high 
log_destination = 'stderr'              #
logging_collector = off         #
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' #
log_rotation_age = 1d                   #
log_min_duration_statement = 10000      #
log_line_prefix='%t:%r:%u@%d:[%p]: '                    #
track_activities = on
track_counts = on
track_activity_query_size = 1024        #
autovacuum = on                     #
autovacuum_max_workers = 5              #
datestyle = 'iso, mdy'
lc_messages = 'en_US.UTF-8'                     #
lc_monetary = 'en_US.UTF-8'                     #
lc_numeric = 'en_US.UTF-8'                      #
lc_time = 'en_US.UTF-8'                         #
default_text_search_config = 'pg_catalog.english'




pgsql-performance by date:

Previous
From: Gunnlaugur Þór Briem
Date:
Subject: Re: Constraint exclusion on UNION ALL subqueries with WHERE conditions
Next
From: Filip Rembiałkowski
Date:
Subject: Re: slow query on tables with new columns added.