Thread: 57 minute SELECT
Howdy,
I’m going to post this in 2 parts as I think it’s too big for 1 post.
Environment:
PG 8.4.17
Linux Ubuntu 10.04
Total RAM – 1G
Things that have been performed:
1. Explain on SELECT.
2. ANALYZE database.
3. VACUUM database.
4. shared_buffers = 256M
5. effective_cache_size = 768M
6. work_mem = 512M
Table DDL:
nms=# \d syslog
View "public.syslog"
Column | Type | Modifiers
----------+-----------------------------+-----------
ip | inet |
facility | character varying(10) |
level | character varying(10) |
datetime | timestamp without time zone |
program | character varying(25) |
msg | text |
seq | bigint |
View definition:
SELECT syslog_master.ip, syslog_master.facility, syslog_master.level, syslog_master.datetime, syslog_master.program, syslog_master.msg, syslog_master.seq
FROM syslog_master;
Rules:
syslog_insert_201308 AS
ON INSERT TO syslog
WHERE new.datetime >= '2013-08-01'::date AND new.datetime < '2013-09-01'::date DO INSTEAD INSERT INTO syslog_201308 (ip, facility, level, datetime, program, msg)
VALUES (new.ip, new.facility, new.level, new.datetime, new.program, new.msg)
syslog_insert_201309 AS
ON INSERT TO syslog
WHERE new.datetime >= '2013-09-01'::date AND new.datetime < '2013-10-01'::date DO INSTEAD INSERT INTO syslog_201309 (ip, facility, level, datetime, program, msg)
VALUES (new.ip, new.facility, new.level, new.datetime, new.program, new.msg)
syslog_insert_201310 AS
ON INSERT TO syslog
WHERE new.datetime >= '2013-10-01'::date AND new.datetime < '2013-11-01'::date DO INSTEAD INSERT INTO syslog_201310 (ip, facility, level, datetime, program, msg)
VALUES (new.ip, new.facility, new.level, new.datetime, new.program, new.msg)
syslog_insert_null AS
ON INSERT TO syslog DO INSTEAD NOTHING
nms=#
nms=# \d devices
hostname | character varying(20) |
hostpop | character varying(20) |
hostgroup | character varying(20) |
rack | character varying(10) |
asset | character varying(10) |
ip | inet |
snmprw | character varying(20) |
snmpro | character varying(20) |
snmpver | character varying(3) |
console | character varying(20) |
psu1 | character varying(20) |
psu2 | character varying(20) |
psu3 | character varying(20) |
psu4 | character varying(20) |
alias1 | character varying(20) |
alias2 | character varying(20) |
failure | character varying(255) |
modified | timestamp without time zone | not null default now()
modified_by | character varying(20) |
active | character(1) | default 't'::bpchar
rad_secret | character varying(20) |
rad_atr | character varying(40) |
snmpdev | integer |
netflow | text |
cpu | integer |
temp | integer |
firmware_type_id | bigint | default 1
Indexes:
"id_pkey" PRIMARY KEY, btree (id)
"devices_active_index" btree (active)
"devices_failure" btree (failure)
"devices_hostgroup" btree (hostgroup)
"devices_hostname" btree (hostname)
"devices_hostpop" btree (hostpop)
"devices_ip_index" btree (ip)
"devices_snmprw" btree (snmprw)
Foreign-key constraints:
"devices_firmware_type_id_fkey" FOREIGN KEY (firmware_type_id) REFERENCES firmware_type(id)
Referenced by:
TABLE "ac_attributes" CONSTRAINT "ac_attributes_id_fkey" FOREIGN KEY (id) REFERENCES devices(id) ON DELETE CASCADE
TABLE "acls_matrix" CONSTRAINT "acls_matrix_device_id_fkey" FOREIGN KEY (device_id) REFERENCES devices(id) ON UPDATE CASCADE ON DELETE CASCADE
TABLE "ip_local_pool_aggregates" CONSTRAINT "ip_local_pool_aggregates_host_fkey" FOREIGN KEY (host) REFERENCES devices(id)
TABLE "ipsla_instances" CONSTRAINT "ipsla_instances_host_fkey" FOREIGN KEY (host) REFERENCES devices(id) ON DELETE CASCADE
TABLE "lns_attributes" CONSTRAINT "lns_attributes_id_fkey" FOREIGN KEY (id) REFERENCES devices(id) ON DELETE CASCADE
(END)
nms=# \d mongroups
Table "public.mongroups"
Column | Type | Modifiers
------------+-----------------------+-----------
hostgroup | character varying(20) |
locale | text |
department | character varying(20) |
Indexes:
"ukey_hostgroup_department" UNIQUE, btree (hostgroup, department)
nms=#
<end part I>
Thank you,
Sam
Ok, let’s try 3 parts:
Table counts:
syslog – 150200285
devices – 3291
mongroups – 71
The query:
SELECT syslog.ip,
syslog.msg,
syslog.datetime,
devices.hostname,
devices.hostpop
FROM syslog,
devices
WHERE syslog.ip IN
(SELECT ip
FROM devices,
mongroups
WHERE (active = 't'
OR active = 's')
AND devices.hostgroup = mongroups.hostgroup
AND devices.hostname || '.' || devices.hostpop ~* E'pe1.mel4'
AND devices.id != '1291')
AND datetime <= '2013-08-01 00:00:00'
AND datetime >= '2013-04-12 00:00:00'
AND syslog.ip = devices.ip
AND (devices.active = 't'
OR devices.active = 's');
<end part II>
Thank you,
Sam
From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Samuel Stearns
Sent: Thursday, 3 October 2013 10:26 AM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] 57 minute SELECT
Howdy,
I’m going to post this in 2 parts as I think it’s too big for 1 post.
Environment:
PG 8.4.17
Linux Ubuntu 10.04
Total RAM – 1G
Things that have been performed:
1. Explain on SELECT.
2. ANALYZE database.
3. VACUUM database.
4. shared_buffers = 256M
5. effective_cache_size = 768M
6. work_mem = 512M
Table DDL:
nms=# \d syslog
View "public.syslog"
Column | Type | Modifiers
----------+-----------------------------+-----------
ip | inet |
facility | character varying(10) |
level | character varying(10) |
datetime | timestamp without time zone |
program | character varying(25) |
msg | text |
seq | bigint |
View definition:
SELECT syslog_master.ip, syslog_master.facility, syslog_master.level, syslog_master.datetime, syslog_master.program, syslog_master.msg, syslog_master.seq
FROM syslog_master;
Rules:
syslog_insert_201308 AS
ON INSERT TO syslog
WHERE new.datetime >= '2013-08-01'::date AND new.datetime < '2013-09-01'::date DO INSTEAD INSERT INTO syslog_201308 (ip, facility, level, datetime, program, msg)
VALUES (new.ip, new.facility, new.level, new.datetime, new.program, new.msg)
syslog_insert_201309 AS
ON INSERT TO syslog
WHERE new.datetime >= '2013-09-01'::date AND new.datetime < '2013-10-01'::date DO INSTEAD INSERT INTO syslog_201309 (ip, facility, level, datetime, program, msg)
VALUES (new.ip, new.facility, new.level, new.datetime, new.program, new.msg)
syslog_insert_201310 AS
ON INSERT TO syslog
WHERE new.datetime >= '2013-10-01'::date AND new.datetime < '2013-11-01'::date DO INSTEAD INSERT INTO syslog_201310 (ip, facility, level, datetime, program, msg)
VALUES (new.ip, new.facility, new.level, new.datetime, new.program, new.msg)
syslog_insert_null AS
ON INSERT TO syslog DO INSTEAD NOTHING
nms=#
nms=# \d devices
hostname | character varying(20) |
hostpop | character varying(20) |
hostgroup | character varying(20) |
rack | character varying(10) |
asset | character varying(10) |
ip | inet |
snmprw | character varying(20) |
snmpro | character varying(20) |
snmpver | character varying(3) |
console | character varying(20) |
psu1 | character varying(20) |
psu2 | character varying(20) |
psu3 | character varying(20) |
psu4 | character varying(20) |
alias1 | character varying(20) |
alias2 | character varying(20) |
failure | character varying(255) |
modified | timestamp without time zone | not null default now()
modified_by | character varying(20) |
active | character(1) | default 't'::bpchar
rad_secret | character varying(20) |
rad_atr | character varying(40) |
snmpdev | integer |
netflow | text |
cpu | integer |
temp | integer |
firmware_type_id | bigint | default 1
Indexes:
"id_pkey" PRIMARY KEY, btree (id)
"devices_active_index" btree (active)
"devices_failure" btree (failure)
"devices_hostgroup" btree (hostgroup)
"devices_hostname" btree (hostname)
"devices_hostpop" btree (hostpop)
"devices_ip_index" btree (ip)
"devices_snmprw" btree (snmprw)
Foreign-key constraints:
"devices_firmware_type_id_fkey" FOREIGN KEY (firmware_type_id) REFERENCES firmware_type(id)
Referenced by:
TABLE "ac_attributes" CONSTRAINT "ac_attributes_id_fkey" FOREIGN KEY (id) REFERENCES devices(id) ON DELETE CASCADE
TABLE "acls_matrix" CONSTRAINT "acls_matrix_device_id_fkey" FOREIGN KEY (device_id) REFERENCES devices(id) ON UPDATE CASCADE ON DELETE CASCADE
TABLE "ip_local_pool_aggregates" CONSTRAINT "ip_local_pool_aggregates_host_fkey" FOREIGN KEY (host) REFERENCES devices(id)
TABLE "ipsla_instances" CONSTRAINT "ipsla_instances_host_fkey" FOREIGN KEY (host) REFERENCES devices(id) ON DELETE CASCADE
TABLE "lns_attributes" CONSTRAINT "lns_attributes_id_fkey" FOREIGN KEY (id) REFERENCES devices(id) ON DELETE CASCADE
(END)
nms=# \d mongroups
Table "public.mongroups"
Column | Type | Modifiers
------------+-----------------------+-----------
hostgroup | character varying(20) |
locale | text |
department | character varying(20) |
Indexes:
"ukey_hostgroup_department" UNIQUE, btree (hostgroup, department)
nms=#
<end part I>
Thank you,
Sam
Samuel Stearns-2 wrote > Total RAM - 1G > > > 1. Explain on SELECT. So either this is a typo (1 GB of RAM) or your query is likely ending up I/O bound. You should probably provide EXPLAIN and EXPLAIN (ANALYZE) output since even with the schema it is impossible for someone to see what the planner is proposing for a multiple-million record source table that is going to be empty if all someone does is create the schema. For my money it is also helpful to actual write some prose describing what you are providing and seeing and not just toss some settings and schema out there. http://wiki.postgresql.org/wiki/SlowQueryQuestions David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/57-minute-SELECT-tp5773169p5773174.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com.
The last part, the EXPLAIN, is too big to send. Is there an alternative way I can get it too you, other than chopping it up and sending in multiple parts?
Thank you,
Sam
From: Samuel Stearns
Sent: Thursday, 3 October 2013 10:34 AM
To: Samuel Stearns; pgsql-performance@postgresql.org
Subject: RE: 57 minute SELECT
Ok, let’s try 3 parts:
Table counts:
syslog – 150200285
devices – 3291
mongroups – 71
The query:
SELECT syslog.ip,
syslog.msg,
syslog.datetime,
devices.hostname,
devices.hostpop
FROM syslog,
devices
WHERE syslog.ip IN
(SELECT ip
FROM devices,
mongroups
WHERE (active = 't'
OR active = 's')
AND devices.hostgroup = mongroups.hostgroup
AND devices.hostname || '.' || devices.hostpop ~* E'pe1.mel4'
AND devices.id != '1291')
AND datetime <= '2013-08-01 00:00:00'
AND datetime >= '2013-04-12 00:00:00'
AND syslog.ip = devices.ip
AND (devices.active = 't'
OR devices.active = 's');
<end part II>
Thank you,
Sam
From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Samuel Stearns
Sent: Thursday, 3 October 2013 10:26 AM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] 57 minute SELECT
Howdy,
I’m going to post this in 2 parts as I think it’s too big for 1 post.
Environment:
PG 8.4.17
Linux Ubuntu 10.04
Total RAM – 1G
Things that have been performed:
1. Explain on SELECT.
2. ANALYZE database.
3. VACUUM database.
4. shared_buffers = 256M
5. effective_cache_size = 768M
6. work_mem = 512M
Table DDL:
nms=# \d syslog
View "public.syslog"
Column | Type | Modifiers
----------+-----------------------------+-----------
ip | inet |
facility | character varying(10) |
level | character varying(10) |
datetime | timestamp without time zone |
program | character varying(25) |
msg | text |
seq | bigint |
View definition:
SELECT syslog_master.ip, syslog_master.facility, syslog_master.level, syslog_master.datetime, syslog_master.program, syslog_master.msg, syslog_master.seq
FROM syslog_master;
Rules:
syslog_insert_201308 AS
ON INSERT TO syslog
WHERE new.datetime >= '2013-08-01'::date AND new.datetime < '2013-09-01'::date DO INSTEAD INSERT INTO syslog_201308 (ip, facility, level, datetime, program, msg)
VALUES (new.ip, new.facility, new.level, new.datetime, new.program, new.msg)
syslog_insert_201309 AS
ON INSERT TO syslog
WHERE new.datetime >= '2013-09-01'::date AND new.datetime < '2013-10-01'::date DO INSTEAD INSERT INTO syslog_201309 (ip, facility, level, datetime, program, msg)
VALUES (new.ip, new.facility, new.level, new.datetime, new.program, new.msg)
syslog_insert_201310 AS
ON INSERT TO syslog
WHERE new.datetime >= '2013-10-01'::date AND new.datetime < '2013-11-01'::date DO INSTEAD INSERT INTO syslog_201310 (ip, facility, level, datetime, program, msg)
VALUES (new.ip, new.facility, new.level, new.datetime, new.program, new.msg)
syslog_insert_null AS
ON INSERT TO syslog DO INSTEAD NOTHING
nms=#
nms=# \d devices
hostname | character varying(20) |
hostpop | character varying(20) |
hostgroup | character varying(20) |
rack | character varying(10) |
asset | character varying(10) |
ip | inet |
snmprw | character varying(20) |
snmpro | character varying(20) |
snmpver | character varying(3) |
console | character varying(20) |
psu1 | character varying(20) |
psu2 | character varying(20) |
psu3 | character varying(20) |
psu4 | character varying(20) |
alias1 | character varying(20) |
alias2 | character varying(20) |
failure | character varying(255) |
modified | timestamp without time zone | not null default now()
modified_by | character varying(20) |
active | character(1) | default 't'::bpchar
rad_secret | character varying(20) |
rad_atr | character varying(40) |
snmpdev | integer |
netflow | text |
cpu | integer |
temp | integer |
firmware_type_id | bigint | default 1
Indexes:
"id_pkey" PRIMARY KEY, btree (id)
"devices_active_index" btree (active)
"devices_failure" btree (failure)
"devices_hostgroup" btree (hostgroup)
"devices_hostname" btree (hostname)
"devices_hostpop" btree (hostpop)
"devices_ip_index" btree (ip)
"devices_snmprw" btree (snmprw)
Foreign-key constraints:
"devices_firmware_type_id_fkey" FOREIGN KEY (firmware_type_id) REFERENCES firmware_type(id)
Referenced by:
TABLE "ac_attributes" CONSTRAINT "ac_attributes_id_fkey" FOREIGN KEY (id) REFERENCES devices(id) ON DELETE CASCADE
TABLE "acls_matrix" CONSTRAINT "acls_matrix_device_id_fkey" FOREIGN KEY (device_id) REFERENCES devices(id) ON UPDATE CASCADE ON DELETE CASCADE
TABLE "ip_local_pool_aggregates" CONSTRAINT "ip_local_pool_aggregates_host_fkey" FOREIGN KEY (host) REFERENCES devices(id)
TABLE "ipsla_instances" CONSTRAINT "ipsla_instances_host_fkey" FOREIGN KEY (host) REFERENCES devices(id) ON DELETE CASCADE
TABLE "lns_attributes" CONSTRAINT "lns_attributes_id_fkey" FOREIGN KEY (id) REFERENCES devices(id) ON DELETE CASCADE
(END)
nms=# \d mongroups
Table "public.mongroups"
Column | Type | Modifiers
------------+-----------------------+-----------
hostgroup | character varying(20) |
locale | text |
department | character varying(20) |
Indexes:
"ukey_hostgroup_department" UNIQUE, btree (hostgroup, department)
nms=#
<end part I>
Thank you,
Sam
Total RAM = 1G is correct This query executes as the result of a search from our Network Management System Device Audit web tool where the date rangeis large and is focused on a specific device. I was thinking it should execute more quickly since syslog.ip has an index and we're not performing any textual matching. EXPLAIN: QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Hash Join (cost=408.53..1962721.39 rows=98068 width=126) (actual time=30121.265..3419306.752 rows=1929714 loops=1) Hash Cond: (public.syslog_master.ip = public.devices.ip) -> Nested Loop (cost=209.67..1120466.90 rows=37316116 width=122) (actual time=30117.845..3416690.561 rows=1929714 loops=1) Join Filter: (public.syslog_master.ip = public.devices.ip) -> HashAggregate (cost=205.40..205.41 rows=1 width=7) (actual time=5.133..5.142 rows=1 loops=1) -> Nested Loop (cost=0.00..205.40 rows=1 width=7) (actual time=4.117..5.124 rows=1 loops=1) Join Filter: ((public.devices.hostgroup)::text = (mongroups.hostgroup)::text) -> Seq Scan on devices (cost=0.00..202.80 rows=1 width=14) (actual time=4.088..5.075 rows=1 loops=1) Filter: ((id <> 1291) AND ((active = 't'::bpchar) OR (active = 's'::bpchar)) AND ((((hostname)::text|| '.'::text) || (hostpop)::text) ~* 'pe1.mel4'::text)) -> Seq Scan on mongroups (cost=0.00..1.71 rows=71 width=6) (actual time=0.009..0.017 rows=71 loops=1) -> Append (cost=4.27..1114378.69 rows=470624 width=115) (actual time=30112.646..3415201.052 rows=1929766 loops=1) -> Bitmap Heap Scan on syslog_master (cost=4.27..9.61 rows=2 width=72) (actual time=0.027..0.027 rows=0loops=1) Recheck Cond: (public.syslog_master.ip = public.devices.ip) Filter: ((public.syslog_master.datetime <= '2013-08-01 00:00:00'::timestamp without time zone) AND (public.syslog_master.datetime>= '2013-04-12 00:00:00'::timestamp without time zone)) -> Bitmap Index Scan on syslog_master_ip_idx (cost=0.00..4.27 rows=2 width=0) (actual time=0.019..0.019rows=0 loops=1) Index Cond: (public.syslog_master.ip = public.devices.ip) -> Bitmap Heap Scan on syslog_201307 syslog_master (cost=4175.37..355209.50 rows=150004 width=112) (actualtime=30112.618..686289.128 rows=297015 loops=1) Recheck Cond: (public.syslog_master.ip = public.devices.ip) Filter: ((public.syslog_master.datetime <= '2013-08-01 00:00:00'::timestamp without time zone) AND (public.syslog_master.datetime>= '2013-04-12 00:00:00'::timestamp without time -----Original Message----- From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of David Johnston Sent: Thursday, 3 October 2013 10:47 AM To: pgsql-performance@postgresql.org Subject: Re: [PERFORM] 57 minute SELECT Samuel Stearns-2 wrote > Total RAM - 1G > > > 1. Explain on SELECT. So either this is a typo (1 GB of RAM) or your query is likely ending up I/O bound. You should probably provide EXPLAIN and EXPLAIN (ANALYZE) output since even with the schema it is impossible for someoneto see what the planner is proposing for a multiple-million record source table that is going to be empty if all someonedoes is create the schema. For my money it is also helpful to actual write some prose describing what you are providing and seeing and not just tosssome settings and schema out there. http://wiki.postgresql.org/wiki/SlowQueryQuestions David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/57-minute-SELECT-tp5773169p5773174.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
zone)) -> Bitmap Index Scan on syslog_master_ip_idx (cost=0.00..4.27 rows=2 width=0) (actual time=0.019..0.019rows=0 loops=1) Index Cond: (public.syslog_master.ip = public.devices.ip) -> Bitmap Heap Scan on syslog_201307 syslog_master (cost=4175.37..355209.50 rows=150004 width=112) (actualtime=30112.618..686289.128 rows=297015 loops=1) Recheck Cond: (public.syslog_master.ip = public.devices.ip) Filter: ((public.syslog_master.datetime <= '2013-08-01 00:00:00'::timestamp without time zone) AND (public.syslog_master.datetime>= '2013-04-12 00:00:00'::timestamp without time zone)) -> Bitmap Index Scan on syslog_201307_ip_idx (cost=0.00..4137.88 rows=150004 width=0) (actual time=30040.703..30040.703rows=297015 loops=1) Index Cond: (public.syslog_master.ip = public.devices.ip) -> Index Scan using syslog_201308_datetime_idx on syslog_201308 syslog_master (cost=0.00..8.46 rows=1 width=108)(actual time=30.022..124.656 rows=52 loops=1) Index Cond: ((public.syslog_master.datetime <= '2013-08-01 00:00:00'::timestamp without time zone) AND(public.syslog_master.datetime >= '2013-04-12 00:00:00'::timestamp without time zone)) -> Bitmap Heap Scan on syslog_201304 syslog_master (cost=1829.05..235809.88 rows=98010 width=117) (actualtime=1049.606..875045.663 rows=320488 loops=1) Recheck Cond: (public.syslog_master.ip = public.devices.ip) Filter: ((public.syslog_master.datetime <= '2013-08-01 00:00:00'::timestamp without time zone) AND (public.syslog_master.datetime>= '2013-04-12 00:00:00'::timestamp without time zone)) -> Bitmap Index Scan on syslog_201304_ip_idx (cost=0.00..1813.37 rows=98010 width=0) (actual time=984.401..984.401rows=505789 loops=1) Index Cond: (public.syslog_master.ip = public.devices.ip) -> Bitmap Heap Scan on syslog_201305 syslog_master (cost=2157.14..264759.11 rows=114937 width=115) (actualtime=926.035..910323.922 rows=520315 loops=1) Recheck Cond: (public.syslog_master.ip = public.devices.ip) Filter: ((public.syslog_master.datetime <= '2013-08-01 00:00:00'::timestamp without time zone) AND (public.syslog_master.datetime>= '2013-04-12 00:00:00'::timestamp without time zone)) -> Bitmap Index Scan on syslog_201305_ip_idx (cost=0.00..2128.41 rows=114937 width=0) (actual time=864.109..864.109rows=520315 loops=1) Index Cond: (public.syslog_master.ip = public.devices.ip) -> Bitmap Heap Scan on syslog_201306 syslog_master (cost=2020.92..258582.12 rows=107670 width=117) (actualtime=1948.265..942909.424 rows=791896 loops=1) Recheck Cond: (public.syslog_master.ip = public.devices.ip) Filter: ((public.syslog_master.datetime <= '2013-08-01 00:00:00'::timestamp without time zone) AND (public.syslog_master.datetime>= '2013-04-12 00:00:00'::timestamp without time zone)) -> Bitmap Index Scan on syslog_201306_ip_idx (cost=0.00..1994.01 rows=107670 width=0) (actual time=1896.295..1896.295rows=791896 loops=1) Index Cond: (public.syslog_master.ip = public.devices.ip) -> Hash (cost=170.08..170.08 rows=2303 width=18) (actual time=3.398..3.398 rows=2386 loops=1) -> Seq Scan on devices (cost=0.00..170.08 rows=2303 width=18) (actual time=0.017..2.407 rows=2387 loops=1) Filter: ((active = 't'::bpchar) OR (active = 's'::bpchar)) Total runtime: 3419878.638 ms Thank you, Sam -----Original Message----- From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Samuel Stearns Sent: Thursday, 3 October 2013 11:00 AM To: David Johnston; pgsql-performance@postgresql.org Subject: Re: [PERFORM] 57 minute SELECT Total RAM = 1G is correct This query executes as the result of a search from our Network Management System Device Audit web tool where the date rangeis large and is focused on a specific device. I was thinking it should execute more quickly since syslog.ip has an index and we're not performing any textual matching. EXPLAIN: QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Hash Join (cost=408.53..1962721.39 rows=98068 width=126) (actual time=30121.265..3419306.752 rows=1929714 loops=1) Hash Cond: (public.syslog_master.ip = public.devices.ip) -> Nested Loop (cost=209.67..1120466.90 rows=37316116 width=122) (actual time=30117.845..3416690.561 rows=1929714 loops=1) Join Filter: (public.syslog_master.ip = public.devices.ip) -> HashAggregate (cost=205.40..205.41 rows=1 width=7) (actual time=5.133..5.142 rows=1 loops=1) -> Nested Loop (cost=0.00..205.40 rows=1 width=7) (actual time=4.117..5.124 rows=1 loops=1) Join Filter: ((public.devices.hostgroup)::text = (mongroups.hostgroup)::text) -> Seq Scan on devices (cost=0.00..202.80 rows=1 width=14) (actual time=4.088..5.075 rows=1 loops=1) Filter: ((id <> 1291) AND ((active = 't'::bpchar) OR (active = 's'::bpchar)) AND ((((hostname)::text|| '.'::text) || (hostpop)::text) ~* 'pe1.mel4'::text)) -> Seq Scan on mongroups (cost=0.00..1.71 rows=71 width=6) (actual time=0.009..0.017 rows=71 loops=1) -> Append (cost=4.27..1114378.69 rows=470624 width=115) (actual time=30112.646..3415201.052 rows=1929766 loops=1) -> Bitmap Heap Scan on syslog_master (cost=4.27..9.61 rows=2 width=72) (actual time=0.027..0.027 rows=0loops=1) Recheck Cond: (public.syslog_master.ip = public.devices.ip) Filter: ((public.syslog_master.datetime <= '2013-08-01 00:00:00'::timestamp without time zone) AND (public.syslog_master.datetime>= '2013-04-12 00:00:00'::timestamp without time zone)) -> Bitmap Index Scan on syslog_master_ip_idx (cost=0.00..4.27 rows=2 width=0) (actual time=0.019..0.019rows=0 loops=1) Index Cond: (public.syslog_master.ip = public.devices.ip) -> Bitmap Heap Scan on syslog_201307 syslog_master (cost=4175.37..355209.50 rows=150004 width=112) (actualtime=30112.618..686289.128 rows=297015 loops=1) Recheck Cond: (public.syslog_master.ip = public.devices.ip) Filter: ((public.syslog_master.datetime <= '2013-08-01 00:00:00'::timestamp without time zone) AND (public.syslog_master.datetime>= '2013-04-12 00:00:00'::timestamp without time -----Original Message----- From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of David Johnston Sent: Thursday, 3 October 2013 10:47 AM To: pgsql-performance@postgresql.org Subject: Re: [PERFORM] 57 minute SELECT Samuel Stearns-2 wrote > Total RAM - 1G > > > 1. Explain on SELECT. So either this is a typo (1 GB of RAM) or your query is likely ending up I/O bound. You should probably provide EXPLAIN and EXPLAIN (ANALYZE) output since even with the schema it is impossible for someoneto see what the planner is proposing for a multiple-million record source table that is going to be empty if all someonedoes is create the schema. For my money it is also helpful to actual write some prose describing what you are providing and seeing and not just tosssome settings and schema out there. http://wiki.postgresql.org/wiki/SlowQueryQuestions David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/57-minute-SELECT-tp5773169p5773174.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
On Wed, Oct 2, 2013 at 10:17 PM, Samuel Stearns <sstearns@staff.iinet.net.au> wrote: > The last part, the EXPLAIN, is too big to send. Is there an alternative way > I can get it too you, other than chopping it up and sending in multiple > parts? Try explain.depesz.com On Wed, Oct 2, 2013 at 10:30 PM, Samuel Stearns <sstearns@staff.iinet.net.au> wrote: > > EXPLAIN: > > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > Hash Join (cost=408.53..1962721.39 rows=98068 width=126) (actual time=30121.265..3419306.752 rows=1929714 loops=1) > Hash Cond: (public.syslog_master.ip = public.devices.ip) So your query is returning 2M rows. I think you should try lowering work_mem. 512M seems oversized for a query this complex on a system with 1G. You may be thrashing the OS cache. Also, you seem to have a problem with constraint exclusion. Some of those bitmap heap scans aren't necessary, and the planner should know it. Are you missing the corresponding CHECK constraints on datetime?
Thanks, Claudio: http://explain.depesz.com/s/WJQx -----Original Message----- From: Claudio Freire [mailto:klaussfreire@gmail.com] Sent: Thursday, 3 October 2013 11:16 AM To: Samuel Stearns Cc: David Johnston; pgsql-performance@postgresql.org Subject: Re: [PERFORM] 57 minute SELECT On Wed, Oct 2, 2013 at 10:17 PM, Samuel Stearns <sstearns@staff.iinet.net.au> wrote: > The last part, the EXPLAIN, is too big to send. Is there an > alternative way I can get it too you, other than chopping it up and > sending in multiple parts? Try explain.depesz.com On Wed, Oct 2, 2013 at 10:30 PM, Samuel Stearns <sstearns@staff.iinet.net.au> wrote: > > EXPLAIN: > > QUERY PLAN > ---------------------------------------------------------------------- > ---------------------------------------------------------------------- > ---------------------------------------------------------------------- > - Hash Join (cost=408.53..1962721.39 rows=98068 width=126) (actual > time=30121.265..3419306.752 rows=1929714 loops=1) > Hash Cond: (public.syslog_master.ip = public.devices.ip) So your query is returning 2M rows. I think you should try lowering work_mem. 512M seems oversized for a query this complex on a system with 1G. You may be thrashingthe OS cache. Also, you seem to have a problem with constraint exclusion. Some of those bitmap heap scans aren't necessary, and the plannershould know it. Are you missing the corresponding CHECK constraints on datetime?
Samuel Stearns-2 wrote > EXPLAIN: > > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > Hash Join (cost=408.53..1962721.39 rows=98068 width=126) (actual > time=30121.265..3419306.752 rows=1929714 loops=1) You are selecting and returning 2 million records...how fast do you want this to run? For some reason I read 57 seconds initially - I guess 57 minutes is a bit much...but the most obvious solution is RAM. Might want to include buffers output in the explain as well but: I'm doubting the contents of your result fit into the server memory so your disk is involved which will severely slow down processing. Hopefully someone more knowledgeable and experience will chime in to help you. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/57-minute-SELECT-tp5773169p5773187.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com.
Thanks, David. Can't run EXPLAIN (ANALYZE, BUFFERS) as I'm on 8.4. -----Original Message----- From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of David Johnston Sent: Thursday, 3 October 2013 11:19 AM To: pgsql-performance@postgresql.org Subject: Re: [PERFORM] 57 minute SELECT Samuel Stearns-2 wrote > EXPLAIN: > > QUERY PLAN > ---------------------------------------------------------------------- > ---------------------------------------------------------------------- > ---------------------------------------------------------------------- > - Hash Join (cost=408.53..1962721.39 rows=98068 width=126) (actual > time=30121.265..3419306.752 rows=1929714 loops=1) You are selecting and returning 2 million records...how fast do you want this to run? For some reason I read 57 secondsinitially - I guess 57 minutes is a bit much...but the most obvious solution is RAM. Might want to include buffers output in the explain as well but: I'm doubting the contents of your result fit into the server memory so your disk is involved which will severely slow downprocessing. Hopefully someone more knowledgeable and experience will chime in to help you. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/57-minute-SELECT-tp5773169p5773187.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
On Wed, Oct 2, 2013 at 10:47 PM, Samuel Stearns <sstearns@staff.iinet.net.au> wrote: > Thanks, Claudio: > > http://explain.depesz.com/s/WJQx If you have a test database, and if it doesn't hurt other queries of course, try clustering on the ip index. I believe your problem is that the index isn't helping much, it's probably hurting you in fact. If you cluster over ip, however, the scan will go almost sequentially, and there will be no wasted bytes in the pages fetched, which will be much friendlier on your I/O subsystem. If I were in your shoes, I'd cluster each of the monthly tables as they become inactive.
Thanks, Claudio. I'll have a look at the clustering. We have also noticed that the same query with a datetime range of 3 hours (rather than 4 months) runs in just 30 seconds: AND datetime <= '2013-10-03 10:03:49' AND datetime >= '2013-10-03 07:03:49' -----Original Message----- From: Claudio Freire [mailto:klaussfreire@gmail.com] Sent: Thursday, 3 October 2013 1:44 PM To: Samuel Stearns Cc: David Johnston; pgsql-performance@postgresql.org Subject: Re: [PERFORM] 57 minute SELECT On Wed, Oct 2, 2013 at 10:47 PM, Samuel Stearns <sstearns@staff.iinet.net.au> wrote: > Thanks, Claudio: > > http://explain.depesz.com/s/WJQx If you have a test database, and if it doesn't hurt other queries of course, try clustering on the ip index. I believe your problem is that the index isn't helping much, it's probably hurting you in fact. If you cluster over ip, however,the scan will go almost sequentially, and there will be no wasted bytes in the pages fetched, which will be muchfriendlier on your I/O subsystem. If I were in your shoes, I'd cluster each of the monthly tables as they become inactive.
On 03/10/2013 03:17, Samuel Stearns wrote: > The last part, the EXPLAIN, is too big to send. Is there an alternative > way I can get it too you, other than chopping it up and sending in > multiple parts? The usual way is via http://explain.depesz.com/ .
Attachment
On Thu, Oct 03, 2013 at 04:19:29AM +0000, Samuel Stearns wrote: > Thanks, Claudio. > > I'll have a look at the clustering. > > We have also noticed that the same query with a datetime range of 3 hours (rather than 4 months) runs in just 30 seconds: > > AND datetime <= '2013-10-03 10:03:49' > AND datetime >= '2013-10-03 07:03:49' > Hi Samuel, That is even worse performance relatively. 30s for a 3 hour range equals 28800s for a 4 month (2880 hours) range, or 8 hours. I definitely would consider clustering. Regards, Ken
On Thu, Oct 03, 2013 at 01:47:29AM +0000, Samuel Stearns wrote: - Thanks, Claudio: - - http://explain.depesz.com/s/WJQx You're spending a lot of time in the hash join which can kill a system with low ram. You may, just for fun, want to try the query with enable_hashjoin=false.
On Thu, Oct 03, 2013 at 09:20:52AM -0700, David Kerr wrote: - On Thu, Oct 03, 2013 at 01:47:29AM +0000, Samuel Stearns wrote: - - Thanks, Claudio: - - - - http://explain.depesz.com/s/WJQx - - You're spending a lot of time in the hash join which can kill a system with - low ram. - - You may, just for fun, want to try the query with enable_hashjoin=false. Sorry, ignore that, more coffe before mailing lists for me.
Thanks for all the advice here. I'll look at setting up something in a test environment and play with the clustering. Testinghow other queries perform against the clustering, also. Thank you! Sam -----Original Message----- From: David Kerr [mailto:dmk@mr-paradox.net] Sent: Friday, 4 October 2013 1:55 AM To: Samuel Stearns Cc: Claudio Freire; David Johnston; pgsql-performance@postgresql.org Subject: Re: [PERFORM] 57 minute SELECT On Thu, Oct 03, 2013 at 09:20:52AM -0700, David Kerr wrote: - On Thu, Oct 03, 2013 at 01:47:29AM +0000, Samuel Stearns wrote: - - Thanks, Claudio: - - - - http://explain.depesz.com/s/WJQx - - You're spending a lot of time in the hash join which can kill a system with - low ram. - - You may, just for fun, want to try the query with enable_hashjoin=false. Sorry, ignore that, more coffe before mailing lists for me.
Missed the 2nd part of Claudio's reply here. I actually tried different settings of work_mem up to 512M which didn't make any difference. Check constraints appear to be there: nms=# \d syslog_201304 Table "public.syslog_201304" Column | Type | Modifiers ----------+-----------------------------+------------------------------------------------------------- ip | inet | facility | character varying(10) | level | character varying(10) | datetime | timestamp without time zone | program | character varying(25) | msg | text | seq | bigint | not null default nextval('syslog_master_seq_seq'::regclass) Indexes: "syslog_201304_datetime_idx" btree (datetime) "syslog_201304_ip_idx" btree (ip) "syslog_201304_seq_idx" btree (seq) Check constraints: "syslog_201304_datetime_check" CHECK (datetime >= '2013-04-01'::date AND datetime < '2013-05-01'::date) Inherits: syslog_master nms=# -----Original Message----- From: Claudio Freire [mailto:klaussfreire@gmail.com] Sent: Thursday, 3 October 2013 11:16 AM To: Samuel Stearns Cc: David Johnston; pgsql-performance@postgresql.org Subject: Re: [PERFORM] 57 minute SELECT On Wed, Oct 2, 2013 at 10:17 PM, Samuel Stearns <sstearns@staff.iinet.net.au> wrote: > The last part, the EXPLAIN, is too big to send. Is there an > alternative way I can get it too you, other than chopping it up and > sending in multiple parts? Try explain.depesz.com On Wed, Oct 2, 2013 at 10:30 PM, Samuel Stearns <sstearns@staff.iinet.net.au> wrote: > > EXPLAIN: > > QUERY PLAN > ---------------------------------------------------------------------- > ---------------------------------------------------------------------- > ---------------------------------------------------------------------- > - Hash Join (cost=408.53..1962721.39 rows=98068 width=126) (actual > time=30121.265..3419306.752 rows=1929714 loops=1) > Hash Cond: (public.syslog_master.ip = public.devices.ip) So your query is returning 2M rows. I think you should try lowering work_mem. 512M seems oversized for a query this complex on a system with 1G. You may be thrashingthe OS cache. Also, you seem to have a problem with constraint exclusion. Some of those bitmap heap scans aren't necessary, and the plannershould know it. Are you missing the corresponding CHECK constraints on datetime?