Thread: How to optimize monstrous query, sorts instead of using index
I've used indexes to speed up my queries but this query escapes me. I'm curious if someone can suggest an index or a way to modify the query to use the index. The query is: select ms.averageconnecttimex as ms_averageconnecttime, ms.averagedurationx as ms_averageduration, ms.datex as ms_date, ms.idx as ms_id, ms.statusstringx as ms_statusstring, ms.statusx as ms_status, msi.actualcontentx as msi_actualcontent, msi.connecttimex as msi_connecttime, msi.correctcontentx as msi_correctcontent, msi.datex as msi_date, msi.descriptionx as msi_description, msi.durationx as msi_duration, msi.errorcontentx as msi_errorcontent, msi.idx as msi_id, msi.monitorlocationx as msi_monitorlocation, msi.statusstringx as msi_statusstring, msi.statusx as msi_status from monitorstatusx ms, monitorstatusitemx msi where monitorx.idx = 'M-TEST_1444-TEST_00_10560561260561463219352' AND monitorx.jdoidx = ms.monitorx AND ms.datex >= '2003-06-20 08:57:21.36' AND ms.datex <= '2003-06-29 08:57:21.36' AND ms.jdoidx = monitorstatus_statusitemsx.jdoidx AND monitorstatus_statusitemsx.statusitemsx = monitorstatusitemlistd8ea58a5x.jdoidx AND monitorstatusitemlistd8ea58a5x.statusitemlistx = msi.jdoidx ORDER BY ms_date DESC; Here is the result of explain: Sort (cost=9498.85..9500.16 rows=525 width=788) Sort Key: ms.datex -> Nested Loop (cost=0.00..9475.15 rows=525 width=788) -> Nested Loop (cost=0.00..7887.59 rows=525 width=123) -> Nested Loop (cost=0.00..6300.03 rows=525 width=107) -> Nested Loop (cost=0.00..4712.02 rows=525 width=91) -> Index Scan using monitorx_id_index on monitorx (cost=0.00..5.37 rows=1 width=8) Index Cond: (idx = 'M-TEST_1444-TEST_00_10560561260561463219352'::character varying) -> Index Scan using monitorstatusxmonitori on monitorstatusx ms (cost=0.00..4695.65 rows=880 width=83) Index Cond: ("outer".jdoidx = ms.monitorx) Filter: ((datex >= '2003-06-20 08:57:21.36'::timestamp without time zone) AND (datex <= '2003-06-29 08:57:21.36'::timestamp without time zone)) -> Index Scan using monitorstatus_stjdoidb742c9b3i on monitorstatus_statusitemsx (cost=0.00..3.01 rows=1 width=16) Index Cond: ("outer".jdoidx = monitorstatus_statusitemsx.jdoidx) -> Index Scan using monitorstatusitejdoid7db0befci on monitorstatusitemlistd8ea58a5x (cost=0.00..3.01 rows=1 width=16) Index Cond: ("outer".statusitemsx = monitorstatusitemlistd8ea58a5x.jdoidx) -> Index Scan using monitorstatusitemx_pkey on monitorstatusitemx msi (cost=0.00..3.01 rows=1 width=665) Index Cond: ("outer".statusitemlistx = msi.jdoidx) (17 rows) As you can see, it's doing a sort on ms.datex. I created an index on the monitorstatusx (ms) table for the datex, but it doesn't use it. Is it possible to create an index to prevent this sort? Thanks, Michael Michael Mattox cunparis@yahoo.fr / http://www.advweb.com/michael
Is this 7.3.x? Can we see explain analyze output for the query? On Wed, 2003-06-25 at 07:46, Michael Mattox wrote: > I've used indexes to speed up my queries but this query escapes me. I'm > curious if someone can suggest an index or a way to modify the query to use > the index. The query is: > > select ms.averageconnecttimex as ms_averageconnecttime, ms.averagedurationx > as ms_averageduration, ms.datex as ms_date, ms.idx as ms_id, > ms.statusstringx as ms_statusstring, ms.statusx as ms_status, > msi.actualcontentx as msi_actualcontent, msi.connecttimex as > msi_connecttime, msi.correctcontentx as msi_correctcontent, msi.datex as > msi_date, msi.descriptionx as msi_description, msi.durationx as > msi_duration, msi.errorcontentx as msi_errorcontent, msi.idx as msi_id, > msi.monitorlocationx as msi_monitorlocation, msi.statusstringx as > msi_statusstring, msi.statusx as msi_status from monitorstatusx ms, > monitorstatusitemx msi where monitorx.idx = > 'M-TEST_1444-TEST_00_10560561260561463219352' AND monitorx.jdoidx = > ms.monitorx AND ms.datex >= '2003-06-20 08:57:21.36' AND ms.datex <= > '2003-06-29 08:57:21.36' AND ms.jdoidx = monitorstatus_statusitemsx.jdoidx > AND monitorstatus_statusitemsx.statusitemsx = > monitorstatusitemlistd8ea58a5x.jdoidx AND > monitorstatusitemlistd8ea58a5x.statusitemlistx = msi.jdoidx ORDER BY ms_date > DESC; > > Here is the result of explain: > > Sort (cost=9498.85..9500.16 rows=525 width=788) > Sort Key: ms.datex > -> Nested Loop (cost=0.00..9475.15 rows=525 width=788) > -> Nested Loop (cost=0.00..7887.59 rows=525 width=123) > -> Nested Loop (cost=0.00..6300.03 rows=525 width=107) > -> Nested Loop (cost=0.00..4712.02 rows=525 width=91) > -> Index Scan using monitorx_id_index on > monitorx (cost=0.00..5.37 rows=1 width=8) > Index Cond: (idx = > 'M-TEST_1444-TEST_00_10560561260561463219352'::character varying) > -> Index Scan using monitorstatusxmonitori on > monitorstatusx ms (cost=0.00..4695.65 rows=880 width=83) > Index Cond: ("outer".jdoidx = ms.monitorx) > Filter: ((datex >= '2003-06-20 > 08:57:21.36'::timestamp without time zone) AND (datex <= '2003-06-29 > 08:57:21.36'::timestamp without time zone)) > -> Index Scan using monitorstatus_stjdoidb742c9b3i on > monitorstatus_statusitemsx (cost=0.00..3.01 rows=1 width=16) > Index Cond: ("outer".jdoidx = > monitorstatus_statusitemsx.jdoidx) > -> Index Scan using monitorstatusitejdoid7db0befci on > monitorstatusitemlistd8ea58a5x (cost=0.00..3.01 rows=1 width=16) > Index Cond: ("outer".statusitemsx = > monitorstatusitemlistd8ea58a5x.jdoidx) > -> Index Scan using monitorstatusitemx_pkey on monitorstatusitemx > msi (cost=0.00..3.01 rows=1 width=665) > Index Cond: ("outer".statusitemlistx = msi.jdoidx) > (17 rows) > > As you can see, it's doing a sort on ms.datex. I created an index on the > monitorstatusx (ms) table for the datex, but it doesn't use it. Is it > possible to create an index to prevent this sort? > > Thanks, > Michael > > > Michael Mattox > cunparis@yahoo.fr / http://www.advweb.com/michael > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly -- Rod Taylor <rbt@rbt.ca> PGP Key: http://www.rbt.ca/rbtpub.asc
Attachment
Sorry, I neglected to say the version, yes I'm using Postgres 7.3.2 on Linux. Here's the output of explain analyze. The query typically takes 0-4 seconds depending on the time frame. It's run very frequently especially to process the nightly reports. veriguard=# explain analyze select ms.averageconnecttimex as ms_averageconnecttime, ms.averagedurationx as ms_averageduration, ms.datex as ms_date, ms.idx as ms_id, ms.statusstringx as ms_statusstring, ms.statusx as ms_status, msi.actualcontentx as msi_actualcontent, msi.connecttimex as msi_connecttime, msi.correctcontentx as msi_correctcontent, msi.datex as msi_date, msi.descriptionx as msi_description, msi.durationx as msi_duration, msi.errorcontentx as msi_errorcontent, msi.idx as msi_id, msi.monitorlocationx as msi_monitorlocation, msi.statusstringx as msi_statusstring, msi.statusx as msi_status from monitorstatusx ms, monitorstatusitemx msi where monitorx.idx = 'M-TEST_1444-TEST_00_10560561260561463219352' AND monitorx.jdoidx = ms.monitorx AND ms.datex >= '2003-06-20 08:57:21.36' AND ms.datex <= '2003-06-29 08:57:21.36' AND ms.jdoidx = monitorstatus_statusitemsx.jdoidx AND monitorstatus_statusitemsx.statusitemsx = monitorstatusitemlistd8ea58a5x.jdoidx AND monitorstatusitemlistd8ea58a5x.statusitemlistx = msi.jdoidx ORDER BY ms_date DESC; QUERY PLAN ---------------------------------------------------------------------------- ---------------------------------------------------------------------------- ---------------------------- Sort (cost=9498.96..9500.27 rows=525 width=788) (actual time=6720.91..6721.44 rows=623 loops=1) Sort Key: ms.datex -> Nested Loop (cost=0.00..9475.26 rows=525 width=788) (actual time=145.16..6718.65 rows=623 loops=1) -> Nested Loop (cost=0.00..7887.69 rows=525 width=123) (actual time=126.84..4528.85 rows=623 loops=1) -> Nested Loop (cost=0.00..6300.13 rows=525 width=107) (actual time=95.37..3470.55 rows=623 loops=1) -> Nested Loop (cost=0.00..4712.13 rows=525 width=91) (actual time=40.44..1892.06 rows=625 loops=1) -> Index Scan using monitorx_id_index on monitorx (cost=0.00..5.48 rows=1 width=8) (actual time=0.25..19.90 rows=1 loops=1) Index Cond: (idx = 'M-TEST_1444-TEST_00_10560561260561463219352'::character varying) -> Index Scan using monitorstatusxmonitori on monitorstatusx ms (cost=0.00..4695.65 rows=880 width=83) (actual time=40.17..1868.12 rows=625 loops=1) Index Cond: ("outer".jdoidx = ms.monitorx) Filter: ((datex >= '2003-06-20 08:57:21.36'::timestamp without time zone) AND (datex <= '2003-06-29 08:57:21.36'::timestamp without time zone)) -> Index Scan using monitorstatus_stjdoidb742c9b3i on monitorstatus_statusitemsx (cost=0.00..3.01 rows=1 width=16) (actual time=2.51..2.51 rows=1 loops=625) Index Cond: ("outer".jdoidx = monitorstatus_statusitemsx.jdoidx) -> Index Scan using monitorstatusitejdoid7db0befci on monitorstatusitemlistd8ea58a5x (cost=0.00..3.01 rows=1 width=16) (actual time=1.68..1.69 rows=1 loops=623) Index Cond: ("outer".statusitemsx = monitorstatusitemlistd8ea58a5x.jdoidx) -> Index Scan using monitorstatusitemx_pkey on monitorstatusitemx msi (cost=0.00..3.01 rows=1 width=665) (actual time=3.50..3.50 rows=1 loops=623) Index Cond: ("outer".statusitemlistx = msi.jdoidx) Total runtime: 6722.43 msec (18 rows)
> Here's the output of explain analyze. The query typically takes 0-4 seconds > depending on the time frame. It's run very frequently especially to process > the nightly reports. The plan picked seems reasonable (estimated costs / tuples is close to actual). I think the biggest hit is this index scan. Thats a substantial cost to pull out less than a thousand lines: -> Index Scan using monitorstatusxmonitori on monitorstatusx ms (cost=0.00..4695.65 rows=880 width=83) (actual time=40.17..1868.12 rows=625 loops=1) Index Cond: ("outer".jdoidx = ms.monitorx) Filter: ((datex >= '2003-06-20 08:57:21.36'::timestamp without time zone) AND (datex <= '2003-06-29 08:57:21.36'::timestamp without time zone)) Are jdoidx and monitorx integers? You might try a multi-column index on (ms.monitorx, ms.datex). Are monitorx assigned roughly ordered by date? It must be, otherwise the sort step would not be so cheap (hardly any impact on the query -- see actual cost number). The multi-column index above should give you a bit of a boost. Depending on the data in the table, the index (ms.datex, monitorx) may give better results along with a single index on (ms.monitorx) as you currently have. It's not very likely though. -- Rod Taylor <rbt@rbt.ca> PGP Key: http://www.rbt.ca/rbtpub.asc
Attachment
> Are jdoidx and monitorx integers? Yes both are integers: -- Table: public.monitorstatusx CREATE TABLE public.monitorstatusx ( averageconnecttimex numeric(65535, 65532), averagedurationx numeric(65535, 65532), datex timestamp, idx varchar(255), jdoclassx varchar(255), jdoidx int8 NOT NULL, jdolockx int4, monitorx int8, statusstringx varchar(255), statusx varchar(255), CONSTRAINT monitorstatusx_pkey PRIMARY KEY (jdoidx) ) WITH OIDS; > You might try a multi-column index on (ms.monitorx, ms.datex). Just tried it, it didn't prevent the sort. But it sounds like the sort isn't the problem, correct? -- Index: public.monitorstatusx_datex_monitorx_index CREATE INDEX monitorstatusx_datex_monitorx_index ON monitorstatusx USING btree (monitorx, datex); veriguard=# explain analyze select ms.averageconnecttimex as ms_averageconnecttime, ms.averagedurationx as ms_averageduration, ms.datex as ms_date, ms.idx as ms_id, ms.statusstringx as ms_statusstring, ms.statusx as ms_status, msi.actualcontentx as msi_actualcontent, msi.connecttimex as msi_connecttime, msi.correctcontentx as msi_correctcontent, msi.datex as msi_date, msi.descriptionx as msi_description, msi.durationx as msi_duration, msi.errorcontentx as msi_errorcontent, msi.idx as msi_id, msi.monitorlocationx as msi_monitorlocation, msi.statusstringx as msi_statusstring, msi.statusx as msi_status from monitorstatusx ms, monitorstatusitemx msi where monitorx.idx = 'M-TEST_1444-TEST_00_10560561260561463219352' AND monitorx.jdoidx = ms.monitorx AND ms.datex >= '2003-06-20 08:57:21.36' AND ms.datex <= '2003-06-29 08:57:21.36' AND ms.jdoidx = monitorstatus_statusitemsx.jdoidx AND monitorstatus_statusitemsx.statusitemsx = monitorstatusitemlistd8ea58a5x.jdoidx AND monitorstatusitemlistd8ea58a5x.statusitemlistx = msi.jdoidx ORDER BY ms_date DESC; QUERY PLAN ---------------------------------------------------------------------------- ---------------------------------------------------------------------------- ---------------------------------------------------------------------- Sort (cost=6014.53..6015.86 rows=529 width=788) (actual time=4286.35..4286.88 rows=626 loops=1) Sort Key: ms.datex -> Nested Loop (cost=0.00..5990.59 rows=529 width=788) (actual time=131.57..4283.76 rows=626 loops=1) -> Nested Loop (cost=0.00..4388.44 rows=529 width=123) (actual time=106.23..3398.54 rows=626 loops=1) -> Nested Loop (cost=0.00..2786.29 rows=529 width=107) (actual time=90.29..2518.20 rows=626 loops=1) -> Nested Loop (cost=0.00..1175.81 rows=532 width=91) (actual time=55.15..1345.88 rows=628 loops=1) -> Index Scan using monitorx_id_index on monitorx (cost=0.00..5.36 rows=1 width=8) (actual time=54.94..55.03 rows=1 loops=1) Index Cond: (idx = 'M-TEST_1444-TEST_00_10560561260561463219352'::character varying) -> Index Scan using monitorstatusx_datex_monitorx_index on monitorstatusx ms (cost=0.00..1159.33 rows=890 width=83) (actual time=0.19..1287.02 rows=628 loops=1) Index Cond: (("outer".jdoidx = ms.monitorx) AND (ms.datex >= '2003-06-20 08:57:21.36'::timestamp without time zone) AND (ms.datex <= '2003-06-29 08:57:21.36'::timestamp without time zone)) -> Index Scan using monitorstatus_stjdoidb742c9b3i on monitorstatus_statusitemsx (cost=0.00..3.01 rows=1 width=16) (actual time=1.85..1.86 rows=1 loops=628) Index Cond: ("outer".jdoidx = monitorstatus_statusitemsx.jdoidx) -> Index Scan using monitorstatusitejdoid7db0befci on monitorstatusitemlistd8ea58a5x (cost=0.00..3.01 rows=1 width=16) (actual time=1.39..1.39 rows=1 loops=626) Index Cond: ("outer".statusitemsx = monitorstatusitemlistd8ea58a5x.jdoidx) -> Index Scan using monitorstatusitemx_pkey on monitorstatusitemx msi (cost=0.00..3.01 rows=1 width=665) (actual time=1.40..1.40 rows=1 loops=626) Index Cond: ("outer".statusitemlistx = msi.jdoidx) Total runtime: 4288.71 msec (17 rows) veriguard=# > Are monitorx assigned roughly ordered by date? It must be, otherwise > the sort step would not be so cheap (hardly any impact on the query -- > see actual cost number). The multi-column index above should give you a > bit of a boost. monitorx is a foreign key to the monitorx table. If the query can't be optimized it's OK, I can live it the speed. I just couldn't figure out why it'd sort on datex if I had an index on datex. Thanks, Michael
> > You might try a multi-column index on (ms.monitorx, ms.datex). > > Just tried it, it didn't prevent the sort. But it sounds like the sort > isn't the problem, correct? The sort isn't actually doing any sorting, so it's virtually free. The sort is taking less than 3ms as the data is already 99% sorted due to the correlation between datex and monitorx. For similar reasons, the datex index will not be used, as it has no advantage to being used. > -> Index Scan using > monitorstatusx_datex_monitorx_index on monitorstatusx ms > (cost=0.00..1159.33 rows=890 width=83) (actual time=0.19..1287.02 rows=628 > loops=1) > Index Cond: (("outer".jdoidx = ms.monitorx) > AND (ms.datex >= '2003-06-20 08:57:21.36'::timestamp without time zone) AND > (ms.datex <= '2003-06-29 08:57:21.36'::timestamp without time zone)) You can see that it used the new multi-key index for both items, rather than finding for monitorx, then filtering out unwanted results by datex. It doesn't appear to have made much difference (looks like data was partially cached for this new run), but it changed a bit for the better. I'm afraid thats the best I can do on the query itself I think. Oh, and using tables in your where clause that aren't in the from clause is non-portable and often hides bugs: from monitorstatusx ms , monitorstatusitemx msi where monitorx.idx = 'M-TEST_1444-TEST_00_10560561260561463219352' Are you sure you sure you don't have any duplicated constraints by pulling information in from other tables that you don't need to? Removing some of those nested loops would make a significant impact to the results. -- Rod Taylor <rbt@rbt.ca> PGP Key: http://www.rbt.ca/rbtpub.asc
Attachment
> Oh, and using tables in your where clause that aren't in the from clause > is non-portable and often hides bugs: > > from monitorstatusx ms > , monitorstatusitemx msi > where monitorx.idx = 'M-TEST_1444-TEST_00_10560561260561463219352' > > Are you sure you sure you don't have any duplicated constraints by > pulling information in from other tables that you don't need to? > Removing some of those nested loops would make a significant impact to > the results. I didn't notice that before, thanks for pointing that out. I just tried adding monitorx.idx to the select and it ended up making my query take several minutes long. Any ideas how I can fix this and keep my performance? new query: veriguard=# explain select m.idx, ms.averageconnecttimex as ms_averageconnecttime, ms.averagedurationx as ms_averageduration, ms.datex as ms_date, ms.idx as ms_id, ms.statusstringx as ms_statusstring, ms.statusx as ms_status, msi.actualcontentx as msi_actualcontent, msi.connecttimex as msi_connecttime, msi.correctcontentx as msi_correctcontent, msi.datex as msi_date, msi.descriptionx as msi_description, msi.durationx as msi_duration, msi.errorcontentx as msi_errorcontent, msi.idx as msi_id, msi.monitorlocationx as msi_monitorlocation, msi.statusstringx as msi_statusstring, msi.statusx as msi_status from monitorx m, monitorstatusx ms, monitorstatusitemx msi where m.idx = 'M-TEST_1444-TEST_00_10560561260561463219352' AND monitorx.jdoidx = ms.monitorx AND ms.datex >= '2003-06-20 08:57:21.36' AND ms.datex <= '2003-06-29 08:57:21.36' AND ms.jdoidx = monitorstatus_statusitemsx.jdoidx AND monitorstatus_statusitemsx.statusitemsx = monitorstatusitemlistd8ea58a5x.jdoidx AND monitorstatusitemlistd8ea58a5x.statusitemlistx = msi.jdoidx ORDER BY ms_date DESC; QUERY PLAN ---------------------------------------------------------------------------- ---------------------------------------------------------------------------- ------------------------------------- Sort (cost=1653384.42..1655402.97 rows=807418 width=826) Sort Key: ms.datex -> Hash Join (cost=820308.66..1112670.42 rows=807418 width=826) Hash Cond: ("outer".monitorx = "inner".jdoidx) -> Merge Join (cost=820132.71..1098364.65 rows=807418 width=780) Merge Cond: ("outer".jdoidx = "inner".statusitemlistx) -> Index Scan using monitorstatusitemx_pkey on monitorstatusitemx msi (cost=0.00..247616.27 rows=6596084 width=665) -> Sort (cost=820132.71..822151.59 rows=807554 width=115) Sort Key: monitorstatusitemlistd8ea58a5x.statusitemlistx -> Hash Join (cost=461310.87..685820.13 rows=807554 width=115) Hash Cond: ("outer".jdoidx = "inner".statusitemsx) -> Seq Scan on monitorstatusitemlistd8ea58a5x (cost=0.00..104778.90 rows=6597190 width=16) -> Hash (cost=447067.98..447067.98 rows=807554 width=99) -> Merge Join (cost=0.00..447067.98 rows=807554 width=99) Merge Cond: ("outer".jdoidx = "inner".jdoidx) -> Index Scan using monitorstatusx_pkey on monitorstatusx ms (cost=0.00..272308.56 rows=811754 width=83) Filter: ((datex >= '2003-06-20 08:57:21.36'::timestamp without time zone) AND (datex <= '2003-06-29 08:57:21.36'::timestamp without time zone)) -> Index Scan using monitorstatus_stjdoidb742c9b3i on monitorstatus_statusitemsx (cost=0.00..146215.58 rows=6596680 width=16) -> Hash (cost=172.22..172.22 rows=1493 width=46) -> Nested Loop (cost=0.00..172.22 rows=1493 width=46) -> Index Scan using monitorx_id_index on monitorx m (cost=0.00..5.36 rows=1 width=38) Index Cond: (idx = 'M-TEST_1444-TEST_00_10560561260561463219352'::character varying) -> Seq Scan on monitorx (cost=0.00..151.93 rows=1493 width=8) (23 rows) old query: veriguard=# explain select ms.averageconnecttimex as ms_averageconnecttime, ms.averagedurationx as ms_averageduration, ms.datex as ms_date, ms.idx as ms_id, ms.statusstringx as ms_statusstring, ms.statusx as ms_status, msi.actualcontentx as msi_actualcontent, msi.connecttimex as msi_connecttime, msi.correctcontentx as msi_correctcontent, msi.datex as msi_date, msi.descriptionx as msi_description, msi.durationx as msi_duration, msi.errorcontentx as msi_errorcontent, msi.idx as msi_id, msi.monitorlocationx as msi_monitorlocation, msi.statusstringx as msi_statusstring, msi.statusx as msi_status from monitorstatusx ms, monitorstatusitemx msi where monitorx.idx = 'M-TEST_1444-TEST_00_10560561260561463219352' AND monitorx.jdoidx = ms.monitorx AND ms.datex >= '2003-06-20 08:57:21.36' AND ms.datex <= '2003-06-29 08:57:21.36' AND ms.jdoidx = monitorstatus_statusitemsx.jdoidx AND monitorstatus_statusitemsx.statusitemsx = monitorstatusitemlistd8ea58a5x.jdoidx AND monitorstatusitemlistd8ea58a5x.statusitemlistx = msi.jdoidx ORDER BY ms_date DESC; QUERY PLAN ---------------------------------------------------------------------------- ---------------------------------------------------------------------------- ------------------------- Sort (cost=9590.52..9591.87 rows=541 width=788) Sort Key: ms.datex -> Nested Loop (cost=0.00..9565.97 rows=541 width=788) -> Nested Loop (cost=0.00..7929.22 rows=541 width=123) -> Nested Loop (cost=0.00..6292.48 rows=541 width=107) -> Nested Loop (cost=0.00..4647.22 rows=544 width=91) -> Index Scan using monitorx_id_index on monitorx (cost=0.00..5.36 rows=1 width=8) Index Cond: (idx = 'M-TEST_1444-TEST_00_10560561260561463219352'::character varying) -> Index Scan using monitorstatusxmonitori on monitorstatusx ms (cost=0.00..4630.29 rows=926 width=83) Index Cond: ("outer".jdoidx = ms.monitorx) Filter: ((datex >= '2003-06-20 08:57:21.36'::timestamp without time zone) AND (datex <= '2003-06-29 08:57:21.36'::timestamp without time zone)) -> Index Scan using monitorstatus_stjdoidb742c9b3i on monitorstatus_statusitemsx (cost=0.00..3.01 rows=1 width=16) Index Cond: ("outer".jdoidx = monitorstatus_statusitemsx.jdoidx) -> Index Scan using monitorstatusitejdoid7db0befci on monitorstatusitemlistd8ea58a5x (cost=0.00..3.01 rows=1 width=16) Index Cond: ("outer".statusitemsx = monitorstatusitemlistd8ea58a5x.jdoidx) -> Index Scan using monitorstatusitemx_pkey on monitorstatusitemx msi (cost=0.00..3.01 rows=1 width=665) Index Cond: ("outer".statusitemlistx = msi.jdoidx) (17 rows) veriguard=#
Michael, This whole query looks like a mess to me. Since I don't know the exact model and the table stats, I don't even try to rewrite your query, however, here are the weak points I can think of: * as Rod pointed out, there are more tables in WHERE that aren't in FROM. This can be a bug, but the very least, it makes the query far less readable. These are: monitorx monitorstatus_statusitemsx.jdoidx monitorstatusitemlistd8ea58a5x.jdoidx * there are 3 index scans that basically steal your time. They are 1.6..3.5 ms x 625 ~ 1..2 sec each (or I'm reading exp ana wrong, I'm not an expert indeed): - Index Scan using monitorstatus_stjdoidb742c9b3i on monitorstatus_statusitemsx (cost=0.00..3.01 rows=1 width=16) (actual time=2.51..2.51 rows=1 loops=625) Index Cond: ("outer".jdoidx = monitorstatus_statusitemsx.jdoidx) - Index Scan using monitorstatusitejdoid7db0befci on monitorstatusitemlistd8ea58a5x (cost=0.00..3.01 rows=1 width=16) (actual time=1.68..1.69 rows=1 loops=623) Index Cond: ("outer".statusitemsx = monitorstatusitemlistd8ea58a5x.jdoidx) - Index Scan using monitorstatusitemx_pkey on monitorstatusitemx msi (cost=0.00..3.01 rows=1 width=665) (actual time=3.50..3.50 rows=1 loops=623) Index Cond: ("outer".statusitemlistx = msi.jdoidx) * another killer index: I think this one takes about the rest of the time (i.e. 3-4 secs): -> Index Scan using monitorstatusxmonitori on monitorstatusx ms (cost=0.00..4695.65 rows=880 width=83) (actual time=40.17..1868.12 rows=625 loops=1) Index Cond: ("outer".jdoidx = ms.monitorx) Filter: ((datex >= '2003-06-20 08:57:21.36'::timestamp without time zone) AND (datex <= '2003-06-29 08:57:21.36'::timestamp without time zone)) Since the number of rows probably can't be reduced (as I read it, the query actually returned that many rows), I'd think about clever joins in the FROM part and fewer tables, to use fewer index scans. Finally, decided to do an ad-hoc adjustment. Try this, or (wild guess) try to completely eliminate the WHERE part by subselects on ms and monitorx. This may be faster, slower, or even give different results, based on whether I guessed the 1:N relationships right or not. G. ------------------------------- cut here ------------------------------- select ms.averageconnecttimex as ms_averageconnecttime, ms.averagedurationx as ms_averageduration, ms.datex as ms_date, ms.idx as ms_id, ms.statusstringx as ms_statusstring, ms.statusx as ms_status, msi.actualcontentx as msi_actualcontent, msi.connecttimex as msi_connecttime, msi.correctcontentx as msi_correctcontent, msi.datex as msi_date, msi.descriptionx as msi_description, msi.durationx as msi_duration, msi.errorcontentx as msi_errorcontent, msi.idx as msi_id, msi.monitorlocationx as msi_monitorlocation, msi.statusstringx as msi_statusstring, msi.statusx as msi_status from monitorstatusx ms LEFT JOIN monitorx ON (monitorx.jdoidx = ms.monitorx) LEFT JOIN monitorstatus_statusitemsx ms_si ON (ms.jdoidx = ms_si.jdoidx) LEFT JOIN monitorstatusitemlistd8ea58a5x msil ON (ms_si.statusitemsx = msil.jdoidx) LEFT JOIN monitorstatusitemx msi ON (msil.statusitemlistx = msi.jdoidx) where monitorx.idx = 'M-TEST_1444-TEST_00_10560561260561463219352' ms.datex >= '2003-06-20 08:57:21.36' AND ms.datex <= '2003-06-29 08:57:21.36' ------------------------------- cut here -------------------------------
> I didn't notice that before, thanks for pointing that out. I just tried > adding monitorx.idx to the select and it ended up making my query take > several minutes long. Any ideas how I can fix this and keep my performance? By using it aliased and non-aliased (2 different references to the same table) you've caused it to join itself. Try this: SELECT m.idx , ms.averageconnecttimex AS ms_averageconnecttime , ms.averagedurationx AS ms_averageduration , ms.datex AS ms_date , ms.idx AS ms_id , ms.statusstringx AS ms_statusstring , ms.statusx AS ms_status , msi.actualcontentx AS msi_actualcontent , msi.connecttimex AS msi_connecttime , msi.correctcontentx AS msi_correctcontent , msi.datex AS msi_date , msi.descriptionx AS msi_description , msi.durationx AS msi_duration , msi.errorcontentx AS msi_errorcontent , msi.idx AS msi_id , msi.monitorlocationx AS msi_monitorlocation , msi.statusstringx AS msi_statusstring , msi.statusx AS msi_status FROM monitorstatusx AS ms , monitorstatusitemx AS msi , monitorx AS mx , monitorstatus_statusitemsx AS mssisx , monitorstatusitemlistd8ea58a5x AS litem WHERE ms.jdoidx = mssisx.jdoidx AND mssisx.statusitemsx = litem.jdoidx AND litem.statusitemlistx = msi.jdoidx AND mx.jdoidx = ms.monitorx AND ms.datex BETWEEN '2003-06-20 08:57:21.36' AND '2003-06-29 08:57:21.36' AND m.idx = 'M-TEST_1444-TEST_00_10560561260561463219352' ORDER BY ms.datex DESC; -- Rod Taylor <rbt@rbt.ca> PGP Key: http://www.rbt.ca/rbtpub.asc
Attachment
Michael, Actually, you missed an alias :) the select now returned 800k rows! (according to explain) pointed it out below. See my prev mail for more. If it's possible, try your query on a backend and look for notices like "Adding missing FROM clause for table ..." G. ------------------------------- cut here ------------------------------- ----- Original Message ----- From: "Michael Mattox" <michael.mattox@verideon.com> Cc: "Postgresql Performance" <pgsql-performance@postgresql.org> Sent: Wednesday, June 25, 2003 4:09 PM > from monitorx m, monitorstatusx ms, monitorstatusitemx msi > where m.idx = 'M-TEST_1444-TEST_00_10560561260561463219352' AND > monitorx.jdoidx = ms.monitorx AND ^^^^^^^^ substitute the same alias "m" here.
> Finally, decided to do an ad-hoc adjustment. Try this, or (wild guess) try > to completely eliminate the WHERE part by subselects on ms and monitorx. > > This may be faster, slower, or even give different results, based > on whether > I guessed the 1:N relationships right or not. It's much slower but I appreciate you taking the time to try. I'm pretty new to SQL so I must admin this query is very confusing for me. I'm using Java Data Objects (JDO, an O/R mapping framework) but the implementation I'm using (Kodo) isn't smart enough to do all the joins efficiently, which is why I had to rewrite this query by hand. Here's the output: veriguard=# explain select ms.averageconnecttimex as ms_averageconnecttime, ms.averagedurationx as ms_averageduration, ms.datex as ms_date, ms.idx as ms_id, ms.statusstringx as ms_statusstring, ms.statusx as ms_status, msi.actualcontentx as msi_actualcontent, msi.connecttimex as msi_connecttime, msi.correctcontentx as msi_correctcontent, msi.datex as msi_date, msi.descriptionx as msi_description, msi.durationx as msi_duration, msi.errorcontentx as msi_errorcontent, msi.idx as msi_id, msi.monitorlocationx as msi_monitorlocation, msi.statusstringx as msi_statusstring, msi.statusx as msi_status from monitorstatusx ms LEFT JOIN monitorx ON (monitorx.jdoidx = ms.monitorx) LEFT JOIN monitorstatus_statusitemsx ms_si ON (ms.jdoidx = ms_si.jdoidx) LEFT JOIN monitorstatusitemlistd8ea58a5x msil ON (ms_si.statusitemsx = msil.jdoidx) LEFT JOIN monitorstatusitemx msi ON (msil.statusitemlistx = msi.jdoidx) where monitorx.idx = 'M-TEST_1444-TEST_00_10560561260561463219352' AND ms.datex >= '2003-06-20 08:57:21.36' AND ms.datex <= '2003-06-29 08:57:21.36'; QUERY PLAN ---------------------------------------------------------------------------- ---------------------------------------------------------------------------- ------------------------------------- Merge Join (cost=1006209.47..1283529.68 rows=751715 width=826) Merge Cond: ("outer".jdoidx = "inner".statusitemlistx) -> Index Scan using monitorstatusitemx_pkey on monitorstatusitemx msi (cost=0.00..247679.64 rows=6595427 width=665) -> Sort (cost=1006209.47..1008088.76 rows=751715 width=161) Sort Key: msil.statusitemlistx -> Merge Join (cost=697910.17..864079.59 rows=751715 width=161) Merge Cond: ("outer".jdoidx = "inner".statusitemsx) -> Index Scan using monitorstatusitejdoid7db0befci on monitorstatusitemlistd8ea58a5x msil (cost=0.00..136564.80 rows=6595427 width=16) -> Sort (cost=697910.17..699789.46 rows=751715 width=145) Sort Key: ms_si.statusitemsx -> Merge Join (cost=385727.49..561594.96 rows=751715 width=145) Merge Cond: ("outer".jdoidx = "inner".jdoidx) -> Index Scan using monitorstatus_stjdoidb742c9b3i on monitorstatus_statusitemsx ms_si (cost=0.00..146268.80 rows=6595427 width=16) -> Sort (cost=385727.49..387606.78 rows=751715 width=129) Sort Key: ms.jdoidx -> Hash Join (cost=155.66..255240.65 rows=751715 width=129) Hash Cond: ("outer".monitorx = "inner".jdoidx) Filter: ("inner".idx = 'M-TEST_1444-TEST_00_10560561260561463219352'::character varying) -> Seq Scan on monitorstatusx ms (cost=0.00..240050.69 rows=751715 width=83) Filter: ((datex >= '2003-06-20 08:57:21.36'::timestamp without time zone) AND (datex <= '2003-06-29 08:57:21.36'::timestamp without time zone)) -> Hash (cost=151.93..151.93 rows=1493 width=46) -> Seq Scan on monitorx (cost=0.00..151.93 rows=1493 width=46) (22 rows) veriguard=#
With a slight correction (you had m & mx so I changed them to be all mx, I hope this is what you intended) this query works. It's exactly the same speed, but it doesn't give me the warnings I was getting: NOTICE: Adding missing FROM-clause entry for table "monitorx" NOTICE: Adding missing FROM-clause entry for table "monitorstatus_statusitemsx" NOTICE: Adding missing FROM-clause entry for table "monitorstatusitemlistd8ea58a5x" I never knew what those were from, I even searched Google trying to find out and I couldn't understand it so I gave up. Thanks for pointing this out for me, and thanks for fixing my query. Michael > -----Original Message----- > From: pgsql-performance-owner@postgresql.org > [mailto:pgsql-performance-owner@postgresql.org]On Behalf Of Rod Taylor > Sent: Wednesday, June 25, 2003 4:28 PM > To: michael.mattox@verideon.com > Cc: Postgresql Performance > Subject: Re: [PERFORM] How to optimize monstrous query, sorts instead of > > > > I didn't notice that before, thanks for pointing that out. I just tried > > adding monitorx.idx to the select and it ended up making my query take > > several minutes long. Any ideas how I can fix this and keep my > performance? > > By using it aliased and non-aliased (2 different references to the same > table) you've caused it to join itself. > > Try this: > > SELECT m.idx > , ms.averageconnecttimex AS ms_averageconnecttime > , ms.averagedurationx AS ms_averageduration > , ms.datex AS ms_date > , ms.idx AS ms_id > , ms.statusstringx AS ms_statusstring > , ms.statusx AS ms_status > , msi.actualcontentx AS msi_actualcontent > , msi.connecttimex AS msi_connecttime > , msi.correctcontentx AS msi_correctcontent > , msi.datex AS msi_date > , msi.descriptionx AS msi_description > , msi.durationx AS msi_duration > , msi.errorcontentx AS msi_errorcontent > , msi.idx AS msi_id > , msi.monitorlocationx AS msi_monitorlocation > , msi.statusstringx AS msi_statusstring > , msi.statusx AS msi_status > > FROM monitorstatusx AS ms > , monitorstatusitemx AS msi > > , monitorx AS mx > , monitorstatus_statusitemsx AS mssisx > , monitorstatusitemlistd8ea58a5x AS litem > > WHERE ms.jdoidx = mssisx.jdoidx > AND mssisx.statusitemsx = litem.jdoidx > AND litem.statusitemlistx = msi.jdoidx > AND mx.jdoidx = ms.monitorx > AND ms.datex BETWEEN '2003-06-20 08:57:21.36' > AND '2003-06-29 08:57:21.36' > AND m.idx = 'M-TEST_1444-TEST_00_10560561260561463219352' > > ORDER BY ms.datex DESC; > > -- > Rod Taylor <rbt@rbt.ca> > > PGP Key: http://www.rbt.ca/rbtpub.asc >
Rod Taylor <rbt@rbt.ca> writes: >> monitorstatusx_datex_monitorx_index on monitorstatusx ms >> (cost=3D0.00..1159.33 rows=3D890 width=3D83) (actual time=3D0.19..1287.02= > rows=3D628 >> loops=3D1) >> Index Cond: (("outer".jdoidx =3D ms.moni= > torx) >> AND (ms.datex >=3D '2003-06-20 08:57:21.36'::timestamp without time zone)= > AND >> (ms.datex <=3D '2003-06-29 08:57:21.36'::timestamp without time zone)) > You can see that it used the new multi-key index for both items, rather > than finding for monitorx, then filtering out unwanted results by datex. What is the column ordering of the combined index? Unless datex is the first column, there is no chance of using it to create the required sort order anyway. I think this index condition is suggesting that monitorx is the first column. However, I agree with Rod's point that "avoid the sort" is not the mindset to use to optimize this query. The joins are the problem. You might try forcing different join types (see enable_nestloop and friends) to get an idea of whether a different plan is likely to help. regards, tom lane
"Michael Mattox" <michael.mattox@verideon.com> writes: > It's much slower but I appreciate you taking the time to try. I'm pretty > new to SQL so I must admin this query is very confusing for me. I'm using > Java Data Objects (JDO, an O/R mapping framework) but the implementation I'm > using (Kodo) isn't smart enough to do all the joins efficiently, which is > why I had to rewrite this query by hand. It wasn't till I read that :-( that I noticed that you were doing nested left joins. Fooling with the join order may be your best route to a solution --- have you read http://www.postgresql.org/docs/view.php?version=7.3&idoc=0&file=explicit-joins.html regards, tom lane
> Try this: Rod, you improved my query last week (thank you very much) but I'm not sure why but my performance is getting worse. I think I know what happened, when I did my load testing I created data that all had the same date, so sorting on the date was very fast. But now I've been running the system for a few weeks I've got a range of dates and now the sort is very costly. I'm curious if it's possible to optimize this with an index? I've tried creating some indexes but they're never used. explain analyze SELECT mx.idx , ms.averageconnecttimex AS ms_averageconnecttime , ms.averagedurationx AS ms_averageduration , ms.datex AS ms_date , ms.idx AS ms_id , ms.statusstringx AS ms_statusstring , ms.statusx AS ms_status , msi.actualcontentx AS msi_actualcontent , msi.connecttimex AS msi_connecttime , msi.correctcontentx AS msi_correctcontent , msi.datex AS msi_date , msi.descriptionx AS msi_description , msi.durationx AS msi_duration , msi.errorcontentx AS msi_errorcontent , msi.idx AS msi_id , msi.monitorlocationx AS msi_monitorlocation , msi.statusstringx AS msi_statusstring , msi.statusx AS msi_status FROM monitorstatusx AS ms , monitorstatusitemx AS msi , monitorx AS mx , monitorstatus_statusitemsx AS mssisx , monitorstatusitemlistd8ea58a5x AS litem WHERE ms.jdoidx = mssisx.jdoidx AND mssisx.statusitemsx = litem.jdoidx AND litem.statusitemlistx = msi.jdoidx AND mx.jdoidx = ms.monitorx AND ms.datex BETWEEN '2003-07-01 00:00:00.000000+01' AND '2003-07-01 23:59:59.000000+01' AND mx.idx = 'M-TEST_150-TEST_01_10560776551771895174239' ORDER BY ms.datex DESC; QUERY PLAN ---------------------------------------------------------------------------- ---------------------------------------------------------------------------- ----------------------------------- Sort (cost=6882.84..6883.08 rows=97 width=827) (actual time=16712.46..16712.65 rows=225 loops=1) Sort Key: ms.datex -> Nested Loop (cost=0.00..6879.66 rows=97 width=827) (actual time=4413.12..16711.62 rows=225 loops=1) -> Nested Loop (cost=0.00..6587.53 rows=97 width=162) (actual time=4406.06..15941.16 rows=225 loops=1) -> Nested Loop (cost=0.00..6295.38 rows=97 width=146) (actual time=4383.59..15424.96 rows=225 loops=1) -> Nested Loop (cost=0.00..6003.22 rows=97 width=130) (actual time=4383.53..14938.02 rows=225 loops=1) -> Index Scan using monitorx_id_index on monitorx mx (cost=0.00..5.01 rows=1 width=46) (actual time=0.13..0.21 rows=1 loops=1) Index Cond: (idx = 'M-TEST_150-TEST_01_10560776551771895174239'::character varying) -> Index Scan using monitorstatusxmonitori on monitorstatusx ms (cost=0.00..5996.18 rows=163 width=84) (actual time=4383.38..14936.39 rows=225 loops=1) Index Cond: ("outer".jdoidx = ms.monitorx) Filter: ((datex >= '2003-07-01 00:00:00'::timestamp without time zone) AND (datex <= '2003-07-01 23:59:59'::timestamp without time zone)) -> Index Scan using monitorstatus_stjdoidb742c9b3i on monitorstatus_statusitemsx mssisx (cost=0.00..3.01 rows=1 width=16) (actual time=2.15..2.15 rows=1 loops=225) Index Cond: ("outer".jdoidx = mssisx.jdoidx) -> Index Scan using monitorstatusitejdoid7db0befci on monitorstatusitemlistd8ea58a5x litem (cost=0.00..3.01 rows=1 width=16) (actual time=2.28..2.28 rows=1 loops=225) Index Cond: ("outer".statusitemsx = litem.jdoidx) -> Index Scan using monitorstatusitemx_pkey on monitorstatusitemx msi (cost=0.00..3.01 rows=1 width=665) (actual time=3.41..3.41 rows=1 loops=225) Index Cond: ("outer".statusitemlistx = msi.jdoidx) Total runtime: 16713.25 msec (18 rows) As you can see it takes 16 seconds to return only 18 rows. The monitorstatusx table has over 7 million rows, and for each monitor status there's one row in each of the monitorstatusitemx and the join tables. So I think the size of the database is just too high for this sort. I run my reports offline, but what I'm finding is that at 16 seconds per report, the reports aren't finished by morning. My postgresql.conf is attached in case I have it configured incorrectly. Thanks, Michael
> My postgresql.conf is attached in case I have it configured incorrectly. Forgot my postgres.conf..
Attachment
On Wed, 2003-07-02 at 10:28, Michael Mattox wrote: > > My postgresql.conf is attached in case I have it configured incorrectly. > > Forgot my postgres.conf.. Shared buffers is probably too high. How much memory in this machine? Is there anything else running aside from PostgreSQL? What does top say about cached / buffered data (number) I see you reduced the random_page_cost to 1.5. Why did you do this (how is your disk subsystem configured)? -- Rod Taylor <rbt@rbt.ca> PGP Key: http://www.rbt.ca/rbtpub.asc
Attachment
On Wed, 2003-07-02 at 10:24, Michael Mattox wrote: > > Try this: > > Rod, you improved my query last week (thank you very much) but I'm not sure > why but my performance is getting worse. I think I know what happened, when > I did my load testing I created data that all had the same date, so sorting > on the date was very fast. But now I've been running the system for a few > weeks I've got a range of dates and now the sort is very costly. I'm > curious if it's possible to optimize this with an index? I've tried > creating some indexes but they're never used. Standard questions, did you VACUUM? Regularly? Want to try again and send us the output from VACUUM VERBOSE? Sounds like you created a ton of test data, then removed a bunch? Did you REINDEX that table? During normal use, what is your query spread like? Mostly selects with some inserts? Any updates or deletes? How often to updates or deletes come in, and how many rows do they effect? > -> Index Scan using monitorstatusxmonitori on > monitorstatusx ms (cost=0.00..5996.18 rows=163 width=84) (actual > time=4383.38..14936.39 rows=225 loops=1) > Index Cond: ("outer".jdoidx = ms.monitorx) > Filter: ((datex >= '2003-07-01 > 00:00:00'::timestamp without time zone) AND (datex <= '2003-07-01 > 23:59:59'::timestamp without time zone)) The above index scan is taking a vast majority of the time (nearly 15 seconds of the 16 second total -- stop thinking about sorts!).. What happened to the index on monitorx and datex? -- Rod Taylor <rbt@rbt.ca> PGP Key: http://www.rbt.ca/rbtpub.asc
Attachment
> Shared buffers is probably too high. How much memory in this machine? > Is there anything else running aside from PostgreSQL? What does top say > about cached / buffered data (number) I was using the 25% of RAM guideline posted recently. The machine has 1.5gig but it also has a couple other java applications running on it including tomcat. 1:56pm up 6 days, 2:58, 6 users, load average: 2.60, 2.07, 1.78 193 processes: 191 sleeping, 2 running, 0 zombie, 0 stopped CPU0 states: 14.0% user, 9.0% system, 0.0% nice, 75.1% idle CPU1 states: 31.0% user, 0.1% system, 0.0% nice, 67.0% idle CPU2 states: 5.0% user, 0.1% system, 0.1% nice, 93.0% idle CPU3 states: 0.0% user, 0.1% system, 0.1% nice, 98.0% idle Mem: 1547572K av, 1537848K used, 9724K free, 0K shrd, 25104K buff Swap: 1044216K av, 51352K used, 992864K free 1245460K cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND 10184 veriguar 9 0 94760 83M 2612 S 36.5 5.5 13:29 java 8990 postgres 17 0 54864 53M 53096 R 11.5 3.5 0:00 postmaster 8988 veriguar 11 0 1164 1164 836 R 1.9 0.0 0:00 top 10161 veriguar 13 5 69504 60M 2600 S N 0.9 3.9 13:11 java 10206 veriguar 13 5 27952 23M 2580 S N 0.9 1.5 7:21 java 10699 postgres 9 0 31656 30M 30396 S 0.9 2.0 0:02 postmaster total used free shared buffers cached Mem: 1547572 1532024 15548 0 23820 1239024 -/+ buffers/cache: 269180 1278392 Swap: 1044216 51368 992848 > I see you reduced the random_page_cost to 1.5. Why did you do this (how > is your disk subsystem configured)? Someone suggested I lower it to 1.5 or 1.0, not sure what the reasoning was. The disks are both SCSI 10,000 RPM. My data directory is on one disk by itself, and the pg_xlog is on the other disk as well as the operating system and everything else. I was told it's best to have them on seperate disks, however I'm wondering because my system only has two disks and the one with the operating system isn't big enough to hold my database therefore I must put my DB on the 2nd disk and if pg_xlog is to be separate, it has to be with the OS & Java apps. > Standard questions, did you VACUUM? Regularly? Want to try again and > send us the output from VACUUM VERBOSE? I vacuum the monitor table every 5 minutes and I do a vacuum full analyze every night at midnight (cron job). I just did a vacuum verbose, output is attached. > Sounds like you created a ton of test data, then removed a bunch? Did > you REINDEX that table? I haven't deleted any of the data, I've been continuously adding new data. I added about 6 million rows at once, and they all had the same date. Since then my application has been stress testing over about 2 weeks now so there's now 7693057 rows in monitorstatusx and monitorstatusitemx as well as the necessary rows for the join tables. > During normal use, what is your query spread like? Mostly selects with > some inserts? Any updates or deletes? How often to updates or deletes > come in, and how many rows do they effect? There is a query on monitorx by datex every 10 seconds (monitors are updated every 5 minutes, so every 10 seconds I get the monitors that are due for an update). Each monitor is then saved with its status field modified, and a new status item is inserted. This happens every 5 minutes. There are 8-16 monitors being run in parallel, although typically it's 8 or less. This is the main application. The reporting application does a few queries but nothing major except the query that is the subject of this email. It's the reporting app that is slow due to this one big query. Finally the web app executes the same query as the reporting app, except there is a lot less data to be returned since it's only for the current day. > > -> Index Scan using > monitorstatusxmonitori on > > monitorstatusx ms (cost=0.00..5996.18 rows=163 width=84) (actual > > time=4383.38..14936.39 rows=225 loops=1) > > Index Cond: ("outer".jdoidx = > ms.monitorx) > > Filter: ((datex >= '2003-07-01 > > 00:00:00'::timestamp without time zone) AND (datex <= '2003-07-01 > > 23:59:59'::timestamp without time zone)) > > The above index scan is taking a vast majority of the time (nearly 15 > seconds of the 16 second total -- stop thinking about sorts!).. What > happened to the index on monitorx and datex? I just did reindex table monitorstatux; which didn't help, in fact query times went up. I then did create index monitorstatus_monitor_date_i on monitorstatusx(monitorx, datex); and this seemed to help a little: QUERY PLAN ---------------------------------------------------------------------------- ---------------------------------------------------------------------------- ---------------------------------------------------------------- Sort (cost=1133.13..1133.38 rows=98 width=827) (actual time=9754.06..9754.25 rows=226 loops=1) Sort Key: ms.datex -> Nested Loop (cost=0.00..1129.90 rows=98 width=827) (actual time=50.81..9753.17 rows=226 loops=1) -> Nested Loop (cost=0.00..833.47 rows=98 width=162) (actual time=50.74..7149.28 rows=226 loops=1) -> Nested Loop (cost=0.00..537.04 rows=98 width=146) (actual time=50.67..4774.45 rows=226 loops=1) -> Nested Loop (cost=0.00..240.44 rows=98 width=130) (actual time=50.61..1515.10 rows=226 loops=1) -> Index Scan using monitorx_id_index on monitorx mx (cost=0.00..3.45 rows=1 width=46) (actual time=0.09..0.11 rows=1 loops=1) Index Cond: (idx = 'M-TEST_170-TEST_00_10560857890510173779233'::character varying) -> Index Scan using monitorstatus_monitor_date_i on monitorstatusx ms (cost=0.00..234.93 rows=165 width=84) (actual time=50.51..1513.21 rows=226 loops=1) Index Cond: (("outer".jdoidx = ms.monitorx) AND (ms.datex >= '2003-07-01 00:00:00'::timestamp without time zone) AND (ms.datex <= '2003-07-01 23:59:59'::timestamp without time zone)) -> Index Scan using monitorstatus_stjdoidb742c9b3i on monitorstatus_statusitemsx mssisx (cost=0.00..3.01 rows=1 width=16) (actual time=14.40..14.41 rows=1 loops=226) Index Cond: ("outer".jdoidx = mssisx.jdoidx) -> Index Scan using monitorstatusitejdoid7db0befci on monitorstatusitemlistd8ea58a5x litem (cost=0.00..3.01 rows=1 width=16) (actual time=10.49..10.49 rows=1 loops=226) Index Cond: ("outer".statusitemsx = litem.jdoidx) -> Index Scan using monitorstatusitemx_pkey on monitorstatusitemx msi (cost=0.00..3.01 rows=1 width=665) (actual time=11.50..11.50 rows=1 loops=226) Index Cond: ("outer".statusitemlistx = msi.jdoidx) Total runtime: 9754.64 msec (17 rows) Before I guess the index with monitorx,datex didn't do much because all the data had the same date. But now that I have over 2 weeks of real data, it makes a difference. Thanks, Michael