Re: (VERY) Slow Query - PostgreSQL 9.2 - Mailing list pgsql-general
From | Bill Moran |
---|---|
Subject | Re: (VERY) Slow Query - PostgreSQL 9.2 |
Date | |
Msg-id | 20160503063332.42dd365fa687af5bd8c4e862@potentialtech.com Whole thread Raw |
In response to | (VERY) Slow Query - PostgreSQL 9.2 ("drum.lucas@gmail.com" <drum.lucas@gmail.com>) |
List | pgsql-general |
On Tue, 3 May 2016 21:55:21 +1200 "drum.lucas@gmail.com" <drum.lucas@gmail.com> wrote: > Hi all, > > I'm trying to get the query below a better performance.. but just don't > know what else I can do... > > Please, have a look and let me know if you can help somehow.. also.. if you > need some extra data jet ask me please. > > * Note that the gorfs.inode_segments table is 1.7TB size > > I have the following Query: > > explain analyzeSELECT split_part(full_path, '/', 4)::INT AS account_id, > split_part(full_path, '/', 6)::INT AS note_id, > split_part(full_path, '/', 9)::TEXT AS variation, > st_size, > segment_index, > reverse(split_part(reverse(full_path), '/', 1)) as file_name, > i.st_ino, > full_path, > (i.st_size / 1000000::FLOAT)::NUMERIC(5,2) || 'MB' AS > size_mbFROM gorfs.inodes iJOIN gorfs.inode_segments s > ON i.st_ino = s.st_ino_targetWHERE > i.checksum_md5 IS NOT NULL > AND > AND i.st_size > 0; > split_part(s.full_path, '/', 4)::INT IN ( > SELECT account.id > FROM public.ja_clients AS account > WHERE > NOT ( > ((account.last_sub_pay > EXTRACT('epoch' FROM > (transaction_timestamp() - CAST('4 Months' AS INTERVAL)))) AND > (account.price_model > 0)) OR > (account.regdate > EXTRACT('epoch' FROM > (transaction_timestamp() - CAST('3 Month' AS INTERVAL)))) OR > (((account.price_model = 0) AND (account.jobcredits > > 0)) AND (account.last_login > EXTRACT('epoch' FROM > (transaction_timestamp() - CAST('4 Month' AS INTERVAL))))) > ) LIMIT 100); > > > - Explain analyze link: http://explain.depesz.com/s/Oc6 > > The query is taking ages, and I can't get the problem solved. > > These are the index I've already created on the inode_segments table: > > Indexes: > "ix_account_id_from_full_path" "btree" > (("split_part"("full_path"::"text", '/'::"text", 4)::integer)) WHERE > "full_path"::"text" ~ > '^/userfiles/account/[0-9]+/[a-z]+/[0-9]+'::"text" > "ix_inode_segments_ja_files_lookup" "btree" ((CASE > WHEN "full_path"::"text" ~ '/[^/]*\.[^/]*$'::"text" THEN > "upper"("regexp_replace"("full_path"::"text", '.*\.'::"text", > ''::"text", 'g'::"text")) > ELSE NULL::"text"END)) WHERE > "gorfs"."is_kaminski_note_path"("full_path"::"text") > "ix_inode_segments_notes_clientids" "btree" > (("split_part"("full_path"::"text", '/'::"text", 4)::integer)) WHERE > "gorfs"."is_kaminski_note_path"("full_path"::"text") > "ix_inode_segments_notes_clientids2" "btree" ("full_path") > "ix_inode_segments_notes_fileids" "btree" > (("split_part"("full_path"::"text", '/'::"text", 8)::integer)) WHERE > "gorfs"."is_kaminski_note_path"("full_path"::"text") > "ix_inode_segments_notes_noteids" "btree" > ((NULLIF("split_part"("full_path"::"text", '/'::"text", 6), > 'unassigned'::"text")::integer)) WHERE > "gorfs"."is_kaminski_note_path"("full_path"::"text") > > These are the index I've already created on the inodes table: > > Indexes: > "ix_inodes_checksum_st_size" "btree" ("checksum_md5", "st_size") > WHERE "checksum_md5" IS NOT NULL > > *Question:* > > What else can I do to improve the Performance of the Query? From the explain, it looks like the biggest pain point is the inode_segments table, specifically, this condition: s.full_path ~ '^/userfiles/account/[0-9]+/[a-z]+/[0-9]+' It's doing a full scan of every record in that table, which is a large number, and that regex can't be cheap over that kind of volume. If you do: SELECT count(*) FROM inode_segments WHERE full_path ~ '^/userfiles/account/[0-9]+/[a-z]+/[0-9]+'; how many tuples actually match that condition? If the number is a large percentage of the total table, then I'm not sure how to help you, but if the percentage is small, you might be able to speed things up by adding an index: CREATE INDEX is_fp_trunc_idx ON inode_segments(substring(full_path FROM 1 FOR 19)); Then adding this condition to the where clause: substring(s.full_path FROM 1 FOR 19) = '/userfiles/account/' There are other index combinations that may help as well, depending on the nature of the values in that table, but, in general, anything you can do to reduce the number of records that have to be examined in that table is liable to speed things up. -- Bill Moran
pgsql-general by date: