Thread: Query with large number of joins

Query with large number of joins

From

Marco Di Cesare

Date:

20 October 2014, 23:33:12

We are using Postgres for the first time after being SQLServer users for a long time so forgive for being noobs.

We are using a BI tool that generates a query with an unusually large number of joins. My understanding is that with this many joins Postgres query planner can't possibly use an exhaustive search so it drops into a heuristics algorithm. Unfortunately, the query runs quite slow (~35 seconds) and seems to ignore using primary keys and indexes where available.

Query plan here (sorry had to anonymize):

http://explain.depesz.com/s/Uml

Line 30 is one of the pain points where a full table scan is running on 4.2 million rows even though there are indexes on oscar_bravo.foxtrot_four and oscar_charlie.foxtrot_four

We've tried to play around with the join_collapse_limit value by upping it from the default of 8 to 10 or 12 but it doesn't seem to help much. Cranking the value up to an unreasonable value of 20 does shave some seconds off the query time but not substantially (explain plan with the value set to 20: http://explain.depesz.com/s/sW6).

We haven't tried playing around with the geqo_threshold at this point.

Any thoughts on ways to speed up the run time of this query or any other Postgres settings we should be aware of when dealing with this unusually large number of joins?

Thanks in advance

Marco Di Cesare

Re: Query with large number of joins

From

Tom Lane

Date:

21 October 2014, 03:00:05

Marco Di Cesare <Marco.DiCesare@pointclickcare.com> writes:
> We are using a BI tool that generates a query with an unusually large number of joins. My understanding is that with
thismany joins Postgres query planner can't possibly use an exhaustive search so it drops into a heuristics algorithm.
Unfortunately,the query runs quite slow (~35 seconds) and seems to ignore using primary keys and indexes where
available.

> Query plan here (sorry had to anonymize):
> http://explain.depesz.com/s/Uml

It's difficult to make any detailed comments when you've shown us only an
allegedly-bad query plan, and not either the query itself or the table
definitions.

However, it appears to me that the query plan is aggregating over a rather
large number of join rows, and there are very few constraints that would
allow eliminating rows.  So I'm not at all sure there is a significantly
better plan available.  Are you claiming this query was instantaneous
on SQL Server?

The only thing that jumps out at me as possibly improvable is that with
a further increase in work_mem, you could probably get it to change the
last aggregation step from Sort+GroupAggregate into HashAggregate,
which'd likely run faster ... assuming you can spare some more memory.

            regards, tom lane

Re: Query with large number of joins

From

Felipe Santos

Date:

21 October 2014, 13:45:25

2014-10-20 21:59 GMT-02:00 Tom Lane <tgl@sss.pgh.pa.us>:

Marco Di Cesare <Marco.DiCesare@pointclickcare.com> writes:
> We are using a BI tool that generates a query with an unusually large number of joins. My understanding is that with this many joins Postgres query planner can't possibly use an exhaustive search so it drops into a heuristics algorithm. Unfortunately, the query runs quite slow (~35 seconds) and seems to ignore using primary keys and indexes where available.

> Query plan here (sorry had to anonymize):
> http://explain.depesz.com/s/Uml

It's difficult to make any detailed comments when you've shown us only an
allegedly-bad query plan, and not either the query itself or the table
definitions.

However, it appears to me that the query plan is aggregating over a rather
large number of join rows, and there are very few constraints that would
allow eliminating rows. So I'm not at all sure there is a significantly
better plan available. Are you claiming this query was instantaneous
on SQL Server?

The only thing that jumps out at me as possibly improvable is that with
a further increase in work_mem, you could probably get it to change the
last aggregation step from Sort+GroupAggregate into HashAggregate,
which'd likely run faster ... assuming you can spare some more memory.

regards, tom lane

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Hi,

As Tom said, WORK_MEM seems a nice place to start.

Here are other considerations you might take in account:

https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server

There's also the opportunity to tune the query itself (if it's not automatically generated by your BI tool). You can always speed up a query response by using filtered sub-selects instead of calling the the entire tables themselves on the joins.

Felipe

Re: Query with large number of joins

From

Merlin Moncure

Date:

21 October 2014, 16:39:33

On Mon, Oct 20, 2014 at 3:32 PM, Marco Di Cesare
<Marco.DiCesare@pointclickcare.com> wrote:
> We are using Postgres for the first time after being SQLServer users for a
> long time so forgive for being noobs.
>
>
>
> We are using a BI tool that generates a query with an unusually large number
> of joins. My understanding is that with this many joins Postgres query
> planner can't possibly use an exhaustive search so it drops into a
> heuristics algorithm. Unfortunately, the query runs quite slow (~35 seconds)
> and seems to ignore using primary keys and indexes where available.
>
>
>
> Query plan here (sorry had to anonymize):
>
> http://explain.depesz.com/s/Uml
>
>
>
> Line 30 is one of the pain points where a full table scan is running on 4.2
> million rows even though there are indexes on oscar_bravo.foxtrot_four and
> oscar_charlie.foxtrot_four
>
>
>
> We've tried to play around with the join_collapse_limit value by upping it
> from the default of 8 to 10 or 12 but it doesn't seem to help much. Cranking
> the value up to an unreasonable value of 20 does shave some seconds off the
> query time but not substantially (explain plan with the value set to 20:
> http://explain.depesz.com/s/sW6).

You always have the option of disabling geqo completely.

However, in this case, can you fetch out the relevant fields for
"oscar_bravo" that are participating in the join?  I'd like to see the
field name/type in the source table and the destination table.  Also.
I'd like to see the index definition and the snippit of the query that
presents the join condition.

You can encourage the server to favor index scans vs seq scans by
lowering 'random_page_cost'.  The nuclear option is to disable
sequential scans completely (which is generally a bad idea but can be
useful to try and fetch out queries that are inadvertently forced into
a seqscan for some reason).

merlin

Re: Query with large number of joins

From

Marco Di Cesare

Date:

21 October 2014, 19:10:05

I did not mean to imply this works any better on SQL Server. We never tried. I just meant to say this is the first time
weare using Postgres so we don't have much experience with it.
 

We tried with work_mem set to 1GB (even as high as 3GB) but it didn't change the GroupAggregate and Sort or query run
time.

Sorry, I had to sanitize the query and a few of the relevant tables so hopefully I got it all right. 

SELECT
                "foxtrot_india"."juliet_alpha", 
                "foxtrot_india"."foxtrot_yankee", 
                "foxtrot_india"."hotel_sierra", 
                "foxtrot_india"."juliet_alpha", 
                "foxtrot_india"."bravo_romeo", 
                "oscar_bravo"."golf_foxtrot", 
                "seven_kilo"."november_lima", 
                "foxtrot_india"."echo_six", 
                "uniform_six"."seven_six", 
                "oscar_charlie"."foxtrot_charlie", 
                COUNT(DISTINCT "foxtrot_india"."bravo_romeo")  
FROM
                "public"."seven_kilo" "seven_kilo"
                INNER JOIN "public"."papa_sierra" "papa_sierra" ON ("seven_kilo"."golf_bravo" =
"papa_sierra"."golf_bravo")
                LEFT JOIN "public"."golf_two" "golf_two" ON ("seven_kilo"."lima" = "golf_two"."lima")
                LEFT JOIN "public"."bravo_xray" "bravo_xray" ON ("seven_kilo"."lima" = "bravo_xray"."lima")
                LEFT JOIN "public"."foo1" "foo1" ON (("seven_kilo"."bar1" = "foo1"."bar1") AND
("seven_kilo"."golf_bravo"= "foo1"."golf_bravo"))
 
                INNER JOIN "public"."oscar_charlie" "oscar_charlie" ON ("seven_kilo"."lima" = "oscar_charlie"."lima")
                INNER JOIN "public"."oscar_bravo" "oscar_bravo" ON ("oscar_charlie"."foxtrot_four" =
"oscar_bravo"."foxtrot_four")
                INNER JOIN "public"."foxtrot_india" "foxtrot_india" ON ("oscar_bravo"."sierra" =
"foxtrot_india"."sierra")
                INNER JOIN "public"."hotel_romeo" "hotel_romeo" ON ("oscar_charlie"."foxtrot_charlie" =
"hotel_romeo"."foxtrot_charlie")
                INNER JOIN "public"."uniform_six" "uniform_six" ON ("hotel_romeo"."hotel_lima" =
"uniform_six"."hotel_lima")
                LEFT JOIN "public"."lookup" "foo2" ON ("foxtrot_india"."bar2" = "foo2"."lookup_id")
                LEFT JOIN "public"."uniform_two" "uniform_two" ON ("foxtrot_india"."sierra" = "uniform_two"."sierra")
                INNER JOIN "public"."lookup" "four_xray" ON ("uniform_two"."quebec" = "four_xray"."quebec")
                LEFT JOIN "public"."papa_four" "papa_four" ON ("foxtrot_india"."sierra" = "papa_four"."sierra")
                INNER JOIN "public"."lookup" "romeo_bravo" ON ("papa_four"."quebec" = "romeo_bravo"."quebec")
                LEFT JOIN "public"."juliet_two" "juliet_two" ON ("foxtrot_india"."sierra" = "juliet_two"."sierra")
                INNER JOIN "public"."lookup" "four_delta" ON ("juliet_two"."quebec" = "four_delta"."quebec")
                LEFT JOIN "public"."foo3" "foo3" ON ("foxtrot_india"."bar3" = "foo3"."bar3")
                INNER JOIN "public"."xray" "xray" ON ("seven_kilo"."lima" = "xray"."lima")
                INNER JOIN "public"."romeo_echo" "romeo_echo" ON ("xray"."echo_sierra" = "romeo_echo"."echo_sierra") 
WHERE
                ((("xray"."echo_sierra" = 'november_foxtrot')
                AND ("romeo_echo"."hotel_oscar" = 'zulu')
                AND ("oscar_charlie"."five" = 6)
                AND ("oscar_charlie"."whiskey" = 'four_romeo')
                AND ("oscar_charlie"."charlie_romeo" = 2014)))                
GROUP BY 1, 2, 3, 4, 5, 6, 7, 8, 9, 10


                                    Table "public.oscar_bravo"
         Column          |         Type          | Modifiers | Storage  | Stats target | Description

-------------------------+-----------------------+-----------+----------+--------------+------------
-
 foxtrot_four            | character varying(60) | not null  | extended |              |
 sierra                     | character varying(40) | not null  | extended |              |
 foo                     | boolean               | not null  | plain    |              |
 bar                 | numeric(3,2)          |           | main     |              |
 baz                     | integer               | not null  | plain    |              |
 
Indexes:
    "foxtrot_four_sierra_PK_IX" PRIMARY KEY, btree (foxtrot_four, sierra)
    "foxtrot_four_idx" btree (foxtrot_four)
    "sierra_idx" btree (sierra) CLUSTER

Foreign-key constraints:
    "sierra_FK" FOREIGN KEY (sierra) REFERENCES foxtrot_india(sierra)
    "foxtrot_four_FK" FOREIGN KEY (foxtrot_four) REFERENCES oscar_charlie(foxtrot_four )
Has OIDs: no


                                   Table "public.oscar_charlie"
      Column       |         Type          | Modifiers | Storage  | Stats target | Description
-------------------+-----------------------+-----------+----------+--------------+-------------
 foxtrot_four      | character varying(60) | not null  | extended |              |
 foxtrot_charlie   | character varying(10) | not null  | extended |              |
 lima               | character varying(30) | not null  | extended |              |
 whiskey            | character varying(3)  | not null  | extended |              |
 charlie_romeo     | numeric(4,0)          | not null  | main     |              |
 five                 | numeric(2,0)          | not null  | main     |              |
 period_end_date   | date                  | not null  | plain    |              |
 qm_score          | numeric(5,2)          | not null  | main     |              |
 revision_date     | date                  | not null  | plain    |              |
Indexes:
    "foxtrot_four_PK_IX" PRIMARY KEY, btree (foxtrot_four)
    "foxtrot_charlie_UQ_IX" UNIQUE CONSTRAINT, btree (foxtrot_charlie, lima, whiskey, charlie_romeo, five)
    "target_period_idx" btree (five, whiskey, charlie_romeo) CLUSTER
    "foxtrot_charlie_idx" btree (foxtrot_charlie)
Foreign-key constraints:
Referenced by:
    TABLE "oscar_bravo" CONSTRAINT "foxtrot_four_FK" FOREIGN KEY (foxtrot_four) REFERENCES oscar_charlie(foxtrot_four)
Has OIDs: no


                                   Table "public.foxtrot_india"
        Column        |          Type          | Modifiers | Storage  | Stats target | Description
----------------------+------------------------+-----------+----------+--------------+-------------
 sierra                  | character varying(40)  | not null  | extended |              |
 lima                  | character varying(30)  | not null  | extended |              |
 global_client_id     | character varying(40)  | not null  | extended |              |
 org_assess_id        | integer                |           | plain    |              |
 org_client_id        | integer                |           | plain    |              |
 assess_ref_date      | date                   | not null  | plain    |              |
 assess_type          | character varying(10)  | not null  | extended |              |
 client_name          | character varying(100) | not null  | extended |              |
 gender               | character(1)           | not null  | extended |              |
 date_of_birth        | date                   | not null  | plain    |              |
 room                 | character varying(10)  | not null  | extended |              |
 bed                  | character varying(10)  |           | extended |              |
 org_floor_id         | integer                |           | plain    |              |
 org_unit_id          | integer                |           | plain    |              |
 global_physician_id  | character varying(20)  |           | extended |              |
 payer_type_lookup_id | character varying(10)  |           | extended |              |
 mds_version          | character varying(10)  | not null  | extended |              |
Indexes:
    "sierra_PK_IX" PRIMARY KEY, btree (sierra)
    "lima_FK_IX" hash (lima)    

Referenced by:
    TABLE "oscar_bravo" CONSTRAINT "oscar_sierra_FK" FOREIGN KEY (sierra) REFERENCES foxtrot_india(sierra)
Has OIDs: no


We did also attempt to change the pk's from large varchar to ints but it didn't make any noticeable difference. If you
arewondering why we went with that data type it's a long story. :-)
 


-----Original Message-----
From: Merlin Moncure [mailto:mmoncure@gmail.com] 
Sent: Tuesday, October 21, 2014 9:39 AM
To: Marco Di Cesare
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Query with large number of joins

On Mon, Oct 20, 2014 at 3:32 PM, Marco Di Cesare <Marco.DiCesare@pointclickcare.com> wrote:
> We are using Postgres for the first time after being SQLServer users 
> for a long time so forgive for being noobs.
>
>
>
> We are using a BI tool that generates a query with an unusually large 
> number of joins. My understanding is that with this many joins 
> Postgres query planner can't possibly use an exhaustive search so it 
> drops into a heuristics algorithm. Unfortunately, the query runs quite 
> slow (~35 seconds) and seems to ignore using primary keys and indexes where available.
>
>
>
> Query plan here (sorry had to anonymize):
>
> http://explain.depesz.com/s/Uml
>
>
>
> Line 30 is one of the pain points where a full table scan is running 
> on 4.2 million rows even though there are indexes on 
> oscar_bravo.foxtrot_four and oscar_charlie.foxtrot_four
>
>
>
> We've tried to play around with the join_collapse_limit value by 
> upping it from the default of 8 to 10 or 12 but it doesn't seem to 
> help much. Cranking the value up to an unreasonable value of 20 does 
> shave some seconds off the query time but not substantially (explain plan with the value set to 20:
> http://explain.depesz.com/s/sW6).

You always have the option of disabling geqo completely.

However, in this case, can you fetch out the relevant fields for "oscar_bravo" that are participating in the join?  I'd
liketo see the field name/type in the source table and the destination table.  Also.
 
I'd like to see the index definition and the snippit of the query that presents the join condition.

You can encourage the server to favor index scans vs seq scans by lowering 'random_page_cost'.  The nuclear option is
todisable sequential scans completely (which is generally a bad idea but can be useful to try and fetch out queries
thatare inadvertently forced into a seqscan for some reason).
 

merlin

Re: Query with large number of joins

From

Andrew Dunstan

Date:

21 October 2014, 19:31:21

On 10/21/2014 12:09 PM, Marco Di Cesare wrote:
> I did not mean to imply this works any better on SQL Server. We never tried. I just meant to say this is the first
timewe are using Postgres so we don't have much experience with it. 
>
> We tried with work_mem set to 1GB (even as high as 3GB) but it didn't change the GroupAggregate and Sort or query run
time.
>
> Sorry, I had to sanitize the query and a few of the relevant tables so hopefully I got it all right.
>
> SELECT
>                  "foxtrot_india"."juliet_alpha",
>                  "foxtrot_india"."foxtrot_yankee",
>                  "foxtrot_india"."hotel_sierra",
>                  "foxtrot_india"."juliet_alpha",
>                 "foxtrot_india"."bravo_romeo",
>                  "oscar_bravo"."golf_foxtrot",
>                  "seven_kilo"."november_lima",
>                  "foxtrot_india"."echo_six",
>                  "uniform_six"."seven_six",
>                 "oscar_charlie"."foxtrot_charlie",
>                  COUNT(DISTINCT "foxtrot_india"."bravo_romeo")
> FROM
>                  "public"."seven_kilo" "seven_kilo"
>                  INNER JOIN "public"."papa_sierra" "papa_sierra" ON ("seven_kilo"."golf_bravo" =
"papa_sierra"."golf_bravo")
>                  LEFT JOIN "public"."golf_two" "golf_two" ON ("seven_kilo"."lima" = "golf_two"."lima")
>                  LEFT JOIN "public"."bravo_xray" "bravo_xray" ON ("seven_kilo"."lima" = "bravo_xray"."lima")
>                  LEFT JOIN "public"."foo1" "foo1" ON (("seven_kilo"."bar1" = "foo1"."bar1") AND
("seven_kilo"."golf_bravo"= "foo1"."golf_bravo")) 
>                  INNER JOIN "public"."oscar_charlie" "oscar_charlie" ON ("seven_kilo"."lima" =
"oscar_charlie"."lima")
>                  INNER JOIN "public"."oscar_bravo" "oscar_bravo" ON ("oscar_charlie"."foxtrot_four" =
"oscar_bravo"."foxtrot_four")
>                  INNER JOIN "public"."foxtrot_india" "foxtrot_india" ON ("oscar_bravo"."sierra" =
"foxtrot_india"."sierra")
>                  INNER JOIN "public"."hotel_romeo" "hotel_romeo" ON ("oscar_charlie"."foxtrot_charlie" =
"hotel_romeo"."foxtrot_charlie")
>                  INNER JOIN "public"."uniform_six" "uniform_six" ON ("hotel_romeo"."hotel_lima" =
"uniform_six"."hotel_lima")
>                  LEFT JOIN "public"."lookup" "foo2" ON ("foxtrot_india"."bar2" = "foo2"."lookup_id")
>                  LEFT JOIN "public"."uniform_two" "uniform_two" ON ("foxtrot_india"."sierra" =
"uniform_two"."sierra")
>                  INNER JOIN "public"."lookup" "four_xray" ON ("uniform_two"."quebec" = "four_xray"."quebec")
>                  LEFT JOIN "public"."papa_four" "papa_four" ON ("foxtrot_india"."sierra" = "papa_four"."sierra")
>                  INNER JOIN "public"."lookup" "romeo_bravo" ON ("papa_four"."quebec" = "romeo_bravo"."quebec")
>                  LEFT JOIN "public"."juliet_two" "juliet_two" ON ("foxtrot_india"."sierra" = "juliet_two"."sierra")
>                  INNER JOIN "public"."lookup" "four_delta" ON ("juliet_two"."quebec" = "four_delta"."quebec")
>                  LEFT JOIN "public"."foo3" "foo3" ON ("foxtrot_india"."bar3" = "foo3"."bar3")
>                  INNER JOIN "public"."xray" "xray" ON ("seven_kilo"."lima" = "xray"."lima")
>                  INNER JOIN "public"."romeo_echo" "romeo_echo" ON ("xray"."echo_sierra" = "romeo_echo"."echo_sierra")
> WHERE
>                  ((("xray"."echo_sierra" = 'november_foxtrot')
>                  AND ("romeo_echo"."hotel_oscar" = 'zulu')
>                  AND ("oscar_charlie"."five" = 6)
>                  AND ("oscar_charlie"."whiskey" = 'four_romeo')
>                  AND ("oscar_charlie"."charlie_romeo" = 2014)))
> GROUP BY 1, 2, 3, 4, 5, 6, 7, 8, 9, 10


Please don't top-post on the PostgreSQL lists. See
<http://idallen.com/topposting.html>

Have you tried a) either turning off geqo or setting geqo_threshold
fairly high b) setting join_collapse_limit fairly high (assuming all the
above join targets are tables and not views, setting it to something
like 25 should do the trick.

You also haven't told us what settings you have for things like
effective_cache_size, which can dramatically affect query plans.

cheers

andrew

Re: Query with large number of joins

From

Tom Lane

Date:

21 October 2014, 19:47:07

Andrew Dunstan <andrew@dunslane.net> writes:
> Have you tried a) either turning off geqo or setting geqo_threshold
> fairly high b) setting join_collapse_limit fairly high (assuming all the
> above join targets are tables and not views, setting it to something
> like 25 should do the trick.

You'd have to do both, I think, to get an exhaustive plan search.

In any case, this query is going to result in full table scans of most
of the tables, because there just aren't very many WHERE constraints;
so expecting it to run instantaneously is a pipe dream.  I'm not sure
that there's a significantly better plan to be had.

            regards, tom lane

Re: Query with large number of joins

From

Tom Lane

Date:

21 October 2014, 19:50:42

Marco Di Cesare <Marco.DiCesare@pointclickcare.com> writes:
>                 COUNT(DISTINCT "foxtrot_india"."bravo_romeo")

Ah.  That explains why the planner doesn't want to use a hash aggregation
step --- DISTINCT aggregates aren't supported with those.

            regards, tom lane

Re: Query with large number of joins

From

Marco Di Cesare

Date:

21 October 2014, 23:03:18

On 10/21/2014 12:31 PM, Andrew Dunstan wrote:
> Please don't top-post on the PostgreSQL lists. See <http://idallen.com/topposting.html>

Oops, sorry.

>Have you tried a) either turning off geqo or setting geqo_threshold fairly high b) setting join_collapse_limit fairly
high(assuming 
 
>all the above join targets are tables and not views, setting it to something like 25 should do the trick.

I did try various combinations of these settings but none yielded any significant query run time improvements. 

> You also haven't told us what settings you have for things like effective_cache_size, which can dramatically affect
queryplans.
 

effective_cache_size = 4096MB

I tried bumping this up as well but again no significant query run time improvements.

Re: Query with large number of joins

From

Marco Di Cesare

Date:

21 October 2014, 23:07:27

Andrew Dunstan <andrew@dunslane.net> writes:
> Have you tried a) either turning off geqo or setting geqo_threshold
> fairly high b) setting join_collapse_limit fairly high (assuming all
> the above join targets are tables and not views, setting it to
> something like 25 should do the trick.

Tom Lane < tgl@sss.pgh.pa.us> writes:
> You'd have to do both, I think, to get an exhaustive plan search.

>In any case, this query is going to result in full table scans of most of the tables, because there just aren't very
manyWHERE constraints; so > 
>expecting it to run instantaneously is a pipe dream.  I'm not sure that there's a significantly better plan to be had.

>            regards, tom lane

I get that same feeling. Just wanted to be sure there was nothing obvious in terms of settings we might have missed.

The BI tool we use wants to load as much raw data as needed and then apply filters (where clauses) on top of that. The
numerousjoins support those filters and a good number of those joins are one-to-many tables causing a Cartesian
product.

Re: Query with large number of joins

From

Igor Neyman

Date:

21 October 2014, 23:12:31


-----Original Message-----
From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Marco Di
Cesare
Sent: Tuesday, October 21, 2014 4:03 PM
To: Andrew Dunstan; Merlin Moncure
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Query with large number of joins


On 10/21/2014 12:31 PM, Andrew Dunstan wrote:
> Please don't top-post on the PostgreSQL lists. See 
> <http://idallen.com/topposting.html>

Oops, sorry.

>Have you tried a) either turning off geqo or setting geqo_threshold 
>fairly high b) setting join_collapse_limit fairly high (assuming all the above join targets are tables and not views,
settingit to something like 25 should do the trick.
 

I did try various combinations of these settings but none yielded any significant query run time improvements. 

> You also haven't told us what settings you have for things like effective_cache_size, which can dramatically affect
queryplans.
 

effective_cache_size = 4096MB

I tried bumping this up as well but again no significant query run time improvements.  



Marco,

Didn't you mention, that you have something like 48GB RAM?
In this case (if that's dedicated db server), you should try and set effective_cache_size around 40GB (not 4GB).

Regards,
Igor Neyman

Re: Query with large number of joins

From

Merlin Moncure

Date:

22 October 2014, 16:23:12

On Tue, Oct 21, 2014 at 11:50 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Marco Di Cesare <Marco.DiCesare@pointclickcare.com> writes:
>>                 COUNT(DISTINCT "foxtrot_india"."bravo_romeo")
>
> Ah.  That explains why the planner doesn't want to use a hash aggregation
> step --- DISTINCT aggregates aren't supported with those.

yup.  With this query, the planner statistics are pretty good for the
most part.  Considering that the query is generated and amount of data
is significant the runtime isn't too bad.  The query could be
rewritten to support a hash aggregate...

merlin