Thread: Performance difference between Slon master and slave

Performance difference between Slon master and slave

From

Matthew Lunnon

Date:

14 December 2015, 17:18:15

Hello all, I hope someone can help me with this.

Postgres 9.4.4
Slon 2.2.4
Linux

I am using slony-i to replicate a production database which is in the order of 70GB. I have a reasonably complex select query that runs in 40 seconds on the master but takes in the region of 30-40 minutes on the slave. The postgres configurations are identical and the machines are a similar specifications (12 core hyper threaded HP server and the slave has slightly less RAM: 132GB vs 148GB) The server running the slave database has a higher load than the one running the master though the load average on the slave machine was low (1-2) when running the test and the postgres process on the slave machine runs at 100% of a CPU with very little iowait on the server.

Inspecting the execution plan shows that there are some differences, for example, the slave is using a HashAggregate when the master is simply grouping. There also seems to be a difference with the ordering of the sub plans. Armed with this knowledge I have set enable_hashagg to off and run the query again and it now takes 53 seconds on the slave which is a more acceptable difference and the execution plans now look very similar (one difference being that there is another HashAggregate in the master which is now missing on the slave and may account for the 13 seconds). I have isolated a much simpler query which I have detailed below with their execution plans which shows the difference on line 4. I would rather not disable hash aggregation on the slave as this might have other consequences so this raises a number of questions. Firstly Is there anything that I can do to stop this feature? Why is the slave behaving differently to the master?

Thanks in advance for any help.

Cheers
Matthew

explain
with my_view_booking_pax_breakdown as (
SELECT bev.booking_id,
    ( SELECT count(*) AS count
           FROM passenger_version
          WHERE passenger_version.current_version = 'T'::bpchar AND passenger_version.deleted = 'F'::bpchar AND passenger_version.indicative_pax_type = 'A'::bpchar AND passenger_version.booking_id = bev.booking_id) AS adult_count,
    ( SELECT count(*) AS count
           FROM passenger_version
          WHERE passenger_version.current_version = 'T'::bpchar AND passenger_version.deleted = 'F'::bpchar AND passenger_version.indicative_pax_type = 'C'::bpchar AND passenger_version.booking_id = bev.booking_id) AS child_count,
    ( SELECT count(*) AS count
           FROM passenger_version
          WHERE passenger_version.current_version = 'T'::bpchar AND passenger_version.deleted = 'F'::bpchar AND passenger_version.indicative_pax_type = 'I'::bpchar AND passenger_version.booking_id = bev.booking_id) AS infant_count
   FROM booking_expanded_version bev
GROUP BY bev.booking_id
)
select * from "my_view_booking_pax_breakdown" "view_booking_pax_breakdown"
    INNER JOIN "booking"."booking_expanded_version" "booking_expanded_version" ON "view_booking_pax_breakdown"."booking_id"="booking_expanded_version"."booking_id"

Master

"Merge Join (cost=5569138.32..6158794.12 rows=2461265 width=1375)"
" Merge Cond: (booking_expanded_version.booking_id = view_booking_pax_breakdown.booking_id)"
" CTE my_view_booking_pax_breakdown"
"    -> Group (cost=0.43..5545692.19 rows=215891 width=4)"
"          Group Key: bev.booking_id"
"          -> Index Only Scan using booking_expanded_version_booking_idx on booking_expanded_version bev (cost=0.43..64607.40 rows=2461265 width=4)"
"          SubPlan 1"
"            -> Aggregate (cost=8.57..8.58 rows=1 width=0)"
"                  -> Index Scan using passenger_version_idx_4 on passenger_version (cost=0.43..8.55 rows=5 width=0)"
"                        Index Cond: (booking_id = bev.booking_id)"
"          SubPlan 2"
"            -> Aggregate (cost=8.45..8.46 rows=1 width=0)"
"                  -> Index Scan using passenger_version_idx_3 on passenger_version passenger_version_1 (cost=0.42..8.45 rows=1 width=0)"
"                        Index Cond: (booking_id = bev.booking_id)"
"          SubPlan 3"
"            -> Aggregate (cost=8.31..8.32 rows=1 width=0)"
"                  -> Index Scan using passenger_version_idx_2 on passenger_version passenger_version_2 (cost=0.29..8.31 rows=1 width=0)"
"                        Index Cond: (booking_id = bev.booking_id)"
" -> Index Scan using booking_expanded_version_booking_idx on booking_expanded_version (cost=0.43..546584.09 rows=2461265 width=1347)"
" -> Sort (cost=23445.70..23985.43 rows=215891 width=28)"
"        Sort Key: view_booking_pax_breakdown.booking_id"
"        -> CTE Scan on my_view_booking_pax_breakdown view_booking_pax_breakdown (cost=0.00..4317.82 rows=215891 width=28)"

Slave

"Merge Join (cost=6168518.91..6764756.86 rows=2505042 width=1299)"
" Merge Cond: (booking_expanded_version.booking_id = view_booking_pax_breakdown.booking_id)"
" CTE my_view_booking_pax_breakdown"
"    -> HashAggregate (cost=212185.03..6142965.53 rows=234040 width=4)"
"          Group Key: bev.booking_id"
"          -> Seq Scan on booking_expanded_version bev (cost=0.00..205922.42 rows=2505042 width=4)"
"          SubPlan 1"
"            -> Aggregate (cost=8.54..8.55 rows=1 width=0)"
"                  -> Index Scan using passenger_version_idx_4 on passenger_version (cost=0.43..8.53 rows=4 width=0)"
"                        Index Cond: (booking_id = bev.booking_id)"
"          SubPlan 2"
"            -> Aggregate (cost=8.45..8.46 rows=1 width=0)"
"                  -> Index Scan using passenger_version_idx_3 on passenger_version passenger_version_1 (cost=0.42..8.45 rows=1 width=0)"
"                        Index Cond: (booking_id = bev.booking_id)"
"          SubPlan 3"
"            -> Aggregate (cost=8.31..8.32 rows=1 width=0)"
"                  -> Index Scan using passenger_version_idx_2 on passenger_version passenger_version_2 (cost=0.29..8.31 rows=1 width=0)"
"                        Index Cond: (booking_id = bev.booking_id)"
" -> Index Scan using booking_expanded_version_booking_idx on booking_expanded_version (cost=0.43..552400.15 rows=2505042 width=1271)"
" -> Sort (cost=25552.95..26138.05 rows=234040 width=28)"
"        Sort Key: view_booking_pax_breakdown.booking_id"
"        -> CTE Scan on my_view_booking_pax_breakdown view_booking_pax_breakdown (cost=0.00..4680.80 rows=234040 width=28)"

This message has been scanned for malware by Websense. www.websense.com

Re: Performance difference between Slon master and slave

From

Jim Nasby

Date:

14 December 2015, 17:50:29

On 12/14/15 11:16 AM, Matthew Lunnon wrote:
> Inspecting the execution plan shows that there are some differences, for
> example, the slave is using a HashAggregate when the master is simply
> grouping. There also seems to be a difference with the ordering of the
> sub plans.

Have you tried analyzing the tables on the slave?

Also, keep in mind that the first time you access rows on a Slony slave
after they're replicated Postgres will need to write hint bits out,
which will take some time. But that's clearly not the issue here.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: Performance difference between Slon master and slave

From

Mattthew Lunnon

Date:

14 December 2015, 21:44:02

Hi Jim,
Thanks for your response. Yes the tables have been analysed and I have also re-indexed and vacuumed the slave database.

Regards
Matthew

Sent from my iPad

> On 14 Dec 2015, at 17:49, Jim Nasby <Jim.Nasby@BlueTreble.com> wrote:
>
>> On 12/14/15 11:16 AM, Matthew Lunnon wrote:
>> Inspecting the execution plan shows that there are some differences, for
>> example, the slave is using a HashAggregate when the master is simply
>> grouping. There also seems to be a difference with the ordering of the
>> sub plans.
>
> Have you tried analyzing the tables on the slave?
>
> Also, keep in mind that the first time you access rows on a Slony slave after they're replicated Postgres will need
towrite hint bits out, which will take some time. But that's clearly not the issue here. 
> --
> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
> Experts in Analytics, Data Architecture and PostgreSQL
> Data in Trouble? Get it in Treble! http://BlueTreble.com


This message has been scanned for malware by Websense. www.websense.com