Thread: Performance across multiple schemas

Performance across multiple schemas

From

Robins

Date:

28 August 2007, 10:05:31

Hi,

I have just reorganized a relatively decent sized query such that its constituent functions / tables are now spread over 3-4 schemas.

However, the query has for some reason now become very slow (earlier used to take about 20 seconds, now takes about 500 seconds). The explain analyse (given below) doesn't help either.

(Of what I did try, reducing the number of functions made the query faster, which frankly doesn't help me at all. Sadly removing the functions one-by-one led me to two of them which were taking a lot of time (the 3rd last and the 4th last) but their reason is to me still unknown. Besides, even after removing these two fields the query is still painfully slow as compared to its previous performance).

All functions are STABLE (but that shouldnt matter because this analyse was specifically done for 1 row).
Most functions are in the 'processing' schema and most tables are in the fundsys1 schema.
Almost all the required fields are indexed (It was working fast enough earlier, so I dont think that should be an issue).
Did a VACUUM ANALYSE before running this query.
The NULL with COALESCE is just a temporary hack to replace a variable with NULL to run this query for a small set.

Could someone confirm as to whether a query across multiple schemas is known to have any kind of a degraded performance ?
Any other ideas ?

======================================
"Nested Loop (cost=206.15..246.63 rows=37 width=16) (actual time=362.139..296937.587 rows=841 loops=1)"
" -> Merge Join (cost= 206.15..206.33 rows=1 width=12) (actual time=12.817..12.832 rows=1 loops=1)"
"        Merge Cond: (main.scheme_code = jn_set_schemecode.scheme_code)"
"        -> Sort (cost=201.24..201.31 rows=27 width=12) (actual time= 12.672..12.683 rows=8 loops=1)"
"              Sort Key: main.variant_scheme_code"
"              -> Seq Scan on main (cost=0.00..200.60 rows=27 width=12) (actual time=0.029..6.728 rows=2593 loops=1)"
"                    Filter: (variant_scheme_code = scheme_code)"
"        -> Sort (cost=4.91..4.93 rows=9 width=4) (actual time=0.107..0.110 rows=1 loops=1)"
"              Sort Key: jn_set_schemecode.scheme_code"
"              -> Seq Scan on jn_set_schemecode (cost=0.00..4.76 rows=9 width=4) (actual time=0.074..0.076 rows=1 loops=1)"
"                    Filter: (set_id = 10)"
" -> Seq Scan on "month" (cost= 0.00..25.41 rows=841 width=4) (actual time=0.033..3.049 rows=841 loops=1)"
"Total runtime: 296939.886 ms"

======================================
SELECT
        main.scheme_code,
            (
            (processing.fund_month_end_mean(main.scheme_code, '2005-1-1'::date, '2007-6-30'::date)*12) -
            (processing.risk_free_index_month_end_mean('2005-1-1'::date, '2007-6-30'::date) * 12)
            )/(processing.fund_month_end_stddev_pop(main.scheme_code, '2005-1-1'::date, '2007-6-30'::date)*sqrt(12)),

            processing.fund_month_end_stddev_pop(main.scheme_code , '2005-1-1'::date, '2007-6-30'::date) ,

                (
                    (processing.covariance_fund_index_monthly(main.scheme_code, COALESCE(NULL, stated_index), '2005-1-1'::date, '2007-6-30'::date)*12)/
                        (processing.fund_month_end_stddev_pop(main.scheme_code, '2005-1-1'::date, '2007-6-30'::date)*sqrt(12) *
                        processing.index_month_end_stddev_pop(COALESCE(NULL, stated_index), '2005-1-1'::date, '2007-6-30'::date)*sqrt(12))
                ),

            processing.information_ratio_monthly(main.scheme_code, COALESCE(NULL, stated_index), '2005-1-1'::date, '2007-6-30'::date) ,
            (
                (processing.fund_month_end_mean (main.scheme_code, '2005-1-1'::date, '2007-6-30'::date)*12) -
                ((processing.risk_free_index_month_end_mean('2005-1-1'::date, '2007-6-30'::date) * 12) +
                    ((
                        (processing.covariance_fund_index_monthly(main.scheme_code, COALESCE(NULL, stated_index), '2005-1-1'::date, '2007-6-30'::date)*12) /
                        (processing.index_month_end_variance (COALESCE(NULL, stated_index), '2005-1-1'::date, '2007-6-30'::date)* 12)
                    )*
                    (
                        (processing.index_month_end_mean(COALESCE(NULL, stated_index), '2005-1-1'::date, '2007-6-30'::date)*12) -
                        (processing.risk_free_index_month_end_mean('2005-1-1'::date, '2007-6-30'::date) * 12)
                    )
                    )
                ),
                (
                    (processing.covariance_fund_index_monthly(main.scheme_code, COALESCE(NULL, stated_index), '2005-1-1'::date, '2007-6-30'::date)*12) /
                    (processing.index_month_end_variance (COALESCE(NULL, stated_index), '2005-1-1'::date, '2007-6-30'::date)* 12)
                ),
            processing.upside_capture_ratio_monthly(main.scheme_code, COALESCE(NULL, stated_index), '2005-1-1'::date, '2007-6-30'::date) ,
            processing.downside_capture_ratio_monthly(main.scheme_code, COALESCE(NULL, stated_index), '2005-1-1'::date, '2007-6-30'::date) ,
            processing.fund_return(main.scheme_code, '2007-6-30'::date, '1 year', true) ,
            processing.fund_return(main.scheme_code, '2007-6-30'::date, '2 years', true) ,
            processing.fund_return(main.scheme_code, '2007-6-30'::date, '3 years', true) ,
            processing.fund_return(main.scheme_code, '2007-6-30'::date, '5 years', true) ,
            processing.rolling_return(main.scheme_code, '2007-6-30'::date, '1 year', '1 month', '1 day') ,
            processing.calendar_year_return(main.scheme_code, (extract(year from now()))::integer) ,
            processing.calendar_year_return(main.scheme_code, (extract(year from now()) - 1)::integer),
            processing.calendar_year_return(main.scheme_code, (extract(year from now()) - 2)::integer),
            processing.days_to_liquidate(main.scheme_code, '2007-6-30'::date) as days_to_liquidate,
            processing.deviation_from_index (main.scheme_code, COALESCE(NULL, stated_index), '2007-6-30'::date) ,
            (SELECT index_full_name FROM fundsys1.fs_indices INNER JOIN fundsys1.main ON main.stated_index = index_code where main.scheme_code = jn_set_schemecode.scheme_code),
            (SELECT stated_index FROM fundsys1.main where main.scheme_code = jn_set_schemecode.scheme_code),
            processing.number_of_companies_in_index(jn_set_schemecode.scheme_code, lookup_tables.month.month_end_date),
            processing.percentage_of_assets_in_stocks_as_in_benchmark(jn_set_schemecode.scheme_code, lookup_tables.month.month_end_date)
        FROM lookup_tables.month, fundsys1.main
            INNER JOIN output.jn_set_schemecode ON jn_set_schemecode.scheme_code = main.scheme_code
        WHERE jn_set_schemecode.set_id=10
            AND main.variant_scheme_code = main.scheme_code
        ORDER BY main.scheme_code
======================================

Thanks
Robins Tharakan

Re: Performance across multiple schemas

From

Robins

Date:

28 August 2007, 10:44:17

Oops!
Guess I shot myself in the foot there.

It seems to be an SQL issue and not really a PG problem... Sorry for bothering you all.

However, now that we are here, could anyone tell if you would advise for multiple schemas (in PG) while designing the database structure ?

Thanks
Robins Tharakan

On 8/28/07, Robins <tharakan@gmail.com > wrote:

Hi,

I have just reorganized a relatively decent sized query such that its constituent functions / tables are now spread over 3-4 schemas.

However, the query has for some reason now become very slow (earlier used to take about 20 seconds, now takes about 500 seconds). The explain analyse (given below) doesn't help either.

(Of what I did try, reducing the number of functions made the query faster, which frankly doesn't help me at all. Sadly removing the functions one-by-one led me to two of them which were taking a lot of time (the 3rd last and the 4th last) but their reason is to me still unknown. Besides, even after removing these two fields the query is still painfully slow as compared to its previous performance).

All functions are STABLE (but that shouldnt matter because this analyse was specifically done for 1 row).
Most functions are in the 'processing' schema and most tables are in the fundsys1 schema.
Almost all the required fields are indexed (It was working fast enough earlier, so I dont think that should be an issue).
Did a VACUUM ANALYSE before running this query.
The NULL with COALESCE is just a temporary hack to replace a variable with NULL to run this query for a small set.

Could someone confirm as to whether a query across multiple schemas is known to have any kind of a degraded performance ?
Any other ideas ?

======================================
"Nested Loop (cost=206.15..246.63 rows=37 width=16) (actual time=362.139..296937.587 rows=841 loops=1)"
" -> Merge Join (cost= 206.15..206.33 rows=1 width=12) (actual time=12.817..12.832 rows=1 loops=1)"
"        Merge Cond: (main.scheme_code = jn_set_schemecode.scheme_code)"
"        -> Sort (cost=201.24..201.31 rows=27 width=12) (actual time= 12.672..12.683 rows=8 loops=1)"
"              Sort Key: main.variant_scheme_code"
"              -> Seq Scan on main (cost=0.00..200.60 rows=27 width=12) (actual time=0.029..6.728 rows=2593 loops=1)"
"                    Filter: (variant_scheme_code = scheme_code)"
"        -> Sort (cost=4.91..4.93 rows=9 width=4) (actual time=0.107..0.110 rows=1 loops=1)"
"              Sort Key: jn_set_schemecode.scheme_code"
"              -> Seq Scan on jn_set_schemecode (cost=0.00..4.76 rows=9 width=4) (actual time=0.074..0.076 rows=1 loops=1)"
"                    Filter: (set_id = 10)"
" -> Seq Scan on "month" (cost= 0.00..25.41 rows=841 width=4) (actual time=0.033..3.049 rows=841 loops=1)"
"Total runtime: 296939.886 ms"

======================================
SELECT
        main.scheme_code,
            (
            (processing.fund_month_end_mean(main.scheme_code, '2005-1-1'::date, '2007-6-30'::date)*12) -
            (processing.risk_free_index_month_end_mean('2005-1-1'::date, '2007-6-30'::date) * 12)
            )/(processing.fund_month_end_stddev_pop(main.scheme_code, '2005-1-1'::date, '2007-6-30'::date)*sqrt(12)),

            processing.fund_month_end_stddev_pop(main.scheme_code , '2005-1-1'::date, '2007-6-30'::date) ,

                (
                    (processing.covariance_fund_index_monthly(main.scheme_code, COALESCE(NULL, stated_index), '2005-1-1'::date, '2007-6-30'::date)*12)/
                        (processing.fund_month_end_stddev_pop(main.scheme_code, '2005-1-1'::date, '2007-6-30'::date)*sqrt(12) *
                        processing.index_month_end_stddev_pop(COALESCE(NULL, stated_index), '2005-1-1'::date, '2007-6-30'::date)*sqrt(12))
                ),

            processing.information_ratio_monthly(main.scheme_code, COALESCE(NULL, stated_index), '2005-1-1'::date, '2007-6-30'::date) ,
            (
                (processing.fund_month_end_mean (main.scheme_code, '2005-1-1'::date, '2007-6-30'::date)*12) -
                ((processing.risk_free_index_month_end_mean('2005-1-1'::date, '2007-6-30'::date) * 12) +
                    ((
                        (processing.covariance_fund_index_monthly(main.scheme_code, COALESCE(NULL, stated_index), '2005-1-1'::date, '2007-6-30'::date)*12) /
                        (processing.index_month_end_variance (COALESCE(NULL, stated_index), '2005-1-1'::date, '2007-6-30'::date)* 12)
                    )*
                    (
                        (processing.index_month_end_mean(COALESCE(NULL, stated_index), '2005-1-1'::date, '2007-6-30'::date)*12) -
                        (processing.risk_free_index_month_end_mean('2005-1-1'::date, '2007-6-30'::date) * 12)
                    )
                    )
                ),
                (
                    (processing.covariance_fund_index_monthly(main.scheme_code, COALESCE(NULL, stated_index), '2005-1-1'::date, '2007-6-30'::date)*12) /
                    (processing.index_month_end_variance (COALESCE(NULL, stated_index), '2005-1-1'::date, '2007-6-30'::date)* 12)
                ),
            processing.upside_capture_ratio_monthly(main.scheme_code, COALESCE(NULL, stated_index), '2005-1-1'::date, '2007-6-30'::date) ,
            processing.downside_capture_ratio_monthly(main.scheme_code, COALESCE(NULL, stated_index), '2005-1-1'::date, '2007-6-30'::date) ,
            processing.fund_return(main.scheme_code, '2007-6-30'::date, '1 year', true) ,
            processing.fund_return(main.scheme_code, '2007-6-30'::date, '2 years', true) ,
            processing.fund_return(main.scheme_code, '2007-6-30'::date, '3 years', true) ,
            processing.fund_return(main.scheme_code, '2007-6-30'::date, '5 years', true) ,
            processing.rolling_return(main.scheme_code, '2007-6-30'::date, '1 year', '1 month', '1 day') ,
            processing.calendar_year_return(main.scheme_code, (extract(year from now()))::integer) ,
            processing.calendar_year_return(main.scheme_code, (extract(year from now()) - 1)::integer),
            processing.calendar_year_return(main.scheme_code, (extract(year from now()) - 2)::integer),
            processing.days_to_liquidate(main.scheme_code, '2007-6-30'::date) as days_to_liquidate,
            processing.deviation_from_index (main.scheme_code, COALESCE(NULL, stated_index), '2007-6-30'::date) ,
            (SELECT index_full_name FROM fundsys1.fs_indices INNER JOIN fundsys1.main ON main.stated_index = index_code where main.scheme_code = jn_set_schemecode.scheme_code),
            (SELECT stated_index FROM fundsys1.main where main.scheme_code = jn_set_schemecode.scheme_code),
            processing.number_of_companies_in_index(jn_set_schemecode.scheme_code, lookup_tables.month.month_end_date),
            processing.percentage_of_assets_in_stocks_as_in_benchmark(jn_set_schemecode.scheme_code, lookup_tables.month.month_end_date)
        FROM lookup_tables.month, fundsys1.main
            INNER JOIN output.jn_set_schemecode ON jn_set_schemecode.scheme_code = main.scheme_code
        WHERE jn_set_schemecode.set_id=10
            AND main.variant_scheme_code = main.scheme_code
        ORDER BY main.scheme_code
======================================

Thanks
Robins Tharakan

--
Robins

Re: Performance across multiple schemas

From

Tom Lane

Date:

28 August 2007, 10:50:03

Robins <tharakan@gmail.com> writes:
> Could someone confirm as to whether a query across multiple schemas is known
> to have any kind of a degraded performance ?

Schemas are utterly, utterly irrelevant to performance.

I'm guessing you missed analyzing one of the tables, or forgot an index,
or something like that.  Also, if you did anything "cute" like use the
same table name in more than one schema, you need to check the
possibility that some query is selecting the wrong one of the tables.

The explain output you showed is no help because the expense is
evidently down inside one of the functions in the SELECT output list.

One thing you should probably try before getting too frantic is
re-ANALYZEing all the tables and then starting a fresh session to
clear any cached plans inside the functions.  If it's still slow
then it'd be worth digging deeper.

            regards, tom lane

Re: Performance across multiple schemas

From

Robins

Date:

28 August 2007, 23:45:18

Thanks Tom,

Exactly what I did, when I realised that there was an extra Table in the FROM with no conditions set.

Well anyway, this did clear my doubts about whether schema affects performance at all.

Robins

On 8/29/07, Robins Tharakan <robins@pobox.com> wrote:

Thanks Tom,

Exactly what I did, when I realised that there was an extra Table in the FROM with no conditions set.

Well anyway, this did clear my doubts about whether schema affects performance at all.

Robins

On 8/28/07, Tom Lane < tgl@sss.pgh.pa.us> wrote:

Schemas are utterly, utterly irrelevant to performance.

I'm guessing you missed analyzing one of the tables, or forgot an index,
or something like that.  Also, if you did anything "cute" like use the
same table name in more than one schema, you need to check the
possibility that some query is selecting the wrong one of the tables.

The explain output you showed is no help because the expense is
evidently down inside one of the functions in the SELECT output list.

One thing you should probably try before getting too frantic is
re-ANALYZEing all the tables and then starting a fresh session to
clear any cached plans inside the functions.  If it's still slow
then it'd be worth digging deeper.

                        regards, tom lane

--
Robins

Re: Performance across multiple schemas

From

"Robins Tharakan"

Date:

29 August 2007, 14:00:27

On 8/28/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Schemas are utterly, utterly irrelevant to performance.

I'm guessing you missed analyzing one of the tables, or forgot an index,
or something like that.  Also, if you did anything "cute" like use the
same table name in more than one schema, you need to check the
possibility that some query is selecting the wrong one of the tables.

The explain output you showed is no help because the expense is
evidently down inside one of the functions in the SELECT output list.

One thing you should probably try before getting too frantic is
re-ANALYZEing all the tables and then starting a fresh session to
clear any cached plans inside the functions.  If it's still slow
then it'd be worth digging deeper.

                        regards, tom lane