Thread: Queries within a function

Queries within a function

From
Mridula Mahadevan
Date:

Hi,

 I am running a bunch of queries within a function, creating some temp tables and populating them. When the data exceeds say, 100k the queries start getting really slow and timeout (30 min). when these are run outside of a transaction(in auto commit mode), they run in a few seconds. Any ideas on what may be going on and any postgresql.conf parameters etc that might help?

Thanks

Re: Queries within a function

From
Pavel Stehule
Date:
Hello

look on http://blog.endpoint.com/2008/12/why-is-my-function-slow.html

Regards
Pavel Stehule

2010/2/2 Mridula Mahadevan <mmahadevan@stratify.com>:
> Hi,
>
>  I am running a bunch of queries within a function, creating some temp
> tables and populating them. When the data exceeds say, 100k the queries
> start getting really slow and timeout (30 min). when these are run outside
> of a transaction(in auto commit mode), they run in a few seconds. Any ideas
> on what may be going on and any postgresql.conf parameters etc that might
> help?
>
> Thanks

Re: Queries within a function

From
Tom Lane
Date:
Mridula Mahadevan <mmahadevan@stratify.com> writes:
>  I am running a bunch of queries within a function, creating some temp tables and populating them. When the data
exceedssay, 100k the queries start getting really slow and timeout (30 min). when these are run outside of a
transaction(inauto commit mode), they run in a few seconds. Any ideas on what may be going on and any postgresql.conf
parametersetc that might help? 

I'll bet the function is caching query plans that stop being appropriate
once the table grows in size.  You might have to resort to using
EXECUTE, although if you're on 8.4 DISCARD PLANS ought to help too.

            regards, tom lane

Re: Queries within a function

From
"Ing. Marcos Orti­z Valmaseda"
Date:
Mridula Mahadevan wrote:
>
> Hi,
>
>  I am running a bunch of queries within a function, creating some temp
> tables and populating them. When the data exceeds say, 100k the
> queries start getting really slow and timeout (30 min). when these are
> run outside of a transaction(in auto commit mode), they run in a few
> seconds. Any ideas on what may be going on and any postgresql.conf
> parameters etc that might help?
>
> Thanks
>
Do you put here the result of the explain command of the query?
Do you put here the postgresql.conf parameters that you have in your box?

Regards


--
--------------------------------------------------------------------------------
"Para ser realmente grande, hay que estar con la gente, no por encima de ella."
                                                                   Montesquieu
Ing. Marcos Luís Ortíz Valmaseda
PostgreSQL System DBA && DWH -- BI Apprentice

Centro de Tecnologías de Almacenamiento y Análisis de Datos (CENTALAD)
Universidad de las Ciencias Informáticas

Linux User # 418229

-- PostgreSQL --
"TIP 4: No hagas 'kill -9' a postmaster"
http://www.postgresql-es.org
http://www.postgresql.org
http://www.planetpostgresql.org

-- DWH + BI --
The Data WareHousing Institute
http://www.tdwi.org
http://www.tdwi.org/cbip
---------------------------------------------------------------------------------


Attachment

Re: Queries within a function

From
Mridula Mahadevan
Date:
Thanks Tom and Marcos.

More details -

I have 2 temp tables
Table a -
Create table a (id int primary key,
            promoted int,
            unq_str varchar )
Table b -
CREATE TABLE b (
                id int primary key,
                dup_id int
                ) TABLESPACE tblspc_tmp;

And this is my insert statement

INSERT INTO b SELECT a2.id , (SELECT MIN(a1.id) FROM a a1
            WHERE a1.unq_str=a2.unq_str AND a1.promoted = 1) as dup_id
        FROM a a2
        WHERE a2.promoted = 0


Explain -

"Seq Scan on a a2  (cost=0.00..517148464.79 rows=126735 width=12)"
"  Filter: (promoted = 0)"
"  SubPlan"
"    ->  Aggregate  (cost=4080.51..4080.52 rows=1 width=4)"
"          ->  Seq Scan on a a1 (cost=0.00..4080.51 rows=1 width=4)"
"                Filter: (((unq_str)::text = ($0)::text) AND (promoted = 1))"



Postgresql.conf options -


# RESOURCE USAGE (except WAL)
#------------------------------------------------------------------------------

# - Memory -

shared_buffers = 128MB                  # min 128kB or max_connections*16kB
# (change requires restart)  (Changed from 24 MB to 128 MB)
#temp_buffers = 128MB                   # min 800kB
max_prepared_transactions = 10  # can be 0 or more (changed from 5 to 20)
                                        # (change requires restart)
# Note:  Increasing max_prepared_transactions costs ~600 bytes of shared memory
# per transaction slot, plus lock space (see max_locks_per_transaction).
work_mem = 5MB                          # min 64kB (Changed from 1MB to 5 MB)
#maintenance_work_mem = 16MB            # min 1MB
#max_stack_depth = 2MB                  # min 100kB

# - Free Space Map -

max_fsm_pages = 153600                  # min max_fsm_relations*16, 6 bytes each
                                        # (change requires restart)
#max_fsm_relations = 1000               # min 100, ~70 bytes each
                                        # (change requires restart)

# - Kernel Resource Usage -

#max_files_per_process = 1000           # min 25
                                        # (change requires restart)
#shared_preload_libraries = ''          # (change requires restart)

# - Cost-Based Vacuum Delay -

#vacuum_cost_delay = 0                  # 0-1000 milliseconds
#vacuum_cost_page_hit = 1               # 0-10000 credits
#vacuum_cost_page_miss = 10             # 0-10000 credits
#vacuum_cost_page_dirty = 20            # 0-10000 credits
#vacuum_cost_limit = 200                # 1-10000 credits

# - Background Writer -

#bgwriter_delay = 200ms                 # 10-10000ms between rounds
#bgwriter_lru_maxpages = 100            # 0-1000 max buffers written/round
#bgwriter_lru_multiplier = 2.0          # 0-10.0 multipler on buffers scanned/ro
Und
#fsync = on                             # turns forced synchronization on or off
#synchronous_commit = on                # immediate fsync at commit
#wal_sync_method = fsync                # the default is the first option
                                        # supported by the operating system:
                                        #   open_datasync
                                        #   fdatasync
                                        #   fsync
                                        #   fsync_writethrough
                                        #   open_sync
#full_page_writes = on                  # recover from partial page writes
#wal_buffers = 64kB                     # min 32kB
                                        # (change requires restart)
#wal_writer_delay = 200ms               # 1-10000 milliseconds

commit_delay =  5000            # range 0-100000, in microseconds (changed from
0.5 to 5000)
#commit_siblings = 5                    # range 1-1000

# - Checkpoints -

checkpoint_segments = 10                # in logfile segments, min 1, 16MB each
#checkpoint_timeout = 5min              # range 30s-1h
#checkpoint_completion_target = 0.5     # checkpoint target duration, 0.0 - 1.0
#checkpoint_warning = 30s               # 0 is off

# - Archiving -

#archive_mode = off             # allows archiving to be done
                                # (change requires restart)
#archive_command = ''           # command to use to archive a logfile segment
#archive_timeout = 0            # force a logfile segment switch after this
                                # time; 0 is off


#------------------------------------------------------------------------------
# QUERY TUNING
#------------------------------------------------------------------------------

# - Planner Method Configuration -

#enable_bitmapscan = on
#enable_hashagg = on
#enable_hashjoin = on
#enable_indexscan = on
#enable_mergejoin = on
#enable_nestloop = on
enable_seqscan = on
#enable_sort = on
#enable_tidscan = on

# - Planner Cost Constants -

#seq_page_cost = 1.0                    # measured on an arbitrary scale
#random_page_cost = 4.0                 # same scale as above
#cpu_tuple_cost = 0.01                  # same scale as above
#cpu_index_tuple_cost = 0.005           # same scale as above
#cpu_operator_cost = 0.0025             # same scale as above
effective_cache_size = 512MB            #(Changed from 128 MB to 256 MB)

# - Genetic Query Optimizer -

#geqo = on
#geqo_threshold = 12
#geqo_effort = 5                        # range 1-10
#geqo_pool_size = 0                     # selects default based on effort
#geqo_generations = 0                   # selects default based on effort
#geqo_selection_bias = 2.0              # range 1.5-2.0

# - Other Planner Options -

default_statistics_target = 100         # range 1-1000 (changed from 10 to 100)
#constraint_exclusion = off
#from_collapse_limit = 8
#join_collapse_limit = 8                # 1 disables collapsing of explicit
                                        # JOIN clauses



-----Original Message-----
From: "Ing. Marcos Ortiz Valmaseda" [mailto:mlortiz@uci.cu]
Sent: Tuesday, February 02, 2010 11:59 AM
To: Mridula Mahadevan
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Queries within a function

Mridula Mahadevan wrote:
>
> Hi,
>
>  I am running a bunch of queries within a function, creating some temp
> tables and populating them. When the data exceeds say, 100k the
> queries start getting really slow and timeout (30 min). when these are
> run outside of a transaction(in auto commit mode), they run in a few
> seconds. Any ideas on what may be going on and any postgresql.conf
> parameters etc that might help?
>
> Thanks
>
Do you put here the result of the explain command of the query?
Do you put here the postgresql.conf parameters that you have in your box?

Regards


--
--------------------------------------------------------------------------------
"Para ser realmente grande, hay que estar con la gente, no por encima de ella."
                                                                   Montesquieu Ing. Marcos Luís Ortíz Valmaseda
PostgreSQLSystem DBA && DWH -- BI Apprentice 

Centro de Tecnologías de Almacenamiento y Análisis de Datos (CENTALAD) Universidad de las Ciencias Informáticas

Linux User # 418229

-- PostgreSQL --
"TIP 4: No hagas 'kill -9' a postmaster"
http://www.postgresql-es.org
http://www.postgresql.org
http://www.planetpostgresql.org

-- DWH + BI --
The Data WareHousing Institute
http://www.tdwi.org
http://www.tdwi.org/cbip
---------------------------------------------------------------------------------


Re: Queries within a function

From
Mridula Mahadevan
Date:
Tom,
 I cannot run execute because all these are temp tables with drop on auto commit within a function. This should have
somethingto do with running it in a transaction, when I run them in autocommit mode (without a drop on autocommit) the
queriesreturn in a few seconds.  


-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Tuesday, February 02, 2010 11:28 AM
To: Mridula Mahadevan
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Queries within a function

Mridula Mahadevan <mmahadevan@stratify.com> writes:
>  I am running a bunch of queries within a function, creating some temp tables and populating them. When the data
exceedssay, 100k the queries start getting really slow and timeout (30 min). when these are run outside of a
transaction(inauto commit mode), they run in a few seconds. Any ideas on what may be going on and any postgresql.conf
parametersetc that might help? 

I'll bet the function is caching query plans that stop being appropriate
once the table grows in size.  You might have to resort to using
EXECUTE, although if you're on 8.4 DISCARD PLANS ought to help too.

            regards, tom lane

Re: Queries within a function

From
"ramasubramanian"
Date:
Hi,
Try using dynamic sql. Query will be faster in a function
regards
Ram
----- Original Message -----
From: "Mridula Mahadevan" <mmahadevan@stratify.com>
To: "Tom Lane" <tgl@sss.pgh.pa.us>
Cc: <pgsql-performance@postgresql.org>
Sent: Wednesday, February 03, 2010 2:23 AM
Subject: Re: [PERFORM] Queries within a function


Tom,
 I cannot run execute because all these are temp tables with drop on auto
commit within a function. This should have something to do with running it
in a transaction, when I run them in autocommit mode (without a drop on
autocommit) the queries return in a few seconds.


-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Tuesday, February 02, 2010 11:28 AM
To: Mridula Mahadevan
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Queries within a function

Mridula Mahadevan <mmahadevan@stratify.com> writes:
>  I am running a bunch of queries within a function, creating some temp
> tables and populating them. When the data exceeds say, 100k the queries
> start getting really slow and timeout (30 min). when these are run outside
> of a transaction(in auto commit mode), they run in a few seconds. Any
> ideas on what may be going on and any postgresql.conf parameters etc that
> might help?

I'll bet the function is caching query plans that stop being appropriate
once the table grows in size.  You might have to resort to using
EXECUTE, although if you're on 8.4 DISCARD PLANS ought to help too.

regards, tom lane

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



Re: Queries within a function

From
Віталій Тимчишин
Date:


2010/2/2 Mridula Mahadevan <mmahadevan@stratify.com>

Hi,

 I am running a bunch of queries within a function, creating some temp tables and populating them. When the data exceeds say, 100k the queries start getting really slow and timeout (30 min). when these are run outside of a transaction(in auto commit mode), they run in a few seconds. Any ideas on what may be going on and any postgresql.conf parameters etc that might help?

Thanks

Have you tried to analyze temp tables after you've populated them? Because AFAIK it won't do it automatically for tables created, filled and then used  in same transaction.

Re: Queries within a function

From
Mridula Mahadevan
Date:

Thank you all, You were right on the analyze. Insert statement with an aggregated subquery had a problem on an empty table.

 

I had to change the queries to do a simple insert then analyze on the table followed by an update with an aggregated sub query.   That goes thru very fast.

 

-mridula

 

From: Віталій Тимчишин [mailto:tivv00@gmail.com]
Sent: Wednesday, February 03, 2010 8:11 AM
To: Mridula Mahadevan
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Queries within a function

 

 

2010/2/2 Mridula Mahadevan <mmahadevan@stratify.com>

Hi,

 I am running a bunch of queries within a function, creating some temp tables and populating them. When the data exceeds say, 100k the queries start getting really slow and timeout (30 min). when these are run outside of a transaction(in auto commit mode), they run in a few seconds. Any ideas on what may be going on and any postgresql.conf parameters etc that might help?

Thanks

Have you tried to analyze temp tables after you've populated them? Because AFAIK it won't do it automatically for tables created, filled and then used  in same transaction.