Re: Weird performance issue with custom function with a for loop. - Mailing list pgsql-general

From Nicos Panayides
Subject Re: Weird performance issue with custom function with a for loop.
Date
Msg-id 4D47DBD8.6050501@magneta.com.cy
Whole thread Raw
In response to Re: Weird performance issue with custom function with a for loop.  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Weird performance issue with custom function with a for loop.
List pgsql-general
On 01/31/2011 08:48 PM, Tom Lane wrote:
> Nicos Panayides<nicos@magneta.com.cy>  writes:
>
>> the following function takes forever to execute as is. I see 'Inserting
>> original actions in temporary table' and nothing after that. If i
>> replace orig_user_id in the FOR loop with 1811 (the same orig_user_id
>> passed as the function parameter) it returns immediately correctly (the
>> table has indices so it's very fast).
>>
> It seems likely that you're getting a different plan for the generic
> case because that user id isn't representative of the overall average
> for the column.  You could investigate by explaining a parameterized
> query:
>
> PREPARE foo (bigint) AS
>    SELECT ... WHERE game_round_actions.user_id = $1 ... ;
> EXPLAIN EXECUTE foo(1811);
>
> (To really see exactly what's happening, you'd probably need to
> parameterize for each of the plpgsql variables used in the query;
> I'm suspicious that the BETWEEN might be contributing to the
> issue as well.)
>
> Possibly increasing the stats target for the user id column would help,
> but it's hard to be sure without knowing what its distribution is like.
>
>             regards, tom lane
>
I tried the prepared statement with both $1 and 1811 for user_id and
here's the plans I got:

"Sort  (cost=51704688.71..51704689.50 rows=314 width=57)"
"  Sort Key: game_round_actions.action_time, game_round_actions.action_id"
"  ->  Nested Loop  (cost=0.00..51704675.69 rows=314 width=57)"
"        ->  Seq Scan on game_round_actions  (cost=0.00..51702078.26
rows=314 width=53)"
"              Filter: ((action_time >= $2) AND (action_time <= $3) AND
(sub_action_id = 0) AND (user_id = $1))"
"        ->  Index Scan using "PK_game_table" on game_tables
(cost=0.00..8.26 rows=1 width=12)"
"              Index Cond: (game_tables.table_id =
game_round_actions.table_id)"
"              Filter: (game_tables.game_type_id <> ANY ($4))"

"Sort  (cost=226660.58..226661.33 rows=300 width=57)"
"  Sort Key: game_round_actions.action_time, game_round_actions.action_id"
"  ->  Nested Loop  (cost=0.00..226648.24 rows=300 width=57)"
"        ->  Index Scan using i_session on game_round_actions
(cost=0.00..224166.97 rows=300 width=53)"
"              Index Cond: ((action_time >= $2) AND (action_time <= $3))"
"              Filter: (user_id = 1811)"
"        ->  Index Scan using "PK_game_table" on game_tables
(cost=0.00..8.26 rows=1 width=12)"
"              Index Cond: (game_tables.table_id =
game_round_actions.table_id)"
"              Filter: (game_tables.game_type_id <> ANY ($4))"


Here's the table definition:

CREATE TABLE game_round_actions
(
   table_id bigint NOT NULL,
   round_id integer NOT NULL,
   action_id integer NOT NULL,
   seat_id integer NOT NULL,
   action_desc character varying(20) NOT NULL,
   action_area character varying(100),
   amount numeric(16,6),
   action_value character varying(100),
   action_time timestamp without time zone NOT NULL DEFAULT
CURRENT_TIMESTAMP AT TIME ZONE 'UTC',
   user_id bigint,
   sub_action_id integer NOT NULL, -- Sub action id is 0 for the root
actions. >0 for generated actions.

   CONSTRAINT "PK_game_round_actions" PRIMARY KEY (table_id, round_id,
action_id, sub_action_id),
   CONSTRAINT fk_game_round_actions_round FOREIGN KEY (table_id, round_id)
       REFERENCES game_rounds (table_id, round_id) MATCH SIMPLE
       ON UPDATE NO ACTION ON DELETE CASCADE,
   CONSTRAINT fk_game_round_actions_user FOREIGN KEY (table_id, user_id)
       REFERENCES game_table_users (table_id, user_id) MATCH SIMPLE
       ON UPDATE NO ACTION ON DELETE NO ACTION
)
WITH (
   OIDS=FALSE
);

CREATE INDEX i_session
   ON game_round_actions
   USING btree
   (action_time)
   WHERE user_id <> 0 AND sub_action_id = 0;


The table contains 1 655 528 000 rows (estimated) and there are about
10000 unique user_ids. The data spans about 2 years.
Shouldn't postgres realise that in both cases user_id is compared
against a constant value and chose the same plan?

How do I increase the stats target for the column?

--
Regards,

Nicos Panayides
IT Manager

Magneta Technologies Ltd
Tel: +357 22721919, 22317400
Fax: +357 22721917
Web: http://www.magneta.eu


pgsql-general by date:

Previous
From: John R Pierce
Date:
Subject: Re: Serial Vs Sequence
Next
From: "Yngve Nysaeter Pettersen"
Date:
Subject: Select for update with offset interferes with concurrent transactions