Thread: getting ERROR: stack depth limit exceeded on a WHERE IN query on a view

getting ERROR: stack depth limit exceeded on a WHERE IN query on a view

From
Aditya Rastogi
Date:
Hi ,

I am trying to run the following query on a view that is basically composed of 3 table joins . The query includes a WHERE IN clause which searches for the list of pairs specified in the WHERE IN expression inside one of the underlying tables of the query and calculates the count of the matches. The query is similar to the following query:

       select count(*) from gui_die_summary where (x_coord, y_coord) in ((25,5),(41,13),(25,7),(28,3),(25,8),(34,7),(26,6),(21,10)); ,

only that the list of pairs specified in the in clause is pretty large - around 5000-4000 pairs and that's when I get the stack depth limit exceed error. the columns x_coord and y_coord are contained in the table device_info which is one of the tables used in the join . The database contains an index on device_info(x_coord,y_coord) . The query plan for the above query(and similarly for the query with a larger number of pairs in the  WHERE IN expression) does use that index :

      explain analyze select count(*) from gui_die_summary where (x_coord, y_coord) in ((25,5),(41,13),(25,7),(28,3),(25,8),(34,7),(26,6),(21,10));

                                                                                                                                                                                                          QUERY PLAN                        
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=2517.41..2517.42 rows=1 width=0) (actual time=33.006..33.006 rows=1 loops=1)
   ->  Sort  (cost=2517.25..2517.28 rows=10 width=82) (actual time=32.996..32.999 rows=21 loops=1)
         Sort Key: (CASE WHEN (diagnose_runlog.scandie IS TRUE) THEN 'yes'::text ELSE 'no'::text END), diagnose_runlog.num_total_suspects, ((((diagnose_runlog.total_tfsf)::double precision * 100::double precision) / (CASE WHEN (diagnose_runlog.num_simulated = 0) THEN diagnose_runlog.total_tfsf ELSE diagnose_runlog.num_simulated END)::double precision))
         Sort Method:  quicksort  Memory: 27kB
         ->  Hash Join  (cost=2408.43..2517.09 rows=10 width=82) (actual time=24.919..32.929 rows=21 loops=1)
               Hash Cond: (COALESCE(failset_scenarios.device_id, partition_runlog.device_id) = device_info.device_id)
               ->  Merge Full Join  (cost=2346.37..2432.98 rows=5818 width=49) (actual time=23.880..31.658 rows=5788 loops=1)
                     Merge Cond: ((failset_scenarios.scenario_id = partition_runlog.scenario_id) AND (failset_scenarios.device_id = partition_runlog.device_id))
                     ->  Sort  (cost=593.99..608.53 rows=5818 width=8) (actual time=3.475..4.083 rows=5788 loops=1)
                           Sort Key: failset_scenarios.scenario_id, failset_scenarios.device_id
                           Sort Method:  quicksort  Memory: 464kB
                           ->  Seq Scan on failset_scenarios  (cost=0.00..230.18 rows=5818 width=8) (actual time=0.004..1.866 rows=5788 loops=1)
                     ->  Sort  (cost=1752.38..1766.70 rows=5729 width=49) (actual time=20.397..21.086 rows=5729 loops=1)
                           Sort Key: partition_runlog.scenario_id, partition_runlog.device_id
                           Sort Method:  quicksort  Memory: 998kB
                           ->  Hash Join  (cost=649.90..1394.77 rows=5729 width=49) (actual time=4.923..13.704 rows=5729 loops=1)
                                 Hash Cond: (diagnose_runlog.part_id = partition_runlog.part_id)
                                 ->  Seq Scan on diagnose_runlog  (cost=0.00..630.29 rows=5729 width=45) (actual time=0.003..1.633 rows=5729 loops=1)
                                 ->  Hash  (cost=578.29..578.29 rows=5729 width=12) (actual time=4.906..4.906 rows=5729 loops=1)
                                       ->  Seq Scan on partition_runlog  (cost=0.00..578.29 rows=5729 width=12) (actual time=0.002..2.659 rows=5729 loops=1)
               ->  Hash  (cost=61.94..61.94 rows=10 width=37) (actual time=0.108..0.108 rows=21 loops=1)
                     ->  Bitmap Heap Scan on device_info  (cost=34.12..61.94 rows=10 width=37) (actual time=0.063..0.089 rows=21 loops=1)
                           Recheck Cond: (((x_coord = 25) AND (y_coord = 5)) OR ((x_coord = 41) AND (y_coord = 13)) OR ((x_coord = 25) AND (y_coord = 7)) OR ((x_coord = 28) AND (y_coord = 3)) OR ((x_coord = 25) AND (y_coord = 8)) OR ((x_coord = 34) AND (y_coord = 7)) OR ((x_coord = 26) AND (y_coord = 6)) OR ((x_coord = 21) AND (y_coord = 10)))
                           ->  BitmapOr  (cost=34.12..34.12 rows=10 width=0) (actual time=0.053..0.053 rows=0 loops=1)
                                 ->  Bitmap Index Scan on x_y_idx  (cost=0.00..4.26 rows=1 width=0) (actual time=0.020..0.020 rows=3 loops=1)
                                       Index Cond: ((x_coord = 25) AND (y_coord = 5))
                                 ->  Bitmap Index Scan on x_y_idx  (cost=0.00..4.26 rows=1 width=0) (actual time=0.005..0.005 rows=1 loops=1)
                                       Index Cond: ((x_coord = 41) AND (y_coord = 13))
                                 ->  Bitmap Index Scan on x_y_idx  (cost=0.00..4.26 rows=1 width=0) (actual time=0.005..0.005 rows=3 loops=1)
                                       Index Cond: ((x_coord = 25) AND (y_coord = 7))
                                 ->  Bitmap Index Scan on x_y_idx  (cost=0.00..4.27 rows=2 width=0) (actual time=0.005..0.005 rows=6 loops=1)
                                       Index Cond: ((x_coord = 28) AND (y_coord = 3))
                                 ->  Bitmap Index Scan on x_y_idx  (cost=0.00..4.26 rows=1 width=0) (actual time=0.003..0.003 rows=1 loops=1)
                                       Index Cond: ((x_coord = 25) AND (y_coord = 8))
                                 ->  Bitmap Index Scan on x_y_idx  (cost=0.00..4.26 rows=1 width=0) (actual time=0.005..0.005 rows=1 loops=1)
                                       Index Cond: ((x_coord = 34) AND (y_coord = 7))
                                 ->  Bitmap Index Scan on x_y_idx  (cost=0.00..4.26 rows=1 width=0) (actual time=0.004..0.004 rows=2 loops=1)
                                       Index Cond: ((x_coord = 26) AND (y_coord = 6))
                                 ->  Bitmap Index Scan on x_y_idx  (cost=0.00..4.26 rows=1 width=0) (actual time=0.005..0.005 rows=4 loops=1)
                                       Index Cond: ((x_coord = 21) AND (y_coord = 10))
 Total runtime: 33.196 ms
(41 rows)


However, when i do just a select(*) on the same view , with the same WHERE IN expression (with a large number of pairs), I do not get the stack depth limit exceeded error. Please help me understand the origin of this error and how can I change my query/schema in order to make such queries against the database.

Thanks
Aditya


Re: getting ERROR: stack depth limit exceeded on a WHERE IN query on a view

From
Tom Lane
Date:
Aditya Rastogi <adirastogi@outlook.com> writes:
> The query is similar to the following query:
>        select count(*) from gui_die_summary where (x_coord, y_coord) in
((25,5),(41,13),(25,7),(28,3),(25,8),(34,7),(26,6),(21,10));, 
> only that the list of pairs specified in the in clause is pretty large - around 5000-4000 pairs and that's when I get
thestack depth limit exceed error. 

Even without the stack depth issue, that would perform pretty horridly for
so many pairs.

Do you know that the pairs are all distinct, so that you don't really need
the duplicate-elimination behavior of IN?  If so, you could recast this
like so:

select count(*) from
  gui_die_summary,
  (values (25,5),(41,13),(25,7),(28,3),(25,8),(34,7),(26,6),(21,10)) v(vx,vy)
where (x_coord, y_coord) = (vx, vy);

which should work better for large numbers of pairs.

            regards, tom lane


Re: getting ERROR: stack depth limit exceeded on a WHERE IN query on a view

From
Aditya Rastogi
Date:

Thanks Tom, I'll try rewriting the query, with the distinct list of pairs. But I am still curious to know two things and would really appreciate if you could give me some pointers to help me understand them:
     
1. How does the stack depth come into play while evaluating this query ? What part of the query makes recursive calls ?
2. What would be the recommended approach to make such queries ? I tried putting the pairs in a temporary table and then joined it against the view to get the relevant tuples, which seemed to work.

Thanks,
Aditya

> From: tgl@sss.pgh.pa.us
> To: adirastogi@outlook.com
> CC: pgsql-novice@postgresql.org
> Subject: Re: [NOVICE] getting ERROR: stack depth limit exceeded on a WHERE IN query on a view
> Date: Mon, 24 Feb 2014 11:40:06 -0500
>
> Aditya Rastogi <adirastogi@outlook.com> writes:
> > The query is similar to the following query:
> > select count(*) from gui_die_summary where (x_coord, y_coord) in ((25,5),(41,13),(25,7),(28,3),(25,8),(34,7),(26,6),(21,10)); ,
> > only that the list of pairs specified in the in clause is pretty large - around 5000-4000 pairs and that's when I get the stack depth limit exceed error.
>
> Even without the stack depth issue, that would perform pretty horridly for
> so many pairs.
>
> Do you know that the pairs are all distinct, so that you don't really need
> the duplicate-elimination behavior of IN? If so, you could recast this
> like so:
>
> select count(*) from
> gui_die_summary,
> (values (25,5),(41,13),(25,7),(28,3),(25,8),(34,7),(26,6),(21,10)) v(vx,vy)
> where (x_coord, y_coord) = (vx, vy);
>
> which should work better for large numbers of pairs.
>
> regards, tom lane
>
>
> --
> Sent via pgsql-novice mailing list (pgsql-novice@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-novice

Re: getting ERROR: stack depth limit exceeded on a WHERE IN query on a view

From
Tom Lane
Date:
Aditya Rastogi <adirastogi@outlook.com> writes:
> Thanks Tom, I'll try rewriting the query, with the distinct list of
pairs. But I am still curious to know two things and would really
appreciate if you could give me some pointers to help me understand them:
1. How does the stack depth come into play while evaluating this query ?

The IN clause is rewritten into

  ((((x_coord, y_coord) = (25,5) OR (x_coord, y_coord) = (...)) OR (x_coord, y_coord) = (...)) OR ...)

that is, you've got thousands of nested OR constructs, and what's failing
is parser processing of that nest.

We could possibly dodge the stack problem by flattening the output of
transformAExprIn to an N-way OR instead of a nest of binary ORs.
I've not experimented with that though.  In any case, it'd just move the
performance issue someplace else --- you'd still have a situation where
each of those row equality clauses is processed separately for parsing and
planning purposes.  That's intentional in case some of them are not like
the others, but in this example they are all pretty much equivalent so
you're just wasting cycles.

            regards, tom lane