Thread: Curious problem of using BETWEEN with start and end being the same versus EQUALS '='

Curious problem of using BETWEEN with start and end being the same versus EQUALS '='

From
Venky Kandaswamy
Date:
All,
   On 9.1, I am running into a curious issue. I will explain the issue in high level terms using psuedo SQL statements. Consider a SQL statement:

SELECT a, b, c FROM tab WHERE a = value1;

- This does an index scan followed by a merge join and takes about 37 secs to execute

If I change the query to:

SELECT a, b, c FROM tab WHERE a BETWEEN value1 AND value1; -- The start and end are the same

- This query does a index scan and nested loop join and takes forever to run (as a matter of fact, it has never finished even after 8+ hours!!!!)

My actual query:

First Run with BETWEEN clause range of 4 days (Execution time = about 5 mins)

  select a.date_id Date,
    a.page_group page_group,
    a.page page,
    ...  --{snipped for brevity}
 from bi2003.alps_agg a left outer join bi2003.event_agg b
    on (a.date_id = b.date_id and upper(a.adc_visit) = upper(bicommon.get_value('adc_visit', b.vcset)))
 WHERE a.date_id between 20120228 and 20120302
   and b.date_id between 20120228 and 20120302
   and a.page = 'ddi_671'
 group by 1,2,3,4,5,6,7,8,9;

Explain Analyze:

 "GroupAggregate  (cost=1214313.29..1923536.69 rows=274096 width=3040) (actual time=275946.288..312348.732 rows=861 loops=1)"
"  Output: a.date_id, a.page_group, a.page, a.int_alloc_type, (bicommon.get_value('browserfamily'::text, b.vcset)), (bicommon.get_value('trafficsource'::text, b.vcset)), (bicommon.get_value('e671_hl_p1'::text, a.componentset)), (bicommon.get_value('e671_img_p1'::text, a.componentset)), (bicommon.get_value('e671_formLabels'::text, a.componentset)), count(a.adc_visit), sum((bicommon.get_value('revenue'::text, b.eventvalueset))::numeric(9,0)), sum((bicommon.get_value('Impression'::text, b.eventcountset))::numeric(9,0)), sum((bicommon.get_value('page1submit'::text, b.eventcountset))::numeric(9,0)), sum((bicommon.get_value('conversion'::text, b.eventcountset))::numeric(9,0)), sum((bicommon.get_value('lead'::text, b.eventcountset))::numeric(9,0))"
"  ->  Sort  (cost=1214313.29..1214998.53 rows=274096 width=3040) (actual time=275785.085..293169.601 rows=147480 loops=1)"
"        Output: a.date_id, a.page_group, a.page, a.int_alloc_type, b.vcset, a.componentset, a.adc_visit, b.eventvalueset, b.eventcountset, (bicommon.get_value('browserfamily'::text, b.vcset)), (bicommon.get_value('trafficsource'::text, b.vcset)), (bicommon.get_value('e671_hl_p1'::text, a.componentset)), (bicommon.get_value('e671_img_p1'::text, a.componentset)), (bicommon.get_value('e671_formLabels'::text, a.componentset))"
"        Sort Key: a.date_id, a.page_group, a.page, a.int_alloc_type, (bicommon.get_value('browserfamily'::text, b.vcset)), (bicommon.get_value('trafficsource'::text, b.vcset)), (bicommon.get_value('e671_hl_p1'::text, a.componentset)), (bicommon.get_value('e671_img_p1'::text, a.componentset)), (bicommon.get_value('e671_formLabels'::text, a.componentset))"
"        Sort Method: external merge  Disk: 447168kB"
"        ->  Merge Join  (cost=380263.82..830740.00 rows=274096 width=3040) (actual time=81438.225..257963.708 rows=147480 loops=1)"
"              Output: a.date_id, a.page_group, a.page, a.int_alloc_type, b.vcset, a.componentset, a.adc_visit, b.eventvalueset, b.eventcountset, bicommon.get_value('browserfamily'::text, b.vcset), bicommon.get_value('trafficsource'::text, b.vcset), bicommon.get_value('e671_hl_p1'::text, a.componentset), bicommon.get_value('e671_img_p1'::text, a.componentset), bicommon.get_value('e671_formLabels'::text, a.componentset)"
"              Merge Cond: (((upper((a.adc_visit)::text)) = (upper(bicommon.get_value('adc_visit'::text, b.vcset)))) AND (a.date_id = b.date_id))"
"              ->  Sort  (cost=169041.20..169352.99 rows=124717 width=1350) (actual time=15181.446..25806.356 rows=147480 loops=1)"
"                    Output: a.date_id, a.page_group, a.page, a.int_alloc_type, a.componentset, a.adc_visit, (upper((a.adc_visit)::text))"
"                    Sort Key: (upper((a.adc_visit)::text)), a.date_id"
"                    Sort Method: external merge  Disk: 205824kB"
"                    ->  Index Scan using alps_agg_date_id on bi2003.alps_agg a  (cost=0.00..85163.47 rows=124717 width=1350) (actual time=28.843..11369.048 rows=147480 loops=1)"
"                          Output: a.date_id, a.page_group, a.page, a.int_alloc_type, a.componentset, a.adc_visit, upper((a.adc_visit)::text)"
"                          Index Cond: ((a.date_id >= 20120228) AND (a.date_id <= 20120302))"
"                          Filter: ((a.page)::text = 'ddi_671'::text)"
"              ->  Materialize  (cost=211222.61..212076.23 rows=170723 width=1694) (actual time=66254.779..80862.998 rows=187870 loops=1)"
"                    Output: b.vcset, b.eventvalueset, b.eventcountset, b.date_id, (upper(bicommon.get_value('adc_visit'::text, b.vcset)))"
"                    ->  Sort  (cost=211222.61..211649.42 rows=170723 width=1694) (actual time=66254.773..80680.870 rows=187870 loops=1)"
"                          Output: b.vcset, b.eventvalueset, b.eventcountset, b.date_id, (upper(bicommon.get_value('adc_visit'::text, b.vcset)))"
"                          Sort Key: (upper(bicommon.get_value('adc_visit'::text, b.vcset))), b.date_id"
"                          Sort Method: external merge  Disk: 312864kB"
"                          ->  Index Scan using event_agg_date_id on bi2003.event_agg b  (cost=0.00..70924.68 rows=170723 width=1694) (actual time=54.028..61463.318 rows=187870 loops=1)"
"                                Output: b.vcset, b.eventvalueset, b.eventcountset, b.date_id, upper(bicommon.get_value('adc_visit'::text, b.vcset))"
"                                Index Cond: ((b.date_id >= 20120228) AND (b.date_id <= 20120302))"
"Total runtime: 312461.516 ms"

Second Run with EQUALS clause of only one day (Execution time is 37 secs):

  select a.date_id Date,
    a.page_group page_group,
    a.page page,
    ... --{snipped for brevity}
 from bi2003.alps_agg a left outer join bi2003.event_agg b
    on (a.date_id = b.date_id and upper(a.adc_visit) = upper(bicommon.get_value('adc_visit', b.vcset)))
 WHERE a.date_id = 20120228      ----------------- CHANGE TO STATEMENT IS HERE ------------------
   and a.page = 'ddi_671'
 group by 1,2,3,4,5,6,7,8,9;

"GroupAggregate  (cost=18528595.55..29763160.89 rows=4341861 width=3040) (actual time=33762.882..37750.500 rows=208 loops=1)"
"  Output: a.date_id, a.page_group, a.page, a.int_alloc_type, (bicommon.get_value('browserfamily'::text, b.vcset)), (bicommon.get_value('trafficsource'::text, b.vcset)), (bicommon.get_value('e671_hl_p1'::text, a.componentset)), (bicommon.get_value('e671_img_p1'::text, a.componentset)), (bicommon.get_value('e671_formLabels'::text, a.componentset)), count(a.adc_visit), sum((bicommon.get_value('revenue'::text, b.eventvalueset))::numeric(9,0)), sum((bicommon.get_value('Impression'::text, b.eventcountset))::numeric(9,0)), sum((bicommon.get_value('page1submit'::text, b.eventcountset))::numeric(9,0)), sum((bicommon.get_value('conversion'::text, b.eventcountset))::numeric(9,0)), sum((bicommon.get_value('lead'::text, b.eventcountset))::numeric(9,0))"
"  ->  Sort  (cost=18528595.55..18539450.20 rows=4341861 width=3040) (actual time=33681.585..34796.148 rows=36132 loops=1)"
"        Output: a.date_id, a.page_group, a.page, a.int_alloc_type, b.vcset, a.componentset, a.adc_visit, b.eventvalueset, b.eventcountset, (bicommon.get_value('browserfamily'::text, b.vcset)), (bicommon.get_value('trafficsource'::text, b.vcset)), (bicommon.get_value('e671_hl_p1'::text, a.componentset)), (bicommon.get_value('e671_img_p1'::text, a.componentset)), (bicommon.get_value('e671_formLabels'::text, a.componentset))"
"        Sort Key: a.date_id, a.page_group, a.page, a.int_alloc_type, (bicommon.get_value('browserfamily'::text, b.vcset)), (bicommon.get_value('trafficsource'::text, b.vcset)), (bicommon.get_value('e671_hl_p1'::text, a.componentset)), (bicommon.get_value('e671_img_p1'::text, a.componentset)), (bicommon.get_value('e671_formLabels'::text, a.componentset))"
"        Sort Method: external merge  Disk: 108232kB"
"        ->  Merge Join  (cost=75654.72..6682201.93 rows=4341861 width=3040) (actual time=5486.181..30565.791 rows=36132 loops=1)"
"              Output: a.date_id, a.page_group, a.page, a.int_alloc_type, b.vcset, a.componentset, a.adc_visit, b.eventvalueset, b.eventcountset, bicommon.get_value('browserfamily'::text, b.vcset), bicommon.get_value('trafficsource'::text, b.vcset), bicommon.get_value('e671_hl_p1'::text, a.componentset), bicommon.get_value('e671_img_p1'::text, a.componentset), bicommon.get_value('e671_formLabels'::text, a.componentset)"
"              Merge Cond: ((upper((a.adc_visit)::text)) = (upper(bicommon.get_value('adc_visit'::text, b.vcset))))"
"              ->  Sort  (cost=35259.64..35325.37 rows=26292 width=1350) (actual time=878.497..985.331 rows=36132 loops=1)"
"                    Output: a.date_id, a.page_group, a.page, a.int_alloc_type, a.componentset, a.adc_visit, (upper((a.adc_visit)::text))"
"                    Sort Key: (upper((a.adc_visit)::text))"
"                    Sort Method: external merge  Disk: 49480kB"
"                    ->  Index Scan using alps_agg_date_id on bi2003.alps_agg a  (cost=0.00..17870.00 rows=26292 width=1350) (actual time=0.047..142.383 rows=36132 loops=1)"
"                          Output: a.date_id, a.page_group, a.page, a.int_alloc_type, a.componentset, a.adc_visit, upper((a.adc_visit)::text)"
"                          Index Cond: (a.date_id = 20120228)"
"                          Filter: ((a.page)::text = 'ddi_671'::text)"
"              ->  Materialize  (cost=40395.08..40560.22 rows=33028 width=1694) (actual time=4606.461..4809.163 rows=45428 loops=1)"
"                    Output: b.vcset, b.eventvalueset, b.eventcountset, b.date_id, (upper(bicommon.get_value('adc_visit'::text, b.vcset)))"
"                    ->  Sort  (cost=40395.08..40477.65 rows=33028 width=1694) (actual time=4606.455..4778.491 rows=45428 loops=1)"
"                          Output: b.vcset, b.eventvalueset, b.eventcountset, b.date_id, (upper(bicommon.get_value('adc_visit'::text, b.vcset)))"
"                          Sort Key: (upper(bicommon.get_value('adc_visit'::text, b.vcset)))"
"                          Sort Method: external merge  Disk: 75440kB"
"                          ->  Index Scan using event_agg_date_id on bi2003.event_agg b  (cost=0.00..13643.60 rows=33028 width=1694) (actual time=0.201..3865.567 rows=45428 loops=1)"
"                                Output: b.vcset, b.eventvalueset, b.eventcountset, b.date_id, upper(bicommon.get_value('adc_visit'::text, b.vcset))"
"                                Index Cond: (b.date_id = 20120228)"
"Total runtime: 37797.536 ms"

Now, The problem run:
If I change the above query to use one day in a BETWEEN clause where start and end are the same, the query never completes even after 6+ hours.

  select a.date_id Date,
    a.page_group page_group,
    a.page page,
    a.int_alloc_type int_alloc_type,
    ... --{snipped for brevity}
 from bi2003.alps_agg a left outer join bi2003.event_agg b
    on (a.date_id = b.date_id and upper(a.adc_visit) = upper(bicommon.get_value('adc_visit', b.vcset)))
 WHERE a.date_id BETWEEN 20120228 AND 20120228     ----------------- CHANGE TO STATEMENT IS HERE ------------------
   and a.page = 'ddi_671'
 group by 1,2,3,4,5,6,7,8,9;

"HashAggregate  (cost=23.23..24.49 rows=1 width=3040)"
"  Output: a.date_id, a.page_group, a.page, a.int_alloc_type, (bicommon.get_value('browserfamily'::text, b.vcset)), (bicommon.get_value('trafficsource'::text, b.vcset)), (bicommon.get_value('e671_hl_p1'::text, a.componentset)), (bicommon.get_value('e671_img_p1'::text, a.componentset)), (bicommon.get_value('e671_formLabels'::text, a.componentset)), count(a.adc_visit), sum((bicommon.get_value('revenue'::text, b.eventvalueset))::numeric(9,0)), sum((bicommon.get_value('Impression'::text, b.eventcountset))::numeric(9,0)), sum((bicommon.get_value('page1submit'::text, b.eventcountset))::numeric(9,0)), sum((bicommon.get_value('conversion'::text, b.eventcountset))::numeric(9,0)), sum((bicommon.get_value('lead'::text, b.eventcountset))::numeric(9,0))"
"  ->  Nested Loop  (cost=0.00..21.91 rows=1 width=3040)"
"        Output: a.date_id, a.page_group, a.page, a.int_alloc_type, b.vcset, a.componentset, a.adc_visit, b.eventvalueset, b.eventcountset, bicommon.get_value('browserfamily'::text, b.vcset), bicommon.get_value('trafficsource'::text, b.vcset), bicommon.get_value('e671_hl_p1'::text, a.componentset), bicommon.get_value('e671_img_p1'::text, a.componentset), bicommon.get_value('e671_formLabels'::text, a.componentset)"
"        Join Filter: ((a.date_id = b.date_id) AND (upper((a.adc_visit)::text) = upper(bicommon.get_value('adc_visit'::text, b.vcset))))"
"        ->  Index Scan using alps_agg_date_id on bi2003.alps_agg a  (cost=0.00..10.12 rows=1 width=1350)"
"              Output: a.date_id, a.adc_visit, a.page_group, a.page, a.int_alloc_type, a.componentset, a.variation_tagset, a.page_instance"
"              Index Cond: ((a.date_id >= 20120228) AND (a.date_id <= 20120228))"
"              Filter: ((a.page)::text = 'ddi_671'::text)"
"        ->  Index Scan using event_agg_date_id on bi2003.event_agg b  (cost=0.00..10.27 rows=1 width=1694)"
"              Output: b.date_id, b.vcset, b.eventcountset, b.eventvalueset"
"              Index Cond: ((b.date_id >= 20120228) AND (b.date_id <= 20120228))"

Has anyone experienced a similar issue? We have checked the tables and indexes for consistency, VACUUM'ed, ANALYZEd and REINDEX'ed them many times. But the between clause with start and end the same runs for hours and never completes. But if we use a range where start and end is not the same, it comes back in a few minutes. If we use a single day in an '=' clause, it also comes back in 30+ secs.

What could be the problem here? Does anyone have any insights?

________________________________________

Venky Kandaswamy

Principal Engineer, Adchemy Inc.

925-200-7124

Venky Kandaswamy <venky@adchemy.com> writes:
>    On 9.1, I am running into a curious issue.

It's not very curious at all, or at least people on pgsql-performance
(the right list for this sort of question) would have figured it out
quickly.  You're getting a crummy plan because of a crummy row estimate.
When you do this:

>  WHERE a.date_id = 20120228

you get this:

> "                    ->  Index Scan using alps_agg_date_id on bi2003.alps_agg a  (cost=0.00..17870.00 rows=26292
width=1350)(actual time=0.047..142.383 rows=36132 loops=1)" 
> "                          Output: a.date_id, a.page_group, a.page, a.int_alloc_type, a.componentset, a.adc_visit,
upper((a.adc_visit)::text)"
> "                          Index Cond: (a.date_id = 20120228)"
> "                          Filter: ((a.page)::text = 'ddi_671'::text)"

26K estimated rows versus 36K actual isn't the greatest estimate in the
world, but it's plenty good enough.  But when you do this:

>  WHERE a.date_id BETWEEN 20120228 AND 20120228

you get this:

> "        ->  Index Scan using alps_agg_date_id on bi2003.alps_agg a  (cost=0.00..10.12 rows=1 width=1350)"
> "              Output: a.date_id, a.adc_visit, a.page_group, a.page, a.int_alloc_type, a.componentset,
a.variation_tagset,a.page_instance" 
> "              Index Cond: ((a.date_id >= 20120228) AND (a.date_id <= 20120228))"
> "              Filter: ((a.page)::text = 'ddi_671'::text)"

so the bogus estimate of only one row causes the planner to pick an
entirely different plan, which would probably be a great choice if there
were indeed only one such row, but with 36000 of them it's horrid.

The reason the row estimate is so crummy is that a zero-width interval
is an edge case for range estimates.  We've seen this before, although
usually it's not quite this bad.

There's been some talk of making the estimate for "x >= a AND x <= b"
always be at least as much as the estimate for "x = a", but this would
increase the cost of making the estimate by quite a bit, and make things
actually worse in some cases (in particular, if a > b then a nil
estimate is indeed the right thing).

You might look into whether queries formed like "date_id >= 20120228 AND
date_id < 20120229" give you more robust estimates at the edge cases.

BTW, I notice in your EXPLAIN results that the same range restriction
has been propagated to b.date_id:

> "        ->  Index Scan using event_agg_date_id on bi2003.event_agg b  (cost=0.00..10.27 rows=1 width=1694)"
> "              Output: b.date_id, b.vcset, b.eventcountset, b.eventvalueset"
> "              Index Cond: ((b.date_id >= 20120228) AND (b.date_id <= 20120228))"

I'd expect that to happen automatically for a simple equality
constraint, but not for a range constraint.  Did you do that manually
and not tell us about it?

            regards, tom lane


Thanks for the quick and detailed response, Tom.

Yes, I did add a redundant where clause with a restriction on b.date_id on the range queries. This appears to speed
thingsup since it does an index scan on the b table before the merge join.  

We will get more intelligent on query generation (our system generates queries on the fly) to work around this problem.

________________________________________

Venky Kandaswamy

Principal Engineer, Adchemy Inc.

925-200-7124

________________________________________
From: Tom Lane [tgl@sss.pgh.pa.us]
Sent: Tuesday, January 15, 2013 2:30 PM
To: Venky Kandaswamy
Cc: pgsql-general@postgresql.org; pgsql-sql@postgresql.org
Subject: Re: [SQL] Curious problem of using BETWEEN with start and end being the same versus EQUALS '='

Venky Kandaswamy <venky@adchemy.com> writes:
>    On 9.1, I am running into a curious issue.

It's not very curious at all, or at least people on pgsql-performance
(the right list for this sort of question) would have figured it out
quickly.  You're getting a crummy plan because of a crummy row estimate.
When you do this:

>  WHERE a.date_id = 20120228

you get this:

> "                    ->  Index Scan using alps_agg_date_id on bi2003.alps_agg a  (cost=0.00..17870.00 rows=26292
width=1350)(actual time=0.047..142.383 rows=36132 loops=1)" 
> "                          Output: a.date_id, a.page_group, a.page, a.int_alloc_type, a.componentset, a.adc_visit,
upper((a.adc_visit)::text)"
> "                          Index Cond: (a.date_id = 20120228)"
> "                          Filter: ((a.page)::text = 'ddi_671'::text)"

26K estimated rows versus 36K actual isn't the greatest estimate in the
world, but it's plenty good enough.  But when you do this:

>  WHERE a.date_id BETWEEN 20120228 AND 20120228

you get this:

> "        ->  Index Scan using alps_agg_date_id on bi2003.alps_agg a  (cost=0.00..10.12 rows=1 width=1350)"
> "              Output: a.date_id, a.adc_visit, a.page_group, a.page, a.int_alloc_type, a.componentset,
a.variation_tagset,a.page_instance" 
> "              Index Cond: ((a.date_id >= 20120228) AND (a.date_id <= 20120228))"
> "              Filter: ((a.page)::text = 'ddi_671'::text)"

so the bogus estimate of only one row causes the planner to pick an
entirely different plan, which would probably be a great choice if there
were indeed only one such row, but with 36000 of them it's horrid.

The reason the row estimate is so crummy is that a zero-width interval
is an edge case for range estimates.  We've seen this before, although
usually it's not quite this bad.

There's been some talk of making the estimate for "x >= a AND x <= b"
always be at least as much as the estimate for "x = a", but this would
increase the cost of making the estimate by quite a bit, and make things
actually worse in some cases (in particular, if a > b then a nil
estimate is indeed the right thing).

You might look into whether queries formed like "date_id >= 20120228 AND
date_id < 20120229" give you more robust estimates at the edge cases.

BTW, I notice in your EXPLAIN results that the same range restriction
has been propagated to b.date_id:

> "        ->  Index Scan using event_agg_date_id on bi2003.event_agg b  (cost=0.00..10.27 rows=1 width=1694)"
> "              Output: b.date_id, b.vcset, b.eventcountset, b.eventvalueset"
> "              Index Cond: ((b.date_id >= 20120228) AND (b.date_id <= 20120228))"

I'd expect that to happen automatically for a simple equality
constraint, but not for a range constraint.  Did you do that manually
and not tell us about it?

                        regards, tom lane