[BUGS] BUG #14780: PostgreSQL 9.6 selects a wrong plan during aggregationagainst timestamp columns - Mailing list pgsql-bugs

From sogawa@yandex.ru
Subject [BUGS] BUG #14780: PostgreSQL 9.6 selects a wrong plan during aggregationagainst timestamp columns
Date
Msg-id 20170816014704.27360.35941@wrigleys.postgresql.org
Whole thread Raw
Responses [BUGS] Re: BUG #14780: PostgreSQL 9.6 selects a wrong plan duringaggregation against timestamp columns  (Greg Stark <stark@mit.edu>)
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      14780
Logged by:          sogawa-sps
Email address:      sogawa@yandex.ru
PostgreSQL version: 9.6.2
Operating system:   Windows 10
Description:

PostgreSQL 9.6 selects a wrong plan during aggregation against indexed
timestamp columns while it's ok for other types.

Given: table “log” that has three columns: user_id, day, hours.
   user_id character varying(36) COLLATE pg_catalog."default" NOT NULL,   day timestamp without time zone,   hours
doubleprecision
 

All columns have indexes.

The issue is that aggregation against the 'day' field works extremely slow
because makes a full scan filtering the entries that doesn’t relate to
user_id = 'ab056f5a-390b-41d7-ba56-897c14b679bf'
   select min(day) from log where user_id =
'ab056f5a-390b-41d7-ba56-897c14b679bf'
   [     {       "Execution Time": 146502.05,       "Planning Time": 0.893,       "Plan": {         "Startup Cost":
789.02,        "Actual Rows": 1,         "Plans": [           {             "Startup Cost": 0.44,             "Actual
Rows":1,             "Plans": [               {                 "Index Cond": "(log.day IS NOT NULL)",
"StartupCost": 0.44,                 "Scan Direction": "Forward",                 "Plan Width": 8,
"RowsRemoved by Index Recheck": 0,                 "Actual Rows": 1,                 "Node Type": "Index Scan",
       "Total Cost": 1395792.54,                 "Plan Rows": 1770,                 "Relation Name": "log",
   "Alias": "log",                 "Parallel Aware": false,                 "Actual Total Time": 146502.015,
    "Output": [                   "log.day"                 ],                 "Parent Relationship": "Outer",
      "Actual Startup Time": 146502.015,                 "Schema": "public",                 "Filter":
"((log.user_id)::text= 
'ab056f5a-390b-41d7-ba56-897c14b679bf'::text)",                 "Actual Loops": 1,                 "Rows Removed by
Filter":12665610,                 "Index Name": "index_log_day"               }             ],             "Node Type":
"Limit",            "Plan Rows": 1,             "Parallel Aware": false,             "Actual Total Time": 146502.016,
         "Output": [               "log.day"             ],             "Parent Relationship": "InitPlan",
"ActualStartup Time": 146502.016,             "Plan Width": 8,             "Subplan Name": "InitPlan 1 (returns $0)",
         "Actual Loops": 1,             "Total Cost": 789.02           }         ],         "Node Type": "Result",
  "Plan Rows": 1,         "Parallel Aware": false,         "Actual Total Time": 146502.019,         "Output": [
 "$0"         ],         "Actual Startup Time": 146502.019,         "Plan Width": 8,         "Actual Loops": 1,
"TotalCost": 789.03       },       "Triggers": []     }   ]
 

However the almost similar query but for the double type has a correct.
   select min(hours) from log where user_id =
'ab056f5a-390b-41d7-ba56-897c14b679bf'

Server selects entries for user_id = 'ab056f5a-390b-41d7-ba56-897c14b679bf'
first and then aggregates among them what is correct
   [     {       "Execution Time": 5.989,       "Planning Time": 1.186,       "Plan": {         "Partial Mode":
"Simple",        "Startup Cost": 6842.66,         "Actual Rows": 1,         "Plans": [           {             "Startup
Cost":66.28,             "Plan Width": 8,             "Rows Removed by Index Recheck": 0,             "Actual Rows":
745,            "Plans": [               {                 "Startup Cost": 0,                 "Plan Width": 0,
      "Actual Rows": 745,                 "Node Type": "Bitmap Index Scan",                 "Index Cond":
"((log.user_id)::text= 
'ab056f5a-390b-41d7-ba56-897c14b679bf'::text)",                 "Plan Rows": 1770,                 "Parallel Aware":
false,                "Actual Total Time": 0.25,                 "Parent Relationship": "Outer",
"ActualStartup Time": 0.25,                 "Total Cost": 65.84,                 "Actual Loops": 1,
"IndexName": "index_log_user_id"               }             ],             "Recheck Cond": "((log.user_id)::text = 
'ab056f5a-390b-41d7-ba56-897c14b679bf'::text)",             "Exact Heap Blocks": 742,             "Node Type": "Bitmap
HeapScan",             "Plan Rows": 1770,             "Relation Name": "log",             "Alias": "log",
"ParallelAware": false,             "Actual Total Time": 5.793,             "Output": [               "day",
  "hours",               "user_id"             ],             "Lossy Heap Blocks": 0,             "Parent
Relationship":"Outer",             "Actual Startup Time": 0.357,             "Total Cost": 6838.23,             "Actual
Loops":1,             "Schema": "public"           }         ],         "Node Type": "Aggregate",         "Strategy":
"Plain",        "Plan Rows": 1,         "Parallel Aware": false,         "Actual Total Time": 5.946,         "Output":
[          "min(hours)"         ],         "Actual Startup Time": 5.946,         "Plan Width": 8,         "Actual
Loops":1,         "Total Cost": 6842.67       },       "Triggers": []     }   ]
 

Optimizer have to select correct plan for the timestamp fields like it does
for double.

WA: Rewrite query into:
    select user_id, min(day) from log where user_id =
'ac43a155-4fbb-49eb-a670-02c307eb3d4f' group by user_id



--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

pgsql-bugs by date:

Previous
From: "Thomas S. Chin"
Date:
Subject: Re: [BUGS] BUG #14776: ecpg 4.12.0 issues with macros containing linecontinued blocks
Next
From: "Asif Shaikh"
Date:
Subject: [BUGS] Postgre driver(9.06.04.00) connected to Amazon redshift, is returning wrongcount in SQLRowCount