Thread: FW: was [PERFORM] partitioned table and ORDER BY indexed_field DESC LIMIT 1

FW: was [PERFORM] partitioned table and ORDER BY indexed_field DESC LIMIT 1

From
"Luke Lonergan"
Date:
Below is a patch against Greenplum Database that fixes the problem.

- Luke

------ Forwarded Message
From: Luke Lonergan <llonergan@greenplum.com>
Date: Fri, 24 Aug 2007 09:25:53 -0700
To: Heikki Linnakangas <heikki@enterprisedb.com>, Anton <anton200@gmail.com>
Cc: <pgsql-performance@postgresql.org>
Conversation: [PERFORM] partitioned table and ORDER BY indexed_field DESC
LIMIT 1
Subject: Re: [PERFORM] partitioned table and ORDER BY indexed_field DESC
LIMIT 1

Below is a patch against 8.2.4 (more or less), Heikki can you take a look at
it?

This enables the use of index scan of a child table by recognizing sort
order of the append node.  Kurt Harriman did the work.

- Luke

Index: cdb-pg/src/backend/optimizer/path/indxpath.c
===================================================================
RCS file: 
/data/FISHEYE_REPOSITORIES/greenplum/cvsroot/cdb2/cdb-pg/src/backend/optimiz
er/path/indxpath.c,v
diff -u -N -r1.22 -r1.22.2.1
--- cdb-pg/src/backend/optimizer/path/indxpath.c    25 Apr 2007 22:07:21
-0000    1.22
+++ cdb-pg/src/backend/optimizer/path/indxpath.c    10 Aug 2007 03:41:15
-0000    1.22.2.1
@@ -379,8 +379,51 @@            index_pathkeys = build_index_pathkeys(root, index,
           ForwardScanDirection,                                                  true);
 
-            useful_pathkeys = truncate_useless_pathkeys(root, rel,
-                                                        index_pathkeys);
+            /*
+             * CDB: For appendrel child, pathkeys contain Var nodes in
terms 
+             * of the child's baserel.  Transform the pathkey list to refer
to 
+             * columns of the appendrel.
+             */
+            if (rel->reloptkind == RELOPT_OTHER_MEMBER_REL)
+            {
+                AppendRelInfo  *appinfo = NULL;
+                RelOptInfo     *appendrel = NULL;
+                ListCell       *appcell;
+                CdbPathLocus    notalocus;
+
+                /* Find the appendrel of which this baserel is a child. */
+                foreach(appcell, root->append_rel_list)
+                {
+                    appinfo = (AppendRelInfo *)lfirst(appcell);
+                    if (appinfo->child_relid == rel->relid)
+                        break;
+                }
+                Assert(appinfo);
+                appendrel = find_base_rel(root, appinfo->parent_relid);
+
+                /*
+                 * The pathkey list happens to have the same format as the
+                 * partitioning key of a Hashed locus, so by disguising it
+                 * we can use cdbpathlocus_pull_above_projection() to do
the 
+                 * transformation.
+                 */
+                CdbPathLocus_MakeHashed(¬alocus, index_pathkeys);
+                notalocus =
+                    cdbpathlocus_pull_above_projection(root,
+                                                       notalocus,
+                                                       rel->relids,
+                                                       rel->reltargetlist,
+                  
appendrel->reltargetlist,
+                                                       appendrel->relid);
+                if (CdbPathLocus_IsHashed(notalocus))
+                    index_pathkeys = truncate_useless_pathkeys(root,
appendrel,
+                  
notalocus.partkey);
+                else
+                    index_pathkeys = NULL;
+            }
+
+            useful_pathkeys = truncate_useless_pathkeys(root, rel,
+                                                        index_pathkeys);        }        else
useful_pathkeys= NIL;
 
Index: cdb-pg/src/backend/optimizer/path/pathkeys.c
===================================================================
RCS file: 
/data/FISHEYE_REPOSITORIES/greenplum/cvsroot/cdb2/cdb-pg/src/backend/optimiz
er/path/pathkeys.c,v
diff -u -N -r1.18 -r1.18.2.1
--- cdb-pg/src/backend/optimizer/path/pathkeys.c    30 Apr 2007 05:44:07
-0000    1.18
+++ cdb-pg/src/backend/optimizer/path/pathkeys.c    10 Aug 2007 03:41:15
-0000    1.18.2.1
@@ -1403,55 +1403,53 @@{    PathKeyItem    *item;    Expr           *newexpr;
+    AttrNumber  targetindex;    Assert(pathkey);
-    /* Use constant expr if available.  Will be at head of list. */
-    if (CdbPathkeyEqualsConstant(pathkey))
+    /* Find an expr that we can rewrite to use the projected columns. */
+    item = cdbpullup_findPathKeyItemInTargetList(pathkey,
+                                                 relids,
+                                                 targetlist,
+                                                 &targetindex); // OUT
+    
+    /* If not found, see if the equiv class contains a constant expr. */
+    if (!item &&
+        CdbPathkeyEqualsConstant(pathkey))    {        item = (PathKeyItem *)linitial(pathkey);        newexpr = (Expr
*)copyObject(item->key);   }
 
-    /* New vars for old! */
-    else
-    {
-        AttrNumber  targetindex;
+    /* Fail if no usable expr. */
+    else if (!item)
+        return NULL;
-        /* Find an expr that we can rewrite to use the projected columns.
*/
-        item = cdbpullup_findPathKeyItemInTargetList(pathkey,
-                                                     relids,
-                                                     targetlist,
-                                                     &targetindex); // OUT
-        if (!item)
-            return NULL;
+    /* If found matching targetlist item, make a Var that references it. */
+    else if (targetindex > 0)
+        newexpr = (Expr *)cdbpullup_makeVar(newrelid,
+                                            targetindex,
+                                            newvarlist,
+                                            (Expr *)item->key);
-        /* If found matching targetlist item, make a Var that references
it. */
-        if (targetindex > 0)
-            newexpr = (Expr *)cdbpullup_makeVar(newrelid,
-                                                targetindex,
-                                                newvarlist,
-                                                (Expr *)item->key);
+    /* Replace expr's Var nodes with new ones referencing the targetlist.
*/
+    else
+        newexpr = cdbpullup_expr((Expr *)item->key,
+                                 targetlist,
+                                 newvarlist,
+                                 newrelid);
-        /* Replace expr's Var nodes with new ones referencing the
targetlist. */
-        else
-            newexpr = cdbpullup_expr((Expr *)item->key,
-                                     targetlist,
-                                     newvarlist,
-                                     newrelid);
+    /* Pull up RelabelType node too, unless tlist expr has right type. */
+    if (IsA(item->key, RelabelType))
+    {
+        RelabelType    *oldrelabel = (RelabelType *)item->key;
-        /* Pull up RelabelType node too, unless tlist expr has right type.
*/
-        if (IsA(item->key, RelabelType))
-        {
-            RelabelType    *oldrelabel = (RelabelType *)item->key;
-
-            if (oldrelabel->resulttype != exprType((Node *)newexpr) ||
-                oldrelabel->resulttypmod != exprTypmod((Node *)newexpr))
-                newexpr = (Expr *)makeRelabelType(newexpr,
-                                                  oldrelabel->resulttype,
-                                                  oldrelabel->resulttypmod,
-                  
oldrelabel->relabelformat);
-        }
+        if (oldrelabel->resulttype != exprType((Node *)newexpr) ||
+            oldrelabel->resulttypmod != exprTypmod((Node *)newexpr))
+            newexpr = (Expr *)makeRelabelType(newexpr,
+                                              oldrelabel->resulttype,
+                                              oldrelabel->resulttypmod,
+                                              oldrelabel->relabelformat);    }    Insist(newexpr);
Index: cdb-pg/src/backend/optimizer/util/pathnode.c
===================================================================
RCS file: 
/data/FISHEYE_REPOSITORIES/greenplum/cvsroot/cdb2/cdb-pg/src/backend/optimiz
er/util/pathnode.c,v
diff -u -N -r1.52.2.4 -r1.52.2.5
--- cdb-pg/src/backend/optimizer/util/pathnode.c    5 Aug 2007 23:06:44
-0000    1.52.2.4
+++ cdb-pg/src/backend/optimizer/util/pathnode.c    10 Aug 2007 03:41:15
-0000    1.52.2.5
@@ -1563,7 +1563,15 @@            pathnode->path.rescannable = false;    }
-    return pathnode;
+    /* 
+     * CDB: If there is exactly one subpath, its ordering is preserved.
+     * Child rel's pathkey exprs are already expressed in terms of the
+     * columns of the parent appendrel.  See find_usable_indexes().
+     */
+    if (list_length(subpaths) == 1)
+        pathnode->path.pathkeys = ((Path *)linitial(subpaths))->pathkeys;
+    
+    return pathnode;}/*


On 8/24/07 3:38 AM, "Heikki Linnakangas" <heikki@enterprisedb.com> wrote:

> Anton wrote:
>>>> =# explain SELECT * FROM n_traf ORDER BY date_time DESC LIMIT 1;
>>>>                                                QUERY PLAN
>>> ----------------------------------------------------------------------------
>>> -----------------------------
>>>> Limit  (cost=824637.69..824637.69 rows=1 width=32)
>>>>    ->  Sort  (cost=824637.69..838746.44 rows=5643499 width=32)
>>>>          Sort Key: public.n_traf.date_time
>>>>          ->  Result  (cost=0.00..100877.99 rows=5643499 width=32)
>>>>                ->  Append  (cost= 0.00..100877.99 rows=5643499 width=32)
>>>>                      ->  Seq Scan on n_traf  (cost=0.00..22.30
>>>> rows=1230 width=32)
>>>>                      ->  Seq Scan on n_traf_y2007m01 n_traf
>>>> (cost=0.00..22.30 rows=1230 width=32)
>> ...
>>>>                      ->  Seq Scan on n_traf_y2007m12 n_traf
>>>> (cost=0.00..22.30 rows=1230 width=32)
>>>> (18 rows)
>>>> 
>>>> Why it no uses indexes at all?
>>>> -------------------------------------------
>>> I'm no expert but I'd guess that the the planner doesn't know which
>>> partition holds the latest time so it has to read them all.
>> 
>> Agree. But why it not uses indexes when it reading them?
> 
> The planner isn't smart enough to push the "ORDER BY ... LIMIT ..."
> below the append node. Therefore it needs to fetch all rows from all the
> tables, and the fastest way to do that is a seq scan.



---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at
               http://www.postgresql.org/about/donate

------ End of Forwarded Message




Re: FW: was [PERFORM] partitioned table and ORDER BY indexed_field DESC LIMIT 1

From
Bruce Momjian
Date:
This has been saved for the 8.4 release:
http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---------------------------------------------------------------------------

Luke Lonergan wrote:
> Below is a patch against Greenplum Database that fixes the problem.
> 
> - Luke
> 
> ------ Forwarded Message
> From: Luke Lonergan <llonergan@greenplum.com>
> Date: Fri, 24 Aug 2007 09:25:53 -0700
> To: Heikki Linnakangas <heikki@enterprisedb.com>, Anton <anton200@gmail.com>
> Cc: <pgsql-performance@postgresql.org>
> Conversation: [PERFORM] partitioned table and ORDER BY indexed_field DESC
> LIMIT 1
> Subject: Re: [PERFORM] partitioned table and ORDER BY indexed_field DESC
> LIMIT 1
> 
> Below is a patch against 8.2.4 (more or less), Heikki can you take a look at
> it?
> 
> This enables the use of index scan of a child table by recognizing sort
> order of the append node.  Kurt Harriman did the work.
> 
> - Luke
> 
> Index: cdb-pg/src/backend/optimizer/path/indxpath.c
> ===================================================================
> RCS file: 
> /data/FISHEYE_REPOSITORIES/greenplum/cvsroot/cdb2/cdb-pg/src/backend/optimiz
> er/path/indxpath.c,v
> diff -u -N -r1.22 -r1.22.2.1
> --- cdb-pg/src/backend/optimizer/path/indxpath.c    25 Apr 2007 22:07:21
> -0000    1.22
> +++ cdb-pg/src/backend/optimizer/path/indxpath.c    10 Aug 2007 03:41:15
> -0000    1.22.2.1
> @@ -379,8 +379,51 @@
>              index_pathkeys = build_index_pathkeys(root, index,
>                                                    ForwardScanDirection,
>                                                    true);
> -            useful_pathkeys = truncate_useless_pathkeys(root, rel,
> -                                                        index_pathkeys);
> +            /*
> +             * CDB: For appendrel child, pathkeys contain Var nodes in
> terms 
> +             * of the child's baserel.  Transform the pathkey list to refer
> to 
> +             * columns of the appendrel.
> +             */
> +            if (rel->reloptkind == RELOPT_OTHER_MEMBER_REL)
> +            {
> +                AppendRelInfo  *appinfo = NULL;
> +                RelOptInfo     *appendrel = NULL;
> +                ListCell       *appcell;
> +                CdbPathLocus    notalocus;
> +
> +                /* Find the appendrel of which this baserel is a child. */
> +                foreach(appcell, root->append_rel_list)
> +                {
> +                    appinfo = (AppendRelInfo *)lfirst(appcell);
> +                    if (appinfo->child_relid == rel->relid)
> +                        break;
> +                }
> +                Assert(appinfo);
> +                appendrel = find_base_rel(root, appinfo->parent_relid);
> +
> +                /*
> +                 * The pathkey list happens to have the same format as the
> +                 * partitioning key of a Hashed locus, so by disguising it
> +                 * we can use cdbpathlocus_pull_above_projection() to do
> the 
> +                 * transformation.
> +                 */
> +                CdbPathLocus_MakeHashed(¬alocus, index_pathkeys);
> +                notalocus =
> +                    cdbpathlocus_pull_above_projection(root,
> +                                                       notalocus,
> +                                                       rel->relids,
> +                                                       rel->reltargetlist,
> +                  
> appendrel->reltargetlist,
> +                                                       appendrel->relid);
> +                if (CdbPathLocus_IsHashed(notalocus))
> +                    index_pathkeys = truncate_useless_pathkeys(root,
> appendrel,
> +                  
> notalocus.partkey);
> +                else
> +                    index_pathkeys = NULL;
> +            }
> +
> +            useful_pathkeys = truncate_useless_pathkeys(root, rel,
> +                                                        index_pathkeys);
>          }
>          else
>              useful_pathkeys = NIL;
> Index: cdb-pg/src/backend/optimizer/path/pathkeys.c
> ===================================================================
> RCS file: 
> /data/FISHEYE_REPOSITORIES/greenplum/cvsroot/cdb2/cdb-pg/src/backend/optimiz
> er/path/pathkeys.c,v
> diff -u -N -r1.18 -r1.18.2.1
> --- cdb-pg/src/backend/optimizer/path/pathkeys.c    30 Apr 2007 05:44:07
> -0000    1.18
> +++ cdb-pg/src/backend/optimizer/path/pathkeys.c    10 Aug 2007 03:41:15
> -0000    1.18.2.1
> @@ -1403,55 +1403,53 @@
>  {
>      PathKeyItem    *item;
>      Expr           *newexpr;
> +    AttrNumber  targetindex;
>  
>      Assert(pathkey);
>  
> -    /* Use constant expr if available.  Will be at head of list. */
> -    if (CdbPathkeyEqualsConstant(pathkey))
> +    /* Find an expr that we can rewrite to use the projected columns. */
> +    item = cdbpullup_findPathKeyItemInTargetList(pathkey,
> +                                                 relids,
> +                                                 targetlist,
> +                                                 &targetindex); // OUT
> +    
> +    /* If not found, see if the equiv class contains a constant expr. */
> +    if (!item &&
> +        CdbPathkeyEqualsConstant(pathkey))
>      {
>          item = (PathKeyItem *)linitial(pathkey);
>          newexpr = (Expr *)copyObject(item->key);
>      }
>  
> -    /* New vars for old! */
> -    else
> -    {
> -        AttrNumber  targetindex;
> +    /* Fail if no usable expr. */
> +    else if (!item)
> +        return NULL;
>  
> -        /* Find an expr that we can rewrite to use the projected columns.
> */
> -        item = cdbpullup_findPathKeyItemInTargetList(pathkey,
> -                                                     relids,
> -                                                     targetlist,
> -                                                     &targetindex); // OUT
> -        if (!item)
> -            return NULL;
> +    /* If found matching targetlist item, make a Var that references it. */
> +    else if (targetindex > 0)
> +        newexpr = (Expr *)cdbpullup_makeVar(newrelid,
> +                                            targetindex,
> +                                            newvarlist,
> +                                            (Expr *)item->key);
>  
> -        /* If found matching targetlist item, make a Var that references
> it. */
> -        if (targetindex > 0)
> -            newexpr = (Expr *)cdbpullup_makeVar(newrelid,
> -                                                targetindex,
> -                                                newvarlist,
> -                                                (Expr *)item->key);
> +    /* Replace expr's Var nodes with new ones referencing the targetlist.
> */
> +    else
> +        newexpr = cdbpullup_expr((Expr *)item->key,
> +                                 targetlist,
> +                                 newvarlist,
> +                                 newrelid);
>  
> -        /* Replace expr's Var nodes with new ones referencing the
> targetlist. */
> -        else
> -            newexpr = cdbpullup_expr((Expr *)item->key,
> -                                     targetlist,
> -                                     newvarlist,
> -                                     newrelid);
> +    /* Pull up RelabelType node too, unless tlist expr has right type. */
> +    if (IsA(item->key, RelabelType))
> +    {
> +        RelabelType    *oldrelabel = (RelabelType *)item->key;
>  
> -        /* Pull up RelabelType node too, unless tlist expr has right type.
> */
> -        if (IsA(item->key, RelabelType))
> -        {
> -            RelabelType    *oldrelabel = (RelabelType *)item->key;
> -
> -            if (oldrelabel->resulttype != exprType((Node *)newexpr) ||
> -                oldrelabel->resulttypmod != exprTypmod((Node *)newexpr))
> -                newexpr = (Expr *)makeRelabelType(newexpr,
> -                                                  oldrelabel->resulttype,
> -                                                  oldrelabel->resulttypmod,
> -                  
> oldrelabel->relabelformat);
> -        }
> +        if (oldrelabel->resulttype != exprType((Node *)newexpr) ||
> +            oldrelabel->resulttypmod != exprTypmod((Node *)newexpr))
> +            newexpr = (Expr *)makeRelabelType(newexpr,
> +                                              oldrelabel->resulttype,
> +                                              oldrelabel->resulttypmod,
> +                                              oldrelabel->relabelformat);
>      }
>      Insist(newexpr);
>  
> Index: cdb-pg/src/backend/optimizer/util/pathnode.c
> ===================================================================
> RCS file: 
> /data/FISHEYE_REPOSITORIES/greenplum/cvsroot/cdb2/cdb-pg/src/backend/optimiz
> er/util/pathnode.c,v
> diff -u -N -r1.52.2.4 -r1.52.2.5
> --- cdb-pg/src/backend/optimizer/util/pathnode.c    5 Aug 2007 23:06:44
> -0000    1.52.2.4
> +++ cdb-pg/src/backend/optimizer/util/pathnode.c    10 Aug 2007 03:41:15
> -0000    1.52.2.5
> @@ -1563,7 +1563,15 @@
>              pathnode->path.rescannable = false;
>      }
>  
> -    return pathnode;
> +    /* 
> +     * CDB: If there is exactly one subpath, its ordering is preserved.
> +     * Child rel's pathkey exprs are already expressed in terms of the
> +     * columns of the parent appendrel.  See find_usable_indexes().
> +     */
> +    if (list_length(subpaths) == 1)
> +        pathnode->path.pathkeys = ((Path *)linitial(subpaths))->pathkeys;
> +    
> +    return pathnode;
>  }
>  
>  /*
> 
> 
> On 8/24/07 3:38 AM, "Heikki Linnakangas" <heikki@enterprisedb.com> wrote:
> 
> > Anton wrote:
> >>>> =# explain SELECT * FROM n_traf ORDER BY date_time DESC LIMIT 1;
> >>>>                                                QUERY PLAN
> >>> ----------------------------------------------------------------------------
> >>> -----------------------------
> >>>> Limit  (cost=824637.69..824637.69 rows=1 width=32)
> >>>>    ->  Sort  (cost=824637.69..838746.44 rows=5643499 width=32)
> >>>>          Sort Key: public.n_traf.date_time
> >>>>          ->  Result  (cost=0.00..100877.99 rows=5643499 width=32)
> >>>>                ->  Append  (cost= 0.00..100877.99 rows=5643499 width=32)
> >>>>                      ->  Seq Scan on n_traf  (cost=0.00..22.30
> >>>> rows=1230 width=32)
> >>>>                      ->  Seq Scan on n_traf_y2007m01 n_traf
> >>>> (cost=0.00..22.30 rows=1230 width=32)
> >> ...
> >>>>                      ->  Seq Scan on n_traf_y2007m12 n_traf
> >>>> (cost=0.00..22.30 rows=1230 width=32)
> >>>> (18 rows)
> >>>> 
> >>>> Why it no uses indexes at all?
> >>>> -------------------------------------------
> >>> I'm no expert but I'd guess that the the planner doesn't know which
> >>> partition holds the latest time so it has to read them all.
> >> 
> >> Agree. But why it not uses indexes when it reading them?
> > 
> > The planner isn't smart enough to push the "ORDER BY ... LIMIT ..."
> > below the append node. Therefore it needs to fetch all rows from all the
> > tables, and the fastest way to do that is a seq scan.
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
> 
>                 http://www.postgresql.org/about/donate
> 
> ------ End of Forwarded Message
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly

--  Bruce Momjian  <bruce@momjian.us>          http://momjian.us EnterpriseDB
http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Patch rejected because of Tom Lane's comments:

> This is useless since it does not represent a complete patch; the
> provided code calls a lot of Greenplum-private routines that weren't
> provided. It's not even reviewable let alone a candidate to apply.


---------------------------------------------------------------------------

Luke Lonergan wrote:
> Below is a patch against Greenplum Database that fixes the problem.
> 
> - Luke
> 
> ------ Forwarded Message
> From: Luke Lonergan <llonergan@greenplum.com>
> Date: Fri, 24 Aug 2007 09:25:53 -0700
> To: Heikki Linnakangas <heikki@enterprisedb.com>, Anton <anton200@gmail.com>
> Cc: <pgsql-performance@postgresql.org>
> Conversation: [PERFORM] partitioned table and ORDER BY indexed_field DESC
> LIMIT 1
> Subject: Re: [PERFORM] partitioned table and ORDER BY indexed_field DESC
> LIMIT 1
> 
> Below is a patch against 8.2.4 (more or less), Heikki can you take a look at
> it?
> 
> This enables the use of index scan of a child table by recognizing sort
> order of the append node.  Kurt Harriman did the work.
> 
> - Luke
> 
> Index: cdb-pg/src/backend/optimizer/path/indxpath.c
> ===================================================================
> RCS file: 
> /data/FISHEYE_REPOSITORIES/greenplum/cvsroot/cdb2/cdb-pg/src/backend/optimiz
> er/path/indxpath.c,v
> diff -u -N -r1.22 -r1.22.2.1
> --- cdb-pg/src/backend/optimizer/path/indxpath.c    25 Apr 2007 22:07:21
> -0000    1.22
> +++ cdb-pg/src/backend/optimizer/path/indxpath.c    10 Aug 2007 03:41:15
> -0000    1.22.2.1
> @@ -379,8 +379,51 @@
>              index_pathkeys = build_index_pathkeys(root, index,
>                                                    ForwardScanDirection,
>                                                    true);
> -            useful_pathkeys = truncate_useless_pathkeys(root, rel,
> -                                                        index_pathkeys);
> +            /*
> +             * CDB: For appendrel child, pathkeys contain Var nodes in
> terms 
> +             * of the child's baserel.  Transform the pathkey list to refer
> to 
> +             * columns of the appendrel.
> +             */
> +            if (rel->reloptkind == RELOPT_OTHER_MEMBER_REL)
> +            {
> +                AppendRelInfo  *appinfo = NULL;
> +                RelOptInfo     *appendrel = NULL;
> +                ListCell       *appcell;
> +                CdbPathLocus    notalocus;
> +
> +                /* Find the appendrel of which this baserel is a child. */
> +                foreach(appcell, root->append_rel_list)
> +                {
> +                    appinfo = (AppendRelInfo *)lfirst(appcell);
> +                    if (appinfo->child_relid == rel->relid)
> +                        break;
> +                }
> +                Assert(appinfo);
> +                appendrel = find_base_rel(root, appinfo->parent_relid);
> +
> +                /*
> +                 * The pathkey list happens to have the same format as the
> +                 * partitioning key of a Hashed locus, so by disguising it
> +                 * we can use cdbpathlocus_pull_above_projection() to do
> the 
> +                 * transformation.
> +                 */
> +                CdbPathLocus_MakeHashed(¬alocus, index_pathkeys);
> +                notalocus =
> +                    cdbpathlocus_pull_above_projection(root,
> +                                                       notalocus,
> +                                                       rel->relids,
> +                                                       rel->reltargetlist,
> +                  
> appendrel->reltargetlist,
> +                                                       appendrel->relid);
> +                if (CdbPathLocus_IsHashed(notalocus))
> +                    index_pathkeys = truncate_useless_pathkeys(root,
> appendrel,
> +                  
> notalocus.partkey);
> +                else
> +                    index_pathkeys = NULL;
> +            }
> +
> +            useful_pathkeys = truncate_useless_pathkeys(root, rel,
> +                                                        index_pathkeys);
>          }
>          else
>              useful_pathkeys = NIL;
> Index: cdb-pg/src/backend/optimizer/path/pathkeys.c
> ===================================================================
> RCS file: 
> /data/FISHEYE_REPOSITORIES/greenplum/cvsroot/cdb2/cdb-pg/src/backend/optimiz
> er/path/pathkeys.c,v
> diff -u -N -r1.18 -r1.18.2.1
> --- cdb-pg/src/backend/optimizer/path/pathkeys.c    30 Apr 2007 05:44:07
> -0000    1.18
> +++ cdb-pg/src/backend/optimizer/path/pathkeys.c    10 Aug 2007 03:41:15
> -0000    1.18.2.1
> @@ -1403,55 +1403,53 @@
>  {
>      PathKeyItem    *item;
>      Expr           *newexpr;
> +    AttrNumber  targetindex;
>  
>      Assert(pathkey);
>  
> -    /* Use constant expr if available.  Will be at head of list. */
> -    if (CdbPathkeyEqualsConstant(pathkey))
> +    /* Find an expr that we can rewrite to use the projected columns. */
> +    item = cdbpullup_findPathKeyItemInTargetList(pathkey,
> +                                                 relids,
> +                                                 targetlist,
> +                                                 &targetindex); // OUT
> +    
> +    /* If not found, see if the equiv class contains a constant expr. */
> +    if (!item &&
> +        CdbPathkeyEqualsConstant(pathkey))
>      {
>          item = (PathKeyItem *)linitial(pathkey);
>          newexpr = (Expr *)copyObject(item->key);
>      }
>  
> -    /* New vars for old! */
> -    else
> -    {
> -        AttrNumber  targetindex;
> +    /* Fail if no usable expr. */
> +    else if (!item)
> +        return NULL;
>  
> -        /* Find an expr that we can rewrite to use the projected columns.
> */
> -        item = cdbpullup_findPathKeyItemInTargetList(pathkey,
> -                                                     relids,
> -                                                     targetlist,
> -                                                     &targetindex); // OUT
> -        if (!item)
> -            return NULL;
> +    /* If found matching targetlist item, make a Var that references it. */
> +    else if (targetindex > 0)
> +        newexpr = (Expr *)cdbpullup_makeVar(newrelid,
> +                                            targetindex,
> +                                            newvarlist,
> +                                            (Expr *)item->key);
>  
> -        /* If found matching targetlist item, make a Var that references
> it. */
> -        if (targetindex > 0)
> -            newexpr = (Expr *)cdbpullup_makeVar(newrelid,
> -                                                targetindex,
> -                                                newvarlist,
> -                                                (Expr *)item->key);
> +    /* Replace expr's Var nodes with new ones referencing the targetlist.
> */
> +    else
> +        newexpr = cdbpullup_expr((Expr *)item->key,
> +                                 targetlist,
> +                                 newvarlist,
> +                                 newrelid);
>  
> -        /* Replace expr's Var nodes with new ones referencing the
> targetlist. */
> -        else
> -            newexpr = cdbpullup_expr((Expr *)item->key,
> -                                     targetlist,
> -                                     newvarlist,
> -                                     newrelid);
> +    /* Pull up RelabelType node too, unless tlist expr has right type. */
> +    if (IsA(item->key, RelabelType))
> +    {
> +        RelabelType    *oldrelabel = (RelabelType *)item->key;
>  
> -        /* Pull up RelabelType node too, unless tlist expr has right type.
> */
> -        if (IsA(item->key, RelabelType))
> -        {
> -            RelabelType    *oldrelabel = (RelabelType *)item->key;
> -
> -            if (oldrelabel->resulttype != exprType((Node *)newexpr) ||
> -                oldrelabel->resulttypmod != exprTypmod((Node *)newexpr))
> -                newexpr = (Expr *)makeRelabelType(newexpr,
> -                                                  oldrelabel->resulttype,
> -                                                  oldrelabel->resulttypmod,
> -                  
> oldrelabel->relabelformat);
> -        }
> +        if (oldrelabel->resulttype != exprType((Node *)newexpr) ||
> +            oldrelabel->resulttypmod != exprTypmod((Node *)newexpr))
> +            newexpr = (Expr *)makeRelabelType(newexpr,
> +                                              oldrelabel->resulttype,
> +                                              oldrelabel->resulttypmod,
> +                                              oldrelabel->relabelformat);
>      }
>      Insist(newexpr);
>  
> Index: cdb-pg/src/backend/optimizer/util/pathnode.c
> ===================================================================
> RCS file: 
> /data/FISHEYE_REPOSITORIES/greenplum/cvsroot/cdb2/cdb-pg/src/backend/optimiz
> er/util/pathnode.c,v
> diff -u -N -r1.52.2.4 -r1.52.2.5
> --- cdb-pg/src/backend/optimizer/util/pathnode.c    5 Aug 2007 23:06:44
> -0000    1.52.2.4
> +++ cdb-pg/src/backend/optimizer/util/pathnode.c    10 Aug 2007 03:41:15
> -0000    1.52.2.5
> @@ -1563,7 +1563,15 @@
>              pathnode->path.rescannable = false;
>      }
>  
> -    return pathnode;
> +    /* 
> +     * CDB: If there is exactly one subpath, its ordering is preserved.
> +     * Child rel's pathkey exprs are already expressed in terms of the
> +     * columns of the parent appendrel.  See find_usable_indexes().
> +     */
> +    if (list_length(subpaths) == 1)
> +        pathnode->path.pathkeys = ((Path *)linitial(subpaths))->pathkeys;
> +    
> +    return pathnode;
>  }
>  
>  /*
> 
> 
> On 8/24/07 3:38 AM, "Heikki Linnakangas" <heikki@enterprisedb.com> wrote:
> 
> > Anton wrote:
> >>>> =# explain SELECT * FROM n_traf ORDER BY date_time DESC LIMIT 1;
> >>>>                                                QUERY PLAN
> >>> ----------------------------------------------------------------------------
> >>> -----------------------------
> >>>> Limit  (cost=824637.69..824637.69 rows=1 width=32)
> >>>>    ->  Sort  (cost=824637.69..838746.44 rows=5643499 width=32)
> >>>>          Sort Key: public.n_traf.date_time
> >>>>          ->  Result  (cost=0.00..100877.99 rows=5643499 width=32)
> >>>>                ->  Append  (cost= 0.00..100877.99 rows=5643499 width=32)
> >>>>                      ->  Seq Scan on n_traf  (cost=0.00..22.30
> >>>> rows=1230 width=32)
> >>>>                      ->  Seq Scan on n_traf_y2007m01 n_traf
> >>>> (cost=0.00..22.30 rows=1230 width=32)
> >> ...
> >>>>                      ->  Seq Scan on n_traf_y2007m12 n_traf
> >>>> (cost=0.00..22.30 rows=1230 width=32)
> >>>> (18 rows)
> >>>> 
> >>>> Why it no uses indexes at all?
> >>>> -------------------------------------------
> >>> I'm no expert but I'd guess that the the planner doesn't know which
> >>> partition holds the latest time so it has to read them all.
> >> 
> >> Agree. But why it not uses indexes when it reading them?
> > 
> > The planner isn't smart enough to push the "ORDER BY ... LIMIT ..."
> > below the append node. Therefore it needs to fetch all rows from all the
> > tables, and the fastest way to do that is a seq scan.
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
> 
>                 http://www.postgresql.org/about/donate
> 
> ------ End of Forwarded Message
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://postgres.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +