Thread: Another try at reducing repeated detoast work for PostGIS

Another try at reducing repeated detoast work for PostGIS

From
Tom Lane
Date:
There was recently another go-round on the postgis-devel list about
the same problem Mark Cave-Ayland complained about last year:
http://archives.postgresql.org/pgsql-hackers/2008-06/msg00384.php
Basically, what is happening is a nestloop join where the inner
indexscan gets a comparison argument from the outer table, and that
argument is toasted.  Every single call of an index support function
detoasts the argument again :-(, leading to upwards of 90% of the
runtime being spent in detoasting.

I made a proposal to fix it
http://archives.postgresql.org/pgsql-hackers/2008-06/msg00709.php
but that failed due to excessive ambition --- the cost/benefit/risk
tradeoffs just weren't good enough.

Thinking about it again, it seems to me that a much narrower patch
could solve the specific forms of the problem that the PostGIS folk
are seeing.  Instead of trying to have a general-purpose method of
preventing repeat de-toasting, we could just prevent it for inner
indexscans by having ExecIndexEvalRuntimeKeys() detoast anything it's
passing to the index AM.  The attached patch accomplishes this with
a net addition of about three statements.  (It looks bigger, because
I had to move a hunk of code to have the datatype info available when
needed.)  Paul Ramsey reports that this fixes the problem for him:
http://postgis.refractions.net/pipermail/postgis-devel/2009-August/006659.html

The only downside I can see offhand is that it will detoast short-header
values that might not actually need to be detoasted.  But we don't have
any very good way to know whether a datatype's index support functions
use PG_DETOAST_DATUM or PG_DETOAST_DATUM_PACKED.  In the former case we
do need to detoast short-header values.  The extra overhead in this case
amounts to only one palloc and a fairly short memcpy, which should be
pretty negligible in comparison to the other setup costs of an
indexscan, so I'm not very concerned about it.

Comments?  Better ideas?

            regards, tom lane

Index: src/backend/executor/nodeIndexscan.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/executor/nodeIndexscan.c,v
retrieving revision 1.133
diff -c -r1.133 nodeIndexscan.c
*** src/backend/executor/nodeIndexscan.c    18 Jul 2009 19:15:41 -0000    1.133
--- src/backend/executor/nodeIndexscan.c    15 Aug 2009 04:28:55 -0000
***************
*** 237,244 ****
--- 237,247 ----
  ExecIndexEvalRuntimeKeys(ExprContext *econtext,
                           IndexRuntimeKeyInfo *runtimeKeys, int numRuntimeKeys)
  {
+     MemoryContext oldContext;
      int            j;

+     oldContext = MemoryContextSwitchTo(econtext->ecxt_per_tuple_memory);
+
      for (j = 0; j < numRuntimeKeys; j++)
      {
          ScanKey        scan_key = runtimeKeys[j].scan_key;
***************
*** 256,273 ****
           * econtext->ecxt_per_tuple_memory.  We assume that the outer tuple
           * will stay put throughout our scan.  If this is wrong, we could copy
           * the result into our context explicitly, but I think that's not
!          * necessary...
           */
!         scanvalue = ExecEvalExprSwitchContext(key_expr,
!                                               econtext,
!                                               &isNull,
!                                               NULL);
!         scan_key->sk_argument = scanvalue;
          if (isNull)
              scan_key->sk_flags |= SK_ISNULL;
          else
              scan_key->sk_flags &= ~SK_ISNULL;
      }
  }

  /*
--- 259,290 ----
           * econtext->ecxt_per_tuple_memory.  We assume that the outer tuple
           * will stay put throughout our scan.  If this is wrong, we could copy
           * the result into our context explicitly, but I think that's not
!          * necessary.
!          *
!          * It's also entirely possible that the result of the eval is a
!          * toasted value.  In this case we should forcibly detoast it,
!          * to avoid repeat detoastings each time the value is examined
!          * by an index support function.
           */
!         scanvalue = ExecEvalExpr(key_expr,
!                                  econtext,
!                                  &isNull,
!                                  NULL);
          if (isNull)
+         {
+             scan_key->sk_argument = scanvalue;
              scan_key->sk_flags |= SK_ISNULL;
+         }
          else
+         {
+             if (runtimeKeys[j].key_toastable)
+                 scanvalue = PointerGetDatum(PG_DETOAST_DATUM(scanvalue));
+             scan_key->sk_argument = scanvalue;
              scan_key->sk_flags &= ~SK_ISNULL;
+         }
      }
+
+     MemoryContextSwitchTo(oldContext);
  }

  /*
***************
*** 795,800 ****
--- 812,819 ----
                  runtime_keys[n_runtime_keys].scan_key = this_scan_key;
                  runtime_keys[n_runtime_keys].key_expr =
                      ExecInitExpr(rightop, planstate);
+                 runtime_keys[n_runtime_keys].key_toastable =
+                     TypeIsToastable(op_righttype);
                  n_runtime_keys++;
                  scanvalue = (Datum) 0;
              }
***************
*** 844,849 ****
--- 863,893 ----
                  varattno = ((Var *) leftop)->varattno;

                  /*
+                  * We have to look up the operator's associated btree support
+                  * function
+                  */
+                 opno = lfirst_oid(opnos_cell);
+                 opnos_cell = lnext(opnos_cell);
+
+                 if (index->rd_rel->relam != BTREE_AM_OID ||
+                     varattno < 1 || varattno > index->rd_index->indnatts)
+                     elog(ERROR, "bogus RowCompare index qualification");
+                 opfamily = index->rd_opfamily[varattno - 1];
+
+                 get_op_opfamily_properties(opno, opfamily,
+                                            &op_strategy,
+                                            &op_lefttype,
+                                            &op_righttype);
+
+                 if (op_strategy != rc->rctype)
+                     elog(ERROR, "RowCompare index qualification contains wrong operator");
+
+                 opfuncid = get_opfamily_proc(opfamily,
+                                              op_lefttype,
+                                              op_righttype,
+                                              BTORDER_PROC);
+
+                 /*
                   * rightop is the constant or variable comparison value
                   */
                  rightop = (Expr *) lfirst(rargs_cell);
***************
*** 867,902 ****
                      runtime_keys[n_runtime_keys].scan_key = this_sub_key;
                      runtime_keys[n_runtime_keys].key_expr =
                          ExecInitExpr(rightop, planstate);
                      n_runtime_keys++;
                      scanvalue = (Datum) 0;
                  }

                  /*
-                  * We have to look up the operator's associated btree support
-                  * function
-                  */
-                 opno = lfirst_oid(opnos_cell);
-                 opnos_cell = lnext(opnos_cell);
-
-                 if (index->rd_rel->relam != BTREE_AM_OID ||
-                     varattno < 1 || varattno > index->rd_index->indnatts)
-                     elog(ERROR, "bogus RowCompare index qualification");
-                 opfamily = index->rd_opfamily[varattno - 1];
-
-                 get_op_opfamily_properties(opno, opfamily,
-                                            &op_strategy,
-                                            &op_lefttype,
-                                            &op_righttype);
-
-                 if (op_strategy != rc->rctype)
-                     elog(ERROR, "RowCompare index qualification contains wrong operator");
-
-                 opfuncid = get_opfamily_proc(opfamily,
-                                              op_lefttype,
-                                              op_righttype,
-                                              BTORDER_PROC);
-
-                 /*
                   * initialize the subsidiary scan key's fields appropriately
                   */
                  ScanKeyEntryInitialize(this_sub_key,
--- 911,923 ----
                      runtime_keys[n_runtime_keys].scan_key = this_sub_key;
                      runtime_keys[n_runtime_keys].key_expr =
                          ExecInitExpr(rightop, planstate);
+                     runtime_keys[n_runtime_keys].key_toastable =
+                         TypeIsToastable(op_righttype);
                      n_runtime_keys++;
                      scanvalue = (Datum) 0;
                  }

                  /*
                   * initialize the subsidiary scan key's fields appropriately
                   */
                  ScanKeyEntryInitialize(this_sub_key,
Index: src/include/nodes/execnodes.h
===================================================================
RCS file: /cvsroot/pgsql/src/include/nodes/execnodes.h,v
retrieving revision 1.206
diff -c -r1.206 execnodes.h
*** src/include/nodes/execnodes.h    6 Aug 2009 20:44:31 -0000    1.206
--- src/include/nodes/execnodes.h    15 Aug 2009 04:28:55 -0000
***************
*** 1084,1089 ****
--- 1084,1090 ----
  {
      ScanKey        scan_key;        /* scankey to put value into */
      ExprState  *key_expr;        /* expr to evaluate to get value */
+     bool        key_toastable;    /* is expr's result a toastable datatype? */
  } IndexRuntimeKeyInfo;

  typedef struct

Re: Another try at reducing repeated detoast work for PostGIS

From
Jeff Davis
Date:
On Mon, 2009-08-17 at 13:37 -0400, Tom Lane wrote:
> Thinking about it again, it seems to me that a much narrower patch
> could solve the specific forms of the problem that the PostGIS folk
> are seeing.  Instead of trying to have a general-purpose method of
> preventing repeat de-toasting, we could just prevent it for inner
> indexscans by having ExecIndexEvalRuntimeKeys() detoast anything it's
> passing to the index AM. 

With this patch, are there still situations where we should be concerned
about repeated de-toasting, or does this solve the biggest part of the
problem?

If so, is it possible that two similar plans for the same query might
perform differently due to repeated de-toasting?

Regards,Jeff Davis



Re: Another try at reducing repeated detoast work for PostGIS

From
Tom Lane
Date:
Jeff Davis <pgsql@j-davis.com> writes:
> On Mon, 2009-08-17 at 13:37 -0400, Tom Lane wrote:
>> Thinking about it again, it seems to me that a much narrower patch
>> could solve the specific forms of the problem that the PostGIS folk
>> are seeing.  Instead of trying to have a general-purpose method of
>> preventing repeat de-toasting, we could just prevent it for inner
>> indexscans by having ExecIndexEvalRuntimeKeys() detoast anything it's
>> passing to the index AM. 

> With this patch, are there still situations where we should be concerned
> about repeated de-toasting, or does this solve the biggest part of the
> problem?

Well, it solves the case people have actually complained about (twice
now).  I originally attempted to solve a larger set of cases, but it's
not clear there's enough value in that.

> If so, is it possible that two similar plans for the same query might
> perform differently due to repeated de-toasting?

Hard to answer that one.  What's "similar"?
        regards, tom lane


Re: Another try at reducing repeated detoast work for PostGIS

From
Jeff Davis
Date:
On Mon, 2009-08-17 at 14:54 -0400, Tom Lane wrote:
> > If so, is it possible that two similar plans for the same query might
> > perform differently due to repeated de-toasting?
> 
> Hard to answer that one.  What's "similar"?

My only concern is that it's a somewhat hidden optimization (not seen in
explain output), and thus a potential performance problem might catch
someone by surprise. I don't see how that could happen with this
particular patch, but I thought I would bring it up in case someone else
does.

Regards,Jeff Davis



Re: Another try at reducing repeated detoast work for PostGIS

From
Robert Haas
Date:
On Mon, Aug 17, 2009 at 2:54 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:
> Jeff Davis <pgsql@j-davis.com> writes:
>> On Mon, 2009-08-17 at 13:37 -0400, Tom Lane wrote:
>>> Thinking about it again, it seems to me that a much narrower patch
>>> could solve the specific forms of the problem that the PostGIS folk
>>> are seeing.  Instead of trying to have a general-purpose method of
>>> preventing repeat de-toasting, we could just prevent it for inner
>>> indexscans by having ExecIndexEvalRuntimeKeys() detoast anything it's
>>> passing to the index AM.
>
>> With this patch, are there still situations where we should be concerned
>> about repeated de-toasting, or does this solve the biggest part of the
>> problem?
>
> Well, it solves the case people have actually complained about (twice
> now).  I originally attempted to solve a larger set of cases, but it's
> not clear there's enough value in that.

How related is this issue?

http://archives.postgresql.org/pgsql-hackers/2008-12/msg00369.php

...Robert


Re: Another try at reducing repeated detoast work for PostGIS

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Aug 17, 2009 at 2:54 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:
>> Well, it solves the case people have actually complained about (twice
>> now). �I originally attempted to solve a larger set of cases, but it's
>> not clear there's enough value in that.

> How related is this issue?

> http://archives.postgresql.org/pgsql-hackers/2008-12/msg00369.php

It's not the same thing --- the detoast calls here seem to be associated
with examining pg_statistic entries in the planner.  It's hard to tell
from this whether the detoastings are repetitive, or how much we might
stand to gain if they are (the test case doesn't seem to have run long
enough to make the timings trustworthy).  I'll just note that my
previous proposed patch
http://archives.postgresql.org/message-id/5184.1214773030@sss.pgh.pa.us
wouldn't have helped this case either, since that was purely an executor
patch.  The generic backend-wide cache I suggested originally might have
helped, but implementing that seems daunting.
        regards, tom lane


Re: Another try at reducing repeated detoast work for PostGIS

From
Mark Cave-Ayland
Date:
Tom Lane wrote:

> There was recently another go-round on the postgis-devel list about
> the same problem Mark Cave-Ayland complained about last year:
> http://archives.postgresql.org/pgsql-hackers/2008-06/msg00384.php
> Basically, what is happening is a nestloop join where the inner
> indexscan gets a comparison argument from the outer table, and that
> argument is toasted.  Every single call of an index support function
> detoasts the argument again :-(, leading to upwards of 90% of the
> runtime being spent in detoasting.
> 
> I made a proposal to fix it
> http://archives.postgresql.org/pgsql-hackers/2008-06/msg00709.php
> but that failed due to excessive ambition --- the cost/benefit/risk
> tradeoffs just weren't good enough.
> 
> Thinking about it again, it seems to me that a much narrower patch
> could solve the specific forms of the problem that the PostGIS folk
> are seeing.  Instead of trying to have a general-purpose method of
> preventing repeat de-toasting, we could just prevent it for inner
> indexscans by having ExecIndexEvalRuntimeKeys() detoast anything it's
> passing to the index AM.  The attached patch accomplishes this with
> a net addition of about three statements.  (It looks bigger, because
> I had to move a hunk of code to have the datatype info available when
> needed.)  Paul Ramsey reports that this fixes the problem for him:
> http://postgis.refractions.net/pipermail/postgis-devel/2009-August/006659.html
> 
> The only downside I can see offhand is that it will detoast short-header
> values that might not actually need to be detoasted.  But we don't have
> any very good way to know whether a datatype's index support functions
> use PG_DETOAST_DATUM or PG_DETOAST_DATUM_PACKED.  In the former case we
> do need to detoast short-header values.  The extra overhead in this case
> amounts to only one palloc and a fairly short memcpy, which should be
> pretty negligible in comparison to the other setup costs of an
> indexscan, so I'm not very concerned about it.
> 
> Comments?  Better ideas?
> 
>             regards, tom lane


Hi Tom,

Thanks for the patch. Fortunately enough I was able to find the dataset 
from the original report above, and so I've tested your patch against 
PostgreSQL 8.4. Unfortunately in the original test case, it doesn't seem 
to give the same performance improvement for me that Paul was seeing :(


postgis13=# \d geography                                   Table "public.geography"   Column   |          Type
|                       Modifiers
 
------------+------------------------+--------------------------------------------------------- gid        | integer
           | not null default 
 
nextval('geography_gid_seq'::regclass) type       | character varying(1)   | id         | numeric(20,0)          | name
     | character varying(32)  | population | numeric(20,0)          | abbreviati | character varying(2)   | po_name
|character varying(100) | id_geo_sta | numeric(20,0)          | the_geom   | geometry               | centroid   |
geometry              |
 
Indexes:    "geography_pkey" PRIMARY KEY, btree (gid)    "idx_geography_centroid_z" gist (centroid)
Check constraints:    "enforce_dims_the_geom" CHECK (ndims(the_geom) = 2)    "enforce_geotype_the_geom" CHECK
(geometrytype(the_geom)= 
 
'MULTIPOLYGON'::text OR the_geom IS NULL)    "enforce_srid_the_geom" CHECK (srid(the_geom) = (-1))


PostgreSQL 8.4 vanilla:

postgis13=# explain analyze select count(*) from geography where 
type='Z' and centroid && (select the_geom from geography where id=69495);
QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=6892.28..6892.29 rows=1 width=0) (actual 
 
time=1770.333..1770.334 rows=1 loops=1)   InitPlan 1 (returns $0)     ->  Seq Scan on geography  (cost=0.00..6884.00
rows=1width=4432) 
 
(actual time=19.529..43.893 rows=1 loops=1)           Filter: (id = 69495::numeric)   ->  Index Scan using
idx_geography_centroid_zon geography 
 
(cost=0.00..8.28 rows=1 width=0) (actual time=48.897..1730.004 
rows=29687 loops=1)         Index Cond: (centroid && $0)         Filter: ((type)::text = 'Z'::text) Total runtime:
1770.417ms
 
(8 rows)


PostgreSQL 8.4 with your patch applied:

postgis13=# explain analyze select count(*) from geography where 
type='Z' and centroid && (select the_geom from geography where id=69495);
QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=6892.28..6892.29 rows=1 width=0) (actual 
 
time=1499.414..1499.416 rows=1 loops=1)   InitPlan 1 (returns $0)     ->  Seq Scan on geography  (cost=0.00..6884.00
rows=1width=4432) 
 
(actual time=18.901..42.648 rows=1 loops=1)           Filter: (id = 69495::numeric)   ->  Index Scan using
idx_geography_centroid_zon geography 
 
(cost=0.00..8.28 rows=1 width=0) (actual time=43.133..1456.944 
rows=29687 loops=1)         Index Cond: (centroid && $0)         Filter: ((type)::text = 'Z'::text) Total runtime:
1499.498ms
 
(8 rows)


PostgreSQL 8.4 with force_2d() wrapper function:

postgis13=# explain analyze select count(*) from geography where 
type='Z' and centroid && (select force_2d(the_geom) from geography where 
id=69495);
QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=6892.28..6892.29 rows=1 width=0) (actual 
 
time=285.974..285.975 rows=1 loops=1)   InitPlan 1 (returns $0)     ->  Seq Scan on geography  (cost=0.00..6884.00
rows=1width=4432) 
 
(actual time=19.278..48.056 rows=1 loops=1)           Filter: (id = 69495::numeric)   ->  Index Scan using
idx_geography_centroid_zon geography 
 
(cost=0.00..8.28 rows=1 width=0) (actual time=48.375..196.404 rows=29687 
loops=1)         Index Cond: (centroid && $0)         Filter: ((type)::text = 'Z'::text) Total runtime: 286.057 ms


So while there is some improvement with your patch, it still comes 
nowhere close to the speed obtained with wrapping the column in function 
that deTOASTs the input. Could it be that your patch misses out this 
particular case?


ATB,

Mark.

-- 
Mark Cave-Ayland - Senior Technical Architect
PostgreSQL - PostGIS
Sirius Corporation plc - control through freedom
http://www.siriusit.co.uk
t: +44 870 608 0063

Sirius Labs: http://www.siriusit.co.uk/labs


Re: Another try at reducing repeated detoast work for PostGIS

From
Tom Lane
Date:
Mark Cave-Ayland <mark.cave-ayland@siriusit.co.uk> writes:
> Thanks for the patch. Fortunately enough I was able to find the dataset 
> from the original report above, and so I've tested your patch against 
> PostgreSQL 8.4. Unfortunately in the original test case, it doesn't seem 
> to give the same performance improvement for me that Paul was seeing :(

Huh.  As far as I can see this example should traverse the same code
path.  I was about to ask for the dataset, but I think you might have
already sent it to me once --- does this look familiar?

$ tar tvfj geography.tar.bz2
-rw-r--r-- shade/shade 6444737 2008-06-06 13:33 geography.dbf
-rw-r--r-- shade/shade 37179008 2008-06-06 13:33 geography.shp
-rw-r--r-- shade/shade   263140 2008-06-06 13:33 geography.shx

If so, what do I do with it exactly --- the file extensions convey
nothing to my mind at the moment ...
        regards, tom lane


Re: Another try at reducing repeated detoast work for PostGIS

From
Andy Colson
Date:
Tom Lane wrote:
> Mark Cave-Ayland <mark.cave-ayland@siriusit.co.uk> writes:
>> Thanks for the patch. Fortunately enough I was able to find the dataset 
>> from the original report above, and so I've tested your patch against 
>> PostgreSQL 8.4. Unfortunately in the original test case, it doesn't seem 
>> to give the same performance improvement for me that Paul was seeing :(
> 
> Huh.  As far as I can see this example should traverse the same code
> path.  I was about to ask for the dataset, but I think you might have
> already sent it to me once --- does this look familiar?
> 
> $ tar tvfj geography.tar.bz2
> -rw-r--r-- shade/shade 6444737 2008-06-06 13:33 geography.dbf
> -rw-r--r-- shade/shade 37179008 2008-06-06 13:33 geography.shp
> -rw-r--r-- shade/shade   263140 2008-06-06 13:33 geography.shx
> 
> If so, what do I do with it exactly --- the file extensions convey
> nothing to my mind at the moment ...
> 
>             regards, tom lane
> 

You'll need the postgis stuff I think.

use the shp2pgsql tool, like this:

shp2pgsql -D -S geography geography > geography.sql

-D write dump format (ie COPY)
-S creates simple geom's, if you get an error, remove the -S.

USAGE: shp2pgsql [<options>] <shapefile> [<schema>.]<table>


If you wanna see the data right from the shapefiles, you can use a tool 
like qgis.

.dbf is regular dbase file
.shp is a shapefile (esri shapefile)
.shx is an index


-Andy


Re: Another try at reducing repeated detoast work for PostGIS

From
Mark Cave-Ayland
Date:
Tom Lane wrote:

> Huh.  As far as I can see this example should traverse the same code
> path.  I was about to ask for the dataset, but I think you might have
> already sent it to me once --- does this look familiar?
> 
> $ tar tvfj geography.tar.bz2
> -rw-r--r-- shade/shade 6444737 2008-06-06 13:33 geography.dbf
> -rw-r--r-- shade/shade 37179008 2008-06-06 13:33 geography.shp
> -rw-r--r-- shade/shade   263140 2008-06-06 13:33 geography.shx
> 
> If so, what do I do with it exactly --- the file extensions convey
> nothing to my mind at the moment ...

Okay. I've gone back and had a look at the original queries, and it 
seems the reason for the inflated times is that my setup for the 
original database was based on PostGIS 1.3, which has a RECHECK applied 
to the && operator.

If I temporarily remove the RECHECK from PostGIS 1.3 then the times drop 
substantially:


postgis13=# explain analyze select count(*) from geography where 
type='Z' and centroid && (select the_geom from geography where id=69495);
QUERY PLAN


----------------------------------------------------------------------------------------------------------------------------------------
--------- Aggregate  (cost=6892.28..6892.29 rows=1 width=0) (actual 
time=394.183..394.184 rows=1 loops=1)   InitPlan 1 (returns $0)     ->  Seq Scan on geography  (cost=0.00..6884.00
rows=1width=4432) 
 
(actual time=18.192..41.855 rows=1 loops=1)           Filter: (id = 69495::numeric)   ->  Index Scan using
idx_geography_centroid_zon geography 
 
(cost=0.00..8.28 rows=1 width=0) (actual time=46.711..345.940 rows=29687
loops=1)         Index Cond: (centroid && $0)         Filter: ((type)::text = 'Z'::text) Total runtime: 394.265 ms
(8 rows)


Incidentally, the recently-released PostGIS 1.4 has RECHECK disabled by 
default, and so it can be seen that the times are reasonably similar:


postgis14=# explain analyze select count(*) from geography where 
type='Z' and centroid && (select the_geom from geography where id=69495);
QUERY PLAN


----------------------------------------------------------------------------------------------------------------------------------------
--------- Aggregate  (cost=6892.28..6892.29 rows=1 width=0) (actual 
time=396.314..396.315 rows=1 loops=1)   InitPlan 1 (returns $0)     ->  Seq Scan on geography  (cost=0.00..6884.00
rows=1width=4439) 
 
(actual time=14.198..37.340 rows=1 loops=1)           Filter: (id = 69495::numeric)   ->  Index Scan using
idx_geography_centroid_zon geography 
 
(cost=0.00..8.28 rows=1 width=0) (actual time=42.169..344.337 rows=29687
loops=1)         Index Cond: (centroid && $0)         Filter: ((type)::text = 'Z'::text) Total runtime: 396.375 ms
(8 rows)


If I re-apply your patch to PostgreSQL 8.4 using PostGIS 1.4 (ignoring 
RECHECK) then the results now look like this:


postgis14=# explain analyze select count(*) from geography where 
type='Z' and centroid && (select the_geom from geography where id=69495);
QUERY PLAN


----------------------------------------------------------------------------------------------------------------------------------------
--------- Aggregate  (cost=6892.28..6892.29 rows=1 width=0) (actual 
time=271.360..271.362 rows=1 loops=1)   InitPlan 1 (returns $0)     ->  Seq Scan on geography  (cost=0.00..6884.00
rows=1width=4439) 
 
(actual time=17.534..32.009 rows=1 loops=1)           Filter: (id = 69495::numeric)   ->  Index Scan using
idx_geography_centroid_zon geography 
 
(cost=0.00..8.28 rows=1 width=0) (actual time=32.393..165.057 rows=29687
loops=1)         Index Cond: (centroid && $0)         Filter: ((type)::text = 'Z'::text) Total runtime: 271.428 ms
(8 rows)

postgis14=# explain analyze select count(*) from geography where 
type='Z' and centroid && (select force_2d(the_geom) from geography where 
id=69495);
QUERY PLAN


----------------------------------------------------------------------------------------------------------------------------------------
--------- Aggregate  (cost=6892.28..6892.29 rows=1 width=0) (actual 
time=272.091..272.092 rows=1 loops=1)   InitPlan 1 (returns $0)     ->  Seq Scan on geography  (cost=0.00..6884.00
rows=1width=4439) 
 
(actual time=18.644..42.680 rows=1 loops=1)           Filter: (id = 69495::numeric)   ->  Index Scan using
idx_geography_centroid_zon geography 
 
(cost=0.00..8.28 rows=1 width=0) (actual time=43.079..172.788 rows=29687
loops=1)         Index Cond: (centroid && $0)         Filter: ((type)::text = 'Z'::text) Total runtime: 272.185 ms
(8 rows)


So in conclusion, I think that patch looks good and that the extra time 
I was seeing was due to RECHECK being applied to the && operator, and 
not the time being spent within the index scan itself.


ATB,

Mark.

-- 
Mark Cave-Ayland - Senior Technical Architect
PostgreSQL - PostGIS
Sirius Corporation plc - control through freedom
http://www.siriusit.co.uk
t: +44 870 608 0063

Sirius Labs: http://www.siriusit.co.uk/labs


Re: Another try at reducing repeated detoast work for PostGIS

From
Tom Lane
Date:
Mark Cave-Ayland <mark.cave-ayland@siriusit.co.uk> writes:
> So in conclusion, I think that patch looks good and that the extra time 
> I was seeing was due to RECHECK being applied to the && operator, and 
> not the time being spent within the index scan itself.

Thanks, I appreciate the followup.

I plan to go ahead and apply the patch to HEAD --- it doesn't conflict
with Heikki's pending patch AFAICS, and no one has suggested an
alternative that seems likely to get implemented soon.

I am a bit tempted to apply it to 8.4 as well; otherwise the PostGIS
people are likely to start cluttering their code with this
add-a-dummy-function workaround, which would be unproductive in the long
run.  Comments?
        regards, tom lane


Re: Another try at reducing repeated detoast work for PostGIS

From
Kenneth Marshall
Date:
On Sat, Aug 22, 2009 at 12:39:41PM -0400, Tom Lane wrote:
> Mark Cave-Ayland <mark.cave-ayland@siriusit.co.uk> writes:
> > So in conclusion, I think that patch looks good and that the extra time 
> > I was seeing was due to RECHECK being applied to the && operator, and 
> > not the time being spent within the index scan itself.
> 
> Thanks, I appreciate the followup.
> 
> I plan to go ahead and apply the patch to HEAD --- it doesn't conflict
> with Heikki's pending patch AFAICS, and no one has suggested an
> alternative that seems likely to get implemented soon.
> 
> I am a bit tempted to apply it to 8.4 as well; otherwise the PostGIS
> people are likely to start cluttering their code with this
> add-a-dummy-function workaround, which would be unproductive in the long
> run.  Comments?
> 
>             regards, tom lane
> 
+1 for applying it to 8.4 as well.

Cheers,
Ken