Re: [GENERAL] Performance issue with Pointcloud extension - Mailing list pgsql-general

From Eric Lemoine
Subject Re: [GENERAL] Performance issue with Pointcloud extension
Date
Msg-id 386038d7-24ce-8b01-73ff-31947afef011@oslandia.com
Whole thread Raw
In response to Re: [GENERAL] Performance issue with Pointcloud extension  (Bill Moran <wmoran@potentialtech.com>)
Responses Re: [GENERAL] Performance issue with Pointcloud extension
List pgsql-general
> Lots of missing information here ...
>
> Is there an index on public.sthelens.points?

Yes, there are.

lopocs=# \d sthelens;
                           Table "public.sthelens"
 Column |    Type    |                       Modifiers
--------+------------+-------------------------------------------------------
 id     | integer    | not null default nextval('sthelens_id_seq'::regclass)
 points | pcpatch(2) |
 morton | bigint     |
Indexes:
    "sthelens_pkey" PRIMARY KEY, btree (id)
    "sthelens_pc_envelopegeometry_idx" gist (pc_envelopegeometry(points))

So two indices, one for the primary key, and a Postgis index on the
Postgis geometry returned by the Pointcloud pc_envelopegeometry function.


> How many rows are in that table?

30971



> What are your shared_buffers settings?

128 MB (Debian unstable)


> How much RAM does the server have?

16 GB

> What does EXPLAIN look like for that query? How large (in bytes) are the
> tables in question?


                                                             QUERY PLAN



----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on sthelens  (cost=383.26..2496.67 rows=2065 width=32)
(actual time=3.213..46.674 rows=2506 loops=1)
   Recheck Cond:

('01030000207213000001000000050000002E47BC352BFC41C164910500DF8D4CC12E47BC75CCF941C164910500DF8D4CC12E47BC75CCF941C1649105002F8B4CC12E47BC352BFC41C1649105002F8B4CC12E47BC352BFC41C164910500DF8D4CC1'::geometry
&& st_geomfromewkb(pc_envelopeasbinary(points)))
   Filter:

_st_intersects('01030000207213000001000000050000002E47BC352BFC41C164910500DF8D4CC12E47BC75CCF941C164910500DF8D4CC12E47BC75CCF941C1649105002F8B4CC12E47BC352BFC41C1649105002F8B4CC12E47BC352BFC41C164910500DF8D4CC1'::geometry,
st_geomfromewkb(pc_envelopeasbinary(points)))
   Heap Blocks: exact=36
   ->  Bitmap Index Scan on sthelens_pc_envelopegeometry_idx
(cost=0.00..382.75 rows=6196 width=0) (actual time=1.626..1.626
rows=2506 loops=1)
         Index Cond:

('01030000207213000001000000050000002E47BC352BFC41C164910500DF8D4CC12E47BC75CCF941C164910500DF8D4CC12E47BC75CCF941C1649105002F8B4CC12E47BC352BFC41C1649105002F8B4CC12E47BC352BFC41C164910500DF8D4CC1'::geometry
&& st_geomfromewkb(pc_envelopeasbinary(points)))
 Planning time: 0.525 ms
 Execution time: 46.999 ms
(8 rows)


Note that the execution time is 46 ms when the query is wrapped in an
explain analyze (while it's 3 s when it's not!)


> What does pc_typmod_pcid() actually do?

It is one of the simplest functions of Pointcloud.


See
<https://github.com/pgpointcloud/pointcloud/blob/master/pgsql/pointcloud.sql.in#L44-L47>
and
<https://github.com/pgpointcloud/pointcloud/blob/eb4fe1e923179ad1ca718d5620b2f41ee1a94886/pgsql/pc_pgsql.c#L132-L139>.


>
> There are probably lots of other questions I could ask, but those questions
> are based on the fact that this _looks_ like a classic cache blowout. I.e.,
> the query runs quickly when all the related d> ata is in RAM, but is
> significantly slower when the data has to be pulled from disk. Answering
> the quesitons above will likely help to determine if my guess is correct.


I don't know. The query is fast if I run it first on the database
connection. And it is *always* very slow after the "select
pc_typmod_pcid(1)" query has run.


>
> If my guess is correct, there are any number of potential ways to improve
> things: Add RAM to the machine, enlarge shared_buffers, put a geo index on
> public.sthelens.points so it doesn't have to scan the entire table; as a
> few examples.


It sounds like a performance issue to you, while it sounds like a bug to
me :)

>
> Understanding what pc_typmod_pcid() actually does would help, but even
> without that you can test things in a few ways. One would be to substitute
> a different query in your testing for select pc_typmod_pcid(1) that is
> known to push the contents of public.sthelens out of memory and see if
> the behavior is similar. Any count(*) query on some other large table
> would probably suffice. A better way would probalby be to install the
> pg_buffercache module and see what's actually in the cache at each step
> of the testing process.


I'll try to use pg_buffercache.



Thanks a lot for your response. That at least gives me courage in
debugging this :)


--
Éric Lemoine
Oslandia
+33 1 86 95 95 55

Attachment

pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: [GENERAL] Performance issue with Pointcloud extension
Next
From: Eric Lemoine
Date:
Subject: Re: [GENERAL] Performance issue with Pointcloud extension