Re: Geometric types row estimation - Mailing list pgsql-performance

From Igor ALBUQUERQUE SILVA
Subject Re: Geometric types row estimation
Date
Msg-id CADArKzWtKs8ax0wUUa1RUC0APfXbF6WZREhOtir-tBYE8b1Grw@mail.gmail.com
Whole thread Raw
In response to Re: Geometric types row estimation  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Geometric types row estimation  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
Hi Tom,

Thanks a lot for the explanation, I thought the built-in types were more standard, so I didn't mention that I was having the same thing using postgis. Here's the example (I changed the values a little bit to avoid rounding errors):

create table test(p geometry(point));
insert into test(p) values (st_makepoint(0,0));
insert into test(p) values (st_makepoint(0,1));
insert into test(p) values (st_makepoint(1,0));
insert into test(p) values (st_makepoint(1,1));
insert into test(p) values (st_makepoint(50,0));
analyze test;
explain analyze select * from test where ST_Contains(ST_GeomFromText('POLYGON((-1 -1,2 -1,2 2,-1 2,-1 -1))'), p);
explain analyze select * from test where ST_Contains(ST_GeomFromText('POLYGON((49 -1,51 -1,51 1,49 1,49 -1))'), p);

EXPLAIN ANALYZE:

 Seq Scan on test  (cost=0.00..126.05 rows=1 width=32) (actual time=0.015..0.022 rows=4 loops=1)
   Filter: st_contains('01030000000100000005000000000000000000F0BF000000000000F0BF0000000000000040000000000000F0BF00000000000000400000000000000040000000000000F0BF0000000000000040000000000000F0BF000000000000F0BF'::geometry, p)
   Rows Removed by Filter: 1
 Planning Time: 0.072 ms
 Execution Time: 0.035 ms
(5 rows)

Do you know if the functions in Postgis are also stubbed? Or maybe I'm doing something wrong with the syntax?

This time I'm using the postgis docker image, PostgreSQL 15.1 (Debian 15.1-1.pgdg110+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit

Best regards,
Igor

On Wed, 30 Nov 2022 at 18:18, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Igor ALBUQUERQUE SILVA <i.albuquerque-silva@kayrros.com> writes:
> I'm having a problem regarding the point type/gist indexes. Here's a
> minimal reproduction of it:
> ...
> What I was expecting is the first query to estimate 4 rows and the second
> to estimate 1, like what I get If I try the same thing using integers.

Unfortunately, the selectivity estimation functions for PG's geometric
types are mostly just stubs.  The estimation function for point <@ box
in particular is contsel [1]:

/*
 *      contsel -- How likely is a box to contain (be contained by) a given box?
 *
 * This is a tighter constraint than "overlap", so produce a smaller
 * estimate than areasel does.
 */
Datum
contsel(PG_FUNCTION_ARGS)
{
        PG_RETURN_FLOAT8(0.001);
}

It's been like that (excepting notational changes) since Berkeley days,
because nobody has bothered to make it better.

In general, PG's built-in geometric types have never gotten much
beyond their origins as an academic proof-of-concept.  I think people
who are doing serious work that requires such operations mostly use
PostGIS, and I'd suggest looking into that.

Or, if you feel like doing a lot of work to make these estimators
better, have at it.

                        regards, tom lane

[1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob_plain;f=src/backend/utils/adt/geo_selfuncs.c;hb=HEAD

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Geometric types row estimation
Next
From: Tom Lane
Date:
Subject: Re: Geometric types row estimation