Thread: Index Skip Scan - attempting to evalutate patch

Index Skip Scan - attempting to evalutate patch

From
pguser
Date:
Hello

This is my first posting to hackers so sorry if I'm taking up valuable time.

I'm currently migrating a packaged application which supported oracle and sql server to PostgreSQL.

Something that I've identified as hurting the performance a lot is loose index scanning. I don't have access to the application SQL , so all I can try and do is mitigate through indexes. There are ~4000 tables in the application schema, and ~6000 indices.

Some plans are clearly worse than I would expect - because there are lots of index(a,b,c) and select where a= and c=. 

In an attempt to see if the putative skip scan changes will be beneficial on our real world data sets, I've been attempting to build and run pgsql from github with the v20- patch applied.

If I build without the patch, I get a running server, and can execute whatever queries I want.

If I apply the latest patch (which says 1 of 2? - maybe I'm missing a part of the patch?), I apply with

$ patch -p1 <../v20-0001-Index-skip-scan.patch
patching file contrib/bloom/blutils.c
patching file doc/src/sgml/config.sgml
patching file doc/src/sgml/indexam.sgml
patching file doc/src/sgml/indices.sgml
patching file src/backend/access/brin/brin.c
patching file src/backend/access/gin/ginutil.c
patching file src/backend/access/gist/gist.c
patching file src/backend/access/hash/hash.c
patching file src/backend/access/index/indexam.c
patching file src/backend/access/nbtree/nbtree.c
patching file src/backend/access/nbtree/nbtsearch.c
patching file src/backend/access/spgist/spgutils.c
patching file src/backend/commands/explain.c
patching file src/backend/executor/nodeIndexonlyscan.c
patching file src/backend/executor/nodeIndexscan.c
patching file src/backend/nodes/copyfuncs.c
patching file src/backend/nodes/outfuncs.c
patching file src/backend/nodes/readfuncs.c
patching file src/backend/optimizer/path/costsize.c
patching file src/backend/optimizer/path/pathkeys.c
patching file src/backend/optimizer/plan/createplan.c
patching file src/backend/optimizer/plan/planagg.c
patching file src/backend/optimizer/plan/planner.c
patching file src/backend/optimizer/util/pathnode.c
patching file src/backend/optimizer/util/plancat.c
patching file src/backend/utils/misc/guc.c
patching file src/backend/utils/misc/postgresql.conf.sample
patching file src/include/access/amapi.h
patching file src/include/access/genam.h
patching file src/include/access/nbtree.h
patching file src/include/nodes/execnodes.h
patching file src/include/nodes/pathnodes.h
patching file src/include/nodes/plannodes.h
patching file src/include/optimizer/cost.h
patching file src/include/optimizer/pathnode.h
patching file src/include/optimizer/paths.h
patching file src/test/regress/expected/create_index.out
patching file src/test/regress/expected/select_distinct.out
patching file src/test/regress/expected/sysviews.out
patching file src/test/regress/sql/create_index.sql
patching file src/test/regress/sql/select_distinct.sql

This will 'make' and 'make install' cleanly.

When I run the server, I can log in but the postgres processes associated with my psql session crashes SIGSEGV in many cases, for example when using \d:

psql (12beta2)
Type "help" for help.

db1=> show enable_indexskipscan;
enable_indexskipscan
----------------------
on
(1 row)

db1=> \d
psql: server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!> \q


I got a backtrace out of the process:

(gdb) backtrace
#0  MemoryContextAllocZeroAligned (context=0x0, size=size@entry=80) at mcxt.c:864
#1  0x000000000067d2d4 in get_eclass_for_sort_expr (root=root@entry=0x22ecb10, expr=expr@entry=0x22ee280, nullable_relids=nullable_relids@entry=0x0, opfamilies=0x22ff530,
    opcintype=opcintype@entry=19, collation=collation@entry=950, sortref=<optimized out>, rel=0x0, create_it=true) at equivclass.c:704
#2  0x0000000000686d9e in make_pathkey_from_sortinfo (root=root@entry=0x22ecb10, expr=expr@entry=0x22ee280, nullable_relids=nullable_relids@entry=0x0, opfamily=1994, opcintype=19,
    collation=950, reverse_sort=false, nulls_first=false, sortref=1, rel=0x0, create_it=true) at pathkeys.c:228
#3  0x0000000000686eb7 in make_pathkey_from_sortop (root=root@entry=0x22ecb10, expr=0x22ee280, nullable_relids=0x0, ordering_op=660, nulls_first=<optimized out>, sortref=1,
    create_it=true) at pathkeys.c:271
#4  0x0000000000687a4a in make_pathkeys_for_sortclauses (root=root@entry=0x22ecb10, sortclauses=<optimized out>, tlist=tlist@entry=0x22ee2f0) at pathkeys.c:1099
#5  0x0000000000694588 in standard_qp_callback (root=0x22ecb10, extra=<optimized out>) at planner.c:3635
#6  0x0000000000693024 in query_planner (root=root@entry=0x22ecb10, qp_callback=qp_callback@entry=0x6944e0 <standard_qp_callback>, qp_extra=qp_extra@entry=0x7ffe6fe2b8e0)
    at planmain.c:207
#7  0x00000000006970e0 in grouping_planner (root=root@entry=0x22ecb10, inheritance_update=inheritance_update@entry=false, tuple_fraction=<optimized out>, tuple_fraction@entry=0)
    at planner.c:2048
#8  0x000000000069978d in subquery_planner (glob=glob@entry=0x22e43c0, parse=parse@entry=0x22e3f30, parent_root=parent_root@entry=0x0, hasRecursion=hasRecursion@entry=false,
    tuple_fraction=tuple_fraction@entry=0) at planner.c:1012
#9  0x000000000069a7b6 in standard_planner (parse=0x22e3f30, cursorOptions=256, boundParams=<optimized out>) at planner.c:406
#10 0x000000000073ceac in pg_plan_query (querytree=querytree@entry=0x22e3f30, cursorOptions=cursorOptions@entry=256, boundParams=boundParams@entry=0x0) at postgres.c:878
#11 0x000000000073cf86 in pg_plan_queries (querytrees=<optimized out>, cursorOptions=cursorOptions@entry=256, boundParams=boundParams@entry=0x0) at postgres.c:968
#12 0x000000000073d399 in exec_simple_query (
    query_string=0x222a9a0 "SELECT n.nspname as \"Schema\",\n  c.relname as \"Name\",\n  CASE c.relkind WHEN 'r' THEN 'table' WHEN 'v' THEN 'view' WHEN 'm' THEN 'materialized view' WHEN 'i' THEN 'index' WHEN 'S' THEN 'sequence' WHEN '"...) at postgres.c:1143
#13 0x000000000073ef5a in PostgresMain (argc=<optimized out>, argv=argv@entry=0x2255440, dbname=<optimized out>, username=<optimized out>) at postgres.c:4249
#14 0x00000000006cfaf6 in BackendRun (port=0x224e220, port=0x224e220) at postmaster.c:4431
#15 BackendStartup (port=0x224e220) at postmaster.c:4122
#16 ServerLoop () at postmaster.c:1704
#17 0x00000000006d09d0 in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x2224c50) at postmaster.c:1377
#18 0x00000000004820c4 in main (argc=3, argv=0x2224c50) at main.c:228

With the skip scan v20 patch installed, I tried to set enable_indexskipscan=off, but this did not resolve the crash.

Also, if I try to explain a select:

db1=> explain analyze select cmpy,bal,fncl_val from t1 where cmpy='04' and bal='CO';
psql: ERROR:  variable not found in subplan target list

I'm quite prepared to be told I've miscompiled or missed something really obvious and for that I apologize in advance.  I'm really keen to get involved with testing this patch/feature ahead of release.

Re: Index Skip Scan - attempting to evalutate patch

From
Dmitry Dolgov
Date:
> On Wed, Jun 26, 2019 at 1:53 PM pguser <pguser@diorite.uk> wrote:
>
> If I apply the latest patch (which says 1 of 2? - maybe I'm missing a part of the patch?), I apply with

Hi,

First of all, thanks for evaluation!

> psql (12beta2)
> Type "help" for help.
>
> db1=> show enable_indexskipscan;
> enable_indexskipscan
> ----------------------
> on
> (1 row)
>
> db1=> \d
> psql: server closed the connection unexpectedly
>         This probably means the server terminated abnormally
>         before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
> !> \q
>
>
> I got a backtrace out of the process:
>
> (gdb) backtrace
> #0  MemoryContextAllocZeroAligned (context=0x0, size=size@entry=80) at mcxt.c:864
> #1  0x000000000067d2d4 in get_eclass_for_sort_expr (root=root@entry=0x22ecb10, expr=expr@entry=0x22ee280,
nullable_relids=nullable_relids@entry=0x0,opfamilies=0x22ff530, 
>     opcintype=opcintype@entry=19, collation=collation@entry=950, sortref=<optimized out>, rel=0x0, create_it=true) at
equivclass.c:704
> #2  0x0000000000686d9e in make_pathkey_from_sortinfo (root=root@entry=0x22ecb10, expr=expr@entry=0x22ee280,
nullable_relids=nullable_relids@entry=0x0,opfamily=1994, opcintype=19, 
>     collation=950, reverse_sort=false, nulls_first=false, sortref=1, rel=0x0, create_it=true) at pathkeys.c:228
> #3  0x0000000000686eb7 in make_pathkey_from_sortop (root=root@entry=0x22ecb10, expr=0x22ee280, nullable_relids=0x0,
ordering_op=660,nulls_first=<optimized out>, sortref=1, 
>     create_it=true) at pathkeys.c:271
> #4  0x0000000000687a4a in make_pathkeys_for_sortclauses (root=root@entry=0x22ecb10, sortclauses=<optimized out>,
tlist=tlist@entry=0x22ee2f0)at pathkeys.c:1099 
> #5  0x0000000000694588 in standard_qp_callback (root=0x22ecb10, extra=<optimized out>) at planner.c:3635
> #6  0x0000000000693024 in query_planner (root=root@entry=0x22ecb10, qp_callback=qp_callback@entry=0x6944e0
<standard_qp_callback>,qp_extra=qp_extra@entry=0x7ffe6fe2b8e0) 
>     at planmain.c:207
> #7  0x00000000006970e0 in grouping_planner (root=root@entry=0x22ecb10,
inheritance_update=inheritance_update@entry=false,tuple_fraction=<optimized out>, tuple_fraction@entry=0) 
>     at planner.c:2048
> #8  0x000000000069978d in subquery_planner (glob=glob@entry=0x22e43c0, parse=parse@entry=0x22e3f30,
parent_root=parent_root@entry=0x0,hasRecursion=hasRecursion@entry=false, 
>     tuple_fraction=tuple_fraction@entry=0) at planner.c:1012
> #9  0x000000000069a7b6 in standard_planner (parse=0x22e3f30, cursorOptions=256, boundParams=<optimized out>) at
planner.c:406
> #10 0x000000000073ceac in pg_plan_query (querytree=querytree@entry=0x22e3f30, cursorOptions=cursorOptions@entry=256,
boundParams=boundParams@entry=0x0)at postgres.c:878 
> #11 0x000000000073cf86 in pg_plan_queries (querytrees=<optimized out>, cursorOptions=cursorOptions@entry=256,
boundParams=boundParams@entry=0x0)at postgres.c:968 
> #12 0x000000000073d399 in exec_simple_query (
>     query_string=0x222a9a0 "SELECT n.nspname as \"Schema\",\n  c.relname as \"Name\",\n  CASE c.relkind WHEN 'r' THEN
'table'WHEN 'v' THEN 'view' WHEN 'm' THEN 'materialized view' WHEN 'i' THEN 'index' WHEN 'S' THEN 'sequence' WHEN
'"...)at postgres.c:1143 
> #13 0x000000000073ef5a in PostgresMain (argc=<optimized out>, argv=argv@entry=0x2255440, dbname=<optimized out>,
username=<optimizedout>) at postgres.c:4249 
> #14 0x00000000006cfaf6 in BackendRun (port=0x224e220, port=0x224e220) at postmaster.c:4431
> #15 BackendStartup (port=0x224e220) at postmaster.c:4122
> #16 ServerLoop () at postmaster.c:1704
> #17 0x00000000006d09d0 in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x2224c50) at postmaster.c:1377
> #18 0x00000000004820c4 in main (argc=3, argv=0x2224c50) at main.c:228

Could you by any change provide also relations schema that were supposed to be
described by this command?



Re: Index Skip Scan - attempting to evalutate patch

From
pguser
Date:
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Wednesday, June 26, 2019 1:07 PM, Dmitry Dolgov <9erthalion6@gmail.com> wrote:

> > On Wed, Jun 26, 2019 at 1:53 PM pguser pguser@diorite.uk wrote:
> > If I apply the latest patch (which says 1 of 2? - maybe I'm missing a part of the patch?), I apply with
>
> Hi,
>
> First of all, thanks for evaluation!
>

No problem. I'd like to get involved in helping this patch mature as I think that we're suffering in a few areas of
performancedue to this. 

> Could you by any change provide also relations schema that were supposed to be
> described by this command?

Okay for now, it's not much. I get the issue of the SIGSEGV on a brand new database with only one relation:

This is with the 12beta2 as compiled from git sources by me:

psql (12beta2)
Type "help" for help.


db2=> \d
       List of relations
 Schema | Name | Type  | Owner
--------+------+-------+-------
 e5     | t1   | table | e5
(1 row)

db2=> \d t1
                        Table "e5.t1"
 Column |       Type        | Collation | Nullable | Default
--------+-------------------+-----------+----------+---------
 n1     | smallint          |           |          |
 n2     | smallint          |           |          |
 c1     | character varying |           |          |
 c2     | character varying |           |          |
Indexes:
    "i1" btree (n1, n2, c1)


And with patch 20 applied:

psql (12beta2)
Type "help" for help.

db2=> \d
psql: server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!> \q
[postgres@ip-172-31-33-89 ~]$ . sql2
psql (12beta2)
Type "help" for help.

db2=> \d t1
psql: server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!> \q


In fact, if I do:

createdb db3
psql -d db3
db3=# \d
psql: server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

I get this on empty database with no relations yet defined.

I feel I have done something silly or missed something when applying patch



Re: Index Skip Scan - attempting to evalutate patch

From
pguser
Date:
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Wednesday, June 26, 2019 2:55 PM, pguser <pguser@diorite.uk> wrote:

> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> On Wednesday, June 26, 2019 1:07 PM, Dmitry Dolgov 9erthalion6@gmail.com wrote:
>
> > > On Wed, Jun 26, 2019 at 1:53 PM pguser pguser@diorite.uk wrote:
> > > If I apply the latest patch (which says 1 of 2? - maybe I'm missing a part of the patch?), I apply with
> >
> > Hi,
> > First of all, thanks for evaluation!
>
> No problem. I'd like to get involved in helping this patch mature as I think that we're suffering in a few areas of
performancedue to this. 
>
> > Could you by any change provide also relations schema that were supposed to be
> > described by this command?
>
> Okay for now, it's not much. I get the issue of the SIGSEGV on a brand new database with only one relation:
>
> This is with the 12beta2 as compiled from git sources by me:
>
> psql (12beta2)
> Type "help" for help.
>
> db2=> \d
>
>        List of relations
>
>
> Schema | Name | Type | Owner
> --------+------+-------+-------
> e5 | t1 | table | e5
> (1 row)
>
> db2=> \d t1
>
>                         Table "e5.t1"
>
>
> Column | Type | Collation | Nullable | Default
> --------+-------------------+-----------+----------+---------
> n1 | smallint | | |
> n2 | smallint | | |
> c1 | character varying | | |
> c2 | character varying | | |
> Indexes:
> "i1" btree (n1, n2, c1)
>
> And with patch 20 applied:
>
> psql (12beta2)
> Type "help" for help.
>
> db2=> \d
> psql: server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
> !> \q
> [postgres@ip-172-31-33-89 ~]$ . sql2
> psql (12beta2)
> Type "help" for help.
>
> db2=> \d t1
> psql: server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
> !> \q
>
> In fact, if I do:
>
> createdb db3
> psql -d db3
> db3=# \d
> psql: server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
>
> I get this on empty database with no relations yet defined.
>
> I feel I have done something silly or missed something when applying patch


I find that my patched installation can't create its own initdb either:

initdb -D /pgd2
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

fixing permissions on existing directory /pgd2 ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default timezone ... UTC
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... 2019-06-26 14:05:47.807 UTC [8120] FATAL:  could not open file
"base/1/2663.1"(target block 17353008): previous segment is only 4 blocks at character 65 
2019-06-26 14:05:47.807 UTC [8120] STATEMENT:  INSERT INTO pg_shdepend SELECT 0,0,0,0, tableoid,oid, 'p'  FROM
pg_authid;

child process exited with exit code 1
initdb: removing contents of data directory "/pgd2"


I was hoping to share the pgdata between 12beta2 without patch, and 12beta2 with patch, for ease of side by side
comparison.

Even more I feel that I'm missing something more than just this 20 patch from the Index Skip Scan thread.



Re: Index Skip Scan - attempting to evalutate patch

From
Tomas Vondra
Date:
On Wed, Jun 26, 2019 at 02:12:55PM +0000, pguser wrote:
>
>‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
>On Wednesday, June 26, 2019 2:55 PM, pguser <pguser@diorite.uk> wrote:
>
>> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
>> On Wednesday, June 26, 2019 1:07 PM, Dmitry Dolgov 9erthalion6@gmail.com wrote:
>>
>> > > On Wed, Jun 26, 2019 at 1:53 PM pguser pguser@diorite.uk wrote:
>> > > If I apply the latest patch (which says 1 of 2? - maybe I'm missing a part of the patch?), I apply with
>> >
>> > Hi,
>> > First of all, thanks for evaluation!
>>
>> No problem. I'd like to get involved in helping this patch mature as I think that we're suffering in a few areas of
performancedue to this.
 
>>
>> > Could you by any change provide also relations schema that were supposed to be
>> > described by this command?
>>
>> Okay for now, it's not much. I get the issue of the SIGSEGV on a brand new database with only one relation:
>>
>> This is with the 12beta2 as compiled from git sources by me:
>>
>> psql (12beta2)
>> Type "help" for help.
>>
>> db2=> \d
>>
>>        List of relations
>>
>>
>> Schema | Name | Type | Owner
>> --------+------+-------+-------
>> e5 | t1 | table | e5
>> (1 row)
>>
>> db2=> \d t1
>>
>>                         Table "e5.t1"
>>
>>
>> Column | Type | Collation | Nullable | Default
>> --------+-------------------+-----------+----------+---------
>> n1 | smallint | | |
>> n2 | smallint | | |
>> c1 | character varying | | |
>> c2 | character varying | | |
>> Indexes:
>> "i1" btree (n1, n2, c1)
>>
>> And with patch 20 applied:
>>
>> psql (12beta2)
>> Type "help" for help.
>>
>> db2=> \d
>> psql: server closed the connection unexpectedly
>> This probably means the server terminated abnormally
>> before or while processing the request.
>> The connection to the server was lost. Attempting reset: Failed.
>> !> \q
>> [postgres@ip-172-31-33-89 ~]$ . sql2
>> psql (12beta2)
>> Type "help" for help.
>>
>> db2=> \d t1
>> psql: server closed the connection unexpectedly
>> This probably means the server terminated abnormally
>> before or while processing the request.
>> The connection to the server was lost. Attempting reset: Failed.
>> !> \q
>>
>> In fact, if I do:
>>
>> createdb db3
>> psql -d db3
>> db3=# \d
>> psql: server closed the connection unexpectedly
>> This probably means the server terminated abnormally
>> before or while processing the request.
>> The connection to the server was lost. Attempting reset: Failed.
>>
>> I get this on empty database with no relations yet defined.
>>
>> I feel I have done something silly or missed something when applying patch
>
>
>I find that my patched installation can't create its own initdb either:
>
>initdb -D /pgd2
>The files belonging to this database system will be owned by user "postgres".
>This user must also own the server process.
>
>The database cluster will be initialized with locale "en_US.UTF-8".
>The default database encoding has accordingly been set to "UTF8".
>The default text search configuration will be set to "english".
>
>Data page checksums are disabled.
>
>fixing permissions on existing directory /pgd2 ... ok
>creating subdirectories ... ok
>selecting dynamic shared memory implementation ... posix
>selecting default max_connections ... 100
>selecting default shared_buffers ... 128MB
>selecting default timezone ... UTC
>creating configuration files ... ok
>running bootstrap script ... ok
>performing post-bootstrap initialization ... 2019-06-26 14:05:47.807 UTC [8120] FATAL:  could not open file
"base/1/2663.1"(target block 17353008): previous segment is only 4 blocks at character 65
 
>2019-06-26 14:05:47.807 UTC [8120] STATEMENT:  INSERT INTO pg_shdepend SELECT 0,0,0,0, tableoid,oid, 'p'  FROM
pg_authid;
>
>child process exited with exit code 1
>initdb: removing contents of data directory "/pgd2"
>

Well, there's something seriously wrong with your build or environment,
then. I've tried reproducing the issue, but it works just fine for me
(initdb, psql, ...).

>
>I was hoping to share the pgdata between 12beta2 without patch, and
>12beta2 with patch, for ease of side by side comparison.
>

That might be dangerous, if there may be differences in contents of
catalogs. I don't think the patch does that though, and for me it works
just fine. I can initdb database using current master, create table +
indexes, do \d. And I can do that with the patch applied too.

>Even more I feel that I'm missing something more than just this 20 patch
>from the Index Skip Scan thread.
>

Are you sure this is not some sort of OOM issue? That might also
demonstrate as a segfault, in various cases.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Index Skip Scan - attempting to evalutate patch

From
pguser
Date:
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Wednesday, June 26, 2019 4:07 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

> That might be dangerous, if there may be differences in contents of
> catalogs. I don't think the patch does that though, and for me it works
> just fine. I can initdb database using current master, create table +
> indexes, do \d. And I can do that with the patch applied too.
>


Well, this is embarrassing.

I repeated all my steps again on my development laptop (Fedora 30, GCC 9.1.1, glibc 2.29.15) and it all works (doesn't
segfault,can initdb). 

On my Amazon Linux EC2 , (gcc 7.3.1, glibc 2.6.32) it exhibits fault on patched version.

Same steps, same sources.

Got to be build tools/version related on my EC2 instance.

Darn it. Sorry for wasting your time, I will continue to evaluate patch, and be mindful that something, somewhere is
sensitiveto build tools versions or lib versions. 

Many regards