Performance problems with postgres and null Values? - Mailing list pgsql-performance

From Sven Kerkling
Subject Performance problems with postgres and null Values?
Date
Msg-id 003501d19bb3$2b1458a0$813d09e0$@bds-online.com
Whole thread Raw
Responses Re: Performance problems with postgres and null Values?  (Merlin Moncure <mmoncure@gmail.com>)
List pgsql-performance

After remodeling a table we have some performance problems.

 

The Original tables have much more fields and we thought it came from these many fields. After some testing I tried these test layout and the performance problem is not solved.

 

Postgresql 9.3.12

 

Former DB-Layout was table _masterOld_ with 2 tables inherits from _masterOld_: _part1Old_ and _part2Old_. In _masterOld_ were 7 million rows (_part1Old_: 5 millions, _part2Old_: 2 millions).

 

Now we have only one new table _masterNew_ with 7 million rows.

 

DDL: 

export:

 

    CREATE TABLE public.export (

      id_firma BIGINT,

      status VARCHAR(32)

    )

    WITH (oids = false);

 

    CREATE INDEX export_idx ON public.export

      USING btree (id_firma);

 

masterNew:

  

    CREATE TABLE public."masterNew" (

    id_firma BIGINT,

    id_bestand BIGINT NOT NULL,

    status VARCHAR(32),

    sperre VARCHAR(32),

    CONSTRAINT "masterNew_2016_pkey" PRIMARY KEY(id_bestand)

    )

    WITH (oids = false);

 

    CREATE INDEX "masterNew_2016_pi_idx" ON public."masterNew"

    USING btree (id_firma)

    WHERE ((status IS NULL) AND (sperre IS NULL));

 

    CREATE INDEX "masterNew_sperre_2016" ON public."masterNew"

    USING btree (sperre COLLATE pg_catalog."default");

 

    CREATE INDEX "masterNew_status_2016" ON public."masterNew"

    USING btree (status COLLATE pg_catalog."default");

 

masterOld:

 

    CREATE TABLE public."masterOld" (

      id_firma BIGINT,

      id_bestand BIGINT NOT NULL,

      status VARCHAR(32),

      sperre VARCHAR(32),

      CONSTRAINT "masterOld_pkey" PRIMARY KEY(id_bestand)

    )

    WITH (oids = false);

 

    CREATE INDEX "masterOld_idx" ON public."masterOld"

      USING btree (id_firma);

 

    CREATE INDEX "masterOld_sperre" ON public."masterOld"

      USING btree (sperre COLLATE pg_catalog."default");

 

    CREATE INDEX "masterOld_status" ON public."masterOld"

      USING btree (status COLLATE pg_catalog."default");

 

part1Old:

 

    CREATE TABLE public."part1Old" (

      CONSTRAINT "part1Old_idx" PRIMARY KEY(id_bestand)

    ) INHERITS (public."masterOld")

 

    WITH (oids = false);

 

    CREATE INDEX "part1Old_idx1" ON public."part1Old"

      USING btree (id_firma);

 

    CREATE INDEX "part1Old_idx2" ON public."part1Old"

      USING btree (status COLLATE pg_catalog."default");

 

    CREATE INDEX "part1Old_idx3" ON public."part1Old"

      USING btree (sperre COLLATE pg_catalog."default");

 

part2Old:

 

    CREATE TABLE public."part2Old" (

      CONSTRAINT "part2Old_idx" PRIMARY KEY(id_bestand)

    ) INHERITS (public."masterOld")

 

    WITH (oids = false);

 

    CREATE INDEX "part2Old_idx1" ON public."part2Old"

      USING btree (id_firma);

 

    CREATE INDEX "part2Old_idx2" ON public."part2Old"

      USING btree (status COLLATE pg_catalog."default");

 

    CREATE INDEX "part2Old_idx3" ON public."part2Old"

      USING btree (sperre COLLATE pg_catalog."default");

 

In the _export_ table are 1.2 million rows.

 

Old:

 

    EXPLAIN

    SELECT b.id, b.status

    FROM export b, masterOld mb

    WHERE mb.sperre IS NULL

      AND mb.status IS NULL

      AND b.id_firma = mb.id_firma

    LIMIT 100; 

 

<a href="http://explain.depesz.com/s/SCBo">Plan on explain.depesz.com</a>

 

    - Plan:

    Node Type: "Limit"

    Startup Cost: 0.00

    Total Cost: 0.09

    Plan Rows: 100

    Plan Width: 90

    Plans:

      - Node Type: "Nested Loop"

        Parent Relationship: "Outer"

        Join Type: "Inner"

        Startup Cost: 0.00

        Total Cost: 118535034.59

        Plan Rows: 126126068850

        Plan Width: 90

        Plans:

          - Node Type: "Seq Scan"

            Parent Relationship: "Outer"

            Relation Name: "export"

            Alias: "b"

            Startup Cost: 0.00

            Total Cost: 79129.80

            Plan Rows: 5485680

            Plan Width: 90

          - Node Type: "Append"

            Parent Relationship: "Inner"

            Startup Cost: 0.00

            Total Cost: 21.56

            Plan Rows: 3

            Plan Width: 8

            Plans:

              - Node Type: "Seq Scan"

                Parent Relationship: "Member"

                Relation Name: "masterOld"

                Alias: "mb"

                Startup Cost: 0.00

                Total Cost: 1.10

                Plan Rows: 1

                Plan Width: 8

                Filter: "((sperre IS NULL) AND (status IS NULL) AND (b.id = id))"

              - Node Type: "Index Scan"

                Parent Relationship: "Member"

                Scan Direction: "Forward"

                Index Name: "part1Old_idx9"

                Relation Name: "part1Old"

                Alias: "mb_1"

                Startup Cost: 0.43

                Total Cost: 12.20

                Plan Rows: 1

                Plan Width: 8

                Index Cond: "(id = b.id)"

                Filter: "((sperre IS NULL) AND (status IS NULL))"

              - Node Type: "Index Scan"

                Parent Relationship: "Member"

                Scan Direction: "Forward"

                Index Name: "part2Old_idx"

                Relation Name: "part2Old"

                Alias: "mb_2"

                Startup Cost: 0.43

                Total Cost: 8.26

                Plan Rows: 1

                Plan Width: 8

                Index Cond: "(id = b.id)"

                Filter: "((sperre IS NULL) AND (status IS NULL))"

 

There were no speed problems.

 

New:

 

    EXPLAIN

    SELECT b.id, b.status

    FROM export b, masterNew mb

    WHERE mb.sperre IS NULL

      AND mb.status IS NULL

      AND b.id = mb.id

    LIMIT 100;

 

<a href="http://explain.depesz.com/s/eAqG">Plan on explain.depesz.com</a>   

 

    - Plan:

    Node Type: "Limit"

    Startup Cost: 5.38

    Total Cost: 306.99

    Plan Rows: 100

    Plan Width: 90

    Plans:

      - Node Type: "Nested Loop"

        Parent Relationship: "Outer"

        Join Type: "Inner"

        Startup Cost: 5.38

        Total Cost: 14973468.06

        Plan Rows: 4964540

        Plan Width: 90

        Join Filter: "(b.id = mb.id)"

        Plans:

          - Node Type: "Seq Scan"

            Parent Relationship: "Outer"

            Relation Name: "export"

            Alias: "b"

            Startup Cost: 0.00

            Total Cost: 79129.80

            Plan Rows: 5485680

            Plan Width: 90

          - Node Type: "Materialize"

            Parent Relationship: "Inner"

            Startup Cost: 5.38

            Total Cost: 717.51

            Plan Rows: 181

            Plan Width: 8

            Plans:

              - Node Type: "Bitmap Heap Scan"

                Parent Relationship: "Outer"

                Relation Name: "masterNew"

                Alias: "mb"

                Startup Cost: 5.38

                Total Cost: 716.61

                Plan Rows: 181

                Plan Width: 8

                Recheck Cond: "((status IS NULL) AND (sperre IS NULL))"

                Plans:

                  - Node Type: "Bitmap Index Scan"

                    Parent Relationship: "Outer"

                    Index Name: "masterNew_2016_pi_idx"

                    Startup Cost: 0.00

                    Total Cost: 5.34

                    Plan Rows: 181

                    Plan Width: 0

 

There we have our problem.

We have tried to fix it using a partial Index on _id_ with `status is null and sperre is null` .

If we don't use `sperre is null` in this query it is quick. I think we have these problems because _sperre_ and _status_ have much null values. _status_: 67% null and _sperre_: 97% null .

On each table there are btree indexes on _id_, _sperre_ and _status_.

On _masterNew_ there is a partial Index on _id_ with `sperre is null and status is null`.

 

Can somebody help me with these performance Problem.

What can I try to solve this?

 

Best regards,

Sven Kerkling

pgsql-performance by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: Performant queries on table with many boolean columns
Next
From: Teodor Sigaev
Date:
Subject: Re: Performant queries on table with many boolean columns