Thread: BUG #18766: not exists sometimes gives too few records

BUG #18766: not exists sometimes gives too few records

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      18766
Logged by:          Jan Kort
Email address:      jan.kort@genetics.nl
PostgreSQL version: 16.2
Operating system:   Windows
Description:

Hi,

Unfortunately I can't reproduce this problem in a script yet, it has
occurred in 2 different databases out of 100 so far and only in specific
queries. When it does occur on a specific database it keeps occurring every
time the query is executed.

All databases are running PostgreSQL 16.2 on Windows, to be specific:

PostgreSQL 16.2, compiled by Visual C++ build 1937, 64-bit

On 2 of the databases I always get records out of this query:

select * -- 15
from ADVIES_TYPE
where owners_id IN (1,32,25)
and conditional = 'N'
and id not in (
select ADVIES_TYPE from ADVIES_HOOFD
where ADVIES_HOOFD.ADVIES_TYPE = ADVIES_TYPE.ID
and ADVIES_HOOFD.OWNERS_ID IN (1,32,25))
except
select * -- 14
from ADVIES_TYPE
where owners_id IN (1,32,25)
--and id = 1000001002
and conditional = 'N'
and not exists (
select 1 from ADVIES_HOOFD
--where ADVIES_HOOFD.ADVIES_TYPE::text = ADVIES_TYPE.ID::text
where ADVIES_HOOFD.ADVIES_TYPE = ADVIES_TYPE.ID
and ADVIES_HOOFD.OWNERS_ID IN (1,32,25))

Will return a record with ID = 1000001002 on 1 database (15-14=1 row)

On another database it will return 18 rows (18-0=18 rows)

On my personal database, it works as I expect and return 0 rows (15-15)

The first part, before the except gives 15 records and the part after the
except gives 14 records

When I compare the 15 and 14 records manually I indeed see that 1000001002
is in part 1 and missing from part 2

As far as I understand both parts should always return the same number of
records, so part1 except part2 shouldn't return a record.

When I check manually, there isn't a 1000001002  in
ADVIES_HOOFD.ADVIES_TYPE:

select * from ADVIES_HOOFD where ADVIES_TYPE = 1000001002

Gives no records

The create for advies_type is:

CREATE TABLE IF NOT EXISTS public.advies_type
(
    id integer NOT NULL DEFAULT nextval('advies_type_id_seq'::regclass),
    brondocdate_afv timestamp with time zone,
    brondocdate_opv timestamp with time zone,
    brondocnr_afv character varying(20) COLLATE pg_catalog."default",
    brondocnr_opv character varying(20) COLLATE pg_catalog."default",
    mutator_id numeric(22,0) NOT NULL,
    creator_id numeric(22,0) NOT NULL,
    createdate timestamp with time zone NOT NULL,
    mut_dat timestamp with time zone NOT NULL,
    conditional character varying(10) COLLATE pg_catalog."default" NOT
NULL,
    owners_id integer NOT NULL,
    omschrijving character varying(50) COLLATE pg_catalog."default" NOT
NULL,
    instelling integer,
    CONSTRAINT advies_typep PRIMARY KEY (id),
    CONSTRAINT instellingc1 FOREIGN KEY (instelling)
        REFERENCES public.advies_instelling (id) MATCH SIMPLE
        ON UPDATE NO ACTION
        ON DELETE NO ACTION
        DEFERRABLE INITIALLY DEFERRED,
    CONSTRAINT owners_idc1 FOREIGN KEY (owners_id)
        REFERENCES public.owners (id) MATCH SIMPLE
        ON UPDATE NO ACTION
        ON DELETE NO ACTION
        DEFERRABLE INITIALLY DEFERRED,
    CONSTRAINT conditionalc1 CHECK ((conditional::text = ANY
(ARRAY['N'::text, 'J'::text, 'H'::text])) OR conditional IS NULL)
)

And ADVIES_HOOFD:

CREATE TABLE IF NOT EXISTS public.advies_hoofd
(
    id integer NOT NULL DEFAULT nextval('advies_hoofd_id_seq'::regclass),
    brondocdate_afv timestamp with time zone,
    brondocdate_opv timestamp with time zone,
    brondocnr_afv character varying(20) COLLATE pg_catalog."default",
    brondocnr_opv character varying(20) COLLATE pg_catalog."default",
    mutator_id numeric(22,0) NOT NULL,
    creator_id numeric(22,0) NOT NULL,
    createdate timestamp with time zone NOT NULL,
    mut_dat timestamp with time zone NOT NULL,
    conditional character varying(10) COLLATE pg_catalog."default" NOT
NULL,
    owners_id integer NOT NULL,
    advies_type integer,
    advies_kenmerk character varying(40) COLLATE pg_catalog."default",
    advies_gebied_id integer,
    advies_niveau_id integer,
    advies_besluit_id integer,
    datum_advies timestamp with time zone,
    advies_termijn numeric(20,0),
    door integer,
    advies_toelichten character varying(1) COLLATE pg_catalog."default",
    omschrijving character varying(4000) COLLATE pg_catalog."default",
    datum_toets timestamp with time zone,
    getoetst_door integer,
    advies text COLLATE pg_catalog."default",
    namens integer,
    advies_status integer,
    advies_toepaswel character varying(10) COLLATE pg_catalog."default",
    advies_afgerond character varying(1) COLLATE pg_catalog."default",
    advies_meenemen character varying(1) COLLATE pg_catalog."default",
    advies_commissie_id integer,
    advies_datum_id integer,
    advies_eerder character varying(1) COLLATE pg_catalog."default",
    laatste_agenda timestamp with time zone,
    omschrijving_kort character varying(150) COLLATE pg_catalog."default",
    zaak_identificatie character varying(40) COLLATE pg_catalog."default",
    zaak_status_id integer,
    zaak_reeks_id integer,
    xcoordinaat numeric(22,6),
    ycoordinaat numeric(22,6),
    CONSTRAINT advies_hoofdp PRIMARY KEY (id),
    CONSTRAINT advies_besluit_idc1 FOREIGN KEY (advies_besluit_id)
        REFERENCES public.advies_besluit (id) MATCH SIMPLE
        ON UPDATE NO ACTION
        ON DELETE NO ACTION
        DEFERRABLE INITIALLY DEFERRED,
    CONSTRAINT advies_commissie_idc1 FOREIGN KEY (advies_commissie_id)
        REFERENCES public.adv_commissie (id) MATCH SIMPLE
        ON UPDATE NO ACTION
        ON DELETE NO ACTION
        DEFERRABLE INITIALLY DEFERRED,
    CONSTRAINT advies_datum_idc1 FOREIGN KEY (advies_datum_id)
        REFERENCES public.adv_commissie_datums (id) MATCH SIMPLE
        ON UPDATE NO ACTION
        ON DELETE NO ACTION
        DEFERRABLE INITIALLY DEFERRED,
    CONSTRAINT advies_gebied_idc1 FOREIGN KEY (advies_gebied_id)
        REFERENCES public.advies_gebied (id) MATCH SIMPLE
        ON UPDATE NO ACTION
        ON DELETE NO ACTION
        DEFERRABLE INITIALLY DEFERRED,
    CONSTRAINT advies_niveau_idc1 FOREIGN KEY (advies_niveau_id)
        REFERENCES public.advies_niveau (id) MATCH SIMPLE
        ON UPDATE NO ACTION
        ON DELETE NO ACTION
        DEFERRABLE INITIALLY DEFERRED,
    CONSTRAINT advies_statusc1 FOREIGN KEY (advies_status)
        REFERENCES public.advies_status (id) MATCH SIMPLE
        ON UPDATE NO ACTION
        ON DELETE NO ACTION
        DEFERRABLE INITIALLY DEFERRED,
    CONSTRAINT advies_typec1 FOREIGN KEY (advies_type)
        REFERENCES public.advies_type (id) MATCH SIMPLE
        ON UPDATE NO ACTION
        ON DELETE NO ACTION
        DEFERRABLE INITIALLY DEFERRED,
    CONSTRAINT doorc1 FOREIGN KEY (door)
        REFERENCES public.login (id) MATCH SIMPLE
        ON UPDATE NO ACTION
        ON DELETE NO ACTION
        DEFERRABLE INITIALLY DEFERRED,
    CONSTRAINT getoetst_doorc1 FOREIGN KEY (getoetst_door)
        REFERENCES public.login (id) MATCH SIMPLE
        ON UPDATE NO ACTION
        ON DELETE NO ACTION
        DEFERRABLE INITIALLY DEFERRED,
    CONSTRAINT namensc1 FOREIGN KEY (namens)
        REFERENCES public.advies_namens (id) MATCH SIMPLE
        ON UPDATE NO ACTION
        ON DELETE NO ACTION
        DEFERRABLE INITIALLY DEFERRED,
    CONSTRAINT owners_idc1 FOREIGN KEY (owners_id)
        REFERENCES public.owners (id) MATCH SIMPLE
        ON UPDATE NO ACTION
        ON DELETE NO ACTION
        DEFERRABLE INITIALLY DEFERRED,
    CONSTRAINT zaak_reeks_idc1 FOREIGN KEY (zaak_reeks_id)
        REFERENCES public.gfo_zaken_statusreeks (id) MATCH SIMPLE
        ON UPDATE NO ACTION
        ON DELETE NO ACTION
        DEFERRABLE INITIALLY DEFERRED,
    CONSTRAINT zaak_status_idc1 FOREIGN KEY (zaak_status_id)
        REFERENCES public.gfo_zaken_statuscode (id) MATCH SIMPLE
        ON UPDATE NO ACTION
        ON DELETE NO ACTION
        DEFERRABLE INITIALLY DEFERRED,
    CONSTRAINT advies_afgerondc1 CHECK (advies_afgerond::text = ANY
(ARRAY['T'::text, 'F'::text])),
    CONSTRAINT advies_eerderc1 CHECK (advies_eerder::text = ANY
(ARRAY['T'::text, 'F'::text])),
    CONSTRAINT advies_meenemenc1 CHECK (advies_meenemen::text = ANY
(ARRAY['T'::text, 'F'::text])),
    CONSTRAINT advies_toelichtenc1 CHECK (advies_toelichten::text = ANY
(ARRAY['T'::text, 'F'::text])),
    CONSTRAINT advies_toepaswelc1 CHECK ((advies_toepaswel::text = ANY
(ARRAY['J'::text, 'N'::text, 'V'::text])) OR advies_toepaswel IS NULL),
    CONSTRAINT conditionalc1 CHECK ((conditional::text = ANY
(ARRAY['N'::text, 'J'::text, 'H'::text])) OR conditional IS NULL)
)

So both sides of the comparison:

ADVIES_HOOFD.ADVIES_TYPE = ADVIES_TYPE.ID

Are INTEGER

When I run:

select * -- 14
from ADVIES_TYPE
where owners_id IN (1,32,25)
and id = 1000001002
and conditional = 'N'
and not exists (
select 1 from ADVIES_HOOFD
--where ADVIES_HOOFD.ADVIES_TYPE::text = ADVIES_TYPE.ID::text
where ADVIES_HOOFD.ADVIES_TYPE = ADVIES_TYPE.ID
and ADVIES_HOOFD.OWNERS_ID IN (1,32,25))

So, with the "and id = 1000001002", I get exactly that record 1000001002 as
a result. So it seems that this somehow fixes 1000001002 not appearing in
the result.

When I run the first query, but I cast the comparison to text:

select * -- 15
from ADVIES_TYPE
where owners_id IN (1,32,25)
--and id = 1000001002
and conditional = 'N'
and not exists (
select 1 from ADVIES_HOOFD
where ADVIES_HOOFD.ADVIES_TYPE::text = ADVIES_TYPE.ID::text
--where ADVIES_HOOFD.ADVIES_TYPE = ADVIES_TYPE.ID
and ADVIES_HOOFD.OWNERS_ID IN (1,32,25))

It gives 15 rows

When I change it back:

select * -- 14
from ADVIES_TYPE
where owners_id IN (1,32,25)
--and id = 1000001002
and conditional = 'N'
and not exists (
select 1 from ADVIES_HOOFD
--where ADVIES_HOOFD.ADVIES_TYPE::text = ADVIES_TYPE.ID::text
where ADVIES_HOOFD.ADVIES_TYPE = ADVIES_TYPE.ID
and ADVIES_HOOFD.OWNERS_ID IN (1,32,25))

It gives 14 rows

Like I said, I can't reproduce it in a script and attempts to make the data
in my local database as close to one of the problem databases have been
unsuccessful unfortunately.

I am hoping this is a known issue, but I can't find any mention of it. I
looked through the releasenotes 16.3 - 17.2, but couldn't find any mention
of this. Is it a known issue in 16.2 that "not exists" in combination with
INTEGER indexes can malfunction under certain circumstances?

Regards,

Jan


Re: BUG #18766: not exists sometimes gives too few records

From
Tomas Vondra
Date:

On 1/3/25 18:04, PG Bug reporting form wrote:
> The following bug has been logged on the website:
> 
> Bug reference:      18766
> Logged by:          Jan Kort
> Email address:      jan.kort@genetics.nl
> PostgreSQL version: 16.2
> Operating system:   Windows
> Description:        
> 
> Hi,
> 
> Unfortunately I can't reproduce this problem in a script yet, it has
> occurred in 2 different databases out of 100 so far and only in specific
> queries. When it does occur on a specific database it keeps occurring every
> time the query is executed.
> 
> All databases are running PostgreSQL 16.2 on Windows, to be specific:
> 
> PostgreSQL 16.2, compiled by Visual C++ build 1937, 64-bit
> 
> On 2 of the databases I always get records out of this query:
> 
> select * -- 15
> from ADVIES_TYPE
> where owners_id IN (1,32,25)
> and conditional = 'N'
> and id not in (
> select ADVIES_TYPE from ADVIES_HOOFD
> where ADVIES_HOOFD.ADVIES_TYPE = ADVIES_TYPE.ID
> and ADVIES_HOOFD.OWNERS_ID IN (1,32,25))
> except
> select * -- 14
> from ADVIES_TYPE
> where owners_id IN (1,32,25)
> --and id = 1000001002
> and conditional = 'N'
> and not exists (
> select 1 from ADVIES_HOOFD
> --where ADVIES_HOOFD.ADVIES_TYPE::text = ADVIES_TYPE.ID::text
> where ADVIES_HOOFD.ADVIES_TYPE = ADVIES_TYPE.ID
> and ADVIES_HOOFD.OWNERS_ID IN (1,32,25))
> 
> Will return a record with ID = 1000001002 on 1 database (15-14=1 row)
> 
> On another database it will return 18 rows (18-0=18 rows)
> 
> On my personal database, it works as I expect and return 0 rows (15-15)
> 
> The first part, before the except gives 15 records and the part after the
> except gives 14 records
> 
> When I compare the 15 and 14 records manually I indeed see that 1000001002
> is in part 1 and missing from part 2
> 
> As far as I understand both parts should always return the same number of
> records, so part1 except part2 shouldn't return a record.
> 
> When I check manually, there isn't a 1000001002  in
> ADVIES_HOOFD.ADVIES_TYPE:
> 
> select * from ADVIES_HOOFD where ADVIES_TYPE = 1000001002
> 
> Gives no records
> 
> The create for advies_type is:
> 
> CREATE TABLE IF NOT EXISTS public.advies_type
> (
>     id integer NOT NULL DEFAULT nextval('advies_type_id_seq'::regclass),
>     brondocdate_afv timestamp with time zone,
>     brondocdate_opv timestamp with time zone,
>     brondocnr_afv character varying(20) COLLATE pg_catalog."default",
>     brondocnr_opv character varying(20) COLLATE pg_catalog."default",
>     mutator_id numeric(22,0) NOT NULL,
>     creator_id numeric(22,0) NOT NULL,
>     createdate timestamp with time zone NOT NULL,
>     mut_dat timestamp with time zone NOT NULL,
>     conditional character varying(10) COLLATE pg_catalog."default" NOT
> NULL,
>     owners_id integer NOT NULL,
>     omschrijving character varying(50) COLLATE pg_catalog."default" NOT
> NULL,
>     instelling integer,
>     CONSTRAINT advies_typep PRIMARY KEY (id),
>     CONSTRAINT instellingc1 FOREIGN KEY (instelling)
>         REFERENCES public.advies_instelling (id) MATCH SIMPLE
>         ON UPDATE NO ACTION
>         ON DELETE NO ACTION
>         DEFERRABLE INITIALLY DEFERRED,
>     CONSTRAINT owners_idc1 FOREIGN KEY (owners_id)
>         REFERENCES public.owners (id) MATCH SIMPLE
>         ON UPDATE NO ACTION
>         ON DELETE NO ACTION
>         DEFERRABLE INITIALLY DEFERRED,
>     CONSTRAINT conditionalc1 CHECK ((conditional::text = ANY
> (ARRAY['N'::text, 'J'::text, 'H'::text])) OR conditional IS NULL)
> )
> 
> And ADVIES_HOOFD:
> 
> CREATE TABLE IF NOT EXISTS public.advies_hoofd
> (
>     id integer NOT NULL DEFAULT nextval('advies_hoofd_id_seq'::regclass),
>     brondocdate_afv timestamp with time zone,
>     brondocdate_opv timestamp with time zone,
>     brondocnr_afv character varying(20) COLLATE pg_catalog."default",
>     brondocnr_opv character varying(20) COLLATE pg_catalog."default",
>     mutator_id numeric(22,0) NOT NULL,
>     creator_id numeric(22,0) NOT NULL,
>     createdate timestamp with time zone NOT NULL,
>     mut_dat timestamp with time zone NOT NULL,
>     conditional character varying(10) COLLATE pg_catalog."default" NOT
> NULL,
>     owners_id integer NOT NULL,
>     advies_type integer,
>     advies_kenmerk character varying(40) COLLATE pg_catalog."default",
>     advies_gebied_id integer,
>     advies_niveau_id integer,
>     advies_besluit_id integer,
>     datum_advies timestamp with time zone,
>     advies_termijn numeric(20,0),
>     door integer,
>     advies_toelichten character varying(1) COLLATE pg_catalog."default",
>     omschrijving character varying(4000) COLLATE pg_catalog."default",
>     datum_toets timestamp with time zone,
>     getoetst_door integer,
>     advies text COLLATE pg_catalog."default",
>     namens integer,
>     advies_status integer,
>     advies_toepaswel character varying(10) COLLATE pg_catalog."default",
>     advies_afgerond character varying(1) COLLATE pg_catalog."default",
>     advies_meenemen character varying(1) COLLATE pg_catalog."default",
>     advies_commissie_id integer,
>     advies_datum_id integer,
>     advies_eerder character varying(1) COLLATE pg_catalog."default",
>     laatste_agenda timestamp with time zone,
>     omschrijving_kort character varying(150) COLLATE pg_catalog."default",
>     zaak_identificatie character varying(40) COLLATE pg_catalog."default",
>     zaak_status_id integer,
>     zaak_reeks_id integer,
>     xcoordinaat numeric(22,6),
>     ycoordinaat numeric(22,6),
>     CONSTRAINT advies_hoofdp PRIMARY KEY (id),
>     CONSTRAINT advies_besluit_idc1 FOREIGN KEY (advies_besluit_id)
>         REFERENCES public.advies_besluit (id) MATCH SIMPLE
>         ON UPDATE NO ACTION
>         ON DELETE NO ACTION
>         DEFERRABLE INITIALLY DEFERRED,
>     CONSTRAINT advies_commissie_idc1 FOREIGN KEY (advies_commissie_id)
>         REFERENCES public.adv_commissie (id) MATCH SIMPLE
>         ON UPDATE NO ACTION
>         ON DELETE NO ACTION
>         DEFERRABLE INITIALLY DEFERRED,
>     CONSTRAINT advies_datum_idc1 FOREIGN KEY (advies_datum_id)
>         REFERENCES public.adv_commissie_datums (id) MATCH SIMPLE
>         ON UPDATE NO ACTION
>         ON DELETE NO ACTION
>         DEFERRABLE INITIALLY DEFERRED,
>     CONSTRAINT advies_gebied_idc1 FOREIGN KEY (advies_gebied_id)
>         REFERENCES public.advies_gebied (id) MATCH SIMPLE
>         ON UPDATE NO ACTION
>         ON DELETE NO ACTION
>         DEFERRABLE INITIALLY DEFERRED,
>     CONSTRAINT advies_niveau_idc1 FOREIGN KEY (advies_niveau_id)
>         REFERENCES public.advies_niveau (id) MATCH SIMPLE
>         ON UPDATE NO ACTION
>         ON DELETE NO ACTION
>         DEFERRABLE INITIALLY DEFERRED,
>     CONSTRAINT advies_statusc1 FOREIGN KEY (advies_status)
>         REFERENCES public.advies_status (id) MATCH SIMPLE
>         ON UPDATE NO ACTION
>         ON DELETE NO ACTION
>         DEFERRABLE INITIALLY DEFERRED,
>     CONSTRAINT advies_typec1 FOREIGN KEY (advies_type)
>         REFERENCES public.advies_type (id) MATCH SIMPLE
>         ON UPDATE NO ACTION
>         ON DELETE NO ACTION
>         DEFERRABLE INITIALLY DEFERRED,
>     CONSTRAINT doorc1 FOREIGN KEY (door)
>         REFERENCES public.login (id) MATCH SIMPLE
>         ON UPDATE NO ACTION
>         ON DELETE NO ACTION
>         DEFERRABLE INITIALLY DEFERRED,
>     CONSTRAINT getoetst_doorc1 FOREIGN KEY (getoetst_door)
>         REFERENCES public.login (id) MATCH SIMPLE
>         ON UPDATE NO ACTION
>         ON DELETE NO ACTION
>         DEFERRABLE INITIALLY DEFERRED,
>     CONSTRAINT namensc1 FOREIGN KEY (namens)
>         REFERENCES public.advies_namens (id) MATCH SIMPLE
>         ON UPDATE NO ACTION
>         ON DELETE NO ACTION
>         DEFERRABLE INITIALLY DEFERRED,
>     CONSTRAINT owners_idc1 FOREIGN KEY (owners_id)
>         REFERENCES public.owners (id) MATCH SIMPLE
>         ON UPDATE NO ACTION
>         ON DELETE NO ACTION
>         DEFERRABLE INITIALLY DEFERRED,
>     CONSTRAINT zaak_reeks_idc1 FOREIGN KEY (zaak_reeks_id)
>         REFERENCES public.gfo_zaken_statusreeks (id) MATCH SIMPLE
>         ON UPDATE NO ACTION
>         ON DELETE NO ACTION
>         DEFERRABLE INITIALLY DEFERRED,
>     CONSTRAINT zaak_status_idc1 FOREIGN KEY (zaak_status_id)
>         REFERENCES public.gfo_zaken_statuscode (id) MATCH SIMPLE
>         ON UPDATE NO ACTION
>         ON DELETE NO ACTION
>         DEFERRABLE INITIALLY DEFERRED,
>     CONSTRAINT advies_afgerondc1 CHECK (advies_afgerond::text = ANY
> (ARRAY['T'::text, 'F'::text])),
>     CONSTRAINT advies_eerderc1 CHECK (advies_eerder::text = ANY
> (ARRAY['T'::text, 'F'::text])),
>     CONSTRAINT advies_meenemenc1 CHECK (advies_meenemen::text = ANY
> (ARRAY['T'::text, 'F'::text])),
>     CONSTRAINT advies_toelichtenc1 CHECK (advies_toelichten::text = ANY
> (ARRAY['T'::text, 'F'::text])),
>     CONSTRAINT advies_toepaswelc1 CHECK ((advies_toepaswel::text = ANY
> (ARRAY['J'::text, 'N'::text, 'V'::text])) OR advies_toepaswel IS NULL),
>     CONSTRAINT conditionalc1 CHECK ((conditional::text = ANY
> (ARRAY['N'::text, 'J'::text, 'H'::text])) OR conditional IS NULL)
> )
> 
> So both sides of the comparison:
> 
> ADVIES_HOOFD.ADVIES_TYPE = ADVIES_TYPE.ID
> 
> Are INTEGER
> 
> When I run:
> 
> select * -- 14
> from ADVIES_TYPE
> where owners_id IN (1,32,25)
> and id = 1000001002
> and conditional = 'N'
> and not exists (
> select 1 from ADVIES_HOOFD
> --where ADVIES_HOOFD.ADVIES_TYPE::text = ADVIES_TYPE.ID::text
> where ADVIES_HOOFD.ADVIES_TYPE = ADVIES_TYPE.ID
> and ADVIES_HOOFD.OWNERS_ID IN (1,32,25))
> 
> So, with the "and id = 1000001002", I get exactly that record 1000001002 as
> a result. So it seems that this somehow fixes 1000001002 not appearing in
> the result.
> 
> When I run the first query, but I cast the comparison to text:
> 
> select * -- 15
> from ADVIES_TYPE
> where owners_id IN (1,32,25)
> --and id = 1000001002
> and conditional = 'N'
> and not exists (
> select 1 from ADVIES_HOOFD
> where ADVIES_HOOFD.ADVIES_TYPE::text = ADVIES_TYPE.ID::text
> --where ADVIES_HOOFD.ADVIES_TYPE = ADVIES_TYPE.ID
> and ADVIES_HOOFD.OWNERS_ID IN (1,32,25))
> 
> It gives 15 rows
> 
> When I change it back:
> 
> select * -- 14
> from ADVIES_TYPE
> where owners_id IN (1,32,25)
> --and id = 1000001002
> and conditional = 'N'
> and not exists (
> select 1 from ADVIES_HOOFD
> --where ADVIES_HOOFD.ADVIES_TYPE::text = ADVIES_TYPE.ID::text
> where ADVIES_HOOFD.ADVIES_TYPE = ADVIES_TYPE.ID
> and ADVIES_HOOFD.OWNERS_ID IN (1,32,25))
> 
> It gives 14 rows
> 
> Like I said, I can't reproduce it in a script and attempts to make the data
> in my local database as close to one of the problem databases have been
> unsuccessful unfortunately.
> 
> I am hoping this is a known issue, but I can't find any mention of it. I
> looked through the releasenotes 16.3 - 17.2, but couldn't find any mention
> of this. Is it a known issue in 16.2 that "not exists" in combination with
> INTEGER indexes can malfunction under certain circumstances?
> 

I don't think so. It's a bit hard to grok what exactly is expected to be
happening - we don't know the data or the explain plans, so we'd have to
speculate about what is expected to happen etc.

If I had to guess, this seems a bit like index corruption, in which case
the queries can easily return confusing results, depending on what query
plan happens to be selected (e.g. it might work fine without index).

That'd also explain why you may have difficulties trying to reproduce
this on a different machine, a copy of the data, etc.

Use EXPLAIN to check the query plan, and then try to force it to not use
indexes (SET enable_indescans=off). If the result changes to the correct
one, that's a hint it's about indexes. You may try rebuilding the index,
or perhaps even all indexes (because if this one has issues, why not the
others?).

If this doesn't help, I suggest you try to narrow down the example. For
example, instead of the whole query with EXISTS, determine for which
values of (ADVIES_TYPE, OWNERS_ID) the subquery returns the wrong
results, and then look at only that (much smaller) query. Check if it
still uses the same plan (as in the whole query), if it returns the
right results, etc.


regards

-- 
Tomas Vondra




Re: BUG #18766: not exists sometimes gives too few records

From
Tomas Vondra
Date:
Hi Jan,

Please always respond to all, so that it goes to the mailing list too,
not just to me directly. I'll respond to the list this time, but it
makes the thread harder to follow.


On 1/4/25 12:11, Jan Kort wrote:
> Hi Tomas,
>
> You are right that without the index it does produce the correct result
> (0 records), but how do I uncorrupt the index? I tried the following:
>
>   * Dropping and recreating the 2 indexes in the explain
>   * Dropping and recreating all indexes on the the 2 tables
>   * Vacuum analyze advies_type
>   * Vacuum analyze advies_hoofd
>
> But it still produces the wrong result (1 record)
>

To "uncorrupt" the index is to rebuild/reindex it. But you already did
that and it didn't help, so it seems more like a bug in Postgres than a
data corruption.

> The explain of the query on the malfunctioning database is:
>
> HashSetOp Except  (cost=0.14..29.91 rows=9 width=364)
>   ->  Append  (cost=0.14..29.13 rows=24 width=364)
>         ->  Subquery Scan on "*SELECT* 1"  (cost=0.14..26.62 rows=9
> width=149)
>               ->  Index Scan using advies_typeo12 on advies_type
>  (cost=0.14..26.53 rows=9 width=145)
>                     Index Cond: ((conditional)::text = 'N'::text)
>                     Filter: ((owners_id = ANY ('{1,32,25}'::integer[]))
> AND (NOT (SubPlan 1)))
>                     SubPlan 1
>                       ->  Index Scan using advies_hoofdo16 on
> advies_hoofd advies_hoofd_1  (cost=0.15..2.37 rows=1 width=4)
>                             Index Cond: (advies_type = advies_type.id
> <http://advies_type.id>)
>                             Filter: (owners_id = ANY
> ('{1,32,25}'::integer[]))
>         ->  Subquery Scan on "*SELECT* 2"  (cost=1.86..2.39 rows=15
> width=149)
>               ->  Merge Right Anti Join  (cost=1.86..2.24 rows=15
width=145)
>                     Merge Cond: (advies_hoofd.advies_type =
> advies_type_1.id <http://advies_type_1.id>)
>                     ->  Index Scan using advies_hoofdo16 on advies_hoofd
>  (cost=0.15..18.01 rows=293 width=4)
>                           Filter: (owners_id = ANY
('{1,32,25}'::integer[]))
>                     ->  Sort  (cost=1.72..1.76 rows=18 width=145)
>                           Sort Key: advies_type_1.id <http://
> advies_type_1.id>
>                           ->  Seq Scan on advies_type advies_type_1
>  (cost=0.00..1.34 rows=18 width=145)
>                                 Filter: (((conditional)::text =
> 'N'::text) AND (owners_id = ANY ('{1,32,25}'::integer[])))
>
> On the working database it says:
>
> HashSetOp Except  (cost=0.00..17.03 rows=9 width=364)
>   ->  Append  (cost=0.00..16.19 rows=26 width=364)
>         ->  Subquery Scan on "*SELECT* 1"  (cost=0.00..13.20 rows=9
> width=149)
>               ->  Seq Scan on advies_type  (cost=0.00..13.11 rows=9
> width=145)
>                     Filter: (((conditional)::text = 'N'::text) AND
> (owners_id = ANY ('{1,32,25}'::integer[])) AND (NOT (SubPlan 1)))
>                     SubPlan 1
>                       ->  Seq Scan on advies_hoofd advies_hoofd_1
>  (cost=0.00..1.11 rows=1 width=4)
>                             Filter: ((advies_type = advies_type.id
> <http://advies_type.id>) AND (owners_id = ANY ('{1,32,25}'::integer[])))
>         ->  Subquery Scan on "*SELECT* 2"  (cost=1.57..2.85 rows=17
> width=149)
>               ->  Hash Right Anti Join  (cost=1.57..2.68 rows=17
width=145)
>                     Hash Cond: (advies_hoofd.advies_type =
> advies_type_1.id <http://advies_type_1.id>)
>                     ->  Seq Scan on advies_hoofd  (cost=0.00..1.10
> rows=7 width=4)
>                           Filter: (owners_id = ANY
('{1,32,25}'::integer[]))
>                     ->  Hash  (cost=1.34..1.34 rows=18 width=145)
>                           ->  Seq Scan on advies_type advies_type_1
>  (cost=0.00..1.34 rows=18 width=145)
>                                 Filter: (((conditional)::text =
> 'N'::text) AND (owners_id = ANY ('{1,32,25}'::integer[])))
>
> So they are quite different, that gave me an idea why my local database
> wasn't malfunctioning. I inserted 500 dummy records into advies_hoofd
> (without a reference to advies_type).
>
Right. And the "broken" query used an index scan, while the working one
does not. So that's another hint it's some thinko in using the index.

Another thing you might try is disabling the merge join by

  SET enable_mergejoin = off;

or even the index scans (enable_indexscan=off), in which case the query
should probably flip to the hashjoin plan. Then you can compare the
EXPLAIN ANALYZE for the two plans (mergejoin returning the wrong number
of results, and hashjoin returning the correct number). The plans do
align pretty well, except for the join type, so comparing the "rows"
should tell you which of the branches is wrong ...

From the two plans it seems the first Append subplan keeps returning 9
rows, so it's probably the hashjoin - I wonder if it might be about the
advies_hoofdo16 index used with the ANY() clause. Hard to say, I don't
know if there's some difference in the data. But you might try dropping
that index, and see if that fixes it.

If yes, I wonder if the result of

    SELECT * FROM ADVIES_HOOFD WHERE OWNERS_ID IN (1,32,25)

changes depending on the plan type (if you force it to use / not use the
index). That'd be a much simpler query to investigate.


> After that it was still giving 0 records, but then I ran:
>
> vacuum analyze advies_hoofd
> vacuum analyze advies_type
>
> When I run the query now I get 1 record and it's the 1000001002 record,
> so now my local database is malfunctioning too and apparently in the
> same way as the other database. I don't see anything special about this
> record, it's somewhere in the middle.
>
> At least I know how to break it now, I will work this into a standalone
> testcase monday.
>
> Regards,
>
> Jan

Thanks!

Tomas



Re: BUG #18766: not exists sometimes gives too few records

From
David Rowley
Date:
On Sat, 4 Jan 2025 at 06:28, PG Bug reporting form
<noreply@postgresql.org> wrote:
> All databases are running PostgreSQL 16.2 on Windows, to be specific:

You should really be running 16.6.   There was a bug fixed in 16.4
that sounds very likely to solve your problem, namely:

"Avoid incorrect results from Merge Right Anti Join plans (Richard
Guo)".  See [1]

If you still get wrong results with 16.6, please let us know.

David

[1] https://www.postgresql.org/docs/release/16.4/



Re: BUG #18766: not exists sometimes gives too few records

From
Jan Kort
Date:
Thank you, upgrading to 16.6 worked.

Regards,

Jan

--

Met vriendelijke groet,

Jan Kort | Ontwikkelaar
R&D

T. (+31) 36 54 00 850
M. (+31) 6 0000 00 00
A. Postbus 1268, 1300 BG Almere
B. Bouwmeesterweg 8, 1333 LC Almere

Aanwezig: ma | di | woe | do | vrij


On Sat, 4 Jan 2025 at 13:15, David Rowley <dgrowleyml@gmail.com> wrote:
On Sat, 4 Jan 2025 at 06:28, PG Bug reporting form
<noreply@postgresql.org> wrote:
> All databases are running PostgreSQL 16.2 on Windows, to be specific:

You should really be running 16.6.   There was a bug fixed in 16.4
that sounds very likely to solve your problem, namely:

"Avoid incorrect results from Merge Right Anti Join plans (Richard
Guo)".  See [1]

If you still get wrong results with 16.6, please let us know.

David

[1] https://www.postgresql.org/docs/release/16.4/