Re: BUG #18690: A count function returns wrong value when using FDW - Mailing list pgsql-bugs

From Aidar Uldanov
Subject Re: BUG #18690: A count function returns wrong value when using FDW
Date
Msg-id CAOCMfz=GKA+90BC0LXyuGYPeTT5u1mF_aVxXHHBpG+zBfL_Y=g@mail.gmail.com
Whole thread Raw
In response to Re: BUG #18690: A count function returns wrong value when using FDW  (Aidar Uldanov <aidar@donorbox.org>)
Responses Re: BUG #18690: A count function returns wrong value when using FDW
List pgsql-bugs
David, after some troubleshooting I found out that the issue with FWD and bigint values, so when column type int then it works well but after changing to a bigint type a count() function returns the wrong value.


Here is how to reproduce that


-- ===> remote db schema and data
create table orgs(id int, name varchar, provider_id int);

 create table recurring_payments(
  id int primary key,
  org_id int
 );

 create table transactions(
  id serial primary key,
  amount int,
  org_id int,
  recurring_payment_id int REFERENCES recurring_payments(id)
 );


-- populate remote db
insert into orgs(id, name, provider_id) values (1, 'one', 11), (2, 'two', 22);
insert into recurring_payments(id, org_id) values (1, 1), (2, 1), (3, 2);
insert into transactions(amount, org_id, recurring_payment_id) values (1, 1, 1), (2, 1, null), (3, 1, null), (4, 1, 1);


-- <=== remote db


-- ===> host db

-- connect back to base database and create server and fdw
CREATE EXTENSION postgres_fdw;

CREATE SERVER foreign_server
        FOREIGN DATA WRAPPER postgres_fdw
        OPTIONS (host 'localhost', port '5432', dbname 'test3');

CREATE USER MAPPING FOR postgres
        SERVER foreign_server
        OPTIONS (user 'postgres', password '12345');

CREATE FOREIGN TABLE orgs_f (
  id int, name varchar, provider_id int
)
SERVER foreign_server
OPTIONS (schema_name 'public', table_name 'orgs');

CREATE FOREIGN TABLE recurring_payments_f (
  id int, org_id int
)
SERVER foreign_server
OPTIONS (schema_name 'public', table_name 'recurring_payments');


CREATE FOREIGN TABLE transactions_f (
    id serial,
  amount bigint,
  org_id int,
  recurring_payment_id int
)
SERVER foreign_server
OPTIONS (schema_name 'public', table_name 'transactions');


create view transactions_v as
select t.*, o.provider_id as real_org_id
from transactions_f t
join orgs_f o on o.id = t.org_id;

create view recurring_payments_v as
select rp.*, o.provider_id as real_org_id
from recurring_payments_f rp
join orgs_f o on o.id = rp.org_id;

-- This one returns (1 + 4) / 2 = 2.5 which is correct
select avg(t.amount)
from transactions_v t
join recurring_payments_v rp on rp.id = t.recurring_payment_id
where t.real_org_id = 11 and rp.real_org_id = 11;

-- wrapping this to count gives 2 which is not correct
select count(*) from (
select avg(t.amount)
from transactions_v t
join recurring_payments_v rp on rp.id = t.recurring_payment_id
where t.real_org_id = 11 and rp.real_org_id = 11
);

On Wed, Nov 6, 2024 at 3:47 PM Aidar Uldanov <aidar@donorbox.org> wrote:
Thanks David

I am trying to reproduce it on a new db but I couldn't so far, though I was able to simplify the queries

```select count(*)
FROM (
SELECT AVG(t.usd_amount_cents / 100) AS average
FROM transactions t
JOIN recurring_payments r ON r.id = t.recurring_payment_id
WHERE t.org_id = 1 AND r.org_id = 1 and  t.status = 'paid'                                                        
);
```
 count
-------
 22429
(1 row)

```
select count(*)
FROM (
SELECT AVG(t.usd_amount_cents / 100) AS average
FROM transactions t
JOIN recurring_payments r ON r.id = t.recurring_payment_id
WHERE t.status = 'paid'                                                        
);
```
 count
-------
     1
(1 row)

Those queries are the same and both should return 1 because the inner query returns just one AVG value. The difference between those two queries is `t.org_id = 1 AND r.org_id = 1`


Here are SQL explains

explain verbose
select count(*)
FROM (
SELECT AVG(t.usd_amount_cents / 100) AS average
FROM transactions t
JOIN recurring_payments r ON r.id = t.recurring_payment_id
WHERE t.org_id = 1 AND r.org_id = 1 and  t.status = 'paid'
);
                                                                                                                                                                       QUERY PLAN                                                            
                                                                                                         
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
 Aggregate  (cost=213.22..213.22 rows=1 width=8)
   Output: count(*)
   ->  Foreign Scan  (cost=102.30..213.21 rows=1 width=32)
         Output: NULL::numeric
         Relations: Aggregate on ((((transactions t) INNER JOIN (orgs o)) INNER JOIN (recurring_payments rp)) INNER JOIN (orgs o_1))
         Remote SQL: SELECT NULL::numeric FROM (((public.transactions r4 INNER JOIN public.orgs r5 ON (((r4.org_id = r5.id)) AND ((r5.provider_id = 1)) AND ((r4.status = 'paid')))) INNER JOIN public.recurring_payments r7 ON (((r4.recurrin
g_payment_id = r7.id)))) INNER JOIN public.orgs r8 ON (((r7.org_id = r8.id)) AND ((r8.provider_id = 1))))
(7 rows)


explain verbose
select count(*)
FROM (
SELECT AVG(t.usd_amount_cents / 100) AS average
FROM transactions t
JOIN recurring_payments r ON r.id = t.recurring_payment_id
WHERE t.status = 'paid'                                                    
);
                                                                                                                                          QUERY PLAN                                                                                          
                                                 
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------
 Aggregate  (cost=15478.69..15478.69 rows=1 width=8)
   Output: count(*)
   ->  Aggregate  (cost=15478.69..15478.69 rows=1 width=32)
         Output: NULL::numeric
         ->  Foreign Scan  (cost=396.66..15478.69 rows=820775 width=32)
               Relations: (((transactions t) INNER JOIN (orgs o)) INNER JOIN (recurring_payments rp)) INNER JOIN (orgs o_1)
               Remote SQL: SELECT NULL FROM (((public.transactions r4 INNER JOIN public.orgs r5 ON (((r4.org_id = r5.id)) AND ((r4.status = 'paid')))) INNER JOIN public.recurring_payments r7 ON (((r4.recurring_payment_id = r7.id)))) INNER
 JOIN public.orgs r8 ON (((r7.org_id = r8.id))))
(8 rows)

Thanks and regards
Aidar

On Wed, Nov 6, 2024 at 12:03 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Wed, 6 Nov 2024 at 22:58, PG Bug reporting form
<noreply@postgresql.org> wrote:
> ```
> WITH
>
>         transactions AS ( SELECT * FROM transactions_fwd WHERE org_id = 1
> ),
>         recurring_payments AS ( SELECT * FROM recurring_payments_fwd WHERE
> org_id = 1 )
> SELECT AVG(t.usd_amount_cents / 100) AS average_donation_usd
> FROM transactions t
> JOIN recurring_payments r ON r.id = t.recurring_payment_id
> WHERE t.status = 'paid' AND t.deleted_at IS NULL AND t.amount_refunded_cents
> = 0
> ```
>
> But when I wrap it to `select count(*) from (my_query)` i get 288 as a
> result, but I expect it to be 1.
>
> ```
> select count(*)
> FROM (
> WITH
>
>         transactions AS ( SELECT * FROM transactions_fwd WHERE org_id = 1
> ),
>         recurring_payments AS ( SELECT * FROM recurring_payments_fwd WHERE
> org_id = 1 )
> SELECT AVG(t.usd_amount_cents / 100) AS average_donation_usd
> FROM transactions t
> JOIN recurring_payments r ON r.id = t.recurring_payment_id
> WHERE t.status = 'paid' AND t.deleted_at IS NULL AND t.amount_refunded_cents
> = 0
> );
> ```

Can you show the EXPLAIN ANALYZE of both of these queries?

Also, can you work on getting a self-contained reproducer that we can
run to recreate the issue.

David

pgsql-bugs by date:

Previous
From: Aidar Uldanov
Date:
Subject: Re: BUG #18690: A count function returns wrong value when using FDW
Next
From: Tom Lane
Date:
Subject: Re: BUG #18690: A count function returns wrong value when using FDW