Re: BUG #18690: A count function returns wrong value when using FDW - Mailing list pgsql-bugs
From | Aidar Uldanov |
---|---|
Subject | Re: BUG #18690: A count function returns wrong value when using FDW |
Date | |
Msg-id | CAOCMfz=GKA+90BC0LXyuGYPeTT5u1mF_aVxXHHBpG+zBfL_Y=g@mail.gmail.com Whole thread Raw |
In response to | Re: BUG #18690: A count function returns wrong value when using FDW (Aidar Uldanov <aidar@donorbox.org>) |
Responses |
Re: BUG #18690: A count function returns wrong value when using FDW
|
List | pgsql-bugs |
David, after some troubleshooting I found out that the issue with FWD and bigint values, so when column type int then it works well but after changing to a bigint type a count() function returns the wrong value.
Here is how to reproduce that
-- ===> remote db schema and data
create table orgs(id int, name varchar, provider_id int);
create table recurring_payments(
id int primary key,
org_id int
);
create table transactions(
id serial primary key,
amount int,
org_id int,
recurring_payment_id int REFERENCES recurring_payments(id)
);
-- populate remote db
insert into orgs(id, name, provider_id) values (1, 'one', 11), (2, 'two', 22);
insert into recurring_payments(id, org_id) values (1, 1), (2, 1), (3, 2);
insert into transactions(amount, org_id, recurring_payment_id) values (1, 1, 1), (2, 1, null), (3, 1, null), (4, 1, 1);
-- <=== remote db
-- ===> host db
-- connect back to base database and create server and fdw
CREATE EXTENSION postgres_fdw;
CREATE SERVER foreign_server
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'localhost', port '5432', dbname 'test3');
CREATE USER MAPPING FOR postgres
SERVER foreign_server
OPTIONS (user 'postgres', password '12345');
CREATE FOREIGN TABLE orgs_f (
id int, name varchar, provider_id int
)
SERVER foreign_server
OPTIONS (schema_name 'public', table_name 'orgs');
CREATE FOREIGN TABLE recurring_payments_f (
id int, org_id int
)
SERVER foreign_server
OPTIONS (schema_name 'public', table_name 'recurring_payments');
CREATE FOREIGN TABLE transactions_f (
id serial,
amount bigint,
org_id int,
recurring_payment_id int
)
SERVER foreign_server
OPTIONS (schema_name 'public', table_name 'transactions');
create view transactions_v as
select t.*, o.provider_id as real_org_id
from transactions_f t
join orgs_f o on o.id = t.org_id;
create view recurring_payments_v as
select rp.*, o.provider_id as real_org_id
from recurring_payments_f rp
join orgs_f o on o.id = rp.org_id;
-- This one returns (1 + 4) / 2 = 2.5 which is correct
select avg(t.amount)
from transactions_v t
join recurring_payments_v rp on rp.id = t.recurring_payment_id
where t.real_org_id = 11 and rp.real_org_id = 11;
-- wrapping this to count gives 2 which is not correct
select count(*) from (
select avg(t.amount)
from transactions_v t
join recurring_payments_v rp on rp.id = t.recurring_payment_id
where t.real_org_id = 11 and rp.real_org_id = 11
);
-- ===> remote db schema and data
create table orgs(id int, name varchar, provider_id int);
create table recurring_payments(
id int primary key,
org_id int
);
create table transactions(
id serial primary key,
amount int,
org_id int,
recurring_payment_id int REFERENCES recurring_payments(id)
);
-- populate remote db
insert into orgs(id, name, provider_id) values (1, 'one', 11), (2, 'two', 22);
insert into recurring_payments(id, org_id) values (1, 1), (2, 1), (3, 2);
insert into transactions(amount, org_id, recurring_payment_id) values (1, 1, 1), (2, 1, null), (3, 1, null), (4, 1, 1);
-- <=== remote db
-- ===> host db
-- connect back to base database and create server and fdw
CREATE EXTENSION postgres_fdw;
CREATE SERVER foreign_server
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'localhost', port '5432', dbname 'test3');
CREATE USER MAPPING FOR postgres
SERVER foreign_server
OPTIONS (user 'postgres', password '12345');
CREATE FOREIGN TABLE orgs_f (
id int, name varchar, provider_id int
)
SERVER foreign_server
OPTIONS (schema_name 'public', table_name 'orgs');
CREATE FOREIGN TABLE recurring_payments_f (
id int, org_id int
)
SERVER foreign_server
OPTIONS (schema_name 'public', table_name 'recurring_payments');
CREATE FOREIGN TABLE transactions_f (
id serial,
amount bigint,
org_id int,
recurring_payment_id int
)
SERVER foreign_server
OPTIONS (schema_name 'public', table_name 'transactions');
create view transactions_v as
select t.*, o.provider_id as real_org_id
from transactions_f t
join orgs_f o on o.id = t.org_id;
create view recurring_payments_v as
select rp.*, o.provider_id as real_org_id
from recurring_payments_f rp
join orgs_f o on o.id = rp.org_id;
-- This one returns (1 + 4) / 2 = 2.5 which is correct
select avg(t.amount)
from transactions_v t
join recurring_payments_v rp on rp.id = t.recurring_payment_id
where t.real_org_id = 11 and rp.real_org_id = 11;
-- wrapping this to count gives 2 which is not correct
select count(*) from (
select avg(t.amount)
from transactions_v t
join recurring_payments_v rp on rp.id = t.recurring_payment_id
where t.real_org_id = 11 and rp.real_org_id = 11
);
On Wed, Nov 6, 2024 at 3:47 PM Aidar Uldanov <aidar@donorbox.org> wrote:
Thanks David
I am trying to reproduce it on a new db but I couldn't so far, though I was able to simplify the queries
```select count(*)
FROM (
SELECT AVG(t.usd_amount_cents / 100) AS average
FROM transactions t
JOIN recurring_payments r ON r.id = t.recurring_payment_id
WHERE t.org_id = 1 AND r.org_id = 1 and t.status = 'paid'
);
```
count
-------
22429
(1 row)
```
select count(*)
FROM (
SELECT AVG(t.usd_amount_cents / 100) AS average
FROM transactions t
JOIN recurring_payments r ON r.id = t.recurring_payment_id
WHERE t.status = 'paid'
);
```
count
-------
1
(1 row)
Those queries are the same and both should return 1 because the inner query returns just one AVG value. The difference between those two queries is `t.org_id = 1 AND r.org_id = 1`Here are SQL explains
explain verbose
select count(*)
FROM (
SELECT AVG(t.usd_amount_cents / 100) AS average
FROM transactions t
JOIN recurring_payments r ON r.id = t.recurring_payment_id
WHERE t.org_id = 1 AND r.org_id = 1 and t.status = 'paid'
);
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
Aggregate (cost=213.22..213.22 rows=1 width=8)
Output: count(*)
-> Foreign Scan (cost=102.30..213.21 rows=1 width=32)
Output: NULL::numeric
Relations: Aggregate on ((((transactions t) INNER JOIN (orgs o)) INNER JOIN (recurring_payments rp)) INNER JOIN (orgs o_1))
Remote SQL: SELECT NULL::numeric FROM (((public.transactions r4 INNER JOIN public.orgs r5 ON (((r4.org_id = r5.id)) AND ((r5.provider_id = 1)) AND ((r4.status = 'paid')))) INNER JOIN public.recurring_payments r7 ON (((r4.recurrin
g_payment_id = r7.id)))) INNER JOIN public.orgs r8 ON (((r7.org_id = r8.id)) AND ((r8.provider_id = 1))))
(7 rows)explain verbose
select count(*)
FROM (
SELECT AVG(t.usd_amount_cents / 100) AS average
FROM transactions t
JOIN recurring_payments r ON r.id = t.recurring_payment_id
WHERE t.status = 'paid'
);
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------
Aggregate (cost=15478.69..15478.69 rows=1 width=8)
Output: count(*)
-> Aggregate (cost=15478.69..15478.69 rows=1 width=32)
Output: NULL::numeric
-> Foreign Scan (cost=396.66..15478.69 rows=820775 width=32)
Relations: (((transactions t) INNER JOIN (orgs o)) INNER JOIN (recurring_payments rp)) INNER JOIN (orgs o_1)
Remote SQL: SELECT NULL FROM (((public.transactions r4 INNER JOIN public.orgs r5 ON (((r4.org_id = r5.id)) AND ((r4.status = 'paid')))) INNER JOIN public.recurring_payments r7 ON (((r4.recurring_payment_id = r7.id)))) INNER
JOIN public.orgs r8 ON (((r7.org_id = r8.id))))(8 rows)
Thanks and regards
AidarOn Wed, Nov 6, 2024 at 12:03 PM David Rowley <dgrowleyml@gmail.com> wrote:On Wed, 6 Nov 2024 at 22:58, PG Bug reporting form
<noreply@postgresql.org> wrote:
> ```
> WITH
>
> transactions AS ( SELECT * FROM transactions_fwd WHERE org_id = 1
> ),
> recurring_payments AS ( SELECT * FROM recurring_payments_fwd WHERE
> org_id = 1 )
> SELECT AVG(t.usd_amount_cents / 100) AS average_donation_usd
> FROM transactions t
> JOIN recurring_payments r ON r.id = t.recurring_payment_id
> WHERE t.status = 'paid' AND t.deleted_at IS NULL AND t.amount_refunded_cents
> = 0
> ```
>
> But when I wrap it to `select count(*) from (my_query)` i get 288 as a
> result, but I expect it to be 1.
>
> ```
> select count(*)
> FROM (
> WITH
>
> transactions AS ( SELECT * FROM transactions_fwd WHERE org_id = 1
> ),
> recurring_payments AS ( SELECT * FROM recurring_payments_fwd WHERE
> org_id = 1 )
> SELECT AVG(t.usd_amount_cents / 100) AS average_donation_usd
> FROM transactions t
> JOIN recurring_payments r ON r.id = t.recurring_payment_id
> WHERE t.status = 'paid' AND t.deleted_at IS NULL AND t.amount_refunded_cents
> = 0
> );
> ```
Can you show the EXPLAIN ANALYZE of both of these queries?
Also, can you work on getting a self-contained reproducer that we can
run to recreate the issue.
David
pgsql-bugs by date: