Thread: BUG #18690: A count function returns wrong value when using FDW
The following bug has been logged on the website: Bug reference: 18690 Logged by: Aidar Uldanov Email address: aidar@donorbox.org PostgreSQL version: 16.4 Operating system: Ubuntu Description: My following query returns a single average_donation_usd value ``` WITH transactions AS ( SELECT * FROM transactions_fwd WHERE org_id = 1 ), recurring_payments AS ( SELECT * FROM recurring_payments_fwd WHERE org_id = 1 ) SELECT AVG(t.usd_amount_cents / 100) AS average_donation_usd FROM transactions t JOIN recurring_payments r ON r.id = t.recurring_payment_id WHERE t.status = 'paid' AND t.deleted_at IS NULL AND t.amount_refunded_cents = 0 ``` But when I wrap it to `select count(*) from (my_query)` i get 288 as a result, but I expect it to be 1. ``` select count(*) FROM ( WITH transactions AS ( SELECT * FROM transactions_fwd WHERE org_id = 1 ), recurring_payments AS ( SELECT * FROM recurring_payments_fwd WHERE org_id = 1 ) SELECT AVG(t.usd_amount_cents / 100) AS average_donation_usd FROM transactions t JOIN recurring_payments r ON r.id = t.recurring_payment_id WHERE t.status = 'paid' AND t.deleted_at IS NULL AND t.amount_refunded_cents = 0 ); ``` I tried to create materialized views to see it this bug relates to using FWD tables. ``` CREATE MATERIALIZED VIEW transactions AS SELECT * FROM transactions_fwd WHERE org_id = 1; CREATE MATERIALIZED VIEW recurring_payments AS SELECT * FROM recurring_payments_fwd WHERE org_id = 1; ``` And it worked well returning me 1 as a result. ``` select count(*) FROM ( WITH transactions AS ( SELECT * FROM transactions WHERE org_id = 1 ), recurring_payments AS ( SELECT * FROM recurring_payments WHERE org_id = 1 ) SELECT AVG(t.usd_amount_cents / 100) AS average_donation_usd FROM transactions t JOIN recurring_payments r ON r.id = t.recurring_payment_id WHERE t.status = 'paid' AND t.deleted_at IS NULL AND t.amount_refunded_cents = 0 ); ``` My OS and Postgres version (hosted on heroku) ``` User => select version(); version ----------------------------------------------------------------------------------------------------------------------------------- PostgreSQL 16.4 (Ubuntu 16.4-1.pgdg20.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0, 64-bit (1 row) ```
On Wed, 6 Nov 2024 at 22:58, PG Bug reporting form <noreply@postgresql.org> wrote: > ``` > WITH > > transactions AS ( SELECT * FROM transactions_fwd WHERE org_id = 1 > ), > recurring_payments AS ( SELECT * FROM recurring_payments_fwd WHERE > org_id = 1 ) > SELECT AVG(t.usd_amount_cents / 100) AS average_donation_usd > FROM transactions t > JOIN recurring_payments r ON r.id = t.recurring_payment_id > WHERE t.status = 'paid' AND t.deleted_at IS NULL AND t.amount_refunded_cents > = 0 > ``` > > But when I wrap it to `select count(*) from (my_query)` i get 288 as a > result, but I expect it to be 1. > > ``` > select count(*) > FROM ( > WITH > > transactions AS ( SELECT * FROM transactions_fwd WHERE org_id = 1 > ), > recurring_payments AS ( SELECT * FROM recurring_payments_fwd WHERE > org_id = 1 ) > SELECT AVG(t.usd_amount_cents / 100) AS average_donation_usd > FROM transactions t > JOIN recurring_payments r ON r.id = t.recurring_payment_id > WHERE t.status = 'paid' AND t.deleted_at IS NULL AND t.amount_refunded_cents > = 0 > ); > ``` Can you show the EXPLAIN ANALYZE of both of these queries? Also, can you work on getting a self-contained reproducer that we can run to recreate the issue. David
Thanks David
I am trying to reproduce it on a new db but I couldn't so far, though I was able to simplify the queries
```select count(*)
FROM (
SELECT AVG(t.usd_amount_cents / 100) AS average
FROM transactions t
JOIN recurring_payments r ON r.id = t.recurring_payment_id
WHERE t.org_id = 1 AND r.org_id = 1 and t.status = 'paid'
);
```
count
-------
22429
(1 row)
```
select count(*)
FROM (
SELECT AVG(t.usd_amount_cents / 100) AS average
FROM transactions t
JOIN recurring_payments r ON r.id = t.recurring_payment_id
WHERE t.status = 'paid'
);
```
count
-------
1
(1 row)
Those queries are the same and both should return 1 because the inner query returns just one AVG value. The difference between those two queries is `t.org_id = 1 AND r.org_id = 1`
I am trying to reproduce it on a new db but I couldn't so far, though I was able to simplify the queries
```select count(*)
FROM (
SELECT AVG(t.usd_amount_cents / 100) AS average
FROM transactions t
JOIN recurring_payments r ON r.id = t.recurring_payment_id
WHERE t.org_id = 1 AND r.org_id = 1 and t.status = 'paid'
);
```
count
-------
22429
(1 row)
```
select count(*)
FROM (
SELECT AVG(t.usd_amount_cents / 100) AS average
FROM transactions t
JOIN recurring_payments r ON r.id = t.recurring_payment_id
WHERE t.status = 'paid'
);
```
count
-------
1
(1 row)
Those queries are the same and both should return 1 because the inner query returns just one AVG value. The difference between those two queries is `t.org_id = 1 AND r.org_id = 1`
Here are SQL explains
explain verbose
select count(*)
FROM (
SELECT AVG(t.usd_amount_cents / 100) AS average
FROM transactions t
JOIN recurring_payments r ON r.id = t.recurring_payment_id
WHERE t.org_id = 1 AND r.org_id = 1 and t.status = 'paid'
);
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
Aggregate (cost=213.22..213.22 rows=1 width=8)
Output: count(*)
-> Foreign Scan (cost=102.30..213.21 rows=1 width=32)
Output: NULL::numeric
Relations: Aggregate on ((((transactions t) INNER JOIN (orgs o)) INNER JOIN (recurring_payments rp)) INNER JOIN (orgs o_1))
Remote SQL: SELECT NULL::numeric FROM (((public.transactions r4 INNER JOIN public.orgs r5 ON (((r4.org_id = r5.id)) AND ((r5.provider_id = 1)) AND ((r4.status = 'paid')))) INNER JOIN public.recurring_payments r7 ON (((r4.recurrin
g_payment_id = r7.id)))) INNER JOIN public.orgs r8 ON (((r7.org_id = r8.id)) AND ((r8.provider_id = 1))))
(7 rows)
explain verbose
select count(*)
FROM (
SELECT AVG(t.usd_amount_cents / 100) AS average
FROM transactions t
JOIN recurring_payments r ON r.id = t.recurring_payment_id
WHERE t.org_id = 1 AND r.org_id = 1 and t.status = 'paid'
);
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
Aggregate (cost=213.22..213.22 rows=1 width=8)
Output: count(*)
-> Foreign Scan (cost=102.30..213.21 rows=1 width=32)
Output: NULL::numeric
Relations: Aggregate on ((((transactions t) INNER JOIN (orgs o)) INNER JOIN (recurring_payments rp)) INNER JOIN (orgs o_1))
Remote SQL: SELECT NULL::numeric FROM (((public.transactions r4 INNER JOIN public.orgs r5 ON (((r4.org_id = r5.id)) AND ((r5.provider_id = 1)) AND ((r4.status = 'paid')))) INNER JOIN public.recurring_payments r7 ON (((r4.recurrin
g_payment_id = r7.id)))) INNER JOIN public.orgs r8 ON (((r7.org_id = r8.id)) AND ((r8.provider_id = 1))))
(7 rows)
explain verbose
select count(*)
FROM (
SELECT AVG(t.usd_amount_cents / 100) AS average
FROM transactions t
JOIN recurring_payments r ON r.id = t.recurring_payment_id
WHERE t.status = 'paid'
);
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------
Aggregate (cost=15478.69..15478.69 rows=1 width=8)
Output: count(*)
-> Aggregate (cost=15478.69..15478.69 rows=1 width=32)
Output: NULL::numeric
-> Foreign Scan (cost=396.66..15478.69 rows=820775 width=32)
Relations: (((transactions t) INNER JOIN (orgs o)) INNER JOIN (recurring_payments rp)) INNER JOIN (orgs o_1)
Remote SQL: SELECT NULL FROM (((public.transactions r4 INNER JOIN public.orgs r5 ON (((r4.org_id = r5.id)) AND ((r4.status = 'paid')))) INNER JOIN public.recurring_payments r7 ON (((r4.recurring_payment_id = r7.id)))) INNER
JOIN public.orgs r8 ON (((r7.org_id = r8.id))))
select count(*)
FROM (
SELECT AVG(t.usd_amount_cents / 100) AS average
FROM transactions t
JOIN recurring_payments r ON r.id = t.recurring_payment_id
WHERE t.status = 'paid'
);
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------
Aggregate (cost=15478.69..15478.69 rows=1 width=8)
Output: count(*)
-> Aggregate (cost=15478.69..15478.69 rows=1 width=32)
Output: NULL::numeric
-> Foreign Scan (cost=396.66..15478.69 rows=820775 width=32)
Relations: (((transactions t) INNER JOIN (orgs o)) INNER JOIN (recurring_payments rp)) INNER JOIN (orgs o_1)
Remote SQL: SELECT NULL FROM (((public.transactions r4 INNER JOIN public.orgs r5 ON (((r4.org_id = r5.id)) AND ((r4.status = 'paid')))) INNER JOIN public.recurring_payments r7 ON (((r4.recurring_payment_id = r7.id)))) INNER
JOIN public.orgs r8 ON (((r7.org_id = r8.id))))
(8 rows)
Thanks and regards
Aidar
Thanks and regards
Aidar
On Wed, Nov 6, 2024 at 12:03 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Wed, 6 Nov 2024 at 22:58, PG Bug reporting form
<noreply@postgresql.org> wrote:
> ```
> WITH
>
> transactions AS ( SELECT * FROM transactions_fwd WHERE org_id = 1
> ),
> recurring_payments AS ( SELECT * FROM recurring_payments_fwd WHERE
> org_id = 1 )
> SELECT AVG(t.usd_amount_cents / 100) AS average_donation_usd
> FROM transactions t
> JOIN recurring_payments r ON r.id = t.recurring_payment_id
> WHERE t.status = 'paid' AND t.deleted_at IS NULL AND t.amount_refunded_cents
> = 0
> ```
>
> But when I wrap it to `select count(*) from (my_query)` i get 288 as a
> result, but I expect it to be 1.
>
> ```
> select count(*)
> FROM (
> WITH
>
> transactions AS ( SELECT * FROM transactions_fwd WHERE org_id = 1
> ),
> recurring_payments AS ( SELECT * FROM recurring_payments_fwd WHERE
> org_id = 1 )
> SELECT AVG(t.usd_amount_cents / 100) AS average_donation_usd
> FROM transactions t
> JOIN recurring_payments r ON r.id = t.recurring_payment_id
> WHERE t.status = 'paid' AND t.deleted_at IS NULL AND t.amount_refunded_cents
> = 0
> );
> ```
Can you show the EXPLAIN ANALYZE of both of these queries?
Also, can you work on getting a self-contained reproducer that we can
run to recreate the issue.
David
David, after some troubleshooting I found out that the issue with FWD and bigint values, so when column type int then it works well but after changing to a bigint type a count() function returns the wrong value.
Here is how to reproduce that
-- ===> remote db schema and data
create table orgs(id int, name varchar, provider_id int);
create table recurring_payments(
id int primary key,
org_id int
);
create table transactions(
id serial primary key,
amount int,
org_id int,
recurring_payment_id int REFERENCES recurring_payments(id)
);
-- populate remote db
insert into orgs(id, name, provider_id) values (1, 'one', 11), (2, 'two', 22);
insert into recurring_payments(id, org_id) values (1, 1), (2, 1), (3, 2);
insert into transactions(amount, org_id, recurring_payment_id) values (1, 1, 1), (2, 1, null), (3, 1, null), (4, 1, 1);
-- <=== remote db
-- ===> host db
-- connect back to base database and create server and fdw
CREATE EXTENSION postgres_fdw;
CREATE SERVER foreign_server
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'localhost', port '5432', dbname 'test3');
CREATE USER MAPPING FOR postgres
SERVER foreign_server
OPTIONS (user 'postgres', password '12345');
CREATE FOREIGN TABLE orgs_f (
id int, name varchar, provider_id int
)
SERVER foreign_server
OPTIONS (schema_name 'public', table_name 'orgs');
CREATE FOREIGN TABLE recurring_payments_f (
id int, org_id int
)
SERVER foreign_server
OPTIONS (schema_name 'public', table_name 'recurring_payments');
CREATE FOREIGN TABLE transactions_f (
id serial,
amount bigint,
org_id int,
recurring_payment_id int
)
SERVER foreign_server
OPTIONS (schema_name 'public', table_name 'transactions');
create view transactions_v as
select t.*, o.provider_id as real_org_id
from transactions_f t
join orgs_f o on o.id = t.org_id;
create view recurring_payments_v as
select rp.*, o.provider_id as real_org_id
from recurring_payments_f rp
join orgs_f o on o.id = rp.org_id;
-- This one returns (1 + 4) / 2 = 2.5 which is correct
select avg(t.amount)
from transactions_v t
join recurring_payments_v rp on rp.id = t.recurring_payment_id
where t.real_org_id = 11 and rp.real_org_id = 11;
-- wrapping this to count gives 2 which is not correct
select count(*) from (
select avg(t.amount)
from transactions_v t
join recurring_payments_v rp on rp.id = t.recurring_payment_id
where t.real_org_id = 11 and rp.real_org_id = 11
);
-- ===> remote db schema and data
create table orgs(id int, name varchar, provider_id int);
create table recurring_payments(
id int primary key,
org_id int
);
create table transactions(
id serial primary key,
amount int,
org_id int,
recurring_payment_id int REFERENCES recurring_payments(id)
);
-- populate remote db
insert into orgs(id, name, provider_id) values (1, 'one', 11), (2, 'two', 22);
insert into recurring_payments(id, org_id) values (1, 1), (2, 1), (3, 2);
insert into transactions(amount, org_id, recurring_payment_id) values (1, 1, 1), (2, 1, null), (3, 1, null), (4, 1, 1);
-- <=== remote db
-- ===> host db
-- connect back to base database and create server and fdw
CREATE EXTENSION postgres_fdw;
CREATE SERVER foreign_server
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'localhost', port '5432', dbname 'test3');
CREATE USER MAPPING FOR postgres
SERVER foreign_server
OPTIONS (user 'postgres', password '12345');
CREATE FOREIGN TABLE orgs_f (
id int, name varchar, provider_id int
)
SERVER foreign_server
OPTIONS (schema_name 'public', table_name 'orgs');
CREATE FOREIGN TABLE recurring_payments_f (
id int, org_id int
)
SERVER foreign_server
OPTIONS (schema_name 'public', table_name 'recurring_payments');
CREATE FOREIGN TABLE transactions_f (
id serial,
amount bigint,
org_id int,
recurring_payment_id int
)
SERVER foreign_server
OPTIONS (schema_name 'public', table_name 'transactions');
create view transactions_v as
select t.*, o.provider_id as real_org_id
from transactions_f t
join orgs_f o on o.id = t.org_id;
create view recurring_payments_v as
select rp.*, o.provider_id as real_org_id
from recurring_payments_f rp
join orgs_f o on o.id = rp.org_id;
-- This one returns (1 + 4) / 2 = 2.5 which is correct
select avg(t.amount)
from transactions_v t
join recurring_payments_v rp on rp.id = t.recurring_payment_id
where t.real_org_id = 11 and rp.real_org_id = 11;
-- wrapping this to count gives 2 which is not correct
select count(*) from (
select avg(t.amount)
from transactions_v t
join recurring_payments_v rp on rp.id = t.recurring_payment_id
where t.real_org_id = 11 and rp.real_org_id = 11
);
On Wed, Nov 6, 2024 at 3:47 PM Aidar Uldanov <aidar@donorbox.org> wrote:
Thanks David
I am trying to reproduce it on a new db but I couldn't so far, though I was able to simplify the queries
```select count(*)
FROM (
SELECT AVG(t.usd_amount_cents / 100) AS average
FROM transactions t
JOIN recurring_payments r ON r.id = t.recurring_payment_id
WHERE t.org_id = 1 AND r.org_id = 1 and t.status = 'paid'
);
```
count
-------
22429
(1 row)
```
select count(*)
FROM (
SELECT AVG(t.usd_amount_cents / 100) AS average
FROM transactions t
JOIN recurring_payments r ON r.id = t.recurring_payment_id
WHERE t.status = 'paid'
);
```
count
-------
1
(1 row)
Those queries are the same and both should return 1 because the inner query returns just one AVG value. The difference between those two queries is `t.org_id = 1 AND r.org_id = 1`Here are SQL explains
explain verbose
select count(*)
FROM (
SELECT AVG(t.usd_amount_cents / 100) AS average
FROM transactions t
JOIN recurring_payments r ON r.id = t.recurring_payment_id
WHERE t.org_id = 1 AND r.org_id = 1 and t.status = 'paid'
);
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
Aggregate (cost=213.22..213.22 rows=1 width=8)
Output: count(*)
-> Foreign Scan (cost=102.30..213.21 rows=1 width=32)
Output: NULL::numeric
Relations: Aggregate on ((((transactions t) INNER JOIN (orgs o)) INNER JOIN (recurring_payments rp)) INNER JOIN (orgs o_1))
Remote SQL: SELECT NULL::numeric FROM (((public.transactions r4 INNER JOIN public.orgs r5 ON (((r4.org_id = r5.id)) AND ((r5.provider_id = 1)) AND ((r4.status = 'paid')))) INNER JOIN public.recurring_payments r7 ON (((r4.recurrin
g_payment_id = r7.id)))) INNER JOIN public.orgs r8 ON (((r7.org_id = r8.id)) AND ((r8.provider_id = 1))))
(7 rows)explain verbose
select count(*)
FROM (
SELECT AVG(t.usd_amount_cents / 100) AS average
FROM transactions t
JOIN recurring_payments r ON r.id = t.recurring_payment_id
WHERE t.status = 'paid'
);
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------
Aggregate (cost=15478.69..15478.69 rows=1 width=8)
Output: count(*)
-> Aggregate (cost=15478.69..15478.69 rows=1 width=32)
Output: NULL::numeric
-> Foreign Scan (cost=396.66..15478.69 rows=820775 width=32)
Relations: (((transactions t) INNER JOIN (orgs o)) INNER JOIN (recurring_payments rp)) INNER JOIN (orgs o_1)
Remote SQL: SELECT NULL FROM (((public.transactions r4 INNER JOIN public.orgs r5 ON (((r4.org_id = r5.id)) AND ((r4.status = 'paid')))) INNER JOIN public.recurring_payments r7 ON (((r4.recurring_payment_id = r7.id)))) INNER
JOIN public.orgs r8 ON (((r7.org_id = r8.id))))(8 rows)
Thanks and regards
AidarOn Wed, Nov 6, 2024 at 12:03 PM David Rowley <dgrowleyml@gmail.com> wrote:On Wed, 6 Nov 2024 at 22:58, PG Bug reporting form
<noreply@postgresql.org> wrote:
> ```
> WITH
>
> transactions AS ( SELECT * FROM transactions_fwd WHERE org_id = 1
> ),
> recurring_payments AS ( SELECT * FROM recurring_payments_fwd WHERE
> org_id = 1 )
> SELECT AVG(t.usd_amount_cents / 100) AS average_donation_usd
> FROM transactions t
> JOIN recurring_payments r ON r.id = t.recurring_payment_id
> WHERE t.status = 'paid' AND t.deleted_at IS NULL AND t.amount_refunded_cents
> = 0
> ```
>
> But when I wrap it to `select count(*) from (my_query)` i get 288 as a
> result, but I expect it to be 1.
>
> ```
> select count(*)
> FROM (
> WITH
>
> transactions AS ( SELECT * FROM transactions_fwd WHERE org_id = 1
> ),
> recurring_payments AS ( SELECT * FROM recurring_payments_fwd WHERE
> org_id = 1 )
> SELECT AVG(t.usd_amount_cents / 100) AS average_donation_usd
> FROM transactions t
> JOIN recurring_payments r ON r.id = t.recurring_payment_id
> WHERE t.status = 'paid' AND t.deleted_at IS NULL AND t.amount_refunded_cents
> = 0
> );
> ```
Can you show the EXPLAIN ANALYZE of both of these queries?
Also, can you work on getting a self-contained reproducer that we can
run to recreate the issue.
David
Aidar Uldanov <aidar@donorbox.org> writes: > David, after some troubleshooting I found out that the issue with FWD and > bigint values, so when column type int then it works well but after > changing to a bigint type a count() function returns the wrong value. > create table transactions( > id serial primary key, > amount int, > org_id int, > recurring_payment_id int REFERENCES recurring_payments(id) > ); > ... > CREATE FOREIGN TABLE transactions_f ( > id serial, > amount bigint, > org_id int, > recurring_payment_id int > ) > SERVER foreign_server > OPTIONS (schema_name 'public', table_name 'transactions'); While it's sort of interesting that that misbehaves, I do not believe that postgres_fdw promises to work correctly with foreign-table definitions that don't match the remote table's actual column datatypes. regards, tom lane
oh sorry, I might forget to update int to bigint there. Actually both should be bigint.
First I tried with int type and it works well and after changing to bigint I got the wrong count value.
Thanks and regards
Aidar
First I tried with int type and it works well and after changing to bigint I got the wrong count value.
Thanks and regards
Aidar
On Wed, Nov 6, 2024 at 4:29 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Aidar Uldanov <aidar@donorbox.org> writes:
> David, after some troubleshooting I found out that the issue with FWD and
> bigint values, so when column type int then it works well but after
> changing to a bigint type a count() function returns the wrong value.
> create table transactions(
> id serial primary key,
> amount int,
> org_id int,
> recurring_payment_id int REFERENCES recurring_payments(id)
> );
> ...
> CREATE FOREIGN TABLE transactions_f (
> id serial,
> amount bigint,
> org_id int,
> recurring_payment_id int
> )
> SERVER foreign_server
> OPTIONS (schema_name 'public', table_name 'transactions');
While it's sort of interesting that that misbehaves, I do not believe
that postgres_fdw promises to work correctly with foreign-table
definitions that don't match the remote table's actual column
datatypes.
regards, tom lane