Thread: array_agg and/or =ANY doesn't appear to be functioning as I expect
Greetings All, I'm having an issue which is very perplexing. The having clause in a query doesn't appear to be working as I expect it. Either that or my understanding of array_agg() is flawed. I'm using PostgreSQL 10.1, compiled by Visual C++ build 1800, 64-bit. The table in question looks like: CREATE TABLE confounded.dataset ( seq integer, path_seq integer, start_vid bigint, end_vid bigint, node bigint, edge bigint, cost double precision, agg_cost double precision, st_length double precision, truth boolean ); The following query returns 3 rows: with listing as ( select start_vid, end_vid, array_agg(node order by path_seq) node_array, array_agg(edge order by path_seq) edge_array from confounded.dataset group by start_vid,end_vid having true =ALL (array_agg(truth)) ) select * from listing; The issue is, if I unnest the node_array column from the listing cte and do a subselect on confounded.dataset I get back rows where truth = false. with listing as ( select start_vid, end_vid, array_agg(node order by path_seq) node_array, array_agg(edge order by path_seq) edge_array from confounded.dataset group by start_vid,end_vid having true =ALL (array_agg(truth)) ) select count(*) from confounded.dataset where node in (select distinct unnest(node_array) from listing) and truth = false; I would expect the above query to return 0 rows. the dataset can be found at https://gist.github.com/rhysallister/59239c76e8ec265b81777038bf272879 Rhys Peace & Love|Live Long & Prosper
Hi Rhys, Am 2018-01-21 um 02:42 schrieb Rhys A.D. Stewart: > Greetings All, > I'm having an issue which is very perplexing. The having clause in a > query doesn't appear to be working as I expect it. Either that or my > understanding of array_agg() is flawed. > > [...] > > with listing as ( > select start_vid, end_vid, array_agg(node order by path_seq) > node_array, array_agg(edge order by path_seq) edge_array > from confounded.dataset > group by start_vid,end_vid > having true =ALL (array_agg(truth)) > ) > select count(*) from confounded.dataset > where node in (select distinct unnest(node_array) from listing) and > truth = false; > > I would expect the above query to return 0 rows. the answer is in your data: "node" is not a UNIQUE field, and there are node values with multiple rows. e.g. node=977 has one row with truth=true and one with truth=false. So what your second query really does is "select all node values from listing for which another entry with truth=false exists in the dataset". Presuming that "seq" is a primary key [although not declared], you probably meant to restrict your query on that. Best regards, -hannes
Re: array_agg and/or =ANY doesn't appear to be functioning as I expect
From
"Rhys A.D. Stewart"
Date:
Hannes,
Regards,
Thanks for your observations...... Will take a look at the data.
Rhys
On Jan 20, 2018 11:00 PM, "Hannes Erven" <hannes@erven.at> wrote:
Hi Rhys,> [...]
Am 2018-01-21 um 02:42 schrieb Rhys A.D. Stewart:Greetings All,>
I'm having an issue which is very perplexing. The having clause in a
query doesn't appear to be working as I expect it. Either that or my
understanding of array_agg() is flawed.the answer is in your data: "node" is not a UNIQUE field, and there are node values with multiple rows.
with listing as (
select start_vid, end_vid, array_agg(node order by path_seq)
node_array, array_agg(edge order by path_seq) edge_array
from confounded.dataset
group by start_vid,end_vid
having true =ALL (array_agg(truth))
)
select count(*) from confounded.dataset
where node in (select distinct unnest(node_array) from listing) and
truth = false;
I would expect the above query to return 0 rows.
e.g. node=977 has one row with truth=true and one with truth=false.
So what your second query really does is "select all node values from listing for which another entry with truth=false exists in the dataset".
Presuming that "seq" is a primary key [although not declared], you probably meant to restrict your query on that.
Best regards,
-hannes