Thread: json datatype and table bloat?
I have a table that is triggering my nagios database bloat alert regularly. Usually, I have to give it the vacuum full more than once to get it under the threshold. Today I tried repeatedly and cannot get the alert to resolve.
I had a discussion with one of the primary developers about how the table is utilized, and it turns out they are basically only ever inserting into it. This sort of flies in the face of conventional wisdom about bloat being caused by frequent updates and deletes.
We were looking at it, and one of the things that struck me is that this table has a column with a json datatype. I looked through information_schema.columns and there is only one other table with a json datatype, and I recall having bloat issues with this table in the past as well.
I'm wondering if the json datatype is just naturally more bloated than other types, or if the query in the check_postgresql.pl nagios script is not accurate, or if maybe my thresholds are simply too low?
The table design itself is pretty simple:
id | integer | not null default nextval('table_schema.table_name_id_seq'::regclass)
type | character varying(255) |
criteria | json |
created_at | timestamp without time zone | not null
updated_at | timestamp without time zone | not null
Indexes:
"table_name_pkey" PRIMARY KEY, btree (id)
The nagios output looks like this (sanitized):
POSTGRES_BLOAT WARNING: DB "db_name" (host:host.domain.com) (db db_name) table table_schema.table_name rows:29305 pages:1733 shouldbe:330 (5.3X) wasted size:11493376 (10 MB)
POSTGRES_BLOAT WARNING: DB "db_name" (host:host.domain.com) (db db_name) table table_schema.table_name rows:29305 pages:1733 shouldbe:330 (5.3X) wasted size:11493376 (10 MB)
Thoughts?
Greg Haase
On Mon, Oct 28, 2013 at 4:17 PM, Gregory Haase <haaseg@onefreevoice.com> wrote:
-- I have a table that is triggering my nagios database bloat alert regularly. Usually, I have to give it the vacuum full more than once to get it under the threshold. Today I tried repeatedly and cannot get the alert to resolve.I had a discussion with one of the primary developers about how the table is utilized, and it turns out they are basically only ever inserting into it. This sort of flies in the face of conventional wisdom about bloat being caused by frequent updates and deletes.
As I understand it, vacuuming only removes the tuples removed by delete and update operations.
The question is how this is being inserted and if there is anything that ever updates the rows in any way. Maybe this is an manual process? But for 5x bloat, you have to have it be a repeated process.
Maybe there was bad data that had to be corrected?
We were looking at it, and one of the things that struck me is that this table has a column with a json datatype. I looked through information_schema.columns and there is only one other table with a json datatype, and I recall having bloat issues with this table in the past as well.I'm wondering if the json datatype is just naturally more bloated than other types, or if the query in the check_postgresql.pl nagios script is not accurate, or if maybe my thresholds are simply too low?
Next time this happens it would be worth seeing what VACUUM FULL VERBOSE output is for that table.
Best Wishes,
Chris Travers
Efficito: Hosted Accounting and ERP. Robust and Flexible. No vendor lock-in.
On Tue, Oct 29, 2013 at 5:38 AM, Chris Travers <chris.travers@gmail.com> wrote: > > > > On Mon, Oct 28, 2013 at 4:17 PM, Gregory Haase <haaseg@onefreevoice.com> > wrote: >> >> I have a table that is triggering my nagios database bloat alert >> regularly. Usually, I have to give it the vacuum full more than once to get >> it under the threshold. Today I tried repeatedly and cannot get the alert to >> resolve. >> >> I had a discussion with one of the primary developers about how the table >> is utilized, and it turns out they are basically only ever inserting into >> it. This sort of flies in the face of conventional wisdom about bloat being >> caused by frequent updates and deletes. > > > As I understand it, vacuuming only removes the tuples removed by delete and > update operations. well, or by rolled back transactions. we we have to wonder if OP has a lot of queries trying to insert and failing. maybe check the log? merlin
Following up.
I don't see any rolled back transactions in the logs.
The part that troubles me really is that vacuum full doesn't actually fix the problem. If there were bad data that had been corrected via mass updates, I'd expect the bloat issue to be fixed by a vacuum full.
When I run the vacuum back to back, this is what I get:
db_name=# VACUUM FULL VERBOSE table_schema.table_name;
INFO: vacuuming "table_schema.table_name"
INFO: "table_name": found 2 removable, 29663 nonremovable row versions in 1754 pages
DETAIL: 0 dead row versions cannot be removed yet.
CPU 0.07s/0.10u sec elapsed 0.30 sec.
VACUUM
db_name=# VACUUM FULL VERBOSE table_schema.table_name;
INFO: vacuuming "table_schema.table_name"
INFO: "table_name": found 0 removable, 29663 nonremovable row versions in 1754 pages
DETAIL: 0 dead row versions cannot be removed yet.
CPU 0.09s/0.09u sec elapsed 0.32 sec.
VACUUM
I think the question to address may be: "Why does the check_postgres query think there should only be 334 pages instead of 1754?"
The tbloat and wastedbytes calculations provided in the query from this page: http://wiki.postgresql.org/wiki/Show_database_bloat seems to correlate with the wasted bytes reported by nagios though.
Greg Haase
On Tue, Oct 29, 2013 at 7:06 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
On Tue, Oct 29, 2013 at 5:38 AM, Chris Travers <chris.travers@gmail.com> wrote:well, or by rolled back transactions. we we have to wonder if OP has
>
>
>
> On Mon, Oct 28, 2013 at 4:17 PM, Gregory Haase <haaseg@onefreevoice.com>
> wrote:
>>
>> I have a table that is triggering my nagios database bloat alert
>> regularly. Usually, I have to give it the vacuum full more than once to get
>> it under the threshold. Today I tried repeatedly and cannot get the alert to
>> resolve.
>>
>> I had a discussion with one of the primary developers about how the table
>> is utilized, and it turns out they are basically only ever inserting into
>> it. This sort of flies in the face of conventional wisdom about bloat being
>> caused by frequent updates and deletes.
>
>
> As I understand it, vacuuming only removes the tuples removed by delete and
> update operations.
a lot of queries trying to insert and failing. maybe check the log?
merlin
On 10/29/2013 12:41 PM, Gregory Haase wrote:
db_name=# VACUUM FULL VERBOSE table_schema.table_name;INFO: vacuuming "table_schema.table_name"INFO: "table_name": found 2 removable, 29663 nonremovable row versions in 1754 pagesDETAIL: 0 dead row versions cannot be removed yet.CPU 0.07s/0.10u sec elapsed 0.30 sec.
is there an old transaction pending? that 'masks' vacuum from touching any tuples newer than the start of that transaction.
-- john r pierce 37N 122W somewhere on the middle of the left coast
So, between yesterday and today we actually failed over to our hot-standby instance and the issue hasn't changed. I don't think you can have a pending transaction across streaming replication.
On Tue, Oct 29, 2013 at 12:49 PM, John R Pierce <pierce@hogranch.com> wrote:
is there an old transaction pending? that 'masks' vacuum from touching any tuples newer than the start of that transaction.On 10/29/2013 12:41 PM, Gregory Haase wrote:db_name=# VACUUM FULL VERBOSE table_schema.table_name;INFO: vacuuming "table_schema.table_name"INFO: "table_name": found 2 removable, 29663 nonremovable row versions in 1754 pagesDETAIL: 0 dead row versions cannot be removed yet.CPU 0.07s/0.10u sec elapsed 0.30 sec.-- john r pierce 37N 122W somewhere on the middle of the left coast
One more thing I just tried:
create table table_schema.table_name_new (like table_schema.table_name);
insert into table_schema.table_name_new select * from table_schema.table_name;
The new tables shows the same amount of wasted bytes and pages as the old.
So I think based on that I'm going to throw out any notion of updates or deletes as cause for bloat on this particular table.
-G
On Tue, Oct 29, 2013 at 12:53 PM, Gregory Haase <haaseg@onefreevoice.com> wrote:
So, between yesterday and today we actually failed over to our hot-standby instance and the issue hasn't changed. I don't think you can have a pending transaction across streaming replication.On Tue, Oct 29, 2013 at 12:49 PM, John R Pierce <pierce@hogranch.com> wrote:is there an old transaction pending? that 'masks' vacuum from touching any tuples newer than the start of that transaction.On 10/29/2013 12:41 PM, Gregory Haase wrote:db_name=# VACUUM FULL VERBOSE table_schema.table_name;INFO: vacuuming "table_schema.table_name"INFO: "table_name": found 2 removable, 29663 nonremovable row versions in 1754 pagesDETAIL: 0 dead row versions cannot be removed yet.CPU 0.07s/0.10u sec elapsed 0.30 sec.-- john r pierce 37N 122W somewhere on the middle of the left coast
John R Pierce <pierce@hogranch.com> writes: > On 10/29/2013 12:41 PM, Gregory Haase wrote: >> db_name=# VACUUM FULL VERBOSE table_schema.table_name; >> INFO: vacuuming "table_schema.table_name" >> INFO: "table_name": found 2 removable, 29663 nonremovable row >> versions in 1754 pages >> DETAIL: 0 dead row versions cannot be removed yet. >> CPU 0.07s/0.10u sec elapsed 0.30 sec. > is there an old transaction pending? that 'masks' vacuum from touching > any tuples newer than the start of that transaction. If old transactions were the problem, vacuum would be reporting that some-large-number of dead row versions couldn't be removed yet. There doesn't seem to be anything obviously wrong here. regards, tom lane
I've isolated the problem to the json field not showing up in pg_stats, which affects the calculation of the avg row size in the bloat query.
I'm not sure if this is a json issue or some other kind of issue.
db_name=# select c.column_name, c.data_type from information_schema.columns c where table_name = 'table_name' and not exists (select 1 from pg_stats s where c.table_name = s.tablename and c.column_name = s.attname);
column_name | data_type
-------------+-----------
criteria | json
(1 row)
-G
On Tue, Oct 29, 2013 at 1:51 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
John R Pierce <pierce@hogranch.com> writes:If old transactions were the problem, vacuum would be reporting that
> On 10/29/2013 12:41 PM, Gregory Haase wrote:
>> db_name=# VACUUM FULL VERBOSE table_schema.table_name;
>> INFO: vacuuming "table_schema.table_name"
>> INFO: "table_name": found 2 removable, 29663 nonremovable row
>> versions in 1754 pages
>> DETAIL: 0 dead row versions cannot be removed yet.
>> CPU 0.07s/0.10u sec elapsed 0.30 sec.
> is there an old transaction pending? that 'masks' vacuum from touching
> any tuples newer than the start of that transaction.
some-large-number of dead row versions couldn't be removed yet.
There doesn't seem to be anything obviously wrong here.
regards, tom lane
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Gregory Haase <haaseg@onefreevoice.com> writes: > I've isolated the problem to the json field not showing up in pg_stats, > which affects the calculation of the avg row size in the bloat query. > I'm not sure if this is a json issue or some other kind of issue. Possibly your "bloat query" is failing to consider the toast table associated with this table? If the json values are large they'd mostly be in the toast table not the main table. (It's unfortunate that VACUUM FULL doesn't tell you about what's in the toast table. I'd try just VACUUM VERBOSE here, without the FULL, to get more info.) regards, tom lane
Tom is correct: Vacuum verbose shows that their is an associated toast table. Neither the check_postgres.pl script or the query on http://wiki.postgresql.org/wiki/Show_database_bloat appear to take this into consideration. Both rely on null_frac and avg_width from pg_stats to estimate how big the table should be. I'm not sure how you would factor the toast table into that estimate.
-G
On Tue, Oct 29, 2013 at 2:05 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Gregory Haase <haaseg@onefreevoice.com> writes:Possibly your "bloat query" is failing to consider the toast table
> I've isolated the problem to the json field not showing up in pg_stats,
> which affects the calculation of the avg row size in the bloat query.
> I'm not sure if this is a json issue or some other kind of issue.
associated with this table? If the json values are large they'd
mostly be in the toast table not the main table.
(It's unfortunate that VACUUM FULL doesn't tell you about what's
in the toast table. I'd try just VACUUM VERBOSE here, without the
FULL, to get more info.)
regards, tom lane
I spent some more time on this today, and I realized that the issue isn't that there are records in the toast table. The issue is that there are NO records in the toast table. Apparently, all the json we are inserting are too small to get toasted.
I setup a separate benchmark locally:
create table test_json_stats
(
(
test_json_stats_id serial,
json_data json,
insert_timestamp timestamp default now() not null
);
created a file called "json bench" with the following:
BEGIN;
insert into test_json_stats (json_data) values ('{"counters": [ { "first":"1","second":"2"}, { "first":"3","second":"4"}, { "first":"5","second":"6"}, { "first":"7","second":"8"}, { "first":"9","second":"10"}, { "first":"11","second":"12"}, { "first":"13","second":"14"}, { "first":"15","second":"16"}, { "first":"17","second":"18"}, { "first":"19","second":"20"}, { "first":"21","second":"22"}, { "first":"23","second":"24"}, { "first":"25","second":"26"}, { "first":"27","second":"28"}, { "first":"29","second":"30"}, { "first":"31","second":"32"}, { "first":"33","second":"34"}, { "first":"35","second":"36"}, { "first":"37","second":"38"}, { "first":"39","second":"40"}, { "first":"41","second":"42"}, { "first":"43","second":"44"}, { "first":"45","second":"46"} ] }');
END;
Then ran pgbench:
pgbench -c 5 -t 2000000 -f json_bench greg
vacuum vebose shows the test_json_stats table has over a million pages and the toast table exists with zero pages:
INFO: vacuuming "public.test_json_stats"
INFO: "test_json_stats": found 0 removable, 0 nonremovable row versions in 0 out of 1010011 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.01u sec elapsed 0.01 sec.
INFO: vacuuming "pg_toast.pg_toast_51822"
INFO: index "pg_toast_51822_index" now contains 0 row versions in 1 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "pg_toast_51822": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM
The json_data column is not accounted for in pg_stats:
select attname, null_frac, avg_width from pg_stats where tablename = 'test_json_stats';
attname | null_frac | avg_width
--------------------+-----------+-----------
test_json_stats_id | 0 | 4
insert_timestamp | 0 | 8
(2 rows)
So I'm not sure if I'd actually qualify this as a "bug", but it appears that there is no way to currently get stats on a json data type.
I subsequently inserted a very large json into the table that consumed 2 pages in pg_toast_51822, but there still doesn't appear to me any way to get stats on the column.
Greg Haase
On Tue, Oct 29, 2013 at 2:55 PM, Gregory Haase <haaseg@onefreevoice.com> wrote:
Tom is correct: Vacuum verbose shows that their is an associated toast table. Neither the check_postgres.pl script or the query on http://wiki.postgresql.org/wiki/Show_database_bloat appear to take this into consideration. Both rely on null_frac and avg_width from pg_stats to estimate how big the table should be. I'm not sure how you would factor the toast table into that estimate.-GOn Tue, Oct 29, 2013 at 2:05 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:Gregory Haase <haaseg@onefreevoice.com> writes:Possibly your "bloat query" is failing to consider the toast table
> I've isolated the problem to the json field not showing up in pg_stats,
> which affects the calculation of the avg row size in the bloat query.
> I'm not sure if this is a json issue or some other kind of issue.
associated with this table? If the json values are large they'd
mostly be in the toast table not the main table.
(It's unfortunate that VACUUM FULL doesn't tell you about what's
in the toast table. I'd try just VACUUM VERBOSE here, without the
FULL, to get more info.)
regards, tom lane
Gregory Haase <haaseg@onefreevoice.com> writes: > The json_data column is not accounted for in pg_stats: Ah! I hadn't twigged to the fact that your bloat measurement approach assumed you had pg_stats entries for all the columns. > So I'm not sure if I'd actually qualify this as a "bug", but it appears > that there is no way to currently get stats on a json data type. ANALYZE currently punts on columns that don't have an equality operator, which json does not. There isn't that much in the way of stats that we could collect, though I suppose we could still compute average datum width and null fraction. I'm not sure whether there are plans to invent an equality operator for json. regards, tom lane
On Fri, Nov 1, 2013 at 1:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Gregory Haase <haaseg@onefreevoice.com> writes: >> The json_data column is not accounted for in pg_stats: > > Ah! I hadn't twigged to the fact that your bloat measurement approach > assumed you had pg_stats entries for all the columns. > >> So I'm not sure if I'd actually qualify this as a "bug", but it appears >> that there is no way to currently get stats on a json data type. > > ANALYZE currently punts on columns that don't have an equality operator, > which json does not. There isn't that much in the way of stats that we > could collect, though I suppose we could still compute average datum width > and null fraction. > > I'm not sure whether there are plans to invent an equality operator for > json. IMNSO, this may not be a bug, but it's pretty close. All base types should have equality operator as well as other supporting infrastructure that the database itself depends on (in/out and send/receive for example). This is a pretty good example of why. merlin
Along the lines of the equality operator; I have ran into issues trying to pivot a table/result set with a json type due what seemed to be no equality operator.
On Nov 4, 2013 10:14 AM, "Merlin Moncure" <mmoncure@gmail.com> wrote:
On Fri, Nov 1, 2013 at 1:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Gregory Haase <haaseg@onefreevoice.com> writes:
>> The json_data column is not accounted for in pg_stats:
>
> Ah! I hadn't twigged to the fact that your bloat measurement approach
> assumed you had pg_stats entries for all the columns.
>
>> So I'm not sure if I'd actually qualify this as a "bug", but it appears
>> that there is no way to currently get stats on a json data type.
>
> ANALYZE currently punts on columns that don't have an equality operator,
> which json does not. There isn't that much in the way of stats that we
> could collect, though I suppose we could still compute average datum width
> and null fraction.
>
> I'm not sure whether there are plans to invent an equality operator for
> json.
IMNSO, this may not be a bug, but it's pretty close. All base types
should have equality operator as well as other supporting
infrastructure that the database itself depends on (in/out and
send/receive for example). This is a pretty good example of why.
merlin
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
Sent from Gmail Mobile
ajelinek@gmail.com wrote > Along the lines of the equality operator; I have ran into issues trying to > pivot a table/result set with a json type due what seemed to be no > equality > operator. For the curious, and also use-case considerations for development, would you be able to share what it is you are doing (and how) that combines full json documents with pivoting? Compound types holding source data for a pivot seems problematic since generally all the pivot components are single-valued and, for data, often numerical. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/json-datatype-and-table-bloat-tp5776182p5776880.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
On Mon, Nov 4, 2013 at 12:14 PM, David Johnston <polobo@yahoo.com> wrote: > ajelinek@gmail.com wrote >> Along the lines of the equality operator; I have ran into issues trying to >> pivot a table/result set with a json type due what seemed to be no >> equality >> operator. > > For the curious, and also use-case considerations for development, would you > be able to share what it is you are doing (and how) that combines full json > documents with pivoting? > > Compound types holding source data for a pivot seems problematic since > generally all the pivot components are single-valued and, for data, often > numerical. would also like to see this. json type has completely displaced crosstab in my usage. I don't typically pivot json though: I pivot the raw data then transform to json. With limited exceptions I consider storing json in actual table rows to be an anti-pattern (but it should still work if you do it). merlin
>>> Along the lines of the equality operator; I have ran into issues trying to >>> pivot a table/result set with a json type due what seemed to be no >>> equality >>> operator. >> >> For the curious, and also use-case considerations for development, would >> you >> be able to share what it is you are doing (and how) that combines full >> json >> documents with pivoting? >> >> Compound types holding source data for a pivot seems problematic since >> generally all the pivot components are single-valued and, for data, often >> numerical. >would also like to see this. json type has completely displaced >crosstab in my usage. I don't typically pivot json though: I pivot the >raw data then transform to json. With limited exceptions I consider >storing json in actual table rows to be an anti-pattern (but it should >still work if you do it). I could not figure out what I was doing last month to reproduce this. So I did a small pivot poc, and it is erroring on the max function. So it is probably not the same issue. My guess is I tried the using the GREATEST function as a hail marry (which would not have worked) and got the following message; ERROR: could not identify a comparison function for type json and then thought/hopped it was the same thing when reading the emails. CREATE TABLE bad_table_json(id int, detail_type text, details json); INSERT INTO bad_table_json values(1, 'a', '{"a":1}'::json); INSERT INTO bad_table_json values(1, 'b', '{"a":1}'::json); INSERT INTO bad_table_json values(1, 'c', '{"a":1}'::json); SELECT id ,MAX(CASE WHEN detail_type = 'a' THEN details END) AS a ,MAX(CASE WHEN detail_type = 'b' THEN details END) AS b ,MAX(CASE WHEN detail_type = 'c' THEN details END) AS c FROM bad_table_json GROUP BY id -- View this message in context: http://postgresql.1045698.n5.nabble.com/json-datatype-and-table-bloat-tp5776182p5776947.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
While I agree that an equality operator doesn't really make sense for json operationally, there are certain maintenance reasons why it may come in handy. Removing duplicate records comes to mind.
Other than adding basic stats to necessary columns, I would say that the equality operator is really one of the most basic tenets of a relational database and should probably exist for any data type - regardless of it's given usefullness.
Greg Haase
On Nov 4, 2013 6:31 PM, "ajelinek@gmail.com" <ajelinek@gmail.com> wrote:
>>> Along the lines of the equality operator; I have ran into issues trying
to
>>> pivot a table/result set with a json type due what seemed to be no
>>> equality
>>> operator.
>>
>> For the curious, and also use-case considerations for development, would
>> you
>> be able to share what it is you are doing (and how) that combines full
>> json
>> documents with pivoting?
>>
>> Compound types holding source data for a pivot seems problematic since
>> generally all the pivot components are single-valued and, for data, often
>> numerical.
>would also like to see this. json type has completely displaced
>crosstab in my usage. I don't typically pivot json though: I pivot the
>raw data then transform to json. With limited exceptions I consider
>storing json in actual table rows to be an anti-pattern (but it should
>still work if you do it).
I could not figure out what I was doing last month to reproduce this. So I
did a small pivot poc, and it is erroring on the max function. So it is
probably not the same issue. My guess is I tried the using the GREATEST
function as a hail marry (which would not have worked) and got the following
message; ERROR: could not identify a comparison function for type json and
then thought/hopped it was the same thing when reading the emails.
CREATE TABLE bad_table_json(id int, detail_type text, details json);
INSERT INTO bad_table_json values(1, 'a', '{"a":1}'::json);
INSERT INTO bad_table_json values(1, 'b', '{"a":1}'::json);
INSERT INTO bad_table_json values(1, 'c', '{"a":1}'::json);
SELECT id
,MAX(CASE WHEN detail_type = 'a' THEN details END) AS a
,MAX(CASE WHEN detail_type = 'b' THEN details END) AS b
,MAX(CASE WHEN detail_type = 'c' THEN details END) AS c
FROM bad_table_json
GROUP BY id
--
View this message in context: http://postgresql.1045698.n5.nabble.com/json-datatype-and-table-bloat-tp5776182p5776947.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Mon, Nov 4, 2013 at 8:31 PM, ajelinek@gmail.com <ajelinek@gmail.com> wrote: >>>> Along the lines of the equality operator; I have ran into issues trying > to >>>> pivot a table/result set with a json type due what seemed to be no >>>> equality >>>> operator. >>> >>> For the curious, and also use-case considerations for development, would >>> you >>> be able to share what it is you are doing (and how) that combines full >>> json >>> documents with pivoting? >>> >>> Compound types holding source data for a pivot seems problematic since >>> generally all the pivot components are single-valued and, for data, often >>> numerical. > >>would also like to see this. json type has completely displaced >>crosstab in my usage. I don't typically pivot json though: I pivot the >>raw data then transform to json. With limited exceptions I consider >>storing json in actual table rows to be an anti-pattern (but it should >>still work if you do it). > > I could not figure out what I was doing last month to reproduce this. So I > did a small pivot poc, and it is erroring on the max function. So it is > probably not the same issue. My guess is I tried the using the GREATEST > function as a hail marry (which would not have worked) and got the following > message; ERROR: could not identify a comparison function for type json and > then thought/hopped it was the same thing when reading the emails. > > CREATE TABLE bad_table_json(id int, detail_type text, details json); > INSERT INTO bad_table_json values(1, 'a', '{"a":1}'::json); > INSERT INTO bad_table_json values(1, 'b', '{"a":1}'::json); > INSERT INTO bad_table_json values(1, 'c', '{"a":1}'::json); > > SELECT id > ,MAX(CASE WHEN detail_type = 'a' THEN details END) AS a > ,MAX(CASE WHEN detail_type = 'b' THEN details END) AS b > ,MAX(CASE WHEN detail_type = 'c' THEN details END) AS c > FROM bad_table_json > GROUP BY id Aside: here's a way to do those type of things. It's not faster necessarily but seems cleaner to me. This will bypass need for json comparison. IMMUTABLE plpgsql is generally the fastest way to implement highly iterated trivial functions. CREATE OR REPLACE FUNCTION PickInternal(State anyelement, WantValue TEXT, PickValue TEXT, Value anyelement) RETURNS anyelement AS $$ BEGIN RETURN CASE WHEN WantValue = PickValue THEN Value ELSE State END; END; $$ LANGUAGE PLPGSQL IMMUTABLE; CREATE AGGREGATE Pick(TEXT, TEXT, anyelement) ( SFUNC=PickInternal, SType=anyelement); SELECT id ,Pick('a', detail_type, details) AS a ,Pick('b', detail_type, details) AS b ,Pick('c', detail_type, details) AS c FROM bad_table_json GROUP BY id; merlin