Re: json datatype and table bloat? - Mailing list pgsql-general
From | Gregory Haase |
---|---|
Subject | Re: json datatype and table bloat? |
Date | |
Msg-id | CAHA6QFRADbCMhyJKnrQxmyzVNa1kDmy8E2F0qTUJ2PsGJeXnMg@mail.gmail.com Whole thread Raw |
In response to | Re: json datatype and table bloat? (Gregory Haase <haaseg@onefreevoice.com>) |
Responses |
Re: json datatype and table bloat?
|
List | pgsql-general |
I spent some more time on this today, and I realized that the issue isn't that there are records in the toast table. The issue is that there are NO records in the toast table. Apparently, all the json we are inserting are too small to get toasted.
I setup a separate benchmark locally:
create table test_json_stats
(
(
test_json_stats_id serial,
json_data json,
insert_timestamp timestamp default now() not null
);
created a file called "json bench" with the following:
BEGIN;
insert into test_json_stats (json_data) values ('{"counters": [ { "first":"1","second":"2"}, { "first":"3","second":"4"}, { "first":"5","second":"6"}, { "first":"7","second":"8"}, { "first":"9","second":"10"}, { "first":"11","second":"12"}, { "first":"13","second":"14"}, { "first":"15","second":"16"}, { "first":"17","second":"18"}, { "first":"19","second":"20"}, { "first":"21","second":"22"}, { "first":"23","second":"24"}, { "first":"25","second":"26"}, { "first":"27","second":"28"}, { "first":"29","second":"30"}, { "first":"31","second":"32"}, { "first":"33","second":"34"}, { "first":"35","second":"36"}, { "first":"37","second":"38"}, { "first":"39","second":"40"}, { "first":"41","second":"42"}, { "first":"43","second":"44"}, { "first":"45","second":"46"} ] }');
END;
Then ran pgbench:
pgbench -c 5 -t 2000000 -f json_bench greg
vacuum vebose shows the test_json_stats table has over a million pages and the toast table exists with zero pages:
INFO: vacuuming "public.test_json_stats"
INFO: "test_json_stats": found 0 removable, 0 nonremovable row versions in 0 out of 1010011 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.01u sec elapsed 0.01 sec.
INFO: vacuuming "pg_toast.pg_toast_51822"
INFO: index "pg_toast_51822_index" now contains 0 row versions in 1 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "pg_toast_51822": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM
The json_data column is not accounted for in pg_stats:
select attname, null_frac, avg_width from pg_stats where tablename = 'test_json_stats';
attname | null_frac | avg_width
--------------------+-----------+-----------
test_json_stats_id | 0 | 4
insert_timestamp | 0 | 8
(2 rows)
So I'm not sure if I'd actually qualify this as a "bug", but it appears that there is no way to currently get stats on a json data type.
I subsequently inserted a very large json into the table that consumed 2 pages in pg_toast_51822, but there still doesn't appear to me any way to get stats on the column.
Greg Haase
On Tue, Oct 29, 2013 at 2:55 PM, Gregory Haase <haaseg@onefreevoice.com> wrote:
Tom is correct: Vacuum verbose shows that their is an associated toast table. Neither the check_postgres.pl script or the query on http://wiki.postgresql.org/wiki/Show_database_bloat appear to take this into consideration. Both rely on null_frac and avg_width from pg_stats to estimate how big the table should be. I'm not sure how you would factor the toast table into that estimate.-GOn Tue, Oct 29, 2013 at 2:05 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:Gregory Haase <haaseg@onefreevoice.com> writes:Possibly your "bloat query" is failing to consider the toast table
> I've isolated the problem to the json field not showing up in pg_stats,
> which affects the calculation of the avg row size in the bloat query.
> I'm not sure if this is a json issue or some other kind of issue.
associated with this table? If the json values are large they'd
mostly be in the toast table not the main table.
(It's unfortunate that VACUUM FULL doesn't tell you about what's
in the toast table. I'd try just VACUUM VERBOSE here, without the
FULL, to get more info.)
regards, tom lane
pgsql-general by date: