Why is my table continuousely written? - Mailing list pgsql-general

From Peter
Subject Why is my table continuousely written?
Date
Msg-id Yupdof22ov5WkhSb@gate.intra.daemon.contact
Whole thread Raw
Responses Why is my table continuousely written? -> XID issue?
List pgsql-general
Hello,

 this is postgres version 12.11_1 on Freebsd 13.1

I have a table "mess", filename "6309215", that behaves strange.
Data is currently only inserted/appended, there is no update and no
(successful) delete done, autovacuum is NOT enabled.

This is the definition:

CREATE TABLE IF NOT EXISTS dnstap.mess
(
    id bigint NOT NULL DEFAULT nextval('dnstap.mess_id_seq'::regclass),
    mess text COLLATE pg_catalog."default",
    hash text COLLATE pg_catalog."default",
    CONSTRAINT mess_pkey PRIMARY KEY (id),
    CONSTRAINT mess_hash_key UNIQUE (hash)
)
TABLESPACE l1only;


And this is the table:

-rw-------  1 770  770  1073741824 Aug  2 04:38 6309212
-rw-------  1 770  770  1073741824 Aug  2 04:38 6309212.1
-rw-------  1 770  770  1073741824 Aug  2 04:38 6309212.2
-rw-------  1 770  770  1073741824 Aug  2 04:39 6309212.3
-rw-------  1 770  770  1073741824 Aug  2 04:39 6309212.4
-rw-------  1 770  770   405872640 Aug  3 04:28 6309212.5
-rw-------  1 770  770     1441792 Aug  3 03:25 6309212_fsm
-rw-------  1 770  770      180224 Aug  2 05:13 6309212_vm
-rw-------  1 770  770           0 Aug  2 00:21 6309213
-rw-------  1 770  770        8192 Aug  2 04:40 6309214
-rw-------  1 770  770  1073741824 Aug  3 04:29 6309215
-rw-------  1 770  770  1073741824 Aug  3 04:30 6309215.1
-rw-------  1 770  770  1073741824 Aug  3 04:13 6309215.10
-rw-------  1 770  770   995811328 Aug  3 04:16 6309215.11
-rw-------  1 770  770  1073741824 Aug  3 04:32 6309215.2
-rw-------  1 770  770  1073741824 Aug  3 04:34 6309215.3
-rw-------  1 770  770  1073741824 Aug  3 04:35 6309215.4
-rw-------  1 770  770  1073741824 Aug  3 04:37 6309215.5
-rw-------  1 770  770  1073741824 Aug  3 04:39 6309215.6
-rw-------  1 770  770  1073741824 Aug  3 04:41 6309215.7
-rw-------  1 770  770  1073741824 Aug  3 04:42 6309215.8
-rw-------  1 770  770  1073741824 Aug  3 04:12 6309215.9
-rw-------  1 770  770     3170304 Aug  3 04:38 6309215_fsm
-rw-------  1 770  770      393216 Aug  3 04:38 6309215_vm
-rw-------  1 770  770  1073741824 Aug  3 03:17 6309216
-rw-------  1 770  770  1073741824 Aug  2 04:33 6309216.1
-rw-------  1 770  770   241360896 Aug  3 03:18 6309216.2
-rw-------  1 770  770      606208 Aug  3 03:18 6309216_fsm
-rw-------  1 770  770       73728 Aug  3 03:17 6309216_vm
-rw-------  1 770  770    43139072 Aug  3 03:18 6309217


There is another table, "6309212", that is also only appended data,
and that behaves as expected: only the final segment is written, the
other segments are a day old.
But in "6309215" all the segments were recently written. ("6309216" is
the TOAST for "6309215", and "6309217" seems to be an index on that.)


This is what happens:

2022-08-03 03:09:03 CEST
-- ------- Check for hash value clash with old entries
SELECT i1.id
  FROM dnstap.incoming AS i1,
       dnstap.mess AS mess
 WHERE mess.hash = md5(i1.mess)
   AND mess.mess <> i1.mess

I don't think a SELECT like this has reasons to write the
referenced table - but it does:

-rw-------  1 770  770  1073741824 Aug  3 03:09:25 2022 6309215
-rw-------  1 770  770  1073741824 Aug  3 03:09:25 2022 6309215.1
-rw-------  1 770  770  1073741824 Aug  3 03:09:47 2022 6309215.10
-rw-------  1 770  770   965394432 Aug  3 03:10:17 2022 6309215.11
-rw-------  1 770  770  1073741824 Aug  3 03:09:25 2022 6309215.2
-rw-------  1 770  770  1073741824 Aug  3 03:09:20 2022 6309215.3
-rw-------  1 770  770  1073741824 Aug  3 03:09:25 2022 6309215.4
-rw-------  1 770  770  1073741824 Aug  3 03:09:25 2022 6309215.5
-rw-------  1 770  770  1073741824 Aug  3 03:10:18 2022 6309215.6
-rw-------  1 770  770  1073741824 Aug  3 03:10:21 2022 6309215.7
-rw-------  1 770  770  1073741824 Aug  3 03:09:25 2022 6309215.8
-rw-------  1 770  770  1073741824 Aug  3 03:09:21 2022 6309215.9

And this goes on in that way with all the queries.
And when finally the vacuum analyze is run, it is the same, only
that one writes the segments in sequence (only inserts were appended
to the end, there are no updates or deletes):

2022-08-03 04:28:16 CEST
VACUUM (ANALYZE) dnstap.mess;

-rw-------  1 770  770  1073741824 Aug  3 04:29 6309215
-rw-------  1 770  770  1073741824 Aug  3 04:30 6309215.1
-rw-------  1 770  770  1073741824 Aug  3 04:13 6309215.10
-rw-------  1 770  770   995811328 Aug  3 04:16 6309215.11
-rw-------  1 770  770  1073741824 Aug  3 04:32 6309215.2
-rw-------  1 770  770  1073741824 Aug  3 04:33 6309215.3
-rw-------  1 770  770  1073741824 Aug  3 04:07 6309215.4
-rw-------  1 770  770  1073741824 Aug  3 04:08 6309215.5
-rw-------  1 770  770  1073741824 Aug  3 04:09 6309215.6
-rw-------  1 770  770  1073741824 Aug  3 04:10 6309215.7
-rw-------  1 770  770  1073741824 Aug  3 04:11 6309215.8
-rw-------  1 770  770  1073741824 Aug  3 04:12 6309215.9
-rw-------  1 770  770     3170304 Aug  3 03:18 6309215_fsm
-rw-------  1 770  770      393216 Aug  3 04:28 6309215_vm

What is worse is this:

-rw-------  1 770   770  16777216 Aug  3 04:28 000000010000007800000067.ok
-rw-------  1 770   770  16777216 Aug  3 04:28 000000010000007800000068.ok
-rw-------  1 770   770  16777216 Aug  3 04:29 000000010000007800000069.ok
-rw-------  1 770   770  16777216 Aug  3 04:29 00000001000000780000006A.ok
-rw-------  1 770   770  16777216 Aug  3 04:29 00000001000000780000006B.ok
-rw-------  1 770   770  16777216 Aug  3 04:30 00000001000000780000006C.ok
-rw-------  1 770   770  16777216 Aug  3 04:30 00000001000000780000006D.ok
-rw-------  1 770   770  16777216 Aug  3 04:30 00000001000000780000006E.ok
-rw-------  1 770   770  16777216 Aug  3 04:30 00000001000000780000006F.ok
-rw-------  1 770   770  16777216 Aug  3 04:31 000000010000007800000070.ok
-rw-------  1 770   770  16777216 Aug  3 04:31 000000010000007800000071.ok

The table is not only touched, it also creates lots of WAL - and these
here are redo-logs, i.e. I need storage for them. I am seriousely
drowning in the mass of continuousely created data - and the table is
now at 15% of the intended size, so this will apparently get more and
more and more!

History:

I created the table with an 'EXCLUDE =' constraint with hash index on
the "mess" field, as a kind of unique constraint. And with a timestamp
field to be occasionally updated - so at that time did updates happen.

The EXCLUDE constraint suffered from performance issues at 5 mio. rows,
so I had to add my own hash and check for collisions myself (the
"mess" field can be between 100 and maybe 100'000 characters). 

Then, already because of excessive write activity, I moved the
timestamp to a separate table. But that obviousely didn't help the
write activity.

I am running nightly vacuum, and did a vacuum FULL two days ago.
also I moved the table to another tablespace yesterday - it should
be cleanly rewritten. But the problem stays.

The only activity currently is a handful of inserts per day, of some
500'000 rows, run in a single transaction with repeatable read.


How can this be stopped and the table made to behave like the others?

-- PMc



pgsql-general by date:

Previous
From: jian he
Date:
Subject: Re: ICU is not supported in this build. install from source code.
Next
From: Kevin Martin
Date:
Subject: "Missing" column in Postgres logical replication update message