Why is my table continuousely written? - Mailing list pgsql-general
From | Peter |
---|---|
Subject | Why is my table continuousely written? |
Date | |
Msg-id | Yupdof22ov5WkhSb@gate.intra.daemon.contact Whole thread Raw |
Responses |
Why is my table continuousely written? -> XID issue?
|
List | pgsql-general |
Hello, this is postgres version 12.11_1 on Freebsd 13.1 I have a table "mess", filename "6309215", that behaves strange. Data is currently only inserted/appended, there is no update and no (successful) delete done, autovacuum is NOT enabled. This is the definition: CREATE TABLE IF NOT EXISTS dnstap.mess ( id bigint NOT NULL DEFAULT nextval('dnstap.mess_id_seq'::regclass), mess text COLLATE pg_catalog."default", hash text COLLATE pg_catalog."default", CONSTRAINT mess_pkey PRIMARY KEY (id), CONSTRAINT mess_hash_key UNIQUE (hash) ) TABLESPACE l1only; And this is the table: -rw------- 1 770 770 1073741824 Aug 2 04:38 6309212 -rw------- 1 770 770 1073741824 Aug 2 04:38 6309212.1 -rw------- 1 770 770 1073741824 Aug 2 04:38 6309212.2 -rw------- 1 770 770 1073741824 Aug 2 04:39 6309212.3 -rw------- 1 770 770 1073741824 Aug 2 04:39 6309212.4 -rw------- 1 770 770 405872640 Aug 3 04:28 6309212.5 -rw------- 1 770 770 1441792 Aug 3 03:25 6309212_fsm -rw------- 1 770 770 180224 Aug 2 05:13 6309212_vm -rw------- 1 770 770 0 Aug 2 00:21 6309213 -rw------- 1 770 770 8192 Aug 2 04:40 6309214 -rw------- 1 770 770 1073741824 Aug 3 04:29 6309215 -rw------- 1 770 770 1073741824 Aug 3 04:30 6309215.1 -rw------- 1 770 770 1073741824 Aug 3 04:13 6309215.10 -rw------- 1 770 770 995811328 Aug 3 04:16 6309215.11 -rw------- 1 770 770 1073741824 Aug 3 04:32 6309215.2 -rw------- 1 770 770 1073741824 Aug 3 04:34 6309215.3 -rw------- 1 770 770 1073741824 Aug 3 04:35 6309215.4 -rw------- 1 770 770 1073741824 Aug 3 04:37 6309215.5 -rw------- 1 770 770 1073741824 Aug 3 04:39 6309215.6 -rw------- 1 770 770 1073741824 Aug 3 04:41 6309215.7 -rw------- 1 770 770 1073741824 Aug 3 04:42 6309215.8 -rw------- 1 770 770 1073741824 Aug 3 04:12 6309215.9 -rw------- 1 770 770 3170304 Aug 3 04:38 6309215_fsm -rw------- 1 770 770 393216 Aug 3 04:38 6309215_vm -rw------- 1 770 770 1073741824 Aug 3 03:17 6309216 -rw------- 1 770 770 1073741824 Aug 2 04:33 6309216.1 -rw------- 1 770 770 241360896 Aug 3 03:18 6309216.2 -rw------- 1 770 770 606208 Aug 3 03:18 6309216_fsm -rw------- 1 770 770 73728 Aug 3 03:17 6309216_vm -rw------- 1 770 770 43139072 Aug 3 03:18 6309217 There is another table, "6309212", that is also only appended data, and that behaves as expected: only the final segment is written, the other segments are a day old. But in "6309215" all the segments were recently written. ("6309216" is the TOAST for "6309215", and "6309217" seems to be an index on that.) This is what happens: 2022-08-03 03:09:03 CEST -- ------- Check for hash value clash with old entries SELECT i1.id FROM dnstap.incoming AS i1, dnstap.mess AS mess WHERE mess.hash = md5(i1.mess) AND mess.mess <> i1.mess I don't think a SELECT like this has reasons to write the referenced table - but it does: -rw------- 1 770 770 1073741824 Aug 3 03:09:25 2022 6309215 -rw------- 1 770 770 1073741824 Aug 3 03:09:25 2022 6309215.1 -rw------- 1 770 770 1073741824 Aug 3 03:09:47 2022 6309215.10 -rw------- 1 770 770 965394432 Aug 3 03:10:17 2022 6309215.11 -rw------- 1 770 770 1073741824 Aug 3 03:09:25 2022 6309215.2 -rw------- 1 770 770 1073741824 Aug 3 03:09:20 2022 6309215.3 -rw------- 1 770 770 1073741824 Aug 3 03:09:25 2022 6309215.4 -rw------- 1 770 770 1073741824 Aug 3 03:09:25 2022 6309215.5 -rw------- 1 770 770 1073741824 Aug 3 03:10:18 2022 6309215.6 -rw------- 1 770 770 1073741824 Aug 3 03:10:21 2022 6309215.7 -rw------- 1 770 770 1073741824 Aug 3 03:09:25 2022 6309215.8 -rw------- 1 770 770 1073741824 Aug 3 03:09:21 2022 6309215.9 And this goes on in that way with all the queries. And when finally the vacuum analyze is run, it is the same, only that one writes the segments in sequence (only inserts were appended to the end, there are no updates or deletes): 2022-08-03 04:28:16 CEST VACUUM (ANALYZE) dnstap.mess; -rw------- 1 770 770 1073741824 Aug 3 04:29 6309215 -rw------- 1 770 770 1073741824 Aug 3 04:30 6309215.1 -rw------- 1 770 770 1073741824 Aug 3 04:13 6309215.10 -rw------- 1 770 770 995811328 Aug 3 04:16 6309215.11 -rw------- 1 770 770 1073741824 Aug 3 04:32 6309215.2 -rw------- 1 770 770 1073741824 Aug 3 04:33 6309215.3 -rw------- 1 770 770 1073741824 Aug 3 04:07 6309215.4 -rw------- 1 770 770 1073741824 Aug 3 04:08 6309215.5 -rw------- 1 770 770 1073741824 Aug 3 04:09 6309215.6 -rw------- 1 770 770 1073741824 Aug 3 04:10 6309215.7 -rw------- 1 770 770 1073741824 Aug 3 04:11 6309215.8 -rw------- 1 770 770 1073741824 Aug 3 04:12 6309215.9 -rw------- 1 770 770 3170304 Aug 3 03:18 6309215_fsm -rw------- 1 770 770 393216 Aug 3 04:28 6309215_vm What is worse is this: -rw------- 1 770 770 16777216 Aug 3 04:28 000000010000007800000067.ok -rw------- 1 770 770 16777216 Aug 3 04:28 000000010000007800000068.ok -rw------- 1 770 770 16777216 Aug 3 04:29 000000010000007800000069.ok -rw------- 1 770 770 16777216 Aug 3 04:29 00000001000000780000006A.ok -rw------- 1 770 770 16777216 Aug 3 04:29 00000001000000780000006B.ok -rw------- 1 770 770 16777216 Aug 3 04:30 00000001000000780000006C.ok -rw------- 1 770 770 16777216 Aug 3 04:30 00000001000000780000006D.ok -rw------- 1 770 770 16777216 Aug 3 04:30 00000001000000780000006E.ok -rw------- 1 770 770 16777216 Aug 3 04:30 00000001000000780000006F.ok -rw------- 1 770 770 16777216 Aug 3 04:31 000000010000007800000070.ok -rw------- 1 770 770 16777216 Aug 3 04:31 000000010000007800000071.ok The table is not only touched, it also creates lots of WAL - and these here are redo-logs, i.e. I need storage for them. I am seriousely drowning in the mass of continuousely created data - and the table is now at 15% of the intended size, so this will apparently get more and more and more! History: I created the table with an 'EXCLUDE =' constraint with hash index on the "mess" field, as a kind of unique constraint. And with a timestamp field to be occasionally updated - so at that time did updates happen. The EXCLUDE constraint suffered from performance issues at 5 mio. rows, so I had to add my own hash and check for collisions myself (the "mess" field can be between 100 and maybe 100'000 characters). Then, already because of excessive write activity, I moved the timestamp to a separate table. But that obviousely didn't help the write activity. I am running nightly vacuum, and did a vacuum FULL two days ago. also I moved the table to another tablespace yesterday - it should be cleanly rewritten. But the problem stays. The only activity currently is a handful of inserts per day, of some 500'000 rows, run in a single transaction with repeatable read. How can this be stopped and the table made to behave like the others? -- PMc
pgsql-general by date: