Re: Disabling Heap-Only Tuples - Mailing list pgsql-hackers
From | Thom Brown |
---|---|
Subject | Re: Disabling Heap-Only Tuples |
Date | |
Msg-id | CAA-aLv5yos961mAjfFRrmy6ZKr=eWMRN4FeSBeya9XcettrHcw@mail.gmail.com Whole thread Raw |
In response to | Re: Disabling Heap-Only Tuples (Matthias van de Meent <boekewurm+postgres@gmail.com>) |
List | pgsql-hackers |
On Thu, 6 Jul 2023 at 21:18, Matthias van de Meent <boekewurm+postgres@gmail.com> wrote: > > On Wed, 5 Jul 2023 at 19:55, Thom Brown <thom@linux.com> wrote: > > > > On Wed, 5 Jul 2023 at 18:05, Matthias van de Meent > > <boekewurm+postgres@gmail.com> wrote: > > > So what were you thinking of? A session GUC? A table option? > > > > Both. > > Here's a small patch implementing a new table option max_local_update > (name very much bikesheddable). Value is -1 (default, disabled) or the > size of the table in MiB that you still want to allow to update on the > same page. I didn't yet go for a GUC as I think that has too little > control on the impact on the system. > > I decided that max_local_update would be in MB because there is no > reloption value that can contain MaxBlockNumber and -1/disabled; and 1 > MiB seems like enough granularity for essentially all use cases. > > The added regression tests show how this feature works, that the new > feature works, and validate that lock levels are acceptable > (ShareUpdateExclusiveLock, same as for updating fillfactor). Wow, thanks for working on this. I've given it a test, and it does what I would expect it to do. I'm aware of the concerns about the potential for the relocation to land in an undesirable location, so perhaps that needs addressing. But this is already considerably better than the current need to update a row until it gets pushed off its current page. Ideally there would be tooling built around this where the user wouldn't need to figure out how much of the table to UPDATE, or deal with VACUUMing concerns. But here's my quick test: CREATE OR REPLACE FUNCTION compact_table(table_name IN TEXT) RETURNS VOID AS $$ DECLARE current_row RECORD; old_ctid TID; new_ctid TID; keys TEXT; update_query TEXT; row_counter INTEGER := 0; BEGIN SELECT string_agg(a.attname || ' = ' || a.attname, ', ') INTO keys FROM pg_index i JOIN pg_attribute a ON a.attnum = ANY(i.indkey) WHERE i.indrelid = table_name::regclass AND a.attrelid = table_name::regclass AND i.indisprimary; IF keys IS NULL THEN RAISE EXCEPTION 'Table % does not have a primary key.', table_name; END IF; FOR current_row IN EXECUTE FORMAT('SELECT ctid, * FROM %I ORDER BY ctid DESC', table_name) LOOP old_ctid := current_row.ctid; update_query := FORMAT('UPDATE %I SET %s WHERE ctid = $1 RETURNING ctid', table_name, keys); EXECUTE update_query USING old_ctid INTO new_ctid; row_counter := row_counter + 1; IF row_counter % 1000 = 0 THEN RAISE NOTICE '% rows relocated.', row_counter; END IF; IF new_ctid <= old_ctid THEN CONTINUE; ELSE RAISE NOTICE 'All non-contiguous rows relocated.'; EXIT; END IF; END LOOP; END; $$ LANGUAGE plpgsql; postgres=# CREATE TABLE bigtable (id int, content text); CREATE TABLE postgres=# INSERT INTO bigtable SELECT x, 'This is just a way to fill up space.' FROM generate_series(1,10000000) a(x); INSERT 0 10000000 postgres=# DELETE FROM bigtable WHERE id % 7 = 0; DELETE 1428571 postgres=# VACUUM bigtable; VACUUM postgres=# ALTER TABLE bigtable SET (max_local_update = 0); ALTER TABLE postgres=# ALTER TABLE bigtable ADD PRIMARY KEY (id); ALTER TABLE postgres=# \dt+ bigtable List of relations Schema | Name | Type | Owner | Persistence | Access method | Size | Description --------+----------+-------+-------+-------------+---------------+--------+------------- public | bigtable | table | thom | permanent | heap | 730 MB | (1 row) postgres=# SELECT * FROM pgstattuple('bigtable'); table_len | tuple_count | tuple_len | tuple_percent | dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space | free_percent -----------+-------------+-----------+---------------+------------------+----------------+--------------------+------------+-------------- 765607936 | 8571429 | 557142885 | 72.77 | 0 | 0 | 0 | 105901628 | 13.83 (1 row) postgres=# SELECT compact_table('bigtable'); NOTICE: 1000 rows relocated. NOTICE: 2000 rows relocated. NOTICE: 3000 rows relocated. NOTICE: 4000 rows relocated. ... NOTICE: 1221000 rows relocated. NOTICE: 1222000 rows relocated. NOTICE: 1223000 rows relocated. NOTICE: 1224000 rows relocated. NOTICE: All non-contiguous rows relocated. compact_table --------------- (1 row) postgres=# VACUUM bigtable; VACUUM postgres=# \dt+ bigtable; List of relations Schema | Name | Type | Owner | Persistence | Access method | Size | Description --------+----------+-------+-------+-------------+---------------+--------+------------- public | bigtable | table | thom | permanent | heap | 626 MB | (1 row) postgres=# SELECT * FROM pgstattuple('bigtable'); table_len | tuple_count | tuple_len | tuple_percent | dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space | free_percent -----------+-------------+-----------+---------------+------------------+----------------+--------------------+------------+-------------- 656236544 | 8571429 | 557142885 | 84.9 | 0 | 0 | 0 | 2564888 | 0.39 (1 row) Works for me. Thom
pgsql-hackers by date: