Re: new heapcheck contrib module - Mailing list pgsql-hackers

From Mark Dilger
Subject Re: new heapcheck contrib module
Date
Msg-id 50F6468F-5A5E-412A-86EC-DF24B549D733@enterprisedb.com
Whole thread Raw
In response to Re: new heapcheck contrib module  (Amul Sul <sulamul@gmail.com>)
Responses Re: new heapcheck contrib module  (Amul Sul <sulamul@gmail.com>)
List pgsql-hackers

> On Aug 16, 2020, at 9:37 PM, Amul Sul <sulamul@gmail.com> wrote:
>
> In addition to this, I found a few more things while reading v13 patch are as
> below:
>
> Patch v13-0001:
>
> -
> +#include "amcheck.h"
>
> Not in correct order.

Fixed.

> +typedef struct BtreeCheckContext
> +{
> + TupleDesc tupdesc;
> + Tuplestorestate *tupstore;
> + bool is_corrupt;
> + bool on_error_stop;
> +} BtreeCheckContext;
>
> Unnecessary spaces/tabs between } and BtreeCheckContext.

This refers to a change in verify_nbtree.c that has been removed.  Per discussions with Peter and Robert, I have simply
withdrawnthat portion of the patch. 

> static void bt_index_check_internal(Oid indrelid, bool parentcheck,
> - bool heapallindexed, bool rootdescend);
> + bool heapallindexed, bool rootdescend,
> + BtreeCheckContext * ctx);
>
> Unnecessary space between * and ctx. The same changes needed for other places as
> well.

Same as above.  The changes to verify_nbtree.c have been withdrawn.

> ---
>
> Patch v13-0002:
>
> +-- partitioned tables (the parent ones) don't have visibility maps
> +create table test_partitioned (a int, b text default repeat('x', 5000))
> + partition by list (a);
> +-- these should all fail
> +select * from verify_heapam('test_partitioned',
> + on_error_stop := false,
> + skip := NULL,
> + startblock := NULL,
> + endblock := NULL);
> +ERROR:  "test_partitioned" is not a table, materialized view, or TOAST table
> +create table test_partition partition of test_partitioned for values in (1);
> +create index test_index on test_partition (a);
>
> Can't we make it work? If the input is partitioned, I think we could
> collect all its leaf partitions and process them one by one. Thoughts?

I was following the example from pg_visibility.  I haven't thought about your proposal enough to have much opinion as
yet,except that if we do this for pg_amcheck we should do likewise to pg_visibility, for consistency of the user
interface.

> + ctx->chunkno++;
>
> Instead of incrementing  in check_toast_tuple(), I think incrementing should
> happen at the caller  -- just after check_toast_tuple() call.

I agree.

> ---
>
> Patch v13-0003:
>
> + resetPQExpBuffer(query);
> + destroyPQExpBuffer(query);
>
> resetPQExpBuffer() will be unnecessary if the next call is destroyPQExpBuffer().

Thanks.  I removed it in cases where destroyPQExpBuffer is obviously the very next call.

> + appendPQExpBuffer(query,
> +   "SELECT c.relname, v.blkno, v.offnum, v.lp_off, "
> +   "v.lp_flags, v.lp_len, v.attnum, v.chunk, v.msg"
> +   "\nFROM verify_heapam(rel := %u, on_error_stop := %s, "
> +   "skip := %s, startblock := %s, endblock := %s) v, "
> +   "pg_class c"
> +   "\nWHERE c.oid = %u",
> +   tbloid, stop, skip, settings.startblock,
> +   settings.endblock, tbloid
>
> pg_class should be schema-qualified like elsewhere.

Agreed, and changed.

> IIUC, pg_class is meant to
> get relname only, instead, we could use '%u'::pg_catalog.regclass in the target
> list for the relname. Thoughts?

get_table_check_list() creates the list of all tables to be checked, which check_tables() then iterates over, calling
check_table()for each one.  I think some verification that the table still exists is in order.  Using
'%u'::pg_catalog.regclassfor a table that has since been dropped would pass in the old table Oid and draw an error of
the'ERROR:  could not open relation with OID 36311' variety, whereas the current coding will just skip the dropped
table.

> Also I think we should skip '\n' from the query string (see appendPQExpBuffer()
> in pg_dump.c)

I'm not sure I understand.  pg_dump.c uses "\n" in query strings it passes to appendPQExpBuffer(), in a manner very
similarto what this patch does. 

> + appendPQExpBuffer(query,
> +   "SELECT i.indexrelid"
> +   "\nFROM pg_catalog.pg_index i, pg_catalog.pg_class c"
> +   "\nWHERE i.indexrelid = c.oid"
> +   "\n    AND c.relam = %u"
> +   "\n    AND i.indrelid = %u",
> +   BTREE_AM_OID, tbloid);
> +
> + ExecuteSqlStatement("RESET search_path");
> + res = ExecuteSqlQuery(query->data, PGRES_TUPLES_OK);
> + PQclear(ExecuteSqlQueryForSingleRow(ALWAYS_SECURE_SEARCH_PATH_SQL));
>
> I don't think we need the search_path query. The main query doesn't have any
> dependencies on it.  Same is in check_indexes(), check_index (),
> expand_table_name_patterns() & get_table_check_list().
> Correct me if I am missing something.

Right.

> + output = PageOutput(lines + 2, NULL);
> + for (lineno = 0; usage_text[lineno]; lineno++)
> + fprintf(output, "%s\n", usage_text[lineno]);
> + fprintf(output, "Report bugs to <%s>.\n", PACKAGE_BUGREPORT);
> + fprintf(output, "%s home page: <%s>\n", PACKAGE_NAME, PACKAGE_URL);
>
> I am not sure why we want PageOutput() if the second argument is always going to
> be NULL? Can't we directly use printf() instead of PageOutput() + fprintf() ?
> e.g. usage() function in pg_basebackup.c.

Done.


Please find attached the next version of the patch.  In addition to your review comments (above), I have made changes
inresponse to Peter and Robert's review comments upthread. 




—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Attachment

pgsql-hackers by date:

Previous
From: Kyotaro Horiguchi
Date:
Subject: Re: SyncRepLock acquired exclusively in default configuration
Next
From: 胡常齐
Date:
Subject: The number of logical replication slot or wal sender recommend