Re: new heapcheck contrib module - Mailing list pgsql-hackers

From Mark Dilger
Subject Re: new heapcheck contrib module
Date
Msg-id 411C72E5-23A6-485F-A101-FBF7CFA6D2D5@enterprisedb.com
Whole thread Raw
In response to Re: new heapcheck contrib module  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: new heapcheck contrib module  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers

> On Jan 14, 2021, at 1:13 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Mon, Jan 11, 2021 at 1:16 PM Mark Dilger
> <mark.dilger@enterprisedb.com> wrote:
>> Added in v32, along with adding pg_amcheck to @contrib_uselibpq, @contrib_uselibpgport, and @contrib_uselibpgcommon
>
> exit_utils.c fails to achieve the goal of making this code independent
> of pg_dump, because of:
>
> #ifdef WIN32
>        if (parallel_init_done && GetCurrentThreadId() != mainThreadId)
>                _endthreadex(code);
> #endif
>
> parallel_init_done is a pg_dump-ism. Perhaps this chunk of code could
> be a handler that gets registered using exit_nicely() rather than
> hard-coded like this. Note that the function comments for
> exit_nicely() are heavily implicated in this problem, since they also
> apply to stuff that only happens in pg_dump and not other utilities.

The 0001 patch has been restructured to not have this problem.

> I'm skeptical about the idea of putting functions into string_utils.c
> with names as generic as include_filter() and exclude_filter().
> Existing cases like fmtId() and fmtQualifiedId() are not great either,
> but I think this is worse and that we should do some renaming. On a
> related note, it's not clear to me why these should be classified as
> string_utils while stuff like expand_schema_name_patterns() gets
> classified as option_utils. These are neither generic
> string-processing functions nor are they generic options-parsing
> functions. They are functions for expanding shell-glob style patterns
> for database object names. And they seem like they ought to be
> together, because they seem to do closely-related things. I'm open to
> an argument that this is wrongheaded on my part, but it looks weird to
> me the way it is.

The logic to filter which relations are checked is completely restructured and is kept in pg_amcheck.c

> I'm pretty unimpressed by query_utils.c. The CurrentResultHandler
> stuff looks grotty, and you don't seem to really use it anywhere. And
> it seems woefully overambitious to me anyway: this doesn't apply to
> every kind of "result" we've got hanging around, absolutely nothing
> even close to that, even though a name like CurrentResultHandler
> sounds very broad. It also means more global variables, which is a
> thing of which the PostgreSQL codebase already has a deplorable
> oversupply. quiet_handler() and noop_handler() aren't used anywhere
> either, AFAICS.
>
> I wonder if it would be better to pass in callbacks rather than
> relying on global variables. e.g.:
>
> typedef void (*fatal_error_callback)(const char *fmt,...)
> pg_attribute_printf(1, 2) pg_attribute_noreturn();
>
> Then you could have a few helper functions that take an argument of
> type fatal_error_callback and throw the right fatal error for (a)
> wrong PQresultStatus() and (b) result is not one row. Do you need any
> other cases? exiting_handler() seems to think that the caller might
> want to allow any number of tuples, or any positive number, or any
> particular cout, but I'm not sure if all of those cases are really
> needed.

The error callback stuff has been refactored in this next patch set, and also now includes handlers for parallel slots,
asthe src/bin/scripts/scripts_parallel.c stuff has been moved to fe_utils and made more general.  As it was, there were
hardcodedassumptions that are valid for reindexdb and vacuumdb, but not general enough for pg_amcheck to use.  The
refactoringin patches 0002 through 0005 make it more generally usable.  Patch 0008 uses it in pg_amcheck. 

> This stuff is finnicky and hard to get right. You don't really want to
> create a situation where the same code keeps getting duplicated, or
> the behavior's just a little bit inconsistent everywhere, but it also
> isn't great to build layers upon layers of abstraction around
> something like ExecuteSqlQuery which is, in the end, a four-line
> function. I don't think there's any problem with something like
> pg_dump having it's own function to execute-a-query-or-die. Maybe that
> function ends up doing something like
> TheGenericFunctionToExecuteOrDie(my_die_fn, the_query), or maybe
> pg_dump can just open-code it but have a my_die_fn to pass down to the
> glob-expansion stuff, or, well, I don't know.

There are some real improvements in this next patch set.

The number of queries issued to the database to determine the databases to use is much reduced.  I had been following
thepattern in pg_dump, but abandoned that for something new. 

The parallel slots stuff is now used for parallelism, much like what is done in vacuumdb and reindexdb.

The pg_amcheck application can now be run over one database, multiple specified databases, or all databases.

Relations, schemas, and databases can be included and excluded by pattern, like
"(db1|db2|db3).myschema.(mytable|myindex)". The real-world use-cases for this that I have in mind are things like: 

    pg_amcheck --jobs=12 --all \
        --exclude-relation="db7.schema.known_corrupt_table" \
        --exclude-relation="db*.schema.known_big_table"

and

    pg_amcheck --jobs=20 \
        --include-relation="*.compliance.audited"

I might be missing something, but I think the interface is a superset of the interface from reindexdb and vacuumdb.
Noneof the new interface stuff (patterns, allowing multiple databases to be given on the command line, etc) is
required.


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: On login trigger: take three
Next
From: Amit Kapila
Date:
Subject: Re: logical replication worker accesses catalogs in error context callback