Hi,
This discussion started at https://postgr.es/m/20230627212423.p56zxuqeh5kyatqg@awork3.anarazel.de
but isn't really related to the bug.
On 2023-06-27 17:44:57 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > That's not going to help you / the reporter, but to make issues like this
> > easier to debug in the future, I think we should
> > a) install an error context in load_libraries() printing the GUC name and
> > b) install an error context in internal_load_library() printing the name of
> > the shared library name
>
> +1. I'm not sure this specific issue justifies it, but it seems like that
> might make it easier to diagnose other shared-library-load-time issues
> as well.
Yea, this bug alone wouldn't have made me suggest it, but I've looked at
enough issues where I regretted not knowing what library caused and error
and/or what caused a library to be loaded that I think it's worth it.
I first added an error context to both the places mentioned above, but that
leads to annoyingly redundant messages. And I realized that it'd also be
useful to print a message when loading a library due to
load_external_function() - I've definitely wondered about errors triggered
below that in the past.
I ended up adding a reason enum and a detail const char* to
internal_load_library(). I don't like that approach a whole lot, but couldn't
really come up with something better.
Example errors that now have a context:
Error in _PG_init() of library called via shared_preload_libraries (error added by me):
FATAL: not today
CONTEXT: in "_PG_init()" callback of library "/tmp/meson-install/lib/x86_64-linux-gnu/postgresql/auto_explain.so"
library load for "shared_preload_libraries" parameter
or
FATAL: could not access file "dont_exist": No such file or directory
CONTEXT: library load for "shared_preload_libraries" parameter
Creating a C function referencing a library that needs to be loaded with
shared_preload_libraries:
=# CREATE FUNCTION frak()
RETURNS text IMMUTABLE STRICT
AS '/srv/dev/build/m/src/test/modules/test_slru/test_slru.so' LANGUAGE C;
ERROR: XX000: cannot load "test_slru" after startup
DETAIL: "test_slru" must be loaded with shared_preload_libraries.
CONTEXT: in "_PG_init()" callback of library "/srv/dev/build/m/src/test/modules/test_slru/test_slru.so"
library load for C function "frak"
LOAD of a non-postgres library:
=# LOAD '/usr/lib/libarmadillo.so.11';
ERROR: XX000: incompatible library "/usr/lib/libarmadillo.so.11": missing magic block
HINT: Extension libraries are required to use the PG_MODULE_MAGIC macro.
CONTEXT: library load for LOAD statement
Note that here the errcontext callback prints the reason for the library being
loaded, but not the library name. I made it so that the library name is only
printed during _PG_init(), otherwise it's always duplicating the primary error
message. Which looks messy - but perhaps it's more important to be
"predictable"?
I don't love "library load for ..." and played around with a few other
variants, but I didn't come up with anything particularly satisfying.
I was tempted to invent a separate "library load reason" for
fmgr_info_C_lang() and other uses of load_external_function(), but concluded
that that reaches diminishing-returns territory.
Is it worth adding tests for:
1) an error during shared_preload_libraries, local_preload_libraries, session_preload_libraries
2) loading a non-postgres library and hitting "missing magic block"
3) local_preload_libraries not being allowed to load libraries outside of plugins/?
4) session_preload_libraries being allowed to load libraries outside of plugins/?
?
Greetings,
Andres Freund