Thread: Adding the extension name to EData / log_line_prefix
Hi, It can be very useful to look at the log messages emitted by a larger number of postgres instances to see if anything unusual is happening. E.g. checking whether there are an increased number of internal, IO, corruption errors (and LOGs too, because we emit plenty bad things as LOG) . One difficulty is that extensions tend to not categorize their errors. But unfortunately errors in extensions are hard to distinguish from errors emitted by postgres. A related issue is that it'd be useful to be able to group log messages by extension, to e.g. see which extensions are emitting disproportionally many log messages. Therefore I'd like to collect the extension name in elog/ereport and add a matching log_line_prefix escape code. It's not entirely trivial to provide errfinish() with a parameter indicating the extension, but it's doable: 1) Have PG_MODULE_MAGIC also define a new variable for the extension name, empty at that point 2) In internal_load_library(), look up that new variable, and fill it with a, mangled, libname. 4) in elog.h, define a new macro depending on BUILDING_DLL (if it is set, we're in the server, otherwise an extension). In the backend itself, define it to NULL, otherwise to the variable created by PG_MODULE_MAGIC. 5) In elog/ereport/errsave/... pass this new variable to errfinish/errsave_finish. I've attached a *very rough* prototype of this idea. My goal at this stage was just to show that it's possible, not for the code to be in a reviewable state. Here's e.g. what this produces with log_line_prefix='%m [%E] ' 2024-05-13 13:50:17.518 PDT [postgres] LOG: database system is ready to accept connections 2024-05-13 13:50:19.138 PDT [cube] ERROR: invalid input syntax for cube at character 13 2024-05-13 13:50:19.138 PDT [cube] DETAIL: syntax error at or near "f" 2024-05-13 13:50:19.138 PDT [cube] STATEMENT: SELECT cube('foo'); 2024-05-13 13:43:07.484 PDT [postgres] LOG: database system is ready to accept connections 2024-05-13 13:43:11.699 PDT [hstore] ERROR: syntax error in hstore: unexpected end of string at character 15 2024-05-13 13:43:11.699 PDT [hstore] STATEMENT: SELECT hstore('foo'); It's worth pointing out that this, quite fundamentally, can only work when the log message is triggered directly by the extension. If the extension code calls some postgres function and that function then errors out, it'll be seens as being part of postgres. But I think that's ok - they're going to be properly errcode-ified etc. Thoughts? Greetings, Andres Freund
Attachment
On Mon, May 13, 2024 at 5:51 PM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> It can be very useful to look at the log messages emitted by a larger number
> of postgres instances to see if anything unusual is happening. E.g. checking
> whether there are an increased number of internal, IO, corruption errors (and
> LOGs too, because we emit plenty bad things as LOG) . One difficulty is that
> extensions tend to not categorize their errors. But unfortunately errors in
> extensions are hard to distinguish from errors emitted by postgres.
>
> A related issue is that it'd be useful to be able to group log messages by
> extension, to e.g. see which extensions are emitting disproportionally many
> log messages.
>
> Therefore I'd like to collect the extension name in elog/ereport and add a
> matching log_line_prefix escape code.
>
I liked the idea ... It is very helpful for troubleshooting problems in production.
> It's not entirely trivial to provide errfinish() with a parameter indicating
> the extension, but it's doable:
>
> 1) Have PG_MODULE_MAGIC also define a new variable for the extension name,
> empty at that point
>
> 2) In internal_load_library(), look up that new variable, and fill it with a,
> mangled, libname.
>
> 4) in elog.h, define a new macro depending on BUILDING_DLL (if it is set,
> we're in the server, otherwise an extension). In the backend itself, define
> it to NULL, otherwise to the variable created by PG_MODULE_MAGIC.
>
> 5) In elog/ereport/errsave/... pass this new variable to
> errfinish/errsave_finish.
>
> the extension, but it's doable:
>
> 1) Have PG_MODULE_MAGIC also define a new variable for the extension name,
> empty at that point
>
> 2) In internal_load_library(), look up that new variable, and fill it with a,
> mangled, libname.
>
> 4) in elog.h, define a new macro depending on BUILDING_DLL (if it is set,
> we're in the server, otherwise an extension). In the backend itself, define
> it to NULL, otherwise to the variable created by PG_MODULE_MAGIC.
>
> 5) In elog/ereport/errsave/... pass this new variable to
> errfinish/errsave_finish.
>
Then every extension should define their own Pg_extension_filename, right?
> I've attached a *very rough* prototype of this idea. My goal at this stage was
> just to show that it's possible, not for the code to be in a reviewable state.
>
>
> Here's e.g. what this produces with log_line_prefix='%m [%E] '
>
> 2024-05-13 13:50:17.518 PDT [postgres] LOG: database system is ready to accept connections
> 2024-05-13 13:50:19.138 PDT [cube] ERROR: invalid input syntax for cube at character 13
> 2024-05-13 13:50:19.138 PDT [cube] DETAIL: syntax error at or near "f"
> 2024-05-13 13:50:19.138 PDT [cube] STATEMENT: SELECT cube('foo');
>
> 2024-05-13 13:43:07.484 PDT [postgres] LOG: database system is ready to accept connections
> 2024-05-13 13:43:11.699 PDT [hstore] ERROR: syntax error in hstore: unexpected end of string at character 15
> 2024-05-13 13:43:11.699 PDT [hstore] STATEMENT: SELECT hstore('foo');
>
>
Was not able to build your patch by simply:
./configure --prefix=/tmp/pg
...
make -j
...
/usr/bin/ld: ../../src/port/libpgport_srv.a(path_srv.o): warning: relocation against `Pg_extension_filename' in read-only section `.text'
/usr/bin/ld: access/brin/brin.o: in function `brininsert':
/data/src/pg/main/src/backend/access/brin/brin.c:403: undefined reference to `Pg_extension_filename'
/usr/bin/ld: access/brin/brin.o: in function `brinbuild':
/data/src/pg/main/src/backend/access/brin/brin.c:1107: undefined reference to `Pg_extension_filename'
/usr/bin/ld: access/brin/brin.o: in function `brin_summarize_range':
/data/src/pg/main/src/backend/access/brin/brin.c:1383: undefined reference to `Pg_extension_filename'
/usr/bin/ld: /data/src/pg/main/src/backend/access/brin/brin.c:1389: undefined reference to `Pg_extension_filename'
/usr/bin/ld: /data/src/pg/main/src/backend/access/brin/brin.c:1434: undefined reference to `Pg_extension_filename'
/usr/bin/ld: access/brin/brin.o:/data/src/pg/main/src/backend/access/brin/brin.c:1450: more undefined references to `Pg_extension_filename' follow
/usr/bin/ld: warning: creating DT_TEXTREL in a PIE
collect2: error: ld returned 1 exit status
make[2]: *** [Makefile:67: postgres] Error 1
make[2]: Leaving directory '/data/src/pg/main/src/backend'
make[1]: *** [Makefile:42: all-backend-recurse] Error 2
make[1]: Leaving directory '/data/src/pg/main/src'
make: *** [GNUmakefile:11: all-src-recurse] Error 2
/usr/bin/ld: access/brin/brin.o: in function `brininsert':
/data/src/pg/main/src/backend/access/brin/brin.c:403: undefined reference to `Pg_extension_filename'
/usr/bin/ld: access/brin/brin.o: in function `brinbuild':
/data/src/pg/main/src/backend/access/brin/brin.c:1107: undefined reference to `Pg_extension_filename'
/usr/bin/ld: access/brin/brin.o: in function `brin_summarize_range':
/data/src/pg/main/src/backend/access/brin/brin.c:1383: undefined reference to `Pg_extension_filename'
/usr/bin/ld: /data/src/pg/main/src/backend/access/brin/brin.c:1389: undefined reference to `Pg_extension_filename'
/usr/bin/ld: /data/src/pg/main/src/backend/access/brin/brin.c:1434: undefined reference to `Pg_extension_filename'
/usr/bin/ld: access/brin/brin.o:/data/src/pg/main/src/backend/access/brin/brin.c:1450: more undefined references to `Pg_extension_filename' follow
/usr/bin/ld: warning: creating DT_TEXTREL in a PIE
collect2: error: ld returned 1 exit status
make[2]: *** [Makefile:67: postgres] Error 1
make[2]: Leaving directory '/data/src/pg/main/src/backend'
make[1]: *** [Makefile:42: all-backend-recurse] Error 2
make[1]: Leaving directory '/data/src/pg/main/src'
make: *** [GNUmakefile:11: all-src-recurse] Error 2
> It's worth pointing out that this, quite fundamentally, can only work when the
> log message is triggered directly by the extension. If the extension code
> calls some postgres function and that function then errors out, it'll be seens
> as being part of postgres.
>
> But I think that's ok - they're going to be properly errcode-ified etc.
>
Hmmm, depending on the extension it can extensively call/use postgres code so would be nice if we can differentiate if the code is called from Postgres itself or from an extension.
Regards,
--
Fabrízio de Royes Mello
> log message is triggered directly by the extension. If the extension code
> calls some postgres function and that function then errors out, it'll be seens
> as being part of postgres.
>
> But I think that's ok - they're going to be properly errcode-ified etc.
>
Hmmm, depending on the extension it can extensively call/use postgres code so would be nice if we can differentiate if the code is called from Postgres itself or from an extension.
Regards,
--
Fabrízio de Royes Mello
Hi, On 2024-05-13 19:25:11 -0300, Fabrízio de Royes Mello wrote: > On Mon, May 13, 2024 at 5:51 PM Andres Freund <andres@anarazel.de> wrote: > > It's not entirely trivial to provide errfinish() with a parameter > indicating > > the extension, but it's doable: > > > > 1) Have PG_MODULE_MAGIC also define a new variable for the extension name, > > empty at that point > > > > 2) In internal_load_library(), look up that new variable, and fill it > with a, > > mangled, libname. > > > > 4) in elog.h, define a new macro depending on BUILDING_DLL (if it is set, > > we're in the server, otherwise an extension). In the backend itself, > define > > it to NULL, otherwise to the variable created by PG_MODULE_MAGIC. > > > > 5) In elog/ereport/errsave/... pass this new variable to > > errfinish/errsave_finish. > > > > Then every extension should define their own Pg_extension_filename, right? It'd be automatically set by postgres when loading libraries. > > I've attached a *very rough* prototype of this idea. My goal at this > stage was > > just to show that it's possible, not for the code to be in a reviewable > state. > > > > > > Here's e.g. what this produces with log_line_prefix='%m [%E] ' > > > > 2024-05-13 13:50:17.518 PDT [postgres] LOG: database system is ready to > accept connections > > 2024-05-13 13:50:19.138 PDT [cube] ERROR: invalid input syntax for cube > at character 13 > > 2024-05-13 13:50:19.138 PDT [cube] DETAIL: syntax error at or near "f" > > 2024-05-13 13:50:19.138 PDT [cube] STATEMENT: SELECT cube('foo'); > > > > 2024-05-13 13:43:07.484 PDT [postgres] LOG: database system is ready to > accept connections > > 2024-05-13 13:43:11.699 PDT [hstore] ERROR: syntax error in hstore: > unexpected end of string at character 15 > > 2024-05-13 13:43:11.699 PDT [hstore] STATEMENT: SELECT hstore('foo'); > > > > > > Was not able to build your patch by simply: Oh, turns out it only builds with meson right now. I forgot that, with autoconf, for some unknown reason, we only set BUILDING_DLL on some OSs. I attached a crude patch changing that. > > It's worth pointing out that this, quite fundamentally, can only work > when the > > log message is triggered directly by the extension. If the extension code > > calls some postgres function and that function then errors out, it'll be > seens > > as being part of postgres. > > > > But I think that's ok - they're going to be properly errcode-ified etc. > > > > Hmmm, depending on the extension it can extensively call/use postgres code > so would be nice if we can differentiate if the code is called from > Postgres itself or from an extension. I think that's not realistically possible. It's also very fuzzy what that'd mean. If there's a planner hook and then the query executes normally, what do you report for an execution time error? And even the simpler case - should use of pg_stat_statements cause everything within to be logged as a pg_stat_statement message? I think the best we can do is to actually say where the error is directly triggered from. Greetings, Andres Freund
Attachment
Andres Freund <andres@anarazel.de> writes: > On 2024-05-13 19:25:11 -0300, Fabrízio de Royes Mello wrote: >> Hmmm, depending on the extension it can extensively call/use postgres code >> so would be nice if we can differentiate if the code is called from >> Postgres itself or from an extension. > I think that's not realistically possible. It's also very fuzzy what that'd > mean. If there's a planner hook and then the query executes normally, what do > you report for an execution time error? And even the simpler case - should use > of pg_stat_statements cause everything within to be logged as a > pg_stat_statement message? Not to mention that there could be more than one extension on the call stack. I think tying this statically to the ereport call site is fine. The mechanism that Andres describes for sourcing the name seems a bit overcomplex though. Why not just allow/require each extension to specify its name as a constant string? We could force the matter by redefining PG_MODULE_MAGIC as taking an argument: PG_MODULE_MAGIC("hstore"); regards, tom lane
Hi, On 2024-05-13 19:11:53 -0400, Tom Lane wrote: > The mechanism that Andres describes for sourcing the name seems a bit > overcomplex though. Why not just allow/require each extension to > specify its name as a constant string? We could force the matter by > redefining PG_MODULE_MAGIC as taking an argument: > PG_MODULE_MAGIC("hstore"); Mostly because it seemed somewhat sad to require every extension to have version-specific ifdefs around that, particularly because it's not hard for us to infer. I think there might be other use cases for the backend to provide "extension scoped" information, FWIW. Even just providing the full path to the extension library could be useful. Greetings, Andres Freund
On Mon, May 13, 2024 at 07:11:53PM GMT, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > On 2024-05-13 19:25:11 -0300, Fabrízio de Royes Mello wrote: > >> Hmmm, depending on the extension it can extensively call/use postgres code > >> so would be nice if we can differentiate if the code is called from > >> Postgres itself or from an extension. > > > I think that's not realistically possible. It's also very fuzzy what that'd > > mean. If there's a planner hook and then the query executes normally, what do > > you report for an execution time error? And even the simpler case - should use > > of pg_stat_statements cause everything within to be logged as a > > pg_stat_statement message? > > Not to mention that there could be more than one extension on the call > stack. I think tying this statically to the ereport call site is > fine. > > The mechanism that Andres describes for sourcing the name seems a bit > overcomplex though. Why not just allow/require each extension to > specify its name as a constant string? We could force the matter by > redefining PG_MODULE_MAGIC as taking an argument: > > PG_MODULE_MAGIC("hstore"); FTR there was a proposal at [1] some time ago that could be used for that need (and others), I thought it could be good to mention it just in case. That would obviously only work if all extensions uses that framework. [1] https://www.postgresql.org/message-id/flat/3207907.AWbSqkKDnR%40aivenronan
On 14.05.24 01:11, Tom Lane wrote: > The mechanism that Andres describes for sourcing the name seems a bit > overcomplex though. Why not just allow/require each extension to > specify its name as a constant string? We could force the matter by > redefining PG_MODULE_MAGIC as taking an argument: > > PG_MODULE_MAGIC("hstore"); We kind of already have something like this, for NLS. If you look for pg_bindtextdomain(TEXTDOMAIN) and ereport_domain(), this information already trickles into the vicinity of the error data. Maybe the same thing could just be used for this, by wiring up the macros a bit differently.
Peter Eisentraut <peter@eisentraut.org> writes: > On 14.05.24 01:11, Tom Lane wrote: >> The mechanism that Andres describes for sourcing the name seems a bit >> overcomplex though. Why not just allow/require each extension to >> specify its name as a constant string? We could force the matter by >> redefining PG_MODULE_MAGIC as taking an argument: >> PG_MODULE_MAGIC("hstore"); > We kind of already have something like this, for NLS. If you look for > pg_bindtextdomain(TEXTDOMAIN) and ereport_domain(), this information > already trickles into the vicinity of the error data. Maybe the same > thing could just be used for this, by wiring up the macros a bit > differently. Hmm, cute idea, but it'd only help for extensions that are NLS-enabled. Which I bet is a tiny fraction of the population. So far as I can find, we don't even document how to set up TEXTDOMAIN for an extension --- you have to cargo-cult the macro definition from some in-core extension. regards, tom lane
On 05/15/24 11:50, Tom Lane wrote: > Hmm, cute idea, but it'd only help for extensions that are > NLS-enabled. Which I bet is a tiny fraction of the population. > So far as I can find, we don't even document how to set up > TEXTDOMAIN for an extension --- you have to cargo-cult the But I'd bet, within the fraction of the population that does use it, it is already a short string that looks a whole lot like the name of the extension. So maybe enhancing the documentation and making it easy to set up would achieve much of the objective here. Could PGXS be made to supply the extension name as TEXTDOMAIN when building code that does not otherwise define it, and would that have any ill effect on the otherwise not-NLS-enabled code? Would the worst- case effect be a failed search for a nonexistent .mo file, followed by output of the untranslated message as before? At first glance, it appears elog will apply PG_TEXTDOMAIN("postgres") in an extension that does not otherwise define TEXTDOMAIN. But I assume the usual effect of that is already a failed lookup followed by output of the untranslated message, except in the case of the out-of-core extension using a message matching a PG_TEXTDOMAIN("postgres") translation. If that case is considered unexpected, or actively discouraged, perhaps defining TEXTDOMAIN in an otherwise not-NLS-enabled extension could be relatively painless. Regards, -Chap
Hi, On 2024-05-15 12:54:45 -0400, Chapman Flack wrote: > On 05/15/24 11:50, Tom Lane wrote: > > Hmm, cute idea, but it'd only help for extensions that are > > NLS-enabled. Which I bet is a tiny fraction of the population. > > So far as I can find, we don't even document how to set up > > TEXTDOMAIN for an extension --- you have to cargo-cult the > > But I'd bet, within the fraction of the population that does use it, > it is already a short string that looks a whole lot like the name > of the extension. So maybe enhancing the documentation and making it > easy to set up would achieve much of the objective here. The likely outcome would IMO be that some extensions will have the data, others not. Whereas inferring the information from our side will give you something reliable. But I also just don't think it's something that architecturally fits together that well. If we either had TEXTDOMAIN reliably set across extensions or it'd architecturally be pretty, I'd go for it, but imo it's neither. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2024-05-15 12:54:45 -0400, Chapman Flack wrote: >> But I'd bet, within the fraction of the population that does use it, >> it is already a short string that looks a whole lot like the name >> of the extension. So maybe enhancing the documentation and making it >> easy to set up would achieve much of the objective here. > The likely outcome would IMO be that some extensions will have the data, > others not. Whereas inferring the information from our side will give you > something reliable. > But I also just don't think it's something that architecturally fits together > that well. If we either had TEXTDOMAIN reliably set across extensions or it'd > architecturally be pretty, I'd go for it, but imo it's neither. There is one advantage over my suggestion of changing PG_MODULE_MAGIC: if we tell people to write PG_MODULE_MAGIC; #undef TEXTDOMAIN #define TEXTDOMAIN PG_TEXTDOMAIN("hstore") then that's 100% backwards compatible and they don't need any version-testing ifdef's. I still think that the kind of infrastructure Andres proposes is way overkill compared to the value, plus it's almost certainly going to have a bunch of platform-specific problems to solve. So I think Peter's thought is worth pursuing. regards, tom lane
On 05/15/24 13:45, Tom Lane wrote: > if we tell people to write > > PG_MODULE_MAGIC; > #undef TEXTDOMAIN > #define TEXTDOMAIN PG_TEXTDOMAIN("hstore") > > then that's 100% backwards compatible and they don't need any > version-testing ifdef's. OT for this thread, but related: supposing out-of-core extensions participate increasingly in NLS, would they really want to use the PG_TEXTDOMAIN macro? That munges the supplied domain name with PG's major version and .so version numbers. Were such versioning wanted for an out-of-core extension's message catalogs, wouldn't the extension's own versioning be better suited? Regards, -Chap
Hi, On 2024-05-15 13:45:30 -0400, Tom Lane wrote: > There is one advantage over my suggestion of changing PG_MODULE_MAGIC: > if we tell people to write > > PG_MODULE_MAGIC; > #undef TEXTDOMAIN > #define TEXTDOMAIN PG_TEXTDOMAIN("hstore") > > then that's 100% backwards compatible and they don't need any > version-testing ifdef's. > > I still think that the kind of infrastructure Andres proposes > is way overkill compared to the value, plus it's almost certainly > going to have a bunch of platform-specific problems to solve. Maybe I missing something here. Even adding those two lines to the extensions in core and contrib is going to end up being more lines than what I proposed? What portability issues do you forsee? We already look up the same symbol in all the shared libraries ("Pg_magic_func"), so we know that we can deal with duplicate function names. Are you thinking that somehow we'd end up with symbol interposition or something? Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > What portability issues do you forsee? We already look up the same symbol in > all the shared libraries ("Pg_magic_func"), so we know that we can deal with > duplicate function names. Are you thinking that somehow we'd end up with > symbol interposition or something? No, it's the dependence on the physical library file name that's bothering me. Maybe that won't be an issue, but I foresee requests like "would you please case-fold it" or "the extension-trimming rule isn't quite right", etc. regards, tom lane
On 15.05.24 17:50, Tom Lane wrote: >> We kind of already have something like this, for NLS. If you look for >> pg_bindtextdomain(TEXTDOMAIN) and ereport_domain(), this information >> already trickles into the vicinity of the error data. Maybe the same >> thing could just be used for this, by wiring up the macros a bit >> differently. > Hmm, cute idea, but it'd only help for extensions that are > NLS-enabled. Which I bet is a tiny fraction of the population. > So far as I can find, we don't even document how to set up > TEXTDOMAIN for an extension --- you have to cargo-cult the > macro definition from some in-core extension. Yeah, the whole thing is a bit mysterious, and we don't need to use the exact mechanism we have now. But abstractly, we should only have to specify the, uh, domain of the log messages once. Whether that is used for building a message catalog or tagging the server log, those are just two different downstream uses of the same piece of information.