Re: Printing backtrace of postgres processes - Mailing list pgsql-hackers
From | vignesh C |
---|---|
Subject | Re: Printing backtrace of postgres processes |
Date | |
Msg-id | CALDaNm3aBt1hw+Q=achUxfQgyu3b6DmdxewvMVx5DEikNayAFA@mail.gmail.com Whole thread Raw |
In response to | Re: Printing backtrace of postgres processes (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>) |
List | pgsql-hackers |
Thanks Bharath for your comments. On Mon, Feb 1, 2021 at 6:14 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Fri, Jan 29, 2021 at 7:10 PM vignesh C <vignesh21@gmail.com> wrote: > > > 4) How about following > > > + errmsg("must be a superuser to print backtrace > > > of backend process"))); > > > instead of > > > + errmsg("must be a superuser to print backtrace > > > of superuser query process"))); > > > > > > > Here the message should include superuser, we cannot remove it. Non > > super user can log non super user provided if user has permissions for > > it. > > > > > 5) How about following > > > errmsg("must be a member of the role whose backed > > > process's backtrace is being printed or member of > > > pg_signal_backend"))); > > > instead of > > > + errmsg("must be a member of the role whose > > > backtrace is being logged or member of pg_signal_backend"))); > > > > > > > Modified it. > > Maybe I'm confused here to understand the difference between > SIGNAL_BACKEND_NOSUPERUSER and SIGNAL_BACKEND_NOPERMISSION macros and > corresponding error messages. Some clarification/use case to know in > which scenarios we hit those error messages might help me. Did we try > to add test cases that show up these error messages for > pg_print_backtrace? If not, can we add? I have tested this manually: I have tested it manually, Here is the test I did: Create 2 users: create user test password 'test@123'; create user test1 password 'test@123'; Test1: Test print backtrace of a different user's session: ./psql -d postgres -U test psql (14devel) Type "help" for help. postgres=> select pg_backend_pid(); pg_backend_pid ---------------- 30142 (1 row) ------------------------------------------ ./psql -d postgres -U test1 psql (14devel) Type "help" for help. postgres=> select pg_print_backtrace(30142); ERROR: must be a member of the role whose backtrace is being logged or member of pg_signal_backend The above will be successful after: grant pg_signal_backend to test1; Test1: Test for non super user trying to print backtrace of a super user's session: ./psql -d postgres psql (14devel) Type "help" for help. postgres=# select pg_backend_pid(); pg_backend_pid ---------------- 30211 (1 row) ./psql -d postgres -U test1 psql (14devel) Type "help" for help. postgres=> select pg_print_backtrace(30211); ERROR: must be a superuser to print backtrace of superuser process I have not added any tests for this as we required 2 active sessions and I did not see any existing framework for this. This test should help in relating the behavior. > > Attached v5 patch has the fixes for the same. > > Thoughts? > > Thanks. Here are some comments on v5 patch: > > 1) typo - it's "process" not "porcess" + a backend porcess. For example: > Modified. > 2) select * from pg_print_backtrace(NULL); > postgres=# select proname, proisstrict from pg_proc where proname = > 'pg_print_backtrace'; > proname | proisstrict > --------------------+------------- > pg_print_backtrace | t > > See the documentation: > "proisstrict bool > > Function returns null if any call argument is null. In that case the > function won't actually be called at all. Functions that are not > “strict” must be prepared to handle null inputs." > So below PG_ARGISNUL check is not needed, you can remove that, because > pg_print_backtrace will not get called with null input. > int bt_pid = PG_ARGISNULL(0) ? -1 : PG_GETARG_INT32(0); > Modified. > 3) Can we just set results = true instead of PG_RETURN_BOOL(true); so > that it will be returned from PG_RETURN_BOOL(result); just for > consistency? > if (!SendProcSignal(bt_pid, PROCSIG_BACKTRACE_PRINT, > InvalidBackendId)) > PG_RETURN_BOOL(true); > else > ereport(WARNING, > (errmsg("failed to send signal to postmaster: %m"))); > } > > PG_RETURN_BOOL(result); > I felt existing is better as it will reduce one instruction of setting first and then returning. There are only 2 places from where we return. I felt we could directly return true or false. > 4) Below is what happens if I request for a backtrace of the > postmaster process? 1388210 is pid of postmaster. > postgres=# select * from pg_print_backtrace(1388210); > WARNING: PID 1388210 is not a PostgreSQL server process > pg_print_backtrace > -------------------- > f > > Does it make sense to have a postmaster's backtrace? If yes, can we > have something like below in sigusr1_handler()? > if (CheckPostmasterSignal(PMSIGNAL_BACKTRACE_EMIT)) > { > LogBackTrace(); > } > We had a discussion about this in [1] earlier and felt including this is not very useful. > 5) Can we have PROCSIG_PRINT_BACKTRACE instead of > PROCSIG_BACKTRACE_PRINT, just for readability and consistency with > function name? > PROCSIG_PRINT_BACKTRACE is better, I have modified it. > 6) I think it's not the postmaster that prints backtrace of the > requested backend and we don't send SIGUSR1 with > PMSIGNAL_BACKTRACE_EMIT to postmaster, I think it's the backends, upon > receiving SIGUSR1 with PMSIGNAL_BACKTRACE_EMIT signal, log their own > backtrace. Am I missing anything here? If I'm correct, we need to > change the below description and other places wherever we refer to > this description. > > The idea here is to implement & expose pg_print_backtrace function, internally > what this function does is, the connected backend will send SIGUSR1 signal by > setting PMSIGNAL_BACKTRACE_EMIT to the postmaster process. Once the process > receives this signal it will log the backtrace of the process. > Modified. > 7) Can we get the parallel worker's backtrace? IIUC it's possible. > Yes we can get the parallel worker's backtrace. As this design is driven based on CHECK_FOR_INTERRUPTS, it will be printed when CHECK_FOR_INTERRUPTS is called. > 8) What happens if a user runs pg_print_backtrace() on Windows or > MacOS or some other platform? Do you want to say something about other > platforms where gdb/addr2line doesn't exist? > +</programlisting> > + You can get the file name and line number by using gdb/addr2line in > + linux platforms, as a prerequisite users must ensure gdb/addr2line is > + already installed: > +<programlisting> > I have added the following: This feature will be available, if the installer was generated on a platform which had backtrace capturing capability. If not available it will return false by throwing the following warning "WARNING: backtrace generation is not supported by this installation" > 9) Can't we reuse set_backtrace with just adding a flag to > set_backtrace(ErrorData *edata, int num_skip, bool > is_print_backtrace_function), making it a non-static function and call > set_backtrace(NULL, 0, true)? > > void > set_backtrace(ErrorData *edata, int num_skip, bool is_print_backtrace_function) > { > StringInfoData errtrace; > ------- > ------- > if (is_print_backtrace_function) > elog(LOG_SERVER_ONLY, "current backtrace:%s", errtrace.data); > else > edata->backtrace = errtrace.data; > } > > I think it will be good if we do this, because we can avoid duplicate > code in set_backtrace and LogBackTrace. > Yes it is better, I have modified it to use the same function. > 10) I think it's "pg_signal_backend" instead of "pg_print_backtrace"? > + backtrace is being printed or the calling role has been granted > + <literal>pg_print_backtrace</literal>, however only superusers can > Modified. > 11) In the documentation added, isn't it good if we talk a bit about > in which scenarios users can use this function? For instance, > something like "Use pg_print_backtrace to know exactly where it's > currently waiting when a backend process hangs."? > Modified to include more details. [1] - https:://www.postgresql.org/message-id/1778088.1606026308%40sss.pgh.pa.us Attached v6 patch with the fixes. Thoughts? Regards, Vignesh
Attachment
pgsql-hackers by date: