Re: [PATCH] More docs on what to do and not do in extension code - Mailing list pgsql-hackers
From | Laurenz Albe |
---|---|
Subject | Re: [PATCH] More docs on what to do and not do in extension code |
Date | |
Msg-id | 07de8fed255f31aa2dfda8efbe839df463f143c6.camel@cybertec.at Whole thread Raw |
In response to | [PATCH] More docs on what to do and not do in extension code (Craig Ringer <craig.ringer@enterprisedb.com>) |
Responses |
Re: [PATCH] More docs on what to do and not do in extension code
|
List | pgsql-hackers |
On Mon, 2021-01-18 at 15:56 +0800, Craig Ringer wrote: > The attached patch expands the xfunc docs and bgworker docs a little, providing a starting point for developers > to learn how to do some common tasks the Postgres Way. I like these changes! Here is a review: + <para> + See <xref linkend="xfunc-shared-addin"/> for information on how to + request extension shared memory allocations, <literal>LWLock</literal>s, + etc. + </para> This doesn't sound very English to me, and I don't see the point in repeating parts of the enumeration. How about See ... for detailed information how to allocate these resources. + <para> + If a background worker needs to sleep or wait for activity it should Missing comma after "activity". + always use <link linkend="xfunc-sleeping-interrupts-blocking">PostgreSQL's + interupt-aware APIs</link> for the purpose. Do not <function>usleep()</function>, + <function>system()</function>, make blocking system calls, etc. + </para> "system" is not a verb. Suggestion: Do not use <function>usleep()</function>, <function>system()</function> or other blocking system calls. + <para> + The <filename>src/test/modules/worker_spi</filename> and + <filename>src/test/modules/test_shm_mq</filename> contain working examples + that demonstrates some useful techniques. </para> That is missing a noun in my opinion, I would prefer: The modules ... contain working examples ... + <sect1 id="bgworker-signals" xreflabel="Signal Handling in Background Workers"> + <title>Signal Handling in Background Workers</title> It is not a good idea to start a section in the middle of a documentation page. That will lead to a weird TOC entry at the top of the page. The better way to do this is to write a short introductory header and convert most of the first half of the page into another section, so that we end up with a page the has the introductory material and two TOC entries for the details. + The default signal handlers installed for background workers <emphasis>do + not interrupt queries or blocking calls into other postgres code</emphasis> <productname>PostgreSQL</productname>, not "postgres". Also, there is a comma missing at the end of the line. + so they are only suitable for simple background workers that frequently and + predictably return control to their main loop. If your worker uses the + default background worker signal handling it should call Another missing comma after "handling". + <function>HandleMainLoopInterrupts()</function> on each pass through its + main loop. + </para> + + <para> + Background workers that may make blocking calls into core PostgreSQL code + and/or run user-supplied queries should generally replace the default bgworker Please stick with "background worker", "bgworker" is too sloppy IMO. + signal handlers with the handlers used for normal user backends. This will + ensure that the worker will respond in a timely manner to a termination + request, query cancel request, recovery conflict interrupt, deadlock detector + request, etc. To install these handlers, before unblocking interrupts + run: The following would be more grammatical: To install these handlers, run the following before unblocking interrupts: + Then ensure that your main loop and any other code that could run for some + time contains <function>CHECK_FOR_INTERRUPTS();</function> calls. A + background worker using these signal handlers must use <link + linkend="xfunc-resource-management">PostgreSQL's resource management APIs + and callbacks</link> to handle cleanup rather than relying on control + returning to the main loop because the signal handlers may call There should be a comma before "because". + <function>proc_exit()</function> directly. This is recommended practice + for all types of extension code anyway. + </para> + + <para> + See the comments in <filename>src/include/miscadmin.h</filename> in the + postgres headers for more details on signal handling. + </para> "in the postgres headers" is redundant - at any rate, it should be "PostgreSQL". + Do not attempt to use C++ exceptions or Windows Structured Exception + Handling, and never call <function>exit()</function> directly. I am alright with this addition, but I think it would be good to link to <xref linkend="extend-cpp"/> from it. + <listitem id="xfunc-threading"> + <para> + Individual PostgreSQL backends are single-threaded. + Never call any PostgreSQL function or access any PostgreSQL-managed data + structure from a thread other than the main "PostgreSQL" should always have the <productname> tag. This applies to a lot of places in this patch. + thread. If at all possible your extension should not start any threads Comma after "possible". + and should only use the main thread. PostgreSQL generally uses subprocesses Hm. If the extension does not start threads, it automatically uses the main thread. I think that should be removed for clarity. + that coordinate over shared memory instead of threads - see + <xref linkend="bgworker"/>. It also uses signals and light-weight locks - but I think that you don't need to describe the coordination mechanisms here, which are explained in the link you added. + primitives like <function>WaitEventSetWait</function> where necessary. Any + potentially long-running loop should periodically call <function> + CHECK_FOR_INTERRUPTS()</function> to give PostgreSQL a chance to interrupt + the function in case of a shutdown request, query cancel, etc. This means Are there other causes than shutdown or query cancellation? At any rate, I am not fond of enumerations ending with "etc". + you should <function>sleep()</function> or <function>usleep()</function> You mean: "you should *not* use sleep()" + for any nontrivial amount of time - use <function>WaitLatch()</function> "—" would be better than "-". + or its variants instead. For details on interrupt handling see Comma after "handling". [...] + based cleanup. Your extension function could be terminated mid-execution ... could be terminated *in* mid-execution ... + by PostgreSQL if any function that it calls makes a + <function>CHECK_FOR_INTERRUPTS()</function> check. It may not "makes" sound kind of clumsy in my ears. + Spinlocks, latches, condition variables, and more. Details on all of these + is far outside the scope of this document, and the best reference is + the relevant source code. I don't think we have to add that last sentence. That holds for pretty much everything in this documentation. + <para> + Sometimes relation-based state management for extensions is not + sufficient to meet their needs. In that case the extension may need to Better: Sometimes, relation-based state management is not sufficient to meet the needs of an extension. + use PostgreSQL's shared-memory based inter-process communication + features, and should certainly do so instead of inventing its own or + trying to use platform level features. An extension may use + <link linkend="xfunc-shared-addin">"raw" shared memory requested from + the postmaster at startup</link> or higher level features like dynamic + shared memory segments (<acronym>DSM</acronym>), + dynamic shared areas (<acronym>DSA</acronym>), + or shared memory message queues (<acronym>shm_mq</acronym>). Examples + of the use of some these features can be found in the + <filename>src/test/modules/test_shm_mq/</filename> example extension. Others + can be found in various main PostgreSQL backend code. + </para> Instead of the last sentence, I'd prefer ... or other parts of the source code. + <listitem id="xfunc-relcache-syscache"> + <para> + Look up system catalogs and table information using the relcache and syscache How about "table metadata" rather than "table information"? + APIs (<function>SearchSysCache...</function>, + <function>relation_open()</function>, etc) rather than attempting to run + SQL queries to fetch the information. Ensure that your function holds + any necessary locks on the target objects. Take care not to make any calls ... holds *the* necessary locks ... + that could trigger cache invalidations while still accessing any + syscache cache entries, as this can cause subtle bugs. See Subtle? Perhaps "bugs that are hard to find". + <filename>src/backend/utils/cache/syscache.c</filename>, + <filename>src/backend/utils/cache/relcache.c</filename>, + <filename>src/backend/access/common/relation.c</filename> and their + headers for details. + </para> + </listitem> Attached is a new version that has my suggested changes, plus a few from Bharath Rupireddy (I do not agree with many of his suggestions). Yours, Laurenz Albe
Attachment
pgsql-hackers by date: