Re: pg_plan_advice - Mailing list pgsql-hackers
| From | Robert Haas |
|---|---|
| Subject | Re: pg_plan_advice |
| Date | |
| Msg-id | CA+Tgmob87qsWa-VugofU6epuV0H5XjWZGMbQas4Q-ADKmvSyBg@mail.gmail.com Whole thread Raw |
| In response to | Re: pg_plan_advice (Lukas Fittl <lukas@fittl.com>) |
| Responses |
Re: pg_plan_advice
|
| List | pgsql-hackers |
On Tue, Mar 31, 2026 at 10:25 PM Lukas Fittl <lukas@fittl.com> wrote: > To be clear, I think its okay to go ahead with merging pg_stash_advice > without that and make it a best effort to get the file saving in too - > but I think with the current design in this patch represents a > reasonable solution to what we can do in terms of persistence across > restarts in either 19 or 20. Somewhat against my better judgement, I have attempted to put your patch into committable shape. In the process, I rewrote pretty much the whole thing. So here's v24, also dropping pg_collect_advice. 0001 is the pg_stash_advice patch from v23, but with a number of changes motivated by your desire to add persistence. I split the code into two files, because I felt that file was starting to get a little bit large, and I didn't want to just add a whole bunch more stuff to it. For the most part, this is just code movement, but I did make a couple of substantive changes. First, I restricted stash names to alphanumeric characters and underscores, basically looking like identifier names. This is partly because I got thinking about the escaping requirements for the persistence file, but it's also because I realized that letting somebody name their stashes with spaces or non-printable characters in the name or a bunch of random punctuation was probably more confusing than useful. Second, pgsa_set_advice_string() was previously taking a lock itself, but most of its sister functions require the caller to do that; I changed it to match. Third, lock is also now held when calling pgsa_clear_advice_string(), which may not be entirely necessary, but it seems safer and shouldn't cost anything meaningful. 0002 adds persistence. Here's a list of changes from your version: - I changed the GUC name pg_stash_advice.save to pg_stash_advice.persist, since it controls both whether advice is saved automatically and also whether it's loaded automatically at startup time. - I added a GUC pg_stash_advice.persist_interval, so that the interval between writes can be configured. - Instead of a dump_requested flag, I added a pg_atomic_uint64 change_count. This avoids needing to take &pgsa_state->lock in LW_EXCLUSIVE mode. Even leaving that aside, I don't think a Boolean is adequate here. Your patch cleared the flag before dumping, but that means if the act of dumping fails, you forget that it needed to be done. If you instead clear the flag after dumping, then you don't realize you need to do it again if any concurrent changes happen. - I set the restart time to BGW_DEFAULT_RESTART_INTERVAL rather than 0. Restarting the worker in a tight loop is a bad plan. - I added a function pg_start_stash_advice_worker(). You could do something like add pg_stash_advice to session_preload_libraries, start using it, and use this to kick off the worker. Then eventually you can restart the server with pg_stash_advice moved to shared_preload_libraries. - As you had coded it, any interrupt that jostled the worker would trigger an immediate write-out if one was pending. That behavior seems hard to explain and document, so I made it work more like autoprewarm, which always respects the configured interval even in case of interrupts. - I added a mechanism to prevent the user from manually manipulating stashes or stash entries when persistence is enabled but before the dump file has been reloaded. Without this, reloading the dump file could error out if, for example, the user already managed to recreate a stash with the same name as one that exists in the dump file. - As you had coded it, data is inserted into the dynamic shared memory structures as it's read from the disk file. I felt that could produce rather odd behavior, especially in view of the lack of the lockout mechanism mentioned in the previous point. We might partially process the dump file and then die with some data loaded and other data not loaded. Other backends could see the partial results. While the lockout mechanism by itself is sufficient to prevent that, I felt uncomfortable about relying on that completely. It means we start consuming shared memory even before we know whether there's an error, and continue to consume it after we've died from an error, and it also means we have a very hard dependency on the lockout mechanism, which really only works if there's only ever one load and save file and loading only happens at startup time. I felt it was better to slurp all the data into memory first, parse it completely, and then start making changes to shared memory only if we don't find any problems, so I made it work that way. We replace the tabs and newlines that end fields and lines with \0 on the fly so that we can just point into that buffer, instead of having to pstrdup() anything. (Note that, even if we stuck with your approach of something based on pg_get_line(), it would probably be better to use one of the other variants, e.g. pg_get_line_buf(), to avoid allocating new buffers constantly.) - I completely reworked the string escaping. Your pgsa_escape_string() had a bug where the strpbrk call didn't check for \r. Also, I didn't like the behavior of just ignoring a backslash when it was followed by end of string or something unexpected; I felt those should be errors. Given the decision to slurp the whole file, as mentioned in the previous point, it also made sense to do the unescaping in place, so that we didn't need to allocate additional memory. I particularly didn't like the decision to sometimes allocate memory and sometimes not. While it was economical in a sense, it meant that the memory consumption could be very different depending on how many entries needed escaping. - I completely reworked the error reporting. Now, if we hit an error parsing the file (or doing anything else), we just signal an ERROR, and we rely on the fact that the postmaster will restart us. It's an explicit goal not to apply incremental changes when the file overall is not valid, which also means that we are only concerned about reporting the first error that we detect, which also seems good for avoiding log spam. On the other hand, the error reports are more specific and detailed, and now all follow the same general pattern: "syntax error in file \"%s\" line %u: problem description here". (I was also not entirely happy with the fact that you could potentially call fprintf() lots of times before finally reporting an error. While you did save and restore errno around the calls to FreeFile() and unlink(), I think it makes the code hard to reason about; e.g. what if a later fprintf call hit a different error than an earlier one?) - I felt it was a bit odd to install a zero length file, so I made the persistence mechanism remove the existing file if there's nothing to persist when it goes to write out. I am not totally sure this was the right call. - I added a message when saving entries symmetrical to the one you had for loading entries, and also some DEBUG1/2 messages in case someone needs more details. - I added a TAP test. This isn't as comprehensive as it could be -- in particular, it doesn't cover all the possible error cases that occur when trying to reload data from disk. I could add that, but it would mean stopping and restarting the server a bunch of times, and I wasn't sure that it was a good idea to add that much overhead for a few lines of code coverage. - Your documentation changes still said that the stash data would be "restored when the first session attaches after a server restart" but that doesn't sound like something that a user will understand, and also wasn't what actually happened since you added the background worker. I rewrote this. There's almost none of your code remaining at this point, but I listed you as a co-author for 0002. I think a case could be made for Author or for no listing. Let me know if you have an opinion. And, please review! -- Robert Haas EDB: http://www.enterprisedb.com
Attachment
pgsql-hackers by date: