Re: Misleading/inaccurate error message from pg_basebackup - Mailing list pgsql-bugs
From | Casey |
---|---|
Subject | Re: Misleading/inaccurate error message from pg_basebackup |
Date | |
Msg-id | C0560AD8-B681-46AD-8694-A210F6041A1B@osss.net Whole thread Raw |
In response to | Re: Misleading/inaccurate error message from pg_basebackup (Christoph Berg <myon@debian.org>) |
List | pgsql-bugs |
I didn't believe I had to mkdir, those were just test cases to illustrate the problem in isolation. I had been trying toreinitialize a node after replacing disks used for the data volume, using Patroni. When that failed due to a pg_basebackuperror, it removed the data directory. To be clear, I'm mounting separate volumes at: /var/lib/postgresql /var/lib/postgresql/wal The data and wal directories are a couple levels under those: /var/lib/postgresql/14/main /var/lib/postgresql/wal/14/main So when Patroni ran into a pg_basebackup error, it removed /var/lib/postgresq/14/main and /var/lib/postgresql/14. It alsodoes not log the specific generated pg_basebackup command. As I couldn't tell why it was erroring, I tried to recreatethat command myself based on the configuration and defaults. When I did that, I didn't think about specifying thepath for it as /usr/lib/postgresql/14/bin as I ought to have, but just relied on what was in my path, which turned outto be the wrapper script. The actual problem turned out to be that I thought that I'd cleared out all the contents of the wal directory, but I'd inadvertentlyleft a hidden file sitting in there. Anyways during the process of debugging this, I didn't have the databaserunning, and didn't have the data directory existing. I wanted to look at pg_basebackup --help, and that would notwork, throwing the error about the data directory not existing. I should have focused on the first part of the errormessage, that /var/lib/postgresq/14/main was not accessible, but instead I got distracted by the second part, tellingme to fix the directory permissions on /var/lib/postgresql/14 making it world-readable. Well it didn't actually needto be world-readable, and we don't want it to be world-readable. Regardless, I tried making it world-readable, and wasconfused as to why pg_basebackup threw the same error message. Once I created the /main subdirectory, ignoring the complaintabout world-readability, I was able to get a different error that pointed me to the actual problem: pg_basebackup: error: directory "/var/lib/postgresql/wal/14/main" exists but is not empty The point is that the error I ran into when the data directory (/main) did not exist under /var/lib/postgresq//14, is incorrect,and led me to being confused and wasting some time wondering what was wrong rather than getting to the actual problem. "please fix the directory permissions (/var/lib/postgresql/14/ should be world readable)" is misleading as therewas no need to follow that instruction and it distracted from the more relevant and correct message printed just beforeit ("/var/lib/postgresql/14/main is not accessible"). Furthermore, pg_basebackup --help should ideally work regardlessof that, as does the upstream binary. Hope this helps, -- Casey > On Jan 29, 2024, at 11:40 AM, Christoph Berg <myon@debian.org> wrote: > > Re: Casey >> I thought that I addressed your inquiries as best as I was able. Can you please clarify any remaining questions? > > What did you do to make you believe that you had to "mkdir" in the > first place? > > Also, please keep it on the list. > >>> On Jan 24, 2024, at 6:48 AM, Christoph Berg <myon@debian.org> wrote: >>> >>> Re: Casey Shobe >>>> Below is pasted my initial message, which gives more context and detail. Let me know if anything is still inclear afterthis. The context is that I use Patroni to run a multi-node cluster, and WAL-G creates a hidden directory within thewal directory which I did not initially notice when I otherwise emptied it before reinitializing a node after replacingdisk for the data volume. This led to a fair bit of time wasted looking for the wrong problem: >>> >>> I did reply to your initially message and all the questions are still >>> open. >>> >>> Christoph > > Christoph
pgsql-bugs by date: