Thread: [PATCH] initdb: Treat empty -U argument as unset username

[PATCH] initdb: Treat empty -U argument as unset username

From
Jianghua Yang
Date:
Hi hackers,

While working with `initdb`, I noticed that passing an empty string to the `-U` option (e.g., `initdb -U ''`) causes it to fail with a misleading error:
    

performing post-bootstrap initialization ... 2025-07-01 19:48:42.006 PDT [14888] FATAL:  role """ does not exist at character 72

2025-07-01 19:48:42.006 PDT [14888] STATEMENT:  

UPDATE pg_class   SET relacl = (SELECT array_agg(a.acl) FROM  (SELECT E'=r/""' as acl   UNION SELECT unnest(pg_catalog.acldefault(    CASE WHEN relkind = 'S' THEN 's'          ELSE 'r' END::"char",10::oid)) ) as a)   WHERE relkind IN ('r', 'v', 'm', 'S')  AND relacl IS NULL;
This happens because `initdb` accepts the empty string as a valid role name and attempts to use it as the database superuser, which is not intended and fails during bootstrap SQL.

I propose a small patch that treats an empty string passed to `-U` as if the option was not provided at all — falling back to the current system user, which is the documented and expected behavior when `-U` is omitted.

This change improves robustness and avoids confusing failure messages due to user input that is technically invalid but easy to produce (e.g., via scripting or argument quoting issues).

### Patch summary:

- Checks if the passed `username` is non-null but empty (`'\0'`)
- Replaces it with the effective system user in that case
- Keeps the logic consistent with the existing behavior when `-U` is omitted

Let me know if this approach seems reasonable or if you’d prefer we explicitly reject empty usernames with an error instead.

Patch attached.

Best regards,  
Jianghua Yang  

Attachment

Re: [PATCH] initdb: Treat empty -U argument as unset username

From
Jianghua Yang
Date:

git show 8e673801262c66af4a54837f63ff596407835c20


        effective_user = get_id();

-       if (strlen(username) == 0)

+       if (!username)

                username = effective_user;


The previous code already intended to treat a missing username as falling back to the system user. 
The check was changed from strlen(username) == 0 to !username, but this inadvertently stopped handling the empty-string case. This patch restores the original intent and makes the behavior consistent.

David G. Johnston <david.g.johnston@gmail.com> 于2025年7月1日周二 20:12写道:
On Tue, Jul 1, 2025 at 7:56 PM Jianghua Yang <yjhjstz@gmail.com> wrote:
Let me know if this approach seems reasonable or if you’d prefer we explicitly reject empty usernames with an error instead.


I'd rather we reject the ambiguous input.

David J.

Re: [PATCH] initdb: Treat empty -U argument as unset username

From
"David G. Johnston"
Date:
We try to stick with plain text and inline/bottom replies here.

On Tue, Jul 1, 2025 at 8:31 PM Jianghua Yang <yjhjstz@gmail.com> wrote:

git show 8e673801262c66af4a54837f63ff596407835c20


        effective_user = get_id();

-       if (strlen(username) == 0)

+       if (!username)

                username = effective_user;


The previous code already intended to treat a missing username as falling back to the system user. 
The check was changed from strlen(username) == 0 to !username, but this inadvertently stopped handling the empty-string case. This patch restores the original intent and makes the behavior consistent.


At this point I'd rather take advantage of this behaveing in the "doesn't work" category for the past 8 years, and thus all supported releases, and not change existing behavior (just improve the error message) rather than accept original intent.  Also, the amount of things it has to be consistent with is quite small and I'm pytr sure that some of those are also broken - encoding/pgdata/textsearch all exhibit the same pattern (xlog is the reverse so maybe ok...)

David J.








Re: [PATCH] initdb: Treat empty -U argument as unset username

From
Robert Treat
Date:
On Wed, Jul 2, 2025 at 12:01 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:
> On Tue, Jul 1, 2025 at 8:31 PM Jianghua Yang <yjhjstz@gmail.com> wrote:
>>
>> git show 8e673801262c66af4a54837f63ff596407835c20
>>
>>
>>         effective_user = get_id();
>>
>> -       if (strlen(username) == 0)
>>
>> +       if (!username)
>>
>>                 username = effective_user;
>>
>>
>> The previous code already intended to treat a missing username as falling back to the system user.
>> The check was changed from strlen(username) == 0 to !username, but this inadvertently stopped handling the
empty-stringcase. This patch restores the original intent and makes the behavior consistent. 
>>>
>>>
>
> At this point I'd rather take advantage of this behaveing in the "doesn't work" category for the past 8 years, and
thusall supported releases, and not change existing behavior (just improve the error message) rather than accept
originalintent.  Also, the amount of things it has to be consistent with is quite small and I'm pytr sure that some of
thoseare also broken - encoding/pgdata/textsearch all exhibit the same pattern (xlog is the reverse so maybe ok...) 
>

FWIW, I tend to agree with David; I feel like if a user passes in -U,
there was probably a reason, and a good error message would be more
useful in clarifying things rather than blindly pushing forward with
potentially the wrong thing.


Robert Treat
https://xzilla.net



Re: [PATCH] initdb: Treat empty -U argument as unset username

From
Daniel Gustafsson
Date:
> On 2 Jul 2025, at 15:52, Jianghua Yang <yjhjstz@gmail.com> wrote:
>
> Hi hackers,
>
> Based on the suggestion that we should explicitly reject empty usernames instead of silently falling back, I’ve
updatedthe patch accordingly. 
>
> ### Changes in v2:
>
> - `initdb` now errors out immediately if the `-U` or `--username` argument is an empty string.
> - The error message is:
>
>       superuser name must not be empty
>
> - A regression test is added to `src/bin/initdb/t/001_initdb.pl` to verify that the case `initdb -U ''` fails as
expected.
>
> This approach avoids any ambiguity about whether an empty username is valid, and fails early with a clear message. It
alsobrings consistency with existing checks, such as the one disallowing superuser names starting with `pg_`. 
>
> Let me know if this looks acceptable or if further refinement is needed.

+    pg_log_error("superuser name must not be empty");
+    exit(1);

I would prefer pg_fatal() for this.

--
Daniel Gustafsson




Re: [PATCH] initdb: Treat empty -U argument as unset username

From
Dagfinn Ilmari Mannsåker
Date:
Jianghua Yang <yjhjstz@gmail.com> writes:

> - A regression test is added to `src/bin/initdb/t/001_initdb.pl` to verify
> that the case `initdb -U ''` fails as expected.
[ ... ]
> diff --git a/src/bin/initdb/t/001_initdb.pl b/src/bin/initdb/t/001_initdb.pl
> index 15dd10ce40a..67eb53064f6 100644
> --- a/src/bin/initdb/t/001_initdb.pl
> +++ b/src/bin/initdb/t/001_initdb.pl
> @@ -37,6 +37,10 @@ command_fails(
>  command_fails([ 'initdb', '--username' => 'pg_test', $datadir ],
>      'role names cannot begin with "pg_"');
>  
> +command_fails(
> +    [ 'initdb', '-U', '', $datadir ],
> +    'empty username not allowed');
> +

This only tests that it fails, not that it fails as expected.  It should
use command_fails_like() to check that stderr contains the expected
error.  Also, it shoud use => between the -U option and its argument, as
seen in the above test with --username.

- ilmari



Re: [PATCH] initdb: Treat empty -U argument as unset username

From
Jianghua Yang
Date:
Thanks for the feedback!

I've updated the test to use `command_fails_like()` instead of `command_fails()`, so it now asserts that the error message matches the expected stderr output. 
I also changed the test invocation to use the `-U => ''` syntax for consistency, as seen in the adjacent `--username` test.
   

Let me know if any further adjustments are needed.

Best regards,  
Jianghua Yang

Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> 于2025年7月2日周三 07:09写道:
Jianghua Yang <yjhjstz@gmail.com> writes:

> - A regression test is added to `src/bin/initdb/t/001_initdb.pl` to verify
> that the case `initdb -U ''` fails as expected.
[ ... ]
> diff --git a/src/bin/initdb/t/001_initdb.pl b/src/bin/initdb/t/001_initdb.pl
> index 15dd10ce40a..67eb53064f6 100644
> --- a/src/bin/initdb/t/001_initdb.pl
> +++ b/src/bin/initdb/t/001_initdb.pl
> @@ -37,6 +37,10 @@ command_fails(
>  command_fails([ 'initdb', '--username' => 'pg_test', $datadir ],
>       'role names cannot begin with "pg_"');

> +command_fails(
> +     [ 'initdb', '-U', '', $datadir ],
> +     'empty username not allowed');
> +

This only tests that it fails, not that it fails as expected.  It should
use command_fails_like() to check that stderr contains the expected
error.  Also, it shoud use => between the -U option and its argument, as
seen in the above test with --username.

- ilmari
Attachment

Re: [PATCH] initdb: Treat empty -U argument as unset username

From
Peter Eisentraut
Date:
On 02.07.25 04:55, Jianghua Yang wrote:
> While working with `initdb`, I noticed that passing an empty string to 
> the `-U` option (e.g., `initdb -U ''`) causes it to fail with a 
> misleading error:
> 
> performing post-bootstrap initialization ... 2025-07-01 19:48:42.006 PDT 
> [14888] FATAL:role """ does not exist at character 72
> 
> 2025-07-01 19:48:42.006 PDT [14888] STATEMENT:
> 
> UPDATE pg_class SET relacl = (SELECT array_agg(a.acl) FROM(SELECT 
> E'=r/""' as acl UNION SELECT unnest(pg_catalog.acldefault(CASE WHEN 
> relkind = 'S' THEN 's'ELSE 'r' END::"char",10::oid)) ) as a) WHERE 
> relkind IN ('r', 'v', 'm', 'S')AND relacl IS NULL;
> 
> This happens because `initdb` accepts the empty string as a valid role 
> name and attempts to use it as the database superuser, which is not 
> intended and fails during bootstrap SQL.

I'll start by saying, of course an empty user name isn't going to work, 
so we should reject it.

But let's dig a little deeper into why it fails.  Observe the error:

     FATAL:role """ does not exist at character 72

It thinks that the role name is `"` (a sole double-quote, not empty!). 
Why is that?

This error comes from the literal

    E'=r/""'

interpreted as an aclitem value.  The aclitem parsing ends up in getid() 
in src/backend/utils/adt/acl.c, which thinks that an input string 
consisting entirely of "" is an escaped double quote.

Maybe it's worth fixing that, and making putid() also print empty user 
names correspondingly.

Alternatively, it's the fault of initdb that it constructs aclitem 
values that don't follow the aclitem-specific quoting rules.

Another thought is, if we don't allow zero-length names, shouldn't 
namein() reject empty input strings?  Then this whole thing would fail 
as postgres.bki is being loaded.  (This is more hypothetical, since this 
appears to break a number of other things.)

All of this is to say, it's worth looking at the actual cause and think 
about if there are related problems, maybe other name patterns that we 
don't handle well, instead of just papering over it at the top level.




Re: [PATCH] initdb: Treat empty -U argument as unset username

From
Jianghua Yang
Date:
Hi Peter,

Thanks for your detailed analysis. I appreciate you digging deeper into the root cause.

For this patch, I'd like to keep the changes to `initdb` minimal and focused on rejecting empty usernames, as that seems to be the consensus from the previous discussion.

I'll be happy to discuss the `getid()` and `aclitem` parsing behavior in a separate thread.

Best regards,
Jianghua Yang


Peter Eisentraut <peter@eisentraut.org> 于2025年7月2日周三 07:39写道:
On 02.07.25 04:55, Jianghua Yang wrote:
> While working with `initdb`, I noticed that passing an empty string to
> the `-U` option (e.g., `initdb -U ''`) causes it to fail with a
> misleading error:
>
> performing post-bootstrap initialization ... 2025-07-01 19:48:42.006 PDT
> [14888] FATAL:role """ does not exist at character 72
>
> 2025-07-01 19:48:42.006 PDT [14888] STATEMENT:
>
> UPDATE pg_class SET relacl = (SELECT array_agg(a.acl) FROM(SELECT
> E'=r/""' as acl UNION SELECT unnest(pg_catalog.acldefault(CASE WHEN
> relkind = 'S' THEN 's'ELSE 'r' END::"char",10::oid)) ) as a) WHERE
> relkind IN ('r', 'v', 'm', 'S')AND relacl IS NULL;
>
> This happens because `initdb` accepts the empty string as a valid role
> name and attempts to use it as the database superuser, which is not
> intended and fails during bootstrap SQL.

I'll start by saying, of course an empty user name isn't going to work,
so we should reject it.

But let's dig a little deeper into why it fails.  Observe the error:

     FATAL:role """ does not exist at character 72

It thinks that the role name is `"` (a sole double-quote, not empty!).
Why is that?

This error comes from the literal

    E'=r/""'

interpreted as an aclitem value.  The aclitem parsing ends up in getid()
in src/backend/utils/adt/acl.c, which thinks that an input string
consisting entirely of "" is an escaped double quote.

Maybe it's worth fixing that, and making putid() also print empty user
names correspondingly.

Alternatively, it's the fault of initdb that it constructs aclitem
values that don't follow the aclitem-specific quoting rules.

Another thought is, if we don't allow zero-length names, shouldn't
namein() reject empty input strings?  Then this whole thing would fail
as postgres.bki is being loaded.  (This is more hypothetical, since this
appears to break a number of other things.)

All of this is to say, it's worth looking at the actual cause and think
about if there are related problems, maybe other name patterns that we
don't handle well, instead of just papering over it at the top level.

Peter Eisentraut <peter@eisentraut.org> writes:
> ... The aclitem parsing ends up in getid() 
> in src/backend/utils/adt/acl.c, which thinks that an input string 
> consisting entirely of "" is an escaped double quote.

Yeah, that is definitely broken, and also it occurs to me that this
coding is not robust in the face of non-ASCII identifiers (since
the behavior of isalnum is unportable in that case).  I've posted
a draft patch for that in a separate thread [1].

> Maybe it's worth fixing that, and making putid() also print empty user 
> names correspondingly.

putid() prints an empty name as an empty string, which seems fine
at least at this level.

> Alternatively, it's the fault of initdb that it constructs aclitem 
> values that don't follow the aclitem-specific quoting rules.

With the patch at [1], initdb -U '' still fails at the same place,
but now with

FATAL:  a name must follow the "/" sign at character 72

which is a consequence of getid()'s output not distinguishing
"I found nothing" from "I found an empty string".  Now if we
cared to support empty identifiers, we could make some more
revisions here so that those were consistently represented as
"" rather than nothing.  But I fail to see that that's a useful
expenditure of time: we're never going to allow empty names.

> Another thought is, if we don't allow zero-length names, shouldn't 
> namein() reject empty input strings?

I'm inclined to think not.  It's introducing too much of a gotcha
for too little return.  As a perhaps-not-quite-exact analogy,
NULL::name also doesn't correspond to any identifier you can spell
in SQL; but we're not going to try to forbid that value.  So there
is at least one value of type "name" that isn't valid in SQL, and
I don't see why ''::name can't be another.

            regards, tom lane

[1] https://www.postgresql.org/message-id/3792884.1751492172%40sss.pgh.pa.us