Thread: Binding Postgres to port 0 for testing

Binding Postgres to port 0 for testing

From
Markus Pilman
Date:
Hi all,

I am building a simple integration test framework for an application that I am building. For this project I am planning to use PostgreSQL.

For testing purposes I'd like to always start with an empty database, populate data, and, if the test was successful, delete everything. These tests are single process tests.

I'd like to run many processes in parallel and have one postgres server process running for each. I realize that I could use one and use multiple databases but I don't want to do this for a variety of reasons (one being that I also want to test the control plane logic, the application is multi-tenant and uses a database per tenant, having separate databases simplifies debugging etc).

Now the problem is that I need to find a TCP port for each running postgres instance. There's multiple ways to do this, but by far the easiest one I know is to bind to port 0. So my plan was to start postgres with "-p 0" and then parse stdout to figure out which port it actually uses. But that doesn't seem to work:

 >postgres -D data/ -p 0
 2023-03-25 16:39:54.271 GMT [13924] FATAL:  0 is outside the valid range for parameter "port" (1 .. 65535)

Is there a reason this is not allowed? What would be the recommended way of addressing my issue?

Best Markus

Re: Binding Postgres to port 0 for testing

From
Thorsten Glaser
Date:
Hi Markus,

>I am building a simple integration test framework for an application that I
>am building. For this project I am planning to use PostgreSQL.

check pg_virtualenv(1), which apparently is a Debian thing.
It auto-creates and auto-deletes at the end, if desired, a
cluster and runs your program with suitable environment
variables to access that cluster, which includes choosing
an available port. This was in fact made for integration
testing, though I use it for quickly spinning up throwaway
DBs for one-shot tools as well.

bye,
//mirabilos
--
Infrastrukturexperte • tarent solutions GmbH
Am Dickobskreuz 10, D-53121 Bonn • http://www.tarent.de/
Telephon +49 228 54881-393 • Fax: +49 228 54881-235
HRB AG Bonn 5168 • USt-ID (VAT): DE122264941
Geschäftsführer: Dr. Stefan Barth, Kai Ebenrett, Boris Esser, Alexander Steeg

                        ****************************************************
/⁀\ The UTF-8 Ribbon
╲ ╱ Campaign against      Mit dem tarent-Newsletter nichts mehr verpassen:
 ╳  HTML eMail! Also,     https://www.tarent.de/newsletter
╱ ╲ header encryption!
                        ****************************************************



Re: Binding Postgres to port 0 for testing

From
Erik Wienhold
Date:
> On 25/03/2023 18:01 CET Markus Pilman <markus@pilman.ch> wrote:
>
> I am building a simple integration test framework for an application that I
> am building. For this project I am planning to use PostgreSQL.
>
> For testing purposes I'd like to always start with an empty database,
> populate data, and, if the test was successful, delete everything. These
> tests are single process tests.
>
> I'd like to run many processes in parallel and have one postgres server
> process running for each. I realize that I could use one and use multiple
> databases but I don't want to do this for a variety of reasons (one being
> that I also want to test the control plane logic, the application is
> multi-tenant and uses a database per tenant, having separate databases
> simplifies debugging etc).
>
> Now the problem is that I need to find a TCP port for each running postgres
> instance. There's multiple ways to do this, but by far the easiest one I
> know is to bind to port 0. So my plan was to start postgres with "-p 0" and
> then parse stdout to figure out which port it actually uses. But that
> doesn't seem to work:
>
> >postgres -D data/ -p 0
>  2023-03-25 16:39:54.271 GMT [13924] FATAL: 0 is outside the valid range for parameter "port" (1 .. 65535)
>
> What would be the recommended way of addressing my issue?

I would try to start Postgres with every port number in a for loop starting
with port number 1024.  The first one that works is your port number.  And you
may not even have to parse stdout if you can pass that port number to your tests.

Maybe you can also use pg_virtualenv[0] from Debian's postgresql-common.  It
tries every port number starting from 5432.

[0] https://manpages.debian.org/testing/postgresql-common/pg_virtualenv.1.en.html

--
Erik



Re: Binding Postgres to port 0 for testing

From
Markus Pilman
Date:
Thanks for the suggestions. I didn't know about pg_virtualenv, that's interesting. Though it seems to achieve something similar as to containerize the test (potentially in a more platform independent way). Though it seems pg_virtualenv is mostly doing what my test driver is currently doing. Trying out the ports is obviously possible, but it seems a bit hacky to me (though if there's no better way I don't think that's a good show-stopper).

But I am still wondering: Is there a reason PostgreSQL doesn't allow me to bind against port 0? I understand that in a production environment this is almost never the thing you want to do, but I wouldn't consider this option very dangerous.

Best Markus


On Sat, Mar 25, 2023 at 12:10 PM Erik Wienhold <ewie@ewie.name> wrote:
> On 25/03/2023 18:01 CET Markus Pilman <markus@pilman.ch> wrote:
>
> I am building a simple integration test framework for an application that I
> am building. For this project I am planning to use PostgreSQL.
>
> For testing purposes I'd like to always start with an empty database,
> populate data, and, if the test was successful, delete everything. These
> tests are single process tests.
>
> I'd like to run many processes in parallel and have one postgres server
> process running for each. I realize that I could use one and use multiple
> databases but I don't want to do this for a variety of reasons (one being
> that I also want to test the control plane logic, the application is
> multi-tenant and uses a database per tenant, having separate databases
> simplifies debugging etc).
>
> Now the problem is that I need to find a TCP port for each running postgres
> instance. There's multiple ways to do this, but by far the easiest one I
> know is to bind to port 0. So my plan was to start postgres with "-p 0" and
> then parse stdout to figure out which port it actually uses. But that
> doesn't seem to work:
>
> >postgres -D data/ -p 0
>  2023-03-25 16:39:54.271 GMT [13924] FATAL: 0 is outside the valid range for parameter "port" (1 .. 65535)
>
> What would be the recommended way of addressing my issue?

I would try to start Postgres with every port number in a for loop starting
with port number 1024.  The first one that works is your port number.  And you
may not even have to parse stdout if you can pass that port number to your tests.

Maybe you can also use pg_virtualenv[0] from Debian's postgresql-common.  It
tries every port number starting from 5432.

[0] https://manpages.debian.org/testing/postgresql-common/pg_virtualenv.1.en.html

--
Erik

Re: Binding Postgres to port 0 for testing

From
Erik Wienhold
Date:
> On 25/03/2023 20:10 CET Markus Pilman <markus@pilman.ch> wrote:
>
> Thanks for the suggestions. I didn't know about pg_virtualenv, that's
> interesting. Though it seems to achieve something similar as to containerize
> the test (potentially in a more platform independent way). Though it seems
> pg_virtualenv is mostly doing what my test driver is currently doing. Trying
> out the ports is obviously possible, but it seems a bit hacky to me (though
> if there's no better way I don't think that's a good show-stopper).

You can of course also use Docker and have it map port 5432 to a random host
port.  Use docker-port to find the mapped host port:

    docker port CONTAINER 5432/tcp

Testcontainers may also be an option if you want to use Docker:

* https://www.testcontainers.org/modules/databases/postgres/
* https://testcontainers-python.readthedocs.io/en/latest/postgres/README.html

> But I am still wondering: Is there a reason PostgreSQL doesn't allow me to
> bind against port 0? I understand that in a production environment this is
> almost never the thing you want to do, but I wouldn't consider this option
> very dangerous.

One reason for not allowing port zero is Postgres' naming convention of Unix
domain sockets.  The port number is included in the socket filename.

https://www.postgresql.org/docs/current/runtime-config-connection.html#GUC-UNIX-SOCKET-DIRECTORIES

Accepting port zero for a Unix domain socket would not behave the same as
binding a TCP socket to port zero.

Another benefit is that the bound port number is available through the config.
Postgres does not have to keep track of any "random" port number picked by the
operating system.

--
Erik



Re: Binding Postgres to port 0 for testing

From
Michael Paquier
Date:
On Sat, Mar 25, 2023 at 11:01:33AM -0600, Markus Pilman wrote:
> Now the problem is that I need to find a TCP port for each running postgres
> instance. There's multiple ways to do this, but by far the easiest one I
> know is to bind to port 0. So my plan was to start postgres with "-p 0" and
> then parse stdout to figure out which port it actually uses. But that
> doesn't seem to work:

Note that you can find some inspiration about that in the code tree
within src/test/perl/PostgreSQL/Test/Cluster.pm, particularly
get_free_port(), where we have now accumulated a couple of years of
experience in designing something that's rather safe, even if it comes
with its own limits.  It is in perl so perhaps you could just reuse it
rather than reinvent the wheel?  Of course, still it should not be
complicated to translate that in a different language, but there may
be no need to reinvent the wheel.  And seeing your first message with
the requirements you list, this does what you are looking for:
- Create an empty cluster.
- Freely create databases, tablespaces, queries, etc.
- Wipe out the whole.

The test cases around src/test/recovery/t/ could be a good starting
point, as well.
--
Michael

Attachment

Re: Binding Postgres to port 0 for testing

From
Markus Pilman
Date:
Thanks Michael and Erik for the help, I really appreciate it!

Thanks for explaining the context why PostgreSQL doesn't allow binding against port 0.

I somehow didn't consider looking at the postgres tests, though it makes sense that they need to solve this problem. If I read the perl code correctly though it seems that this could, in theory, cause a race? The script checks first whether the port has been assigned to a test, then binds a socket to check whether it is used by someone else, closes this test socker, and then starts a server process. I guess it's unlikely enough, but isn't there a risk that some other process (that isn't controlled by this perl script) binds to the found port right after this test bind but right before postgres calls bind? I guess it should be rare enough so that it wouldn't cause flaky tests.

I decided to implement the following (this strategy works, though it might be a bit brittle if PostgreSQL changes the error output format in the future):
1. Loop, starting from port 5432, incrementing each iteration
2. Start postgres with the given port
3. Parse the output to check whether postgres either writes a line that ends with "could not create any TCP/IP sockets" (in which case I continue) or with "database system is ready to accept connections" (in which case I break).

This is definitely not the most elegant solution, but it should do for now. At the moment I want to be able to set up everything in one process. In my experience this makes debugging problems a bit easier but comes at the cost of a more complex test driver (I recognize that it is a bit weird that the application layer initializes the runtime environment in this case).

Also, this is a hobby-project and I am more interested in fun learning than reducing work :) Generally I would agree that reusing existing and testing code to run this would be better unless there's a really good reason not to do that.

On Sun, Mar 26, 2023 at 7:27 PM Michael Paquier <michael@paquier.xyz> wrote:
On Sat, Mar 25, 2023 at 11:01:33AM -0600, Markus Pilman wrote:
> Now the problem is that I need to find a TCP port for each running postgres
> instance. There's multiple ways to do this, but by far the easiest one I
> know is to bind to port 0. So my plan was to start postgres with "-p 0" and
> then parse stdout to figure out which port it actually uses. But that
> doesn't seem to work:

Note that you can find some inspiration about that in the code tree
within src/test/perl/PostgreSQL/Test/Cluster.pm, particularly
get_free_port(), where we have now accumulated a couple of years of
experience in designing something that's rather safe, even if it comes
with its own limits.  It is in perl so perhaps you could just reuse it
rather than reinvent the wheel?  Of course, still it should not be
complicated to translate that in a different language, but there may
be no need to reinvent the wheel.  And seeing your first message with
the requirements you list, this does what you are looking for:
- Create an empty cluster.
- Freely create databases, tablespaces, queries, etc.
- Wipe out the whole.

The test cases around src/test/recovery/t/ could be a good starting
point, as well.
--
Michael

Re: Binding Postgres to port 0 for testing

From
Michael Paquier
Date:
On Sun, Mar 26, 2023 at 10:49:33PM -0600, Markus Pilman wrote:
> I somehow didn't consider looking at the postgres tests, though it makes
> sense that they need to solve this problem. If I read the perl code
> correctly though it seems that this could, in theory, cause a race? The
> script checks first whether the port has been assigned to a test, then
> binds a socket to check whether it is used by someone else, closes this
> test socker, and then starts a server process. I guess it's unlikely
> enough, but isn't there a risk that some other process (that isn't
> controlled by this perl script) binds to the found port right after this
> test bind but right before postgres calls bind? I guess it should be rare
> enough so that it wouldn't cause flaky tests.

In theory, yes, I recall that's possible in the scripts.  But only on
Windows where we cannot use socket directories and rely on SSPI
authentication while binding all the nodes to listen to 127.0.0.1.  On
all the other platforms, each test creates its own directory that will
be used as path for unix_socket_directories.  umask is 0700 and owned
by the OS user running the perl tests, so it is isolated as much as it
can be.  Using the same port should be actually fine as far as I
recall, as long as the unix socket paths are different.
--
Michael

Attachment