Thread: People obsessed with docker - how can I help?
Dear Friends Recently I started helping some friends with their postgresql systems. They run production on XLD, then some dev environment on OS level, and docker in others. And they seem dead sure docker is the way to go both for the db and app, both in dev/staging and in production. I am trying to convince those people that docker may be fine for their app distribution (java - tomacat), or dev setup, although I have seen cases that someone needs to mess around with java threads from within the OS, or make persistent changes to java app config from the app or the admin interface, but for postgresql it will be a headache if they want to do any serious tasks with their DB later, new extensions, custom C functions, pg_upgrade, etc. They use postgresql docker image, like I would use e.g. a sqlite or even mysql package in a use once, then throw away fashion. Of course this is just me and today MySQL is far from the toy it used to be in 2000, just trying to give you the idea. They don't want to deal with the DB, want to see it as a black box doing some boring DB stuff and then forget about it. I explained to them that multi-Billion companies might deploy kubernetes for postgresql achieving many 9's of high availability but those are very serious installations running with some well supported kubernetes operator that they can rely and depend upon , they need to respect their architecture in their future designs, and also shape the app accordingly. So we see the oxymoron already : people fanatical with running postgresql on docker are those who dont know how to do a proper installation by hand, while on the other end of the spectrum companies running postgresql on docker/kubernetes are those who already have a vast experience running vanilla or compiled or 3rd party postgreql on the OS, and want to take HA a step further , providing a fully automated environment ( perhaps with Patroni and the rest of ecosystem). So effectively, one can gather that those who should run postgresql on docker are the exact opposite of the kind of people who actually attempt to run it without much thought. Plus another problem spotted by some ppl in the field : https://pigsty.io/blog/db/pg-in-docker/ "a team’s intellect relies on the few seasoned members and their communication overhead. Database issues require database experts; container issues, container experts. However, when databases are deployed on kubernetes & dockers, merging the expertise of database and K8S specialists is challenging — you need a dual-expert to resolve issues, and such individuals are rarer than specialists in one domain. Moreover, one man’s meat is another man’s poison. Certain Docker features might turn into bugs under specific conditions." While the DB is still as big as some GBs of data, I find it strange not being able to install whatever extension I find suitable and also strange to wait half an hour for an upgrade which should take seconds. If someone wanted to have a persistent extension system he/she should also persist /usr/local/pgsql/share besides $PGDATA . And would have to play around with https://github.com/tianon/docker-postgres-upgrade to get a proper binary upgrade. What are your thoughts ? I am puzzled because while I used to hear many skeptical opinions until some years ago, now the trend seems to more on the "acceptance" or neutral side.
On Mon, 2025-03-10 at 09:28 +0200, Achilleas Mantzios - cloud wrote: > [doesn't think running PostgreSQL in containers in production > is such a hot idea, but sees the concept going mainstream] > > What are your thoughts ? I am puzzled because while I used to hear many > skeptical opinions until some years ago, now the trend seems to more on > the "acceptance" or neutral side. Well, lots of people think it is a great idea to host their important database in a public cloud. Fashions are not necessarily based on wisdom. Using Kubernetes for test and play databases that you create and destroy regularly is a great thing. Using Kubernetes to squish many small databases on a single machine while managing the resource usage can be useful. If you use Kubernetes for everything else and it makes monitoring easy for you, it may make sense to run a production database that way. Running your database on Kubernetes will make database administration and troubleshooting more cumbersome and will require you to create special containers for the purpose of upgrading. If these disadvantages are outbalanced by the above advantages, it may make sense. If you plan to run serious databases on Kubernetes, you better have dedicated nodes for that purpose, so that you can tune the kernel parameters. Yours, Laurenz Albe
On 3/10/25 12:11, Laurenz Albe wrote: > On Mon, 2025-03-10 at 09:28 +0200, Achilleas Mantzios - cloud wrote: >> [doesn't think running PostgreSQL in containers in production >> is such a hot idea, but sees the concept going mainstream] >> >> What are your thoughts ? I am puzzled because while I used to hear many >> skeptical opinions until some years ago, now the trend seems to more on >> the "acceptance" or neutral side. > Well, lots of people think it is a great idea to host their important > database in a public cloud. Fashions are not necessarily based on wisdom. > > Using Kubernetes for test and play databases that you create and destroy > regularly is a great thing. > > Using Kubernetes to squish many small databases on a single machine > while managing the resource usage can be useful. > > If you use Kubernetes for everything else and it makes monitoring easy > for you, it may make sense to run a production database that way. > > Running your database on Kubernetes will make database administration > and troubleshooting more cumbersome and will require you to create special > containers for the purpose of upgrading. If these disadvantages are > outbalanced by the above advantages, it may make sense. > > If you plan to run serious databases on Kubernetes, you better have > dedicated nodes for that purpose, so that you can tune the kernel > parameters. Thank you Laurenz, Those friends of mine are PgSQL noobs (hence the choice to use docker), and have no plans AFAIK to deploy kubernetes in the near (or distant) future. So to say my opinion on the advantages one by one : - they dont create and drop DBs regularly, e..g I upgrade from 14.* -> 17 yesterday so this DB was live for some years now. - they have a few small DBs for the moment, with one being the main, so no need for squishing either - they have no kubernetes running or any k8s plans for the future that I know of. For all those reasons, and while they still learning the basics of PgSQL , I dont think docker is a good idea. Plus they dont have a DBA (apart from me which I kinda work in a volunteer basis), and when I eventually leave them, I would like their system to be in a good shape for the next one. > > Yours, > Laurenz Albe
> So effectively, one can gather that those who should run postgresql on docker are the exact opposite of the kind of peoplewho actually attempt to run it without much thought. LOL, yes, I think you cut to the core of the issue right there. (For context, I run HA PG on K8s across continents...) Ipersonally see little to no advantage to run PG in plain Docker. -- Scott Ribe scott_ribe@elevated-dev.com https://www.linkedin.com/in/scottribe/
On Mon, Mar 10, 2025 at 6:56 AM Achilleas Mantzios - cloud <a.mantzios@cloud.gatewaynet.com> wrote:
[snip]
Thank you Laurenz,
Those friends of mine are PgSQL noobs (hence the choice to use docker),
and have no plans AFAIK to deploy kubernetes in the near (or distant)
future. So to say my opinion on the advantages one by one :
- they dont create and drop DBs regularly, e..g I upgrade from 14.* ->
17 yesterday so this DB was live for some years now.
- they have a few small DBs for the moment, with one being the main, so
no need for squishing either
Sounds like SQLite might be what they're looking for.
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!
Hi All, But there is more to it. In our db, (main job) we define users individually, so we can do things like : a user does not get the results he/she expects or claims has otherwise some problem. We can ps aux | grep user, or select pid from pg_stat_activity where , then e.g ALTER ROLE SET log_statements = true or just enable for the whole system for a while then kill -HUP his/her pid , grab our logs, reset the settings to default investigate. Other scenario, we just run top, and see currently the users with the highest cpu/disk load. (apart from all monitoring). Are there alternatives to this when someone runs a slim / stripped version of the OS in a docker image? Or does he/she needs to sacrifice the above ? I am not talking about a Kubernetes scenario (for which I have no experience) , just plain docker. On 3/10/25 12:56, Achilleas Mantzios - cloud wrote: > > On 3/10/25 12:11, Laurenz Albe wrote: >> On Mon, 2025-03-10 at 09:28 +0200, Achilleas Mantzios - cloud wrote: >>> [doesn't think running PostgreSQL in containers in production >>> is such a hot idea, but sees the concept going mainstream] >>> >>> What are your thoughts ? I am puzzled because while I used to hear many >>> skeptical opinions until some years ago, now the trend seems to more on >>> the "acceptance" or neutral side. >> Well, lots of people think it is a great idea to host their important >> database in a public cloud. Fashions are not necessarily based on >> wisdom. >> >> Using Kubernetes for test and play databases that you create and destroy >> regularly is a great thing. >> >> Using Kubernetes to squish many small databases on a single machine >> while managing the resource usage can be useful. >> >> If you use Kubernetes for everything else and it makes monitoring easy >> for you, it may make sense to run a production database that way. >> >> Running your database on Kubernetes will make database administration >> and troubleshooting more cumbersome and will require you to create >> special >> containers for the purpose of upgrading. If these disadvantages are >> outbalanced by the above advantages, it may make sense. >> >> If you plan to run serious databases on Kubernetes, you better have >> dedicated nodes for that purpose, so that you can tune the kernel >> parameters. > > Thank you Laurenz, > > Those friends of mine are PgSQL noobs (hence the choice to use > docker), and have no plans AFAIK to deploy kubernetes in the near (or > distant) future. So to say my opinion on the advantages one by one : > > - they dont create and drop DBs regularly, e..g I upgrade from 14.* -> > 17 yesterday so this DB was live for some years now. > > - they have a few small DBs for the moment, with one being the main, > so no need for squishing either > > - they have no kubernetes running or any k8s plans for the future that > I know of. > > > For all those reasons, and while they still learning the basics of > PgSQL , I dont think docker is a good idea. Plus they dont have a DBA > (apart from me which I kinda work in a volunteer basis), and when I > eventually leave them, I would like their system to be in a good shape > for the next one. > >> >> Yours, >> Laurenz Albe > >
Hi Achilleas,
El mié, 12 de mar de 2025, 8:05 a. m., Achilleas Mantzios - cloud <a.mantzios@cloud.gatewaynet.com> escribió:
Hi All, But there is more to it.
In our db, (main job) we define users individually, so we can do things
like : a user does not get the results he/she expects or claims has
otherwise some problem.
I came to be a big fan of Docker in specific situations:
- For developers, with restricted CPU and/or RAM (to prevent server collapse because of a process going mad)
- For QA, with incremental snapshots, to quickly test updates. This use case saved me days of database recoveries.
Other than that, give the developer an empty container and ask him to install the database by hand :-) The experience is worth it.
For Production, I heard some horror stories due to uncontrolled bad practices. You'd better think twice about the specific use case and whether you have compelling reasons to do it.
Hope it helps
Olivier
We can ps aux | grep user, or select pid from pg_stat_activity where ,
then e.g ALTER ROLE SET log_statements = true or just enable for the
whole system for a while then kill -HUP his/her pid , grab our logs,
reset the settings to default investigate.
Other scenario, we just run top, and see currently the users with the
highest cpu/disk load. (apart from all monitoring).
Are there alternatives to this when someone runs a slim / stripped
version of the OS in a docker image? Or does he/she needs to sacrifice
the above ?
I am not talking about a Kubernetes scenario (for which I have no
experience) , just plain docker.
On 3/10/25 12:56, Achilleas Mantzios - cloud wrote:
>
> On 3/10/25 12:11, Laurenz Albe wrote:
>> On Mon, 2025-03-10 at 09:28 +0200, Achilleas Mantzios - cloud wrote:
>>> [doesn't think running PostgreSQL in containers in production
>>> is such a hot idea, but sees the concept going mainstream]
>>>
>>> What are your thoughts ? I am puzzled because while I used to hear many
>>> skeptical opinions until some years ago, now the trend seems to more on
>>> the "acceptance" or neutral side.
>> Well, lots of people think it is a great idea to host their important
>> database in a public cloud. Fashions are not necessarily based on
>> wisdom.
>>
>> Using Kubernetes for test and play databases that you create and destroy
>> regularly is a great thing.
>>
>> Using Kubernetes to squish many small databases on a single machine
>> while managing the resource usage can be useful.
>>
>> If you use Kubernetes for everything else and it makes monitoring easy
>> for you, it may make sense to run a production database that way.
>>
>> Running your database on Kubernetes will make database administration
>> and troubleshooting more cumbersome and will require you to create
>> special
>> containers for the purpose of upgrading. If these disadvantages are
>> outbalanced by the above advantages, it may make sense.
>>
>> If you plan to run serious databases on Kubernetes, you better have
>> dedicated nodes for that purpose, so that you can tune the kernel
>> parameters.
>
> Thank you Laurenz,
>
> Those friends of mine are PgSQL noobs (hence the choice to use
> docker), and have no plans AFAIK to deploy kubernetes in the near (or
> distant) future. So to say my opinion on the advantages one by one :
>
> - they dont create and drop DBs regularly, e..g I upgrade from 14.* ->
> 17 yesterday so this DB was live for some years now.
>
> - they have a few small DBs for the moment, with one being the main,
> so no need for squishing either
>
> - they have no kubernetes running or any k8s plans for the future that
> I know of.
>
>
> For all those reasons, and while they still learning the basics of
> PgSQL , I dont think docker is a good idea. Plus they dont have a DBA
> (apart from me which I kinda work in a volunteer basis), and when I
> eventually leave them, I would like their system to be in a good shape
> for the next one.
>
>>
>> Yours,
>> Laurenz Albe
>
>
> runs a slim / stripped version of the OS This obsession with absolutely minimization of images leads to people having to add this tool to one image, and that toolto another image, and so on. You wind up with all these different layers for different applications, eating up the allegedspace savings.
On 3/12/25 15:53, Scott Ribe wrote: >> runs a slim / stripped version of the OS > This obsession with absolutely minimization of images leads to people having to add this tool to one image, and that toolto another image, and so on. You wind up with all these different layers for different applications, eating up the allegedspace savings. Exactly