Thread: Running on Docker, AWS with Data Stored on EBS

Running on Docker, AWS with Data Stored on EBS

From
Ryan Mahoney
Date:

Hi All,


TL;TR: Can a new PostgreSQL process, running on a different server instance effectively resume operations by reading the same data directory location as another PostgreSQL process that is no longer running?


- - -


I have an application that is deployed to AWS as a docker container.  Although it is a single application, the container is running multiple services via supervisor.  One of those services is PostrgeSQL.


Currently I dump and reload the database between deployments, as each deployment completely destroys the server instance and creates a new one.


It is a small application without much data or usage load.


I'd like to store my data on a separate EBS volume.  Then, I was thinking that each time I perform a new deployment, I would attach the new server instance to the original EBS volume.


At any given moment, there would only ever be one PostgreSQL process attached to the data directory... but at any given time that would be a different process running on a different server.


I know that I can just test this out and see what happens, but I am concerned that even if it does appear to work at first, that it might still lead to some data corruption down the line if it is an incorrect strategy.


I have also considered using Amazon's hosted PostgreSQL service. While that would certainly work, I don't want to pay for an extra service, I'd like to use the most recent PostgreSQL version and I think my application will be faster if the data is served from the same instance.


If my application was larger, all of this would be moot because I'd run a dedicated PostgreSQL instance or just use RDS... but it isn't so I'd rather save money :)


Thanks in advance for your help,

Ryan

Re: Running on Docker, AWS with Data Stored on EBS

From
"David G. Johnston"
Date:
On Tue, Nov 8, 2016 at 12:48 PM, Ryan Mahoney <ryan.mahoney@outlook.com> wrote:

Hi All,

TL;TR: Can a new PostgreSQL process, running on a different server instance effectively resume operations by reading the same data directory location as another PostgreSQL process that is no longer running?

In short - yes.

Avoiding concurrent access and ensuring that the various PostgreSQL binaries involved are all running at minimum the same major (9.6.x) version of PostgreSQL, and ideally the same configuration, is what matters.

David J.

Re: Running on Docker, AWS with Data Stored on EBS

From
"David G. Johnston"
Date:
On Tue, Nov 8, 2016 at 1:41 PM, Ryan Mahoney <ryan.mahoney@outlook.com> wrote:

I'm so glad the use-case will work -- and sounds somewhat normative.

​The program and the data are distinct things - which is why you can upgrade from say 9.5.1 to 9.5.3 by simply updating the program.  Heck, a simple reboot of a typical server causes a new program instance to launch that is unique from the one that was previously running.

The main concern is avoiding concurrency.  The program is designed to be able to do that in a single-machine setup but if you go introducing other "clone" machines there is a greater chance of breaking things.  The software isn't really setup to do what you are thinking - its designed to be a persistent server that would exist independent of any particular instance of your application and to which your application would connect over jdbc/libpq.  So its up to you to ensure that you set things up to conform to its runtime expectations.

David J.
 

Re: Running on Docker, AWS with Data Stored on EBS

From
Ryan Mahoney
Date:

Thanks for your prompt response.


I'm so glad the use-case will work -- and sounds somewhat normative.


It also looks like the PostgreSQL memory footprint is quite small... so even using the smallest type of EC2 instance is viable (assuming the utilization and data size remain small).


With Appreciation,

Ryan



From: David G. Johnston <david.g.johnston@gmail.com>
Sent: Tuesday, November 8, 2016 3:19:02 PM
To: Ryan Mahoney
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Running on Docker, AWS with Data Stored on EBS
 
On Tue, Nov 8, 2016 at 12:48 PM, Ryan Mahoney <ryan.mahoney@outlook.com> wrote:

Hi All,

TL;TR: Can a new PostgreSQL process, running on a different server instance effectively resume operations by reading the same data directory location as another PostgreSQL process that is no longer running?

In short - yes.

Avoiding concurrent access and ensuring that the various PostgreSQL binaries involved are all running at minimum the same major (9.6.x) version of PostgreSQL, and ideally the same configuration, is what matters.

David J.