Thread: Embedding postgresql in my application
I'd like my application (Glom) to contain its own PostgreSQL instance, either by linking directly to a copy of the PostgreSQL code or, maybe more ideally, by starting its own instance of PostgreSQL as a separate process. Does anyone know of an existing example of this that I can look at? -- Murray Cumming murrayc@murrayc.com www.murrayc.com www.openismus.com
On Sat, December 16, 2006 06:25, Murray Cumming wrote: > I'd like my application (Glom) to contain its own PostgreSQL instance, > either by linking directly to a copy of the PostgreSQL code or, maybe > more ideally, by starting its own instance of PostgreSQL as a separate > process. Then you'll have to go with "starting its own instance." There is a big difference between databases that are designed to be embedded (like sqlite) and ones that run stand-alone (like postgres). It's not very hard. The only real complication is in being a good neighbour if there already is a working postgres installation on the system. What you do is ensure that the server is installed on the system normally, then create your own instance that runs independently of any other instances on the system. It's probably easiest if you do this under its own user identity. You'll probably want to create a dedicated user for the application that runs a single, shared server instance for all users (and put the data in someplace like /var/lib/myapp/data). You don't really need a dedicated user, but it's safer and it means you can't accidentally mess up the rest of the system while you experiment. Apart from installing postgres, you don't need superuser rights at all. Let's say you're in a shell running under the user identity that will own the database, and you have a directory DBDIR that that user owns. You'll want to set up the database with a local, Unix-domain socket. The basics of getting things running are: 1. When installing your application, set up its database instance. mkdir -p "$DBDIR/data" initdb -D "$DBDIR/data" See the initdb manpage for details; you may want to specify the database's encoding for example. Bear in mind that "ASCII" really means "ASCII"--if you choose that as an encoding, you can only safely use the basic 7-bit ASCII character set. You're more likely to want unicode. Another useful option to initdb is how local users will be able to connect to the database's socket. The most useful choices for you are probably "ident sameuser" (always use system user name as database user name as well) or "trust" (allow any user to log into the database). You may also want to restrict access rights to the socket itself. 2. Start the database server. You need to do this both for the next installation step and when your application is run. Run the server under your dedicated user identity, which also owns the database files. Do this in a separate process, e.g. by running it in the background. In practice you'll want to wrap this in some pidfile logic to keep track of whether the database is running. That way you can keep the database running after your application exits, and you won't need to start it up again if your application is run anew. postmaster -k "$DBDIR" -h '' The -k option says to create a socket in "$DBDIR". The socket's file name will start with a dot, so most directory viewer programs will hide it from view by default. The -h option (followed by an empty string!) sets an empty hostname. This tells the postmaster not to listen on any network interfaces. Not only is this more secure; it also prevents conflicts if there is already a postgres instance running on its default TCP port. 3. Create a database, to be managed by your server instance. You can have a single database for your application, or separate per-user databases, or a combination where some data is in a shared database and other data is kept in per-user databases. This connects to the server instance to do its work, so make sure your postmaster is running. createdb myappdb -h "$DBDIR/data" The -h option specifies the host name to connect to. Of course $DBDIR/data is not a hostname, but it will begin with a slash and this is recognized as a special case, meaning "connect to the local socket in this directory." 4. Create database user identities if needed. If your application always runs under its own dedicated user identity, choose "ident sameuser" authentication in initdb above and you won't need to create any users here. If your application can run under many user identities but will always log into the database under the same user identity (probably not a good idea though), use "trust" authentication. But if your application can run under many different user identities and each should have its own database user, use "ident sameuser" and create a database user for every system user when that user first runs the application: createuser -h "$DBDIR/data" \ --no-superuser \ --no-createrole \ --no-createdb \ $USERNAME If you want to give each of these user identities its own database, create that at the same time: createdb -h "$DBDIR/data" \ -O $USERNAME $USERNAME You'll still have to run both of these commands under your dedicated user identity, by the way. 5. Connect to your database! Once your application is running, it can connect to your database by specifying "$DBDIR/data" as a host name for the connection. You won't need to specify a user name unless you want to log into the database under a different name than the user identity the application is running under. If you set up authentication the way I've described, there's no need to provide a password. HTH, Jeroen
On Sat, 2006-12-16 at 13:16 +0700, Jeroen T. Vermeulen wrote: > On Sat, December 16, 2006 06:25, Murray Cumming wrote: > > I'd like my application (Glom) to contain its own PostgreSQL instance, > > either by linking directly to a copy of the PostgreSQL code or, maybe > > more ideally, by starting its own instance of PostgreSQL as a separate > > process. > > Then you'll have to go with "starting its own instance." There is a big > difference between databases that are designed to be embedded (like > sqlite) and ones that run stand-alone (like postgres). > > It's not very hard. The only real complication is in being a good > neighbour if there already is a working postgres installation on the > system. What you do is ensure that the server is installed on the system > normally, then create your own instance that runs independently of any > other instances on the system. > > It's probably easiest if you do this under its own user identity. You'll > probably want to create a dedicated user for the application that runs a > single, shared server instance for all users (and put the data in > someplace like /var/lib/myapp/data). You don't really need a dedicated > user, but it's safer and it means you can't accidentally mess up the rest > of the system while you experiment. Apart from installing postgres, you > don't need superuser rights at all. What would be the problem with running it as the current user? That would get a bit difficult if multiple users are running instances of the application - at the least, the port would have to be chosen dynamically. But it would keep things simple if the postgres instances were separate. > Let's say you're in a shell running under the user identity that will own > the database, and you have a directory DBDIR that that user owns. You'll > want to set up the database with a local, Unix-domain socket. The basics > of getting things running are: > > 1. When installing your application, set up its database instance. > > mkdir -p "$DBDIR/data" > initdb -D "$DBDIR/data" > > See the initdb manpage for details; you may want to specify the database's > encoding for example. Bear in mind that "ASCII" really means "ASCII"--if > you choose that as an encoding, you can only safely use the basic 7-bit > ASCII character set. You're more likely to want unicode. > > Another useful option to initdb is how local users will be able to connect > to the database's socket. The most useful choices for you are probably > "ident sameuser" (always use system user name as database user name as > well) or "trust" (allow any user to log into the database). You may also > want to restrict access rights to the socket itself. In this case, I'd actually like the location of the database data to be different every time. The idea would be that it would all be contained in a directory (or even a tarball archive, but that's another issue), along with the .glom file. And opening the .glom file would start a postgres instance with the database that's in that directory. > 2. Start the database server. You need to do this both for the next > installation step and when your application is run. Run the server under > your dedicated user identity, which also owns the database files. Do this > in a separate process, e.g. by running it in the background. > > In practice you'll want to wrap this in some pidfile logic to keep track > of whether the database is running. That way you can keep the database > running after your application exits, and you won't need to start it up > again if your application is run anew. > > postmaster -k "$DBDIR" -h '' > > The -k option says to create a socket in "$DBDIR". The socket's file name > will start with a dot, so most directory viewer programs will hide it from > view by default. The -h option (followed by an empty string!) sets an > empty hostname. This tells the postmaster not to listen on any network > interfaces. Not only is this more secure; it also prevents conflicts if > there is already a postgres instance running on its default TCP port. > > 3. Create a database, to be managed by your server instance. You can have > a single database for your application, or separate per-user databases, or > a combination where some data is in a shared database and other data is > kept in per-user databases. This connects to the server instance to do > its work, so make sure your postmaster is running. > > createdb myappdb -h "$DBDIR/data" In my each-database-in-its-directory idea, I guess that I'd do both an initdb (to create the cluster) and a createdb (to create the database, in the cluster, after a createuser) once. Then, each time the file/database is opened, I'd use postmaster to start a postgres instance that uses the cluster in that directory. Or do initdb and createdb change anything outside of the specified directory? I mean, can I really move that directory around and just start a postgres instance on it after it's moved, without changes. Are there any absolute filepaths I should watch out for, for instance? > The -h option specifies the host name to connect to. Of course > $DBDIR/data is not a hostname, but it will begin with a slash and this is > recognized as a special case, meaning "connect to the local socket in this > directory." > > 4. Create database user identities if needed. If your application always > runs under its own dedicated user identity, choose "ident sameuser" > authentication in initdb above and you won't need to create any users > here. If your application can run under many user identities but will > always log into the database under the same user identity (probably not a > good idea though), use "trust" authentication. But if your application > can run under many different user identities and each should have its own > database user, use "ident sameuser" and create a database user for every > system user when that user first runs the application: > > createuser -h "$DBDIR/data" \ > --no-superuser \ > --no-createrole \ > --no-createdb \ > $USERNAME > > If you want to give each of these user identities its own database, create > that at the same time: > > createdb -h "$DBDIR/data" \ > -O $USERNAME > $USERNAME > > You'll still have to run both of these commands under your dedicated user > identity, by the way. > > 5. Connect to your database! Once your application is running, it can > connect to your database by specifying "$DBDIR/data" as a host name for > the connection. You won't need to specify a user name unless you want to > log into the database under a different name than the user identity the > application is running under. If you set up authentication the way I've > described, there's no need to provide a password. Actually, I want passwords. This is not just for the internal use of the application. I want Glom systems to be usable by multiple users simultaneously. That's a big reason why I'm not using sqllite for this. Many thanks for the clues. -- Murray Cumming murrayc@murrayc.com www.murrayc.com www.openismus.com
On Sat, December 16, 2006 18:13, Murray Cumming wrote: >> You'll >> probably want to create a dedicated user for the application that runs a >> single, shared server instance for all users (and put the data in >> someplace like /var/lib/myapp/data). You don't really need a dedicated >> user, but it's safer and it means you can't accidentally mess up the >> rest >> of the system while you experiment. Apart from installing postgres, you >> don't need superuser rights at all. > > What would be the problem with running it as the current user? Resources and speed. If this is a multi-user application, then starting a database instance for every user who runs the application is probably unnecessarily wasteful. > That would get a bit difficult if multiple users are running instances > of the application - at the least, the port would have to be chosen > dynamically. But it would keep things simple if the postgres instances > were separate. You can always just give every user his own database in a single, shared postgres instance. There is no issue of ports, really. The way I've described it, you do *not* need a TCP port. Every instance of postgres is running on its own socket, which appears in the filesystem in a place of your choosing. > In this case, I'd actually like the location of the database data to be > different every time. The idea would be that it would all be contained > in a directory (or even a tarball archive, but that's another issue), > along with the .glom file. And opening the .glom file would start a > postgres instance with the database that's in that directory. That should work fine, but yes, then you do need a separate postgres instance running for each of those directories. > In my each-database-in-its-directory idea, I guess that I'd do both an > initdb (to create the cluster) and a createdb (to create the database, > in the cluster, after a createuser) once. Yes. Except for the createuser: if user X owns these directories and postgres is started under user identity X, then X is automatically (well, by default) the superuser for that database. No need to create it. > Or do initdb and createdb change anything outside of the specified > directory? I mean, can I really move that directory around and just > start a postgres instance on it after it's moved, without changes. Are > there any absolute filepaths I should watch out for, for instance? There's the configuration files, of course, in /etc/postgresql*. You'll have to look up the necessary options to replace those with local versions. > Actually, I want passwords. This is not just for the internal use of the > application. I want Glom systems to be usable by multiple users > simultaneously. That's a big reason why I'm not using sqllite for this. That doesn't mean that you have to have passwords, though. With "ident sameuser" you simply rely on the system's authentication--if the system says you're user X, then the database accepts you as X, but if the system says you're not X, then the database will not accept you as X either. Overall that's probably safer, not less safe, than requiring the user to enter a password to log into the database. And it's lots safer than keeping a password stored in a file somewhere. Jeroen
On Sat, 2006-12-16 at 18:30 +0700, Jeroen T. Vermeulen wrote: > There is no issue of ports, really. The way I've described it, you do > *not* need a TCP port. Every instance of postgres is running on its own > socket, which appears in the filesystem in a place of your choosing. Well, I want multiple users, from multiple networked computers to be able to use the database simultaneously. TCP/IP seems the logical way to do this. > > In this case, I'd actually like the location of the database data to be > > different every time. The idea would be that it would all be contained > > in a directory (or even a tarball archive, but that's another issue), > > along with the .glom file. And opening the .glom file would start a > > postgres instance with the database that's in that directory. > > That should work fine, but yes, then you do need a separate postgres > instance running for each of those directories. > > > > In my each-database-in-its-directory idea, I guess that I'd do both an > > initdb (to create the cluster) and a createdb (to create the database, > > in the cluster, after a createuser) once. > > Yes. Except for the createuser: if user X owns these directories and > postgres is started under user identity X, then X is automatically (well, > by default) the superuser for that database. No need to create it. Are we talking about a Linux user or Postgres user as "user X"? I thought they were separate lists of users. > > Or do initdb and createdb change anything outside of the specified > > directory? I mean, can I really move that directory around and just > > start a postgres instance on it after it's moved, without changes. Are > > there any absolute filepaths I should watch out for, for instance? > > There's the configuration files, of course, in /etc/postgresql*. You'll > have to look up the necessary options to replace those with local > versions. Is there no way to put the configuration file in the directory too? > > Actually, I want passwords. This is not just for the internal use of the > > application. I want Glom systems to be usable by multiple users > > simultaneously. That's a big reason why I'm not using sqllite for this. > > That doesn't mean that you have to have passwords, though. With "ident > sameuser" you simply rely on the system's authentication--if the system > says you're user X, then the database accepts you as X, but if the system > says you're not X, then the database will not accept you as X either. > Overall that's probably safer, not less safe, than requiring the user to > enter a password to log into the database. And it's lots safer than > keeping a password stored in a file somewhere. I don't want to require all users to be using the same Linux user system, and I even want to allow Windows users to connect when I port Glom to Windows. -- Murray Cumming murrayc@murrayc.com www.murrayc.com www.openismus.com
On Sat, December 16, 2006 18:47, Murray Cumming wrote: > Well, I want multiple users, from multiple networked computers to be > able to use the database simultaneously. TCP/IP seems the logical way to > do this. Ah--I had not understood that you wanted to access the database remotely. But in that case, in what sense would the database be "embedded" in your application!? >> Yes. Except for the createuser: if user X owns these directories and >> postgres is started under user identity X, then X is automatically >> (well, >> by default) the superuser for that database. No need to create it. > > Are we talking about a Linux user or Postgres user as "user X"? I > thought they were separate lists of users. They are. But if you're on a local connection, as I had been assuming, you can couple them which is often both safe and convenient. >> There's the configuration files, of course, in /etc/postgresql*. You'll >> have to look up the necessary options to replace those with local >> versions. > > Is there no way to put the configuration file in the directory too? Certainly. You'll just have to look up the necessary options to replace the default configuration files with your local versions. Jeroen
On Sat, 2006-12-16 at 18:55 +0700, Jeroen T. Vermeulen wrote: > Ah--I had not understood that you wanted to access the database > remotely. > But in that case, in what sense would the database be "embedded" in > your > application!? It's not the best word, I guess. I just wanted to express that I'd be using Postgres only for my application, without using the system's regular Postgres installation/configuration. My aims are: - The user doesn't need to be clever or patient enough to install and configure postgres to support TCP/IP connections. This is unnecessarily difficult on most distros: http://www.glom.org/wiki/index.php?title=Initial_Postgres_Configuration The file-opening experience would then be the same as when opening a FileMaker or Access file. - I can make choices about the default connection policy without interfering with the connection policy wanted by other applications that use postgres, or with the default policy wanted (understandably) by the disto packager. - The Glom "system" (the database plus the UI description, etc) can be copied and started on a different computer easily. - The first user to open a Glom file becomes the server for that .glom file, and other users connect to that first client (the server). This is how FileMaker works. A central server, for better performance, is then an extra choice that can be used when wanted. -- Murray Cumming murrayc@murrayc.com www.murrayc.com www.openismus.com
On Sat, 2006-12-16 at 13:16 +0700, Jeroen T. Vermeulen wrote: > f your application > can run under many different user identities and each should have its > own > database user, use "ident sameuser" I believe that ident authentication requires an additional service such as gidentd. Well, I think that's been my experience in the past. > and create a database user for every > system user when that user first runs the application: > > createuser -h "$DBDIR/data" \ > --no-superuser \ > --no-createrole \ > --no-createdb \ > $USERNAME createuser asks for a password on stdin. Do you know of any way to specifify an initial superuser and password for a new database cluster without doing it via stdin? That doesn't seem like a stable API that I should rely on. -- Murray Cumming murrayc@murrayc.com www.murrayc.com www.openismus.com
Murray Cumming wrote: > On Sat, 2006-12-16 at 13:16 +0700, Jeroen T. Vermeulen wrote: > > f your application > > can run under many different user identities and each should have its > > own > > database user, use "ident sameuser" > > I believe that ident authentication requires an additional service such > as gidentd. Well, I think that's been my experience in the past. Sure. > > and create a database user for every > > system user when that user first runs the application: > > > > createuser -h "$DBDIR/data" \ > > --no-superuser \ > > --no-createrole \ > > --no-createdb \ > > $USERNAME > > createuser asks for a password on stdin. Do you know of any way to > specifify an initial superuser and password for a new database cluster > without doing it via stdin? That doesn't seem like a stable API that I > should rely on. Sure, initdb accepts the -U parameter to specify the user and you can put the password in a temp file and give it to initdb with --pwfile. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Tue, 2006-12-19 at 12:46 -0300, Alvaro Herrera wrote: > Murray Cumming wrote: > > createuser asks for a password on stdin. Do you know of any way to > > specifify an initial superuser and password for a new database cluster > > without doing it via stdin? That doesn't seem like a stable API that I > > should rely on. > > Sure, initdb accepts the -U parameter to specify the user and you can > put the password in a temp file and give it to initdb with --pwfile. Thanks. That works well. -- Murray Cumming murrayc@murrayc.com www.murrayc.com www.openismus.com
Murray Cumming wrote: > I'd like my application (Glom) to contain its own PostgreSQL instance, > either by linking directly to a copy of the PostgreSQL code or, maybe > more ideally, by starting its own instance of PostgreSQL as a separate > process. > > Does anyone know of an existing example of this that I can look at? I wanted to do just this - extract the postgres client code for embedding in my own application. It's actually rather easy to do; only "interesting" part is configuring the pg_config.h for each system that you will build on. I simply ran configure on the ones I deal with to get a system specific version of the header that I can then rename as needed in the future. Bit of a brutish hack there but it was done in the name of expediency (aren't they all). Anyway... here's my notes on how to extract your own client code from the postgres distribution. At the end of it, you'll have a static lib that you can compile into your app. How to Create a Standalone PostgreSQL Client Library Jonathan Earle, 29Sep06 ==================================================== Prologue -------- Obtain postgres tarball from postgres.org and my modded Makefile and place both in your home directory (or elsewhere - just adjust path references accordingly). Disclaimer ---------- This is constructed only for Unix systems as that is the only need I have for creating this. Win32 could probably be supported by not deleting the win32 files and by re-including portions from the original libpq makefile. This was also built on a Linux system, but the resulting client tree can then be packed and moved to AIX, HPUX, etc. Procedure --------- Unpack and create a new client dir for the standalone code: tar zxf postgresql-8.1.4.tar.gz mkdir pgclient Copy required source files: export pq=~/postgresql-8.1.4/src cp -a $pq/interfaces/libpq/* pgclient cp $pq/backend/utils/mb/encnames.c \ $pq/backend/libpq/ip.c \ $pq/backend/libpq/md5.c \ $pq/port/noblock.c \ $pq/port/pgstrcasecmp.c\ $pq/port/thread.c \ $pq/backend/utils/mb/wchar.c \ pgclient Copy required header files: export pq=~/postgresql-8.1.4/src/include/ cp $pq/getaddrinfo.h \ $pq/postgres_fe.h \ $pq/postgres_ext.h \ $pq/c.h \ $pq/pg_config.h \ $pq/pg_config_manual.h\ $pq/pg_config_os.h \ $pq/port.h \ pgclient cd $pq; cp --parents \libpq/pqcomm.h \libpq/crypt.h \libpq/libpq-be.h \libpq/hba.h \libpq/ip.h \libpq/libpq-fs.h \mb/pg_wchar.h\nodes/pg_list.h \nodes/nodes.h \~/pgclient; cd - cp ~/postgresql-8.1.4/src/port/pg_config_paths.h pgclient Replace Makefile with modified version: cp -f Makefile pgclient Clean up and Build: cd pgclient rm -f *.mak *.def exports.txt libpqdll.c libpq.rc* pthread-win32* win32* Adjust pg_config.h and pg_config_paths.has required. make lib Initial Validation: # ls -l *a ~/postgresql-8.1.4/src/interfaces/libpq/*a -rw-rw-rw- 1 jon user 131652 Sep 29 10:26 libpq.a -rw-rw-rw- 1 jonuser 131652 Sep 28 15:20 ~/postgresql-8.1.4/src/interfaces/libpq/libpq.a Testing and usage is left to the reader. Enjoy! :) Makefile -------- #------------------------------------------------------------------------- # # Makefile for standalone libpq library # # Portions Copyright (c) 1996-2005, PostgreSQL Global Development Group # Portions Copyright (c) 1994, Regents of the University of California # # $Id$ # #------------------------------------------------------------------------- CC = gcc CPPFLAGS = -DFRONTEND -D_GNU_SOURCE -I. CFLAGS = -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Winline -fno-strict-aliasing LIBS = -lcrypt -lresolv -lnsl -ldl -lm -lbsd LIBOBJS = copydir.o dirmod.o exec.o noblock.o path.o pipe.o pgsleep.o pgstrcasecmp.o sprompt.o thread.o AROPT = crs LINK.static = $(AR) $(AROPT) RANLIB = ranlib # library parameters NAME= pq OBJS= fe-auth.o fe-connect.o fe-exec.o fe-misc.o fe-print.o fe-lobj.o \ fe-protocol2.o fe-protocol3.o pqexpbuffer.o pqsignal.ofe-secure.o \ md5.o ip.o wchar.o encnames.o noblock.o pgstrcasecmp.o thread.o \ $(filter crypt.o getaddrinfo.oinet_aton.o open.o snprintf.o strerror.o, $(LIBOBJS)) all: lib: all lib$(NAME).a lib$(NAME).a: $(OBJS) $(LINK.static) $@ $^ $(RANLIB) $@ clean distclean: clean-lib rm -f $(OBJS) clean-lib: rm -f lib$(NAME).a
On Tue, Dec 19, 2006 at 16:36:53 +0100, Murray Cumming <murrayc@murrayc.com> wrote: > On Sat, 2006-12-16 at 13:16 +0700, Jeroen T. Vermeulen wrote: > > f your application > > can run under many different user identities and each should have its > > own > > database user, use "ident sameuser" > > I believe that ident authentication requires an additional service such > as gidentd. Well, I think that's been my experience in the past. If your running on a unix like machine that supports something like getpeername and using domain sockets then you wouldn't need to set up a separate process to do this. In the Windows world I think you are stuck.