Thread: HEADS UP: Win32/OS2/BeOS native ports
Morning all ... Just a heads up that over the next little while, I'm planning on making a bunch of commits in order to work on making the code able to work natively in the above environments ... my work will mostly focus on Win32 (since I have no OS2/BeOS installs), but alot of the changes will be such that it will benefit the others as well ... The initial changes will be to just wrapper all our shared memory code, so that I can make use of Apache's libapr libraries *if* they are installed ... if not, it will just fall back to "the current code" ...
"Marc G. Fournier" wrote: > > Morning all ... > > Just a heads up that over the next little while, I'm planning on > making a bunch of commits in order to work on making the code able to work > natively in the above environments ... my work will mostly focus on Win32 > (since I have no OS2/BeOS installs), but alot of the changes will be such > that it will benefit the others as well ... > > The initial changes will be to just wrapper all our shared memory > code, so that I can make use of Apache's libapr libraries *if* they are > installed ... if not, it will just fall back to "the current code" ... If you want any assistance, drop me an email. I spent a long time (> decade) doing Windows applications and drivers and know a good number of the cool tricks.
On Fri, 3 May 2002, mlw wrote: > "Marc G. Fournier" wrote: > > > > Morning all ... > > > > Just a heads up that over the next little while, I'm planning on > > making a bunch of commits in order to work on making the code able to work > > natively in the above environments ... my work will mostly focus on Win32 > > (since I have no OS2/BeOS installs), but alot of the changes will be such > > that it will benefit the others as well ... > > > > The initial changes will be to just wrapper all our shared memory > > code, so that I can make use of Apache's libapr libraries *if* they are > > installed ... if not, it will just fall back to "the current code" ... > > If you want any assistance, drop me an email. I spent a long time (> decade) > doing Windows applications and drivers and know a good number of the cool > tricks. hrmmmm ... do you have a working Windows development environment? I'm running WinXP at home, but don't have any of the compilers or anything yet, so all my work for the first part is going to be done under Unix ... but someone that knows something about building makefiles for Windows, and compiling under it, will definitely be a major asset ;)
Will there really be a need for a BeOS development with the sale of Be to Palm? Is BeOS even still available? It might not be worth the time to develop for BeOS until you see what Palm decides to do with the software. -----Original Message----- From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Marc G. Fournier Sent: Friday, May 03, 2002 9:48 AM To: mlw Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] HEADS UP: Win32/OS2/BeOS native ports On Fri, 3 May 2002, mlw wrote: > "Marc G. Fournier" wrote: > > > > Morning all ... > > > > Just a heads up that over the next little while, I'm planning on > > making a bunch of commits in order to work on making the code able to work > > natively in the above environments ... my work will mostly focus on Win32 > > (since I have no OS2/BeOS installs), but alot of the changes will be such > > that it will benefit the others as well ... > > > > The initial changes will be to just wrapper all our shared memory > > code, so that I can make use of Apache's libapr libraries *if* they are > > installed ... if not, it will just fall back to "the current code" ... > > If you want any assistance, drop me an email. I spent a long time (> decade) > doing Windows applications and drivers and know a good number of the cool > tricks. hrmmmm ... do you have a working Windows development environment? I'm running WinXP at home, but don't have any of the compilers or anything yet, so all my work for the first part is going to be done under Unix ... but someone that knows something about building makefiles for Windows, and compiling under it, will definitely be a major asset ;) ---------------------------(end of broadcast)--------------------------- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly
On Fri, 3 May 2002, Travis Hoyt wrote: > Will there really be a need for a BeOS development with the sale of Be to > Palm? Is BeOS even still available? It might not be worth the time to > develop for BeOS until you see what Palm decides to do with the software. Note that the changes I'm making are to make use of what is available through the libapr API that the Apache group has developed ... so, as long as they have the hooks in for BeOS, we will ... doesn't mean PgSQL will actually have makefiles for, and will compile under it, unless someone *with* BeOS steps forward, but alot of the core functionality that has held back native ports should work ... > > -----Original Message----- > From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Marc G. Fournier > Sent: Friday, May 03, 2002 9:48 AM > To: mlw > Cc: pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] HEADS UP: Win32/OS2/BeOS native ports > > > On Fri, 3 May 2002, mlw wrote: > > > "Marc G. Fournier" wrote: > > > > > > Morning all ... > > > > > > Just a heads up that over the next little while, I'm planning > on > > > making a bunch of commits in order to work on making the code able to > work > > > natively in the above environments ... my work will mostly focus on > Win32 > > > (since I have no OS2/BeOS installs), but alot of the changes will be > such > > > that it will benefit the others as well ... > > > > > > The initial changes will be to just wrapper all our shared > memory > > > code, so that I can make use of Apache's libapr libraries *if* they > are > > > installed ... if not, it will just fall back to "the current code" ... > > > > If you want any assistance, drop me an email. I spent a long time (> > decade) > > doing Windows applications and drivers and know a good number of the > cool > > tricks. > > hrmmmm ... do you have a working Windows development environment? I'm > running WinXP at home, but don't have any of the compilers or anything > yet, so all my work for the first part is going to be done under Unix ... > > but someone that knows something about building makefiles for Windows, and > compiling under it, will definitely be a major asset ;) > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly >
"Marc G. Fournier" <scrappy@hub.org> writes: > The initial changes will be to just wrapper all our shared memory > code, so that I can make use of Apache's libapr libraries *if* they are > installed ... if not, it will just fall back to "the current code" ... I think we should redesign the shared memory API (and even more so the semaphore API), not just put a wrapper layer on it. A lot of the internal API is unnecessarily dependent on SysV shmem/sem behavior. Note however that there are some things you will break if you are not very careful. We are depending on shmem/sem behavior to catch a number of multiple-postmaster conflict situations. If there's not a more or less SysV-ish kernel underneath us, those situations will have to be rethought and some other interlock invented. In short, I want to see a design review first, not a bunch of off-the-cuff commits. regards, tom lane
Hi Marc, How about using Dev-C++? It's a Windows IDE with a GCC backend, and has a nice rep (and a Linux port): http://sourceforge.net/projects/dev-cpp/ It's always in SF.net's "Top 10" most worked on projects too, with about roughly 7,000 downloads per day. It can generate mingwin code too. :-) Regards and best wishes, Justin Clift "Marc G. Fournier" wrote: > > On Fri, 3 May 2002, mlw wrote: > > > "Marc G. Fournier" wrote: > > > > > > Morning all ... > > > > > > Just a heads up that over the next little while, I'm planning on > > > making a bunch of commits in order to work on making the code able to work > > > natively in the above environments ... my work will mostly focus on Win32 > > > (since I have no OS2/BeOS installs), but alot of the changes will be such > > > that it will benefit the others as well ... > > > > > > The initial changes will be to just wrapper all our shared memory > > > code, so that I can make use of Apache's libapr libraries *if* they are > > > installed ... if not, it will just fall back to "the current code" ... > > > > If you want any assistance, drop me an email. I spent a long time (> decade) > > doing Windows applications and drivers and know a good number of the cool > > tricks. > > hrmmmm ... do you have a working Windows development environment? I'm > running WinXP at home, but don't have any of the compilers or anything > yet, so all my work for the first part is going to be done under Unix ... > > but someone that knows something about building makefiles for Windows, and > compiling under it, will definitely be a major asset ;) > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly -- "My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there." - Indira Gandhi
On Fri, 3 May 2002, Tom Lane wrote: > "Marc G. Fournier" <scrappy@hub.org> writes: > > The initial changes will be to just wrapper all our shared memory > > code, so that I can make use of Apache's libapr libraries *if* they are > > installed ... if not, it will just fall back to "the current code" ... > > I think we should redesign the shared memory API (and even more so the > semaphore API), not just put a wrapper layer on it. A lot of the > internal API is unnecessarily dependent on SysV shmem/sem behavior. > > Note however that there are some things you will break if you are not > very careful. We are depending on shmem/sem behavior to catch a number > of multiple-postmaster conflict situations. If there's not a more or > less SysV-ish kernel underneath us, those situations will have to be > rethought and some other interlock invented. > > In short, I want to see a design review first, not a bunch of > off-the-cuff commits. All I'm planning on doing is changing the appropriate shm_* functions iwth pg_shm_* functions ... if !(libapr), all those pg_shm_* functions will have in them is the original call we've always used ... there will even be a --disable-libapr configure option so that if someone already has Apache2 installed, but doesn't wanna use libapr for PgSQL, they don't have to ... Basically, all I'm looking at is allowing PgSQL to use a different library for its shared memory calls then the standard one, nothing else ...
"Marc G. Fournier" <scrappy@hub.org> writes: > All I'm planning on doing is changing the appropriate shm_* functions iwth > pg_shm_* functions ... if !(libapr), all those pg_shm_* functions will > have in them is the original call we've always used ... there will even be > a --disable-libapr configure option so that if someone already has Apache2 > installed, but doesn't wanna use libapr for PgSQL, they don't have to ... > Basically, all I'm looking at is allowing PgSQL to use a different library > for its shared memory calls then the standard one, nothing else ... Oh. I guess my next question is how closely that Apache library emulates the SysV shmem semantics. In particular, can you reliably tell how many processes are attached to a shmem block? (Cf SharedMemoryIsInUse() in storage/ipc/ipc.c) Without that feature we have an interlock problem. regards, tom lane
On Fri, 3 May 2002, Tom Lane wrote: > "Marc G. Fournier" <scrappy@hub.org> writes: > > All I'm planning on doing is changing the appropriate shm_* functions iwth > > pg_shm_* functions ... if !(libapr), all those pg_shm_* functions will > > have in them is the original call we've always used ... there will even be > > a --disable-libapr configure option so that if someone already has Apache2 > > installed, but doesn't wanna use libapr for PgSQL, they don't have to ... > > > Basically, all I'm looking at is allowing PgSQL to use a different library > > for its shared memory calls then the standard one, nothing else ... > > Oh. I guess my next question is how closely that Apache library > emulates the SysV shmem semantics. In particular, can you reliably > tell how many processes are attached to a shmem block? (Cf > SharedMemoryIsInUse() in storage/ipc/ipc.c) Without that feature we > have an interlock problem. Will investigate this ... my immediate goal is to just get it so that an alternate library can be used ... default behaviour will be to stick with our current function calls ... to use libapr, you will/would have to use a configure option for it (sorry, meant --enable above, not --disable) ... The only '#ifdef's I'm planning on for this will be in a central shmem.* file(s), so there isn't going to be a string of those all over the place or anything stupid like that ...
Tom Lane wrote: > > "Marc G. Fournier" <scrappy@hub.org> writes: > > All I'm planning on doing is changing the appropriate shm_* functions iwth > > pg_shm_* functions ... if !(libapr), all those pg_shm_* functions will > > have in them is the original call we've always used ... there will even be > > a --disable-libapr configure option so that if someone already has Apache2 > > installed, but doesn't wanna use libapr for PgSQL, they don't have to ... > > > Basically, all I'm looking at is allowing PgSQL to use a different library > > for its shared memory calls then the standard one, nothing else ... > > Oh. I guess my next question is how closely that Apache library > emulates the SysV shmem semantics. In particular, can you reliably > tell how many processes are attached to a shmem block? (Cf > SharedMemoryIsInUse() in storage/ipc/ipc.c) Without that feature we > have an interlock problem. I am not familiar with the Apache code, but I see no reason why all the features in SysV SHM should not be implementable in a Windows modules. IMHO that's what should be done.
"Marc G. Fournier" wrote: > > On Fri, 3 May 2002, Tom Lane wrote: > > > "Marc G. Fournier" <scrappy@hub.org> writes: > > > All I'm planning on doing is changing the appropriate shm_* functions iwth > > > pg_shm_* functions ... if !(libapr), all those pg_shm_* functions will > > > have in them is the original call we've always used ... there will even be > > > a --disable-libapr configure option so that if someone already has Apache2 > > > installed, but doesn't wanna use libapr for PgSQL, they don't have to ... > > > > > Basically, all I'm looking at is allowing PgSQL to use a different library > > > for its shared memory calls then the standard one, nothing else ... > > > > Oh. I guess my next question is how closely that Apache library > > emulates the SysV shmem semantics. In particular, can you reliably > > tell how many processes are attached to a shmem block? (Cf > > SharedMemoryIsInUse() in storage/ipc/ipc.c) Without that feature we > > have an interlock problem. > > Will investigate this ... my immediate goal is to just get it so that an > alternate library can be used ... default behaviour will be to stick with > our current function calls ... to use libapr, you will/would have to use a > configure option for it (sorry, meant --enable above, not --disable) ... > > The only '#ifdef's I'm planning on for this will be in a central shmem.* > file(s), so there isn't going to be a string of those all over the place > or anything stupid like that ... I think that you should create a verbatim implementation of the SysV shared memory API in native Win32. It may have to be a pgsysvshm.dll or something like it, but I think it is the best possible approach. Let me look at it, I may be able to have something pretty quick.
mlw <markw@mohawksoft.com> writes: > I think that you should create a verbatim implementation of the SysV > shared memory API in native Win32. It may have to be a pgsysvshm.dll > or something like it, but I think it is the best possible approach. > Let me look at it, I may be able to have something pretty quick. The notion of redesigning the internal API shouldn't be forgotten, though. I'm not so dissatisfied with the shmem API (mainly because it's only relevant at startup; once we've created and attached the shmem segment, we're done worrying about it). But the SysV semaphore API is really kind of ugly, and the ugliness doesn't buy anything except porting difficulty. Moreover, putting a cleaner API layer there would make it easier to experiment with cheaper semaphore primitives, such as POSIX mutexes. There was a thread last fall concerning redesigning that code --- I've forgotten the guy's name, but IIRC he wanted to make a port to QNX6, and the sema code was getting in the way. We put the work on hold because we were getting close to 7.2 release (or thought we were, anyway) but the project ought to be taken up again. regards, tom lane
Tom Lane wrote: > > mlw <markw@mohawksoft.com> writes: > > I think that you should create a verbatim implementation of the SysV > > shared memory API in native Win32. It may have to be a pgsysvshm.dll > > or something like it, but I think it is the best possible approach. > > > Let me look at it, I may be able to have something pretty quick. > > The notion of redesigning the internal API shouldn't be forgotten, > though. I'm not so dissatisfied with the shmem API (mainly because > it's only relevant at startup; once we've created and attached the > shmem segment, we're done worrying about it). But the SysV semaphore > API is really kind of ugly, and the ugliness doesn't buy anything except > porting difficulty. Moreover, putting a cleaner API layer there would > make it easier to experiment with cheaper semaphore primitives, such > as POSIX mutexes. > > There was a thread last fall concerning redesigning that code --- I've > forgotten the guy's name, but IIRC he wanted to make a port to QNX6, > and the sema code was getting in the way. We put the work on hold > because we were getting close to 7.2 release (or thought we were, > anyway) but the project ought to be taken up again. I will commit to writing a windows version of what ever shm/semaphore/mutex code you guys specify. > > regards, tom lane
sysv shm/sem I am writing a Win32 DLL implementation of : int semget(key_t key, int nsems, int semflg); int semctl(int semid, int semnum, int cmd, union semun arg); int semop(int semid, struct sembuf * sops, unsigned nsops); int shmctl(int shmid, int cmd, struct shmid_ds *buf); int shmget(key_t key, int size, int shmflg); void * shmat(int shmid, const void *shmaddr, int shmfl); int shmdt(const void *shmaddr); I will donate it do PostgreSQL. UNIX permissions will be ignored, i.e. uig/gid will be 0 Do you see any need for the msgxxx calls? Is the function ipc() ever used?
mlw <markw@mohawksoft.com> writes: > UNIX permissions will be ignored, i.e. uig/gid will be 0 Win32 has no security anyway, right? ;-) > Do you see any need for the msgxxx calls? > Is the function ipc() ever used? Nope, and nope. regards, tom lane
> mlw <markw@mohawksoft.com> writes: > > I think that you should create a verbatim implementation of the SysV > > shared memory API in native Win32. It may have to be a pgsysvshm.dll > > or something like it, but I think it is the best possible approach. > > > Let me look at it, I may be able to have something pretty quick. > > The notion of redesigning the internal API shouldn't be forgotten, > though. I'm not so dissatisfied with the shmem API (mainly because > it's only relevant at startup; once we've created and attached the > shmem segment, we're done worrying about it). But the SysV semaphore > API is really kind of ugly, and the ugliness doesn't buy anything except > porting difficulty. Moreover, putting a cleaner API layer there would > make it easier to experiment with cheaper semaphore primitives, such > as POSIX mutexes. > > There was a thread last fall concerning redesigning that code --- I've > forgotten the guy's name, but IIRC he wanted to make a port to QNX6, That would be me. > and the sema code was getting in the way. We put the work on hold > because we were getting close to 7.2 release (or thought we were, > anyway) but the project ought to be taken up again. > Yes, I am intended to give it another spin soon. I think it is bad idea to impose SysV ugliness on systems which have better solutions. Main problem with SysV primitives is that they are 'sticky' (i.e., not cleaned up if process dies/exits by the system). So Postgres has to deal with issues like discovering leftovers, finding unused IPC keys, etc. It is inelegant and takes up lot of code. POSIX primitives are anonymous and cleaned up automatically. So you just say 'give me a semaphore' and you get it, nothing gets into your way. Performance of POSIX mutexes and semaphores (on platforms where they are implemented properly) is also better than SysV semaphores. Unfortunately some systems have rather lame POSIX support, for example semaphores and mutexes can't be shared across processes on Linux. That's basically the reason why people keep sticking to SysV. What really need to be done is new abstraction layer which would cover SysV API, POSIX and whatever native APIs are better for BeOS/OS2/Win32. I almost did it last time... -- igor
mlw <markw@mohawksoft.com> writes: > I am writing a Win32 DLL implementation of : > int semget(key_t key, int nsems, int semflg); > int semctl(int semid, int semnum, int cmd, union semun arg); > int semop(int semid, struct sembuf * sops, unsigned nsops); Rather than propagating the SysV semaphore API still further, why don't we kill it now? (I'm willing to keep the shmem API, however.) After looking over the uses of these functions, I believe that we could easily develop a non-SysV-centric internal API. Here's a first cut: 1. Define a struct type PGSemaphore that has implementation-specific contents (the generic code will never look inside it). Operations on semaphores will take "PGSemaphore *" arguments. When implementing atop SysV semaphores, PGSemaphore will contain two fields, the semaphore id and semaphore number. In other cases the contents could be different. 2. All PGSemaphore structs will be physically stored in shared memory. This doesn't matter for SysV support, where the id/number are constants anyway; but it will allow implementations based on mutexes. 3. The operations needed are * Reserve semaphores. This will be told the number of semaphores needed. On SysV it will do the necessary semget()s, but on some implementations it might be a no-op. This should also be prepared to clean up after a failed postmaster, if it is possible for sema resources to outlive the creating postmaster. * Create semaphore. Given a pointer to an uninitialized PGSemaphore struct, initialize it to a new semaphore with count 1. (On SysV this would hand out the individual semas previously allocated by Reserve.) Note that this is not responsible for allocating the memory occupied by the PGSemaphore struct --- I envision the structs being part of larger objects such as PROC structures. * Release semaphores. Release all resources allocated by previous Reserve and Create operations. This is called when shutting down or when resetting shared memory after a backend crash. * Reset semaphore. Reset an existing PGSemaphore to count zero. * Lock semaphore. Identical to current IpcSemaphoreLock(), except parameter is a PGSemaphore *. See code of that routine for detailed semantics. * Unlock semaphore. Identical to current IpcSemaphoreUnlock(), except parameter is a PGSemaphore *. * Conditional lock semaphore. Identical to current IpcSemaphoreTryLock(), except parameter is a PGSemaphore *. Reserve/create/release would all be called in the postmaster process, so they could communicate via malloc'd private memory (eg, an array of semaphore IDs would be needed in the SysV case). The remaining operations would be invokable by any backend. Comments? I'd be willing to work on refactoring the existing SysV-based code to meet this spec. regards, tom lane
"Igor Kovalenko" <Igor.Kovalenko@motorola.com> writes: > What really need to be done is new abstraction layer which would cover SysV > API, POSIX and whatever native APIs are better for BeOS/OS2/Win32. I almost > did it last time... Yes. I just sent off a proposal for a cleaner semaphore API --- please comment on it. My inclination is to stick with the SysV API for shared memory, however. The "stickiness" is actually not a bad thing for us in the shared memory case, because it allows a new postmaster to detect the situation where old backends are still running: it can see that there is an old shmem segment still present with attached processes. Without that, we have no good defense against the scenario where an old postmaster dumped core leaving backends still running. The backends are fine as long as they are left to finish out their operations, or even killed with whatever degree of prejudice the admin wants. But what we must *not* do is allow a new postmaster to start while the old backends are still running; that would mean two sets of backends running without contact with each other, which would be fatal for data integrity. The SysV API lets us detect that case, but I don't see any equally good way to do it if we are using anonymous shared memory. regards, tom lane
Like I told Marc, I don't care. You spec out what you want and I'll write it for Windows. That being said, a SysV IPC interface for native Windows would be kind of cool to have. Tom Lane wrote: > > mlw <markw@mohawksoft.com> writes: > > I am writing a Win32 DLL implementation of : > > > int semget(key_t key, int nsems, int semflg); > > int semctl(int semid, int semnum, int cmd, union semun arg); > > int semop(int semid, struct sembuf * sops, unsigned nsops); > > Rather than propagating the SysV semaphore API still further, why don't > we kill it now? (I'm willing to keep the shmem API, however.) > > After looking over the uses of these functions, I believe that we could > easily develop a non-SysV-centric internal API. Here's a first cut: > > 1. Define a struct type PGSemaphore that has implementation-specific > contents (the generic code will never look inside it). Operations on > semaphores will take "PGSemaphore *" arguments. When implementing atop > SysV semaphores, PGSemaphore will contain two fields, the semaphore id > and semaphore number. In other cases the contents could be different. > > 2. All PGSemaphore structs will be physically stored in shared memory. > This doesn't matter for SysV support, where the id/number are constants > anyway; but it will allow implementations based on mutexes. > > 3. The operations needed are > > * Reserve semaphores. This will be told the number of semaphores > needed. On SysV it will do the necessary semget()s, but on some > implementations it might be a no-op. This should also be prepared > to clean up after a failed postmaster, if it is possible for sema > resources to outlive the creating postmaster. > > * Create semaphore. Given a pointer to an uninitialized PGSemaphore > struct, initialize it to a new semaphore with count 1. (On SysV this > would hand out the individual semas previously allocated by Reserve.) > Note that this is not responsible for allocating the memory occupied > by the PGSemaphore struct --- I envision the structs being part of > larger objects such as PROC structures. > > * Release semaphores. Release all resources allocated by previous > Reserve and Create operations. This is called when shutting down > or when resetting shared memory after a backend crash. > > * Reset semaphore. Reset an existing PGSemaphore to count zero. > > * Lock semaphore. Identical to current IpcSemaphoreLock(), except > parameter is a PGSemaphore *. See code of that routine for detailed > semantics. > > * Unlock semaphore. Identical to current IpcSemaphoreUnlock(), except > parameter is a PGSemaphore *. > > * Conditional lock semaphore. Identical to current > IpcSemaphoreTryLock(), except parameter is a PGSemaphore *. > > Reserve/create/release would all be called in the postmaster process, > so they could communicate via malloc'd private memory (eg, an array > of semaphore IDs would be needed in the SysV case). The remaining > operations would be invokable by any backend. > > Comments? > > I'd be willing to work on refactoring the existing SysV-based code > to meet this spec. > > regards, tom lane
> "Igor Kovalenko" <Igor.Kovalenko@motorola.com> writes: > > What really need to be done is new abstraction layer which would cover SysV > > API, POSIX and whatever native APIs are better for BeOS/OS2/Win32. I almost > > did it last time... > > Yes. I just sent off a proposal for a cleaner semaphore API --- please > comment on it. > I will look. I remember from my last attempt that it actually did not involve a lot of changes in your existing abstraction layer (which already exists, just being SysV-centric). I believe only one function prototype had to be changed... Your proposal sounds like more changes will be needed... > My inclination is to stick with the SysV API for shared memory, however. > The "stickiness" is actually not a bad thing for us in the shared memory > case, because it allows a new postmaster to detect the situation where > old backends are still running: it can see that there is an old shmem > segment still present with attached processes. Without that, we have no > good defense against the scenario where an old postmaster dumped core > leaving backends still running. The backends are fine as long as they > are left to finish out their operations, or even killed with whatever > degree of prejudice the admin wants. But what we must *not* do is allow > a new postmaster to start while the old backends are still running; > that would mean two sets of backends running without contact with each > other, which would be fatal for data integrity. The SysV API lets us > detect that case, but I don't see any equally good way to do it if we > are using anonymous shared memory. It does not have to be anonymous. POSIX also defines shm_open(same arguments as open) API which will create named object in whatever location corresponds to shared memory storage on that platform (object is then grown to needed size by ftruncate() and the fd is then passed to mmap). The object will exist in name space and can be detected by subsequent calls to shm_open() with same name. It is not really different from doing open(), but more portable (mmap() on regular files may not be supported). I suggest we do IPC abstraction which would cover shared memory as well as semaphores, otherwise it will be only half of solution - platforms without SysV API would still have to emulate SysV shared memory. -- igor
"Igor Kovalenko" <Igor.Kovalenko@motorola.com> writes: > It does not have to be anonymous. POSIX also defines shm_open(same arguments > as open) API which will create named object in whatever location corresponds > to shared memory storage on that platform (object is then grown to needed > size by ftruncate() and the fd is then passed to mmap). The object will > exist in name space and can be detected by subsequent calls to shm_open() > with same name. It is not really different from doing open(), but more > portable (mmap() on regular files may not be supported). Yes, but can you detect whether other processes have the same file open? regards, tom lane
On Fri, 3 May 2002, Tom Lane wrote: > But what we must *not* do is allow a new postmaster to start while the > old backends are still running; that would mean two sets of backends > running without contact with each other, which would be fatal for data > integrity. The SysV API lets us detect that case, but I don't see any > equally good way to do it if we are using anonymous shared memory. It's a hack (and has slight security implications), but you could just allow the postgres backends to keep the listening socket(s) open. Matthew.
> -----Original Message----- > From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Tom Lane > Sent: Friday, May 03, 2002 6:07 PM > To: mlw > Cc: Marc G. Fournier; pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] HEADS UP: Win32/OS2/BeOS native ports > > > Rather than propagating the SysV semaphore API still further, why don't > we kill it now? (I'm willing to keep the shmem API, however.) Would this have the benefit of allow PostgreSQL to work properly in BSD jails, since lack of really working SysV IPC was the problem there? - J.
"Joel Burton" <joel@joelburton.com> writes: >> Rather than propagating the SysV semaphore API still further, why don't >> we kill it now? (I'm willing to keep the shmem API, however.) > Would this have the benefit of allow PostgreSQL to work properly in BSD > jails, since lack of really working SysV IPC was the problem there? Was the problem just with semas, or was shmem an issue too? In any case, unless someone actually writes an alternative sema implementation that will work on BSD, nothing will happen... regards, tom lane
Matthew Kirkwood <matthew@hairy.beasts.org> writes: > On Fri, 3 May 2002, Tom Lane wrote: >> The SysV API lets us detect that case, but I don't see any >> equally good way to do it if we are using anonymous shared memory. > It's a hack (and has slight security implications), but you > could just allow the postgres backends to keep the listening > socket(s) open. Hmm. That might be workable, but it feels shaky to me. The problem is that you are using a lock based on port number to interlock a data directory --- and port number and data directory are independently variable parameters. Consider$ postmaster -D /my/dir &-- dba thinks "oops, forgot to specify port"$ kill -9 pm-pid # bad idea$ postmaster -D /my/dir -p myport & Any backends started by the first postmaster will not be noticed by the second one, if the interlock is based on port number. We could get around this, of course: record the port number in the data directory lockfile, and test for existence of the old socket independently of trying to create a new one. But it seems ugly. regards, tom lane
I have just committed changes to create a platform-independent internal API for semaphores, along the lines discussed yesterday. At this point, the Darwin (Mac OS X), BeOS, and QNX4 ports are probably broken. I will fix the Darwin port (probably not till tomorrow though); volunteers to clean up the BeOS and QNX4 ports are needed. BTW, there is a quick hack attempt at a POSIX-semaphore-based implementation in src/backend/port/posix_sema.c. I have not tested this yet, but expect to do so as part of fixing the Darwin port. regards, tom lane
> "Joel Burton" <joel@joelburton.com> writes: > >> Rather than propagating the SysV semaphore API still further, why don't > >> we kill it now? (I'm willing to keep the shmem API, however.) > > > Would this have the benefit of allow PostgreSQL to work properly in BSD > > jails, since lack of really working SysV IPC was the problem there? > > Was the problem just with semas, or was shmem an issue too? Not sure -- doesn't get far enough for me to tell. initdb dies with: creating template1 database in /usr/local/pgsql/data/base/1... IpcSemaphoreCreate: semget(key=1, num=17, 03600) failed: Function not implemented > In any case, unless someone actually writes an alternative sema > implementation that will work on BSD, nothing will happen... Was hoping that the discussions about the APR might let this work under BSD jails, assuming I can get the APR to compile. (For others: apparently PG will work under BSD jails if you recompile the BSD kernel w/some new settings, but my ISP for this project was unwilling to do that. Search the mailing list for messages on how to do this.) J.
> -----Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > Sent: Friday, May 03, 2002 3:07 PM > To: mlw > Cc: Marc G. Fournier; pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] HEADS UP: Win32/OS2/BeOS native ports > > > mlw <markw@mohawksoft.com> writes: > > I am writing a Win32 DLL implementation of : > > > int semget(key_t key, int nsems, int semflg); > > int semctl(int semid, int semnum, int cmd, union semun arg); > > int semop(int semid, struct sembuf * sops, unsigned nsops); > > Rather than propagating the SysV semaphore API still further, > why don't > we kill it now? (I'm willing to keep the shmem API, however.) > > After looking over the uses of these functions, I believe > that we could > easily develop a non-SysV-centric internal API. Here's a first cut: > > 1. Define a struct type PGSemaphore that has implementation-specific > contents (the generic code will never look inside it). Operations on > semaphores will take "PGSemaphore *" arguments. When > implementing atop > SysV semaphores, PGSemaphore will contain two fields, the semaphore id > and semaphore number. In other cases the contents could be different. > > 2. All PGSemaphore structs will be physically stored in > shared memory. > This doesn't matter for SysV support, where the id/number are > constants > anyway; but it will allow implementations based on mutexes. > > 3. The operations needed are > > * Reserve semaphores. This will be told the number of semaphores > needed. On SysV it will do the necessary semget()s, but on some > implementations it might be a no-op. This should also be prepared > to clean up after a failed postmaster, if it is possible for sema > resources to outlive the creating postmaster. > > * Create semaphore. Given a pointer to an uninitialized PGSemaphore > struct, initialize it to a new semaphore with count 1. (On SysV this > would hand out the individual semas previously allocated by Reserve.) > Note that this is not responsible for allocating the memory occupied > by the PGSemaphore struct --- I envision the structs being part of > larger objects such as PROC structures. > > * Release semaphores. Release all resources allocated by previous > Reserve and Create operations. This is called when shutting down > or when resetting shared memory after a backend crash. > > * Reset semaphore. Reset an existing PGSemaphore to count zero. > > * Lock semaphore. Identical to current IpcSemaphoreLock(), except > parameter is a PGSemaphore *. See code of that routine for detailed > semantics. > > * Unlock semaphore. Identical to current IpcSemaphoreUnlock(), except > parameter is a PGSemaphore *. > > * Conditional lock semaphore. Identical to current > IpcSemaphoreTryLock(), except parameter is a PGSemaphore *. > > Reserve/create/release would all be called in the postmaster process, > so they could communicate via malloc'd private memory (eg, an array > of semaphore IDs would be needed in the SysV case). The remaining > operations would be invokable by any backend. > > Comments? > > I'd be willing to work on refactoring the existing SysV-based code > to meet this spec. It's already been done. Here is a freely available C++ implementation (licensing similar to PostgreSQL): http://www.cs.wustl.edu/~schmidt/ACE.html
"Joel Burton" <joel@joelburton.com> writes: > Would this have the benefit of allow PostgreSQL to work properly in BSD > jails, since lack of really working SysV IPC was the problem there? >> >> Was the problem just with semas, or was shmem an issue too? > Not sure -- doesn't get far enough for me to tell. initdb dies with: > creating template1 database in /usr/local/pgsql/data/base/1... > IpcSemaphoreCreate: semget(key=1, num=17, 03600) failed: > Function not implemented We create shared memory before semaphores, so if you got this far then the shmem code is probably working (at least minimally). Do you have working sem_open or sem_init (ie, POSIX semaphores)? regards, tom lane
> > Rather than propagating the SysV semaphore API still further, why don't > > we kill it now? (I'm willing to keep the shmem API, however.) > > Would this have the benefit of allow PostgreSQL to work properly in BSD > jails, since lack of really working SysV IPC was the problem there? I have postgresql working quite happily in FreeBSD jails! (Just make sure you go "sysctl jail.sysvipc_allowed=1"). Chris
> (For others: apparently PG will work under BSD jails if you recompile the > BSD kernel w/some new settings, but my ISP for this project was > unwilling to > do that. Search the mailing list for messages on how to do this.) Works fine. You don't need to recompile - just use the sysctl. Chris
Marc G. Fournier wrote: > hrmmmm ... do you have a working Windows development environment? I'm > running WinXP at home, but don't have any of the compilers or anything > yet, so all my work for the first part is going to be done under Unix ... > > but someone that knows something about building makefiles for Windows, and > compiling under it, will definitely be a major asset ;) I think if you are familiar with make and gcc (and perhaps autoconf), MinGW and MSys are the development environment of choice on Windows. You even get /bin/sh. But the generated program does not depend on any custom library (like cygwin does). It's even possible to cross compile from a Linux box (actully powerpc in my case). Look at http://mingw.sourceforge.net (and there for msys). Christof
> > > Rather than propagating the SysV semaphore API still further, > why don't > > > we kill it now? (I'm willing to keep the shmem API, however.) > > > > Would this have the benefit of allow PostgreSQL to work properly in BSD > > jails, since lack of really working SysV IPC was the problem there? > > I have postgresql working quite happily in FreeBSD jails! (Just make sure > you go "sysctl jail.sysvipc_allowed=1"). Yep, Alastair D'Silva helpfully pointed this out a month or two ago, and for many people, this would be a workable solution. Unfortunately, it appears that you have to run this command outside the jail, which I don't have access to. I forwarded the suggestion to my ISP (imeme, a Zope provider), who said that: "This will allow you to run a single postgres in a single jail only one user would have access to it. If you try to run more then one it will try to use the same shared memory and crash." And therefore they refused to make the change. (More annoyingly, they kept trying to convince me that I should quit my whining and use MySQL since it's "ACID compliant"). So, I'm holding out hope that since this ISP seems unenlightened, one day PostgreSQL will simply run in BSD jails without a cooperating jailmaster, and it sounded like using the APR _might_ make this possible. (All of my other projects use PG; I'd sure love to get this one switched over!) Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton Knowledge Management & Technology Consultant
> I forwarded the suggestion to my ISP (imeme, a Zope provider), who said > that: > > "This will allow you to run a single postgres in a single jail only one > user would have access to it. If you try to run more then one it will > try to use the same shared memory and crash." Not true. But I'll avoid digging up any more on that old issue... Chris
On Sat, 4 May 2002, Joel Burton wrote: > > -----Original Message----- > > From: pgsql-hackers-owner@postgresql.org > > [mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Tom Lane > > Sent: Friday, May 03, 2002 6:07 PM > > To: mlw > > Cc: Marc G. Fournier; pgsql-hackers@postgresql.org > > Subject: Re: [HACKERS] HEADS UP: Win32/OS2/BeOS native ports > > > > > > Rather than propagating the SysV semaphore API still further, why don't > > we kill it now? (I'm willing to keep the shmem API, however.) > > Would this have the benefit of allow PostgreSQL to work properly in BSD > jails, since lack of really working SysV IPC was the problem there? There is no problem with SysV IPC in the jail, per se ... jail's were just not coded to delimite/segregate such IPC from other jails ... its one of those "caveat empor"(sp?) situations ... you can do it, but at your own risk, as somoene in another jail has the ability to 'attach' to your segments ...
> -----Original Message----- > From: Christopher Kings-Lynne [mailto:chriskl@familyhealth.com.au] > Sent: Monday, May 06, 2002 7:36 AM > To: Joel Burton; Tom Lane; mlw > Cc: Marc G. Fournier; pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] HEADS UP: Win32/OS2/BeOS native ports > > > > I forwarded the suggestion to my ISP (imeme, a Zope provider), who said > > that: > > > > "This will allow you to run a single postgres in a single jail only one > > user would have access to it. If you try to run more then one it will > > try to use the same shared memory and crash." > > Not true. But I'll avoid digging up any more on that old issue... Oh, I'm sure it's not true. But sometimes things end up on the "nyah, nyah, it's my server and I say so" level. Sigh. So, I guess that's where it leaves me: waiting for some solution other than ISP cluefulness. :-) - J. Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton Knowledge Management & Technology Consultant
On Sat, 4 May 2002, Tom Lane wrote: > Matthew Kirkwood <matthew@hairy.beasts.org> writes: > > On Fri, 3 May 2002, Tom Lane wrote: > >> The SysV API lets us detect that case, but I don't see any > >> equally good way to do it if we are using anonymous shared memory. > > > It's a hack (and has slight security implications), but you > > could just allow the postgres backends to keep the listening > > socket(s) open. > > Hmm. That might be workable, but it feels shaky to me. The problem > is that you are using a lock based on port number to interlock a data > directory --- and port number and data directory are independently > variable parameters. Consider > $ postmaster -D /my/dir & > -- dba thinks "oops, forgot to specify port" > $ kill -9 pm-pid # bad idea > $ postmaster -D /my/dir -p myport & > Any backends started by the first postmaster will not be noticed by > the second one, if the interlock is based on port number. > > We could get around this, of course: record the port number in the data > directory lockfile, and test for existence of the old socket > independently of trying to create a new one. But it seems ugly. How about a second, data directory based socket simply named something like '.inuse', that is not port dependent?
On Sun, 5 May 2002, Joel Burton wrote: > > "Joel Burton" <joel@joelburton.com> writes: > > >> Rather than propagating the SysV semaphore API still further, why don't > > >> we kill it now? (I'm willing to keep the shmem API, however.) > > > > > Would this have the benefit of allow PostgreSQL to work properly in BSD > > > jails, since lack of really working SysV IPC was the problem there? > > > > Was the problem just with semas, or was shmem an issue too? > > Not sure -- doesn't get far enough for me to tell. initdb dies with: > > creating template1 database in /usr/local/pgsql/data/base/1... > IpcSemaphoreCreate: semget(key=1, num=17, 03600) failed: > Function not implemented Read the jail manpage: jail.sysvipc_allowed This MIB entry determines whether or not processes within a jail have access toSystem V IPC primitives. In the current jail imple- mentation, System V primitives share a single namespace acrossthe host and jail environments, meaning that processes within a jail would be able to communicate with(and potentially interfere with) processes outside of the jail, and in other jails. As such, this functionalityis disabled by default, but can be enabled by setting this MIB entry to 1.
Or changing ISPs to a place more enlightened ... On Mon, 6 May 2002, Joel Burton wrote: > > -----Original Message----- > > From: Christopher Kings-Lynne [mailto:chriskl@familyhealth.com.au] > > Sent: Monday, May 06, 2002 7:36 AM > > To: Joel Burton; Tom Lane; mlw > > Cc: Marc G. Fournier; pgsql-hackers@postgresql.org > > Subject: Re: [HACKERS] HEADS UP: Win32/OS2/BeOS native ports > > > > > > > I forwarded the suggestion to my ISP (imeme, a Zope provider), who said > > > that: > > > > > > "This will allow you to run a single postgres in a single jail only one > > > user would have access to it. If you try to run more then one it will > > > try to use the same shared memory and crash." > > > > Not true. But I'll avoid digging up any more on that old issue... > > Oh, I'm sure it's not true. But sometimes things end up on the "nyah, nyah, > it's my server and I say so" level. Sigh. > > So, I guess that's where it leaves me: waiting for some solution other than > ISP cluefulness. :-) > > - J. > > Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton > Knowledge Management & Technology Consultant > >
"Marc G. Fournier" <scrappy@hub.org> writes: >> We could get around this, of course: record the port number in the data >> directory lockfile, and test for existence of the old socket >> independently of trying to create a new one. But it seems ugly. > How about a second, data directory based socket simply named something > like '.inuse', that is not port dependent? Hmm ... but how do you use that to tell if there are still backends around? regards, tom lane
On Mon, 6 May 2002, Tom Lane wrote: > "Marc G. Fournier" <scrappy@hub.org> writes: > >> We could get around this, of course: record the port number in the data > >> directory lockfile, and test for existence of the old socket > >> independently of trying to create a new one. But it seems ugly. > > > How about a second, data directory based socket simply named something > > like '.inuse', that is not port dependent? > > Hmm ... but how do you use that to tell if there are still backends > around? As a backend is started up, connect to that socket ... if socket is open when trying to start a new frontend, fail as there are currently other connections attached to it?
"Marc G. Fournier" <scrappy@hub.org> writes: >> Hmm ... but how do you use that to tell if there are still backends >> around? > As a backend is started up, connect to that socket ... if socket is open > when trying to start a new frontend, fail as there are currently other > connections attached to it? But the backends would only have the socket open, they'd not be actively listening to it. So how could you tell whether anyone had the socket open or not? ISTM we gave up on exactly that technique for the main postmaster's socket; we now create a separate lockfile to protect the socket, and don't rely on the socket itself to give us any interlocking help at all. But the lockfile just contains the postmaster's PID, so it's no help in detecting the case where the old postmaster has gone away but there are still orphaned backends laying about. I'm not entirely thrilled with the lockfile technique; it'd be nice to find something better. (In particular, we've seen a couple cases now where people had trouble with PG refusing to start after a system reboot, because some other daemon process had been assigned the PID that the postmaster had in its previous incarnation; so the lockfile check code mistakenly thinks there's still an old postmaster.) But so far, the only thing worse than lockfiles is everything else :-( regards, tom lane
I said: > But the backends would only have the socket open, they'd not be actively > listening to it. So how could you tell whether anyone had the socket > open or not? Oh, I take that back, I see how you could do it: the postmaster opens the socket *for writing*, but never actually writes. All its child processes inherit that same open file descriptor and just keep it around. Then, to tell if anyone's home, you open the socket *for reading* and try to read in O_NONBLOCK mode. You get an EOF indication if and only if no one has the socket open for writing; otherwise you get an EAGAIN error. That would work ... but is it more portable than depending on SysV shmem connection counts? ISTR that some of the platforms we support don't have Unix-style sockets at all. regards, tom lane
On Mon, 6 May 2002, Tom Lane wrote: > I said: > > But the backends would only have the socket open, they'd not be actively > > listening to it. So how could you tell whether anyone had the socket > > open or not? > > Oh, I take that back, I see how you could do it: the postmaster opens > the socket *for writing*, but never actually writes. All its child > processes inherit that same open file descriptor and just keep it > around. Then, to tell if anyone's home, you open the socket *for > reading* and try to read in O_NONBLOCK mode. You get an EOF indication > if and only if no one has the socket open for writing; otherwise you > get an EAGAIN error. > > That would work ... but is it more portable than depending on SysV > shmem connection counts? ISTR that some of the platforms we support > don't have Unix-style sockets at all. Wouldn't the same thing work with a simple file? Does it have to be a UnixDomainSocket?
"Marc G. Fournier" <scrappy@hub.org> writes: >> That would work ... but is it more portable than depending on SysV >> shmem connection counts? ISTR that some of the platforms we support >> don't have Unix-style sockets at all. > Wouldn't the same thing work with a simple file? Does it have to be a > UnixDomainSocket? No, and yes. If it's not a pipe/fifo then you don't get the EOF-only-when-no-possible-writers-remain behavior. TCP and UDP sockets don't show this sort of behavior either. So AFAICS we really need a named pipe, ie, socket. We could maybe do something approximately similar with TCP connection attempts (per the prior suggestion of letting backends hold the postmaster's listen socket open; then see if you get "connection refused" or a timeout from trying to connect) but I don't think it'd be as trustworthy. Simple mistakes like overly aggressive ipchains filters would confuse this kind of test. regards, tom lane
Since our default behavior (at startup) is to have TCP sockets disabled, how many OSs are there that don't support UD sockets? Enough to really be worried about? On Mon, 6 May 2002, Tom Lane wrote: > "Marc G. Fournier" <scrappy@hub.org> writes: > >> That would work ... but is it more portable than depending on SysV > >> shmem connection counts? ISTR that some of the platforms we support > >> don't have Unix-style sockets at all. > > > Wouldn't the same thing work with a simple file? Does it have to be a > > UnixDomainSocket? > > No, and yes. If it's not a pipe/fifo then you don't get the > EOF-only-when-no-possible-writers-remain behavior. TCP and UDP > sockets don't show this sort of behavior either. So AFAICS we > really need a named pipe, ie, socket. > > We could maybe do something approximately similar with TCP connection > attempts (per the prior suggestion of letting backends hold the > postmaster's listen socket open; then see if you get "connection > refused" or a timeout from trying to connect) but I don't think it'd be > as trustworthy. Simple mistakes like overly aggressive ipchains filters > would confuse this kind of test. > > regards, tom lane >
"Marc G. Fournier" <scrappy@hub.org> writes: > Since our default behavior (at startup) is to have TCP sockets disabled, > how many OSs are there that don't support UD sockets? A quick look in the sources shows that we #undef HAVE_UNIX_SOCKETS for QNX, BeOS, and old cygwin versions ... which are exactly the platforms that don't have SysV shmem support, so those are exactly the guys who we're trying to fix the problem for. I do like the idea of using a Unix socket this way where available, though. It'd let us switch over the shmem code to using IPC_PRIVATE shmem key, which'd simplify that code tremendously; and we could make some progress against the dead-PID-in-lockfile problem. Could we get away with saying that the Unix-socket-less platforms have weaker protection against mistakenly restarting the postmaster? We could have a plain-vanilla lockfile instead of a socket lockfile on those platforms, which would not catch the dead-postmaster-live-backends case, but it'd be better than nothing. And I am not convinced that the shmem-connection-count check should be trusted on QNX or BeOS, anyway, so I'm not sure that they actually have a functioning check now. regards, tom lane
Tom Lane wrote: > I said: > > But the backends would only have the socket open, they'd not be actively > > listening to it. So how could you tell whether anyone had the socket > > open or not? > > Oh, I take that back, I see how you could do it: the postmaster opens > the socket *for writing*, but never actually writes. All its child > processes inherit that same open file descriptor and just keep it > around. Then, to tell if anyone's home, you open the socket *for > reading* and try to read in O_NONBLOCK mode. You get an EOF indication > if and only if no one has the socket open for writing; otherwise you > get an EAGAIN error. > > That would work ... but is it more portable than depending on SysV > shmem connection counts? ISTR that some of the platforms we support > don't have Unix-style sockets at all. I think what you describe is a named pipe, not a socket. The underlying implementation might be a socketpair, but the behaviour of named pipes is exactly that since Version 7 at least. This worked under Minix already. > > regards, tom lane Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
> "Marc G. Fournier" <scrappy@hub.org> writes: > > Since our default behavior (at startup) is to have TCP sockets disabled, > > how many OSs are there that don't support UD sockets? > > A quick look in the sources shows that we #undef HAVE_UNIX_SOCKETS for > QNX, BeOS, and old cygwin versions ... which are exactly the platforms > that don't have SysV shmem support, so those are exactly the guys who > we're trying to fix the problem for. Next release of QNX (6.2) will add support for UDS, but they are still not quite portable. > > I do like the idea of using a Unix socket this way where available, > though. It'd let us switch over the shmem code to using IPC_PRIVATE > shmem key, which'd simplify that code tremendously; and we could make > some progress against the dead-PID-in-lockfile problem. > > Could we get away with saying that the Unix-socket-less platforms have > weaker protection against mistakenly restarting the postmaster? We > could have a plain-vanilla lockfile instead of a socket lockfile on > those platforms, which would not catch the dead-postmaster-live-backends > case, but it'd be better than nothing. And I am not convinced that the > shmem-connection-count check should be trusted on QNX or BeOS, anyway, > so I'm not sure that they actually have a functioning check now. Why can't we use named pipe (aka FIFO file) instead of UDS? I think that is more portable... The socketpair() function also tends to be more portable than whole UDS in general... It works on QNX4 even, but not sure about BeOS. Another thought is, why can't we use bind() to the postmaster port to detect other postmasters? I might be missing something, so pardon by ignorance. But should not bind() to same port fail with EADDRINUSE unless SO_REUSEADDR is set? I don't really know if it is set in postgres or not ... -- igor
"Igor Kovalenko" <Igor.Kovalenko@motorola.com> writes: >> Could we get away with saying that the Unix-socket-less platforms have >> weaker protection against mistakenly restarting the postmaster? > Why can't we use named pipe (aka FIFO file) instead of UDS? That's exactly what I'm talking about. > Another thought is, why can't we use bind() to the postmaster port to detect > other postmasters? Because port number and data directory are independent parameters. The interlock on port number is not related to the interlock on data directory. regards, tom lane
On Mon, 6 May 2002, Tom Lane wrote: > > As a backend is started up, connect to that socket ... if socket is open > > when trying to start a new frontend, fail as there are currently other > > connections attached to it? > > But the backends would only have the socket open, they'd not be > actively listening to it. So how could you tell whether anyone > had the socket open or not? It's easy. As startup, the postmaster (or standalone backend) creates a Unix socket, binds it to the filename and calls listen on it. If another backend is running, it'll get EADDRINUSE from the bind or listen. Nobody actually needs to connect to the socket. Simple, race-free, 10 lines of code. Matthew.
Matthew Kirkwood <matthew@hairy.beasts.org> writes: > Nobody actually needs to connect to the socket. Simple, > race-free, 10 lines of code. ... and we already do it. But it protects the port number, not the data directory. regards, tom lane
On Tue, 7 May 2002, Tom Lane wrote: > > Nobody actually needs to connect to the socket. Simple, > > race-free, 10 lines of code. > > ... and we already do it. But it protects the port number, not > the data directory. If I understood him correctly, Marc was suggesting a further domain socket inside the data directory. Matthew.
Matthew Kirkwood <matthew@hairy.beasts.org> writes: >> ... and we already do it. But it protects the port number, not >> the data directory. > If I understood him correctly, Marc was suggesting a further > domain socket inside the data directory. Right, and that would work because we would reference it as $PGDATA/.socket --- exact, one-to-one correspondence between data directory and interlock file. A TCP socket isn't going to have any such direct connection to the data directory. We could try to make such a connection (eg, pick a free port number at random, and record the number in a lockfile in $PGDATA). But that will suffer from a bunch of failure modes, starting with the same one that's been biting us for PID interlocking: after a system restart, someone else may hold the port number that we chose at random last time. Basically, the reason that we want this interlock is because we are going after five-nines kind of reliability. An interlock technology that's not itself five-nines reliable isn't going to make things better. regards, tom lane
Just a friendly reminder that it should be named pipe rather than UDS ;) -- igor > Matthew Kirkwood <matthew@hairy.beasts.org> writes: > >> ... and we already do it. But it protects the port number, not > >> the data directory. > > > If I understood him correctly, Marc was suggesting a further > > domain socket inside the data directory. > > Right, and that would work because we would reference it as > $PGDATA/.socket --- exact, one-to-one correspondence between data > directory and interlock file. A TCP socket isn't going to have any > such direct connection to the data directory. > > We could try to make such a connection (eg, pick a free port number at > random, and record the number in a lockfile in $PGDATA). But that will > suffer from a bunch of failure modes, starting with the same one that's > been biting us for PID interlocking: after a system restart, someone > else may hold the port number that we chose at random last time. > > Basically, the reason that we want this interlock is because we are > going after five-nines kind of reliability. An interlock technology > that's not itself five-nines reliable isn't going to make things better. > > regards, tom lane >
On Tue, 7 May 2002, Igor Kovalenko wrote: > Just a friendly reminder that it should be named pipe rather than UDS > ;) Named pipes don't have the required syntax. Perhaps for platforms which have neither SysV shm, something like POSIX named semaphores are the way forward. Matthew.
Can you be more specific? What required syntax? I was talking about named pipe vs UDS socket... > On Tue, 7 May 2002, Igor Kovalenko wrote: > > > Just a friendly reminder that it should be named pipe rather than UDS > > ;) > > Named pipes don't have the required syntax. Perhaps for > platforms which have neither SysV shm, something like > POSIX named semaphores are the way forward. > > Matthew. >
"Igor Kovalenko" <Igor.Kovalenko@motorola.com> writes: > I was talking about named pipe vs UDS socket... Aren't those the same thing? You get a socket file either way. regards, tom lane
> "Igor Kovalenko" <Igor.Kovalenko@motorola.com> writes: > > I was talking about named pipe vs UDS socket... > > Aren't those the same thing? You get a socket file either way. > On QNX named pipe will have type 'FIFO file', which has similar features to a socket indeed but implemented differently but that is not the point. On SysV derivatives they all will be implemented as 2 connected STREAMS heads. On BSD they both will be same thing. Not sure about other systems. The UDS API however was originally limited to BSD4.3 and only later started to spread, whereas named pipes have been around longer and probably exist in any Unix variant and probably other types of systems. -- igor
On Wed, 8 May 2002, Igor Kovalenko wrote: > Can you be more specific? What required syntax? I was talking about > named pipe vs UDS socket... Sorry, I meant semantics. A pipe can have multiple readers and multiple writers. This is no use for us. A listening SOCK_STREAM Unix domain socket can have no readers or writers, but only one listener (well, except that other processes can inherit or be passed the socket). You have to connect() (and the server must accept()) before read and write do anything. But we have no use for that here. It's just an exclusive-only mutex whose namespace is the filesystem. It really is like a TCP socket, except that the address namespace is the filesystem, and thus it's not available remotely. Think of it as a TCP socket without the "which address and port do I use, and how do I keep it secure" issues. Matthew.
Ahh... you want a named semaphore... There is such a thing in POSIX but it is only portable if their names begin with "/" (which tells OS to put it where appropriate). I believe without leading slash they end up in current directory, but we can't rely on that... too bad. Glad UDS it is getting supported on my platform, lol ;) This will however leave QNX4 in the dust, if anyone cares. And most likely BeOS, MP/X and half dozen other platforms. Which prompts me to think if it would not be better to come up with a platform independent 'namespace sync' mechanism. Can't we use fcntl()-based lock for that purpose? That's what apache is doing apparently (one of variants). -- igor > On Wed, 8 May 2002, Igor Kovalenko wrote: > > > Can you be more specific? What required syntax? I was talking about > > named pipe vs UDS socket... > > Sorry, I meant semantics. > > A pipe can have multiple readers and multiple writers. This is > no use for us. > > A listening SOCK_STREAM Unix domain socket can have no readers or > writers, but only one listener (well, except that other processes > can inherit or be passed the socket). You have to connect() (and > the server must accept()) before read and write do anything. But > we have no use for that here. It's just an exclusive-only mutex > whose namespace is the filesystem. > > It really is like a TCP socket, except that the address namespace > is the filesystem, and thus it's not available remotely. > > Think of it as a TCP socket without the "which address and port > do I use, and how do I keep it secure" issues. > > Matthew. >
"Igor Kovalenko" <Igor.Kovalenko@motorola.com> writes: > Can't we use fcntl()-based lock for that purpose? I'm pretty sure that fcntl locking has an evil reputation as well. (Didn't we use that up till a couple years ago, and give up on it?) regards, tom lane
Tom Lane wrote: > "Igor Kovalenko" <Igor.Kovalenko@motorola.com> writes: > > I was talking about named pipe vs UDS socket... > > Aren't those the same thing? You get a socket file either way. No they are not. The former is a FIFO file, the latter a socket. FIFO's can be used via open(2), sockets via connect(2). And as said before, FIFO's are there since UNIX Version 7 (at least, I haven't been aroundbefore that). So there is a good chance that these are available on every UNIX. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
Igor Kovalenko wrote: > It does not have to be anonymous. POSIX also defines shm_open(same arguments > as open) API which will create named object in whatever location corresponds > to shared memory storage on that platform (object is then grown to needed > size by ftruncate() and the fd is then passed to mmap). The object will > exist in name space and can be detected by subsequent calls to shm_open() > with same name. It is not really different from doing open(), but more > portable (mmap() on regular files may not be supported). Actually, I think the best shared memory implemention would be MAP_ANON | MAP_SHARED mmap(), which could be called from the postmaster and passed to child processes. While all our platforms have mmap(), many don't have MAP_ANON, but those that do could use it. You need MAP_ANON to prevent the shared memory from being written to a disk file. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
mlw wrote: > Like I told Marc, I don't care. You spec out what you want and I'll write it > for Windows. > > That being said, a SysV IPC interface for native Windows would be kind of cool > to have. I am wondering why we don't just use the Cygwin shm/sem code in our project, or maybe the Apache stuff; why bother reinventing the wheel. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
I think its already been determined that the cygwin option is too low performing. However, the apache stuff could be quite useful - but if that effort were to be undertaken, it would make more sense to move all versions of the code the the apache runtime, for all platforms. Are there any other runtime libraries out there that are cross platform, open/free and high performance? I know the mozilla XPCOM libraries work quite nicely, but are geared more towards multithreaded apps - and the COM-alike infrastructure is something we wouldn't need. ~Jon ----- Original Message ----- From: Bruce Momjian <pgman@candle.pha.pa.us> To: mlw <markw@mohawksoft.com> Cc: Tom Lane <tgl@sss.pgh.pa.us>; Marc G. Fournier <scrappy@hub.org>; <pgsql-hackers@postgresql.org> Sent: Sunday, June 02, 2002 8:49 PM Subject: Re: [HACKERS] HEADS UP: Win32/OS2/BeOS native ports > mlw wrote: > > Like I told Marc, I don't care. You spec out what you want and I'll write it > > for Windows. > > > > That being said, a SysV IPC interface for native Windows would be kind of cool > > to have. > > I am wondering why we don't just use the Cygwin shm/sem code in our > project, or maybe the Apache stuff; why bother reinventing the wheel. > > -- > Bruce Momjian | http://candle.pha.pa.us > pgman@candle.pha.pa.us | (610) 853-3000 > + If your life is a hard drive, | 830 Blythe Avenue > + Christ can be your backup. | Drexel Hill, Pennsylvania 19026 > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly
Yes, I am having trouble figuring out if I have seen the whole thread yet. --------------------------------------------------------------------------- Marc G. Fournier wrote: > > You might want to go to the archives and catch up on the whole thread and > its digressions :) > > On Sun, 2 Jun 2002, Bruce Momjian wrote: > > > mlw wrote: > > > Like I told Marc, I don't care. You spec out what you want and I'll write it > > > for Windows. > > > > > > That being said, a SysV IPC interface for native Windows would be kind of cool > > > to have. > > > > I am wondering why we don't just use the Cygwin shm/sem code in our > > project, or maybe the Apache stuff; why bother reinventing the wheel. > > > > -- > > Bruce Momjian | http://candle.pha.pa.us > > pgman@candle.pha.pa.us | (610) 853-3000 > > + If your life is a hard drive, | 830 Blythe Avenue > > + Christ can be your backup. | Drexel Hill, Pennsylvania 19026 > > > > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
You might want to go to the archives and catch up on the whole thread and its digressions :) On Sun, 2 Jun 2002, Bruce Momjian wrote: > mlw wrote: > > Like I told Marc, I don't care. You spec out what you want and I'll write it > > for Windows. > > > > That being said, a SysV IPC interface for native Windows would be kind of cool > > to have. > > I am wondering why we don't just use the Cygwin shm/sem code in our > project, or maybe the Apache stuff; why bother reinventing the wheel. > > -- > Bruce Momjian | http://candle.pha.pa.us > pgman@candle.pha.pa.us | (610) 853-3000 > + If your life is a hard drive, | 830 Blythe Avenue > + Christ can be your backup. | Drexel Hill, Pennsylvania 19026 >
Bruce Momjian wrote: > > mlw wrote: > > Like I told Marc, I don't care. You spec out what you want and I'll write it > > for Windows. > > > > That being said, a SysV IPC interface for native Windows would be kind of cool > > to have. > > I am wondering why we don't just use the Cygwin shm/sem code in our > project, or maybe the Apache stuff; why bother reinventing the wheel. I have not been participating on the list, I don't know why I'm still receiving mail. but! in the course of testing some code, I managed to gain some experience with cygwin. I have seen fork() problems with a large number of processes. For PostgreSQL to be as good on Windows as it is on UNIX, it has to be a native program without cygwin. The shared memory and semaphore management should be done with the postmaster process. The apache stuff is OK, it is just as good as anything else. You may be able to use critical sections in shared memory to implement a fast semaphore, but that would take a bit experimentation. I think what Tom had in mind is to take out the SysV and various OS specific APIs and replace them with a more generic one, behind which, you guys can tune the implementation.
Bruce, On Sun, Jun 02, 2002 at 08:49:21PM -0400, Bruce Momjian wrote: > mlw wrote: > > Like I told Marc, I don't care. You spec out what you want and I'll write it > > for Windows. > > > > That being said, a SysV IPC interface for native Windows would be kind of > > cool to have. > > I am wondering why we don't just use the Cygwin shm/sem code in our > project, or maybe the Apache stuff; why bother reinventing the wheel. Are you referring to cygipc above? If so, they even one of the original cygipc authors would discourage this: http://sources.redhat.com/ml/cygwin-apps/2001-09/msg00017.html Specifically, Ludovic Lange states the following: > I really think the solution would be to start again from scratch > another implementation, as was suggested. The waywe did it was > quick and dirty, the goals weren't to have production systems > running on it but only to run prototypes.So the internal design > (if there is any) may not be adequate for the cygwin project. However, Rob Collins has contributed a MinGW daemon to Cygwin to support switching users, System V IPC, etc. So, this code base may be a more suitable starting point to satisfy PostgreSQL's native Win32 System V IPC needs. Jason
On Sun, Jun 02, 2002 at 09:33:57PM -0400, mlw wrote: > Bruce Momjian wrote: > > mlw wrote: > > > Like I told Marc, I don't care. You spec out what you want and I'll write > > > it for Windows. > > > > > > That being said, a SysV IPC interface for native Windows would be kind of > > > cool to have. > > > > I am wondering why we don't just use the Cygwin shm/sem code in our > > project, or maybe the Apache stuff; why bother reinventing the wheel. > > but! in the course of testing some code, I managed to gain some experience > with cygwin. I have seen fork() problems with a large number of processes. Since Cygwin's fork() is implemented with WaitForMultipleObjects(), it has a limitation of only 63 children per parent. Also, there can be DLL base address conflicts (causing Cygwin fork() to fail) that are avoidable by rebasing the appropriate DLLs. AFAICT, Cygwin PostgreSQL is currently *not* affected by this issue where as other Cygwin applications such as Python and Apache are. Jason
Jason Tishler wrote: > On Sun, Jun 02, 2002 at 09:33:57PM -0400, mlw wrote: > > Bruce Momjian wrote: > > > mlw wrote: > > > > Like I told Marc, I don't care. You spec out what you want and I'll write > > > > it for Windows. > > > > > > > > That being said, a SysV IPC interface for native Windows would be kind of > > > > cool to have. > > > > > > I am wondering why we don't just use the Cygwin shm/sem code in our > > > project, or maybe the Apache stuff; why bother reinventing the wheel. > > > > but! in the course of testing some code, I managed to gain some experience > > with cygwin. I have seen fork() problems with a large number of processes. > > Since Cygwin's fork() is implemented with WaitForMultipleObjects(), > it has a limitation of only 63 children per parent. Also, there can > be DLL base address conflicts (causing Cygwin fork() to fail) that are > avoidable by rebasing the appropriate DLLs. AFAICT, Cygwin PostgreSQL is > currently *not* affected by this issue where as other Cygwin applications > such as Python and Apache are. Whatever technical problems there are, we can debate on and on if it's worth working around them in PostgreSQL or fixing them in CygWIN or whatever. The main problem will remain. That using PostgreSQL under CygWIN requires some UNIX know how. So a pure Windows user/shop needs UNIX knowledge to run our "Windows port" of PostgreSQL? Interesting definition of "port". Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
Re: HEADS UP: Win32/OS2/BeOS native ports - the 'BEST OPEN SOURCE database backend'
From
Robert Schrem
Date:
Hi, You may want to have a look at: http://www.garret.ru/~knizhnik/ You find there code for a 'Fast synchronized access to shared memory for Windows and for i86 Unix-es". kind regards, Robert > Bruce, > > On Sun, Jun 02, 2002 at 08:49:21PM -0400, Bruce Momjian wrote: > > mlw wrote: > > > Like I told Marc, I don't care. You spec out what you want and I'll > > > write it for Windows. > > > > > > That being said, a SysV IPC interface for native Windows would be kind > > > of cool to have. > > > > I am wondering why we don't just use the Cygwin shm/sem code in our > > project, or maybe the Apache stuff; why bother reinventing the wheel. > > Are you referring to cygipc above? If so, they even one of the original > cygipc authors would discourage this: > > http://sources.redhat.com/ml/cygwin-apps/2001-09/msg00017.html > > Specifically, Ludovic Lange states the following: > > I really think the solution would be to start again from scratch > > another implementation, as was suggested. The way we did it was > > quick and dirty, the goals weren't to have production systems > > running on it but only to run prototypes. So the internal design > > (if there is any) may not be adequate for the cygwin project. > > However, Rob Collins has contributed a MinGW daemon to Cygwin to support > switching users, System V IPC, etc. So, this code base may be a more > suitable starting point to satisfy PostgreSQL's native Win32 System V > IPC needs. > > Jason > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
GOODS - a sensational public domain database backend that deserves a SQL frontend
From
Robert Schrem
Date:
Hi, Some of you might already know GOODS, programmed almost entirely by Konstantin Knizhnik - if not you should really have a look at it right now (be warned: consuming this extraordinary work might change your levels about the required quality of a 'good programmer' forever. At least this happend to me... ;): http://www.garret.ru/~knizhnik/goods.html Some core features of this backend (as they come to my mind): -> full ACID transaction support -> distributed stoarge management (->distributed transactions) -> multible reader/single writer (is this called MVCC within PostgreSQL?) -> dual client side object cache -> online backup (snapshot backup AND permanent backup) -> nested transactions on object level -> transaction isolation levels on object level -> object level shared and exclusive locks -> excellent C++ programming interface -> WAL -> garbage collection for no longer reference database objects -> fully thread safe client interface -> JAVA client API -> very high performance as a result of a lot of fine tuning -> asyncrous event notification on object instance modification -> extremly high code quality -> a one person effort, hence a very clean design -> the most relevant platforms are supported out of the box -> complete build is done in less than a minute on my machine -> it's documented ... The licensing of this coding wonder: >>> PUBLIC DOMAIN <<< I'm using GOODS quiet a while now in the context of my development activities for a native XML database and have very promissing experiences concerning performance and stability of GOODS. E.g.: The performance seems to be better than sleepycat's berkeley db library - especially with mutliple simultanous transactions... Maybe the only restriction to use this backend in postgres from now on: it's completely C++ ... I'm wondering why there is no SQL frontend yet for this execellent backend... You may want to look also at a comparision chart of some other backends than GOODS (some of them from the same author!!! I'm wondering how he was able to code all this...): http://www.garret.ru/~knizhnik/compare.html kind regards, Robert
On Mon, Jun 03, 2002 at 09:36:51AM -0400, mlw wrote: > Jason Tishler wrote: > > > > On Sun, Jun 02, 2002 at 09:33:57PM -0400, mlw wrote: > > > Bruce Momjian wrote: > > > > mlw wrote: > > > > > Like I told Marc, I don't care. You spec out what you want and I'll > > > > > write it for Windows. > > > > > > > > > > That being said, a SysV IPC interface for native Windows would be > > > > > kind of cool to have. > > > > > > > > I am wondering why we don't just use the Cygwin shm/sem code in our > > > > project, or maybe the Apache stuff; why bother reinventing the wheel. > > > > > > but! in the course of testing some code, I managed to gain some experience > > > with cygwin. I have seen fork() problems with a large number of processes. > > > > Since Cygwin's fork() is implemented with WaitForMultipleObjects(), > > it has a limitation of only 63 children per parent. Also, there can > > be DLL base address conflicts (causing Cygwin fork() to fail) that are > > avoidable by rebasing the appropriate DLLs. AFAICT, Cygwin PostgreSQL is > > currently *not* affected by this issue where as other Cygwin applications > > such as Python and Apache are. > > Why would not PostgreSQL be affected by this? Sorry, if I was unclear -- I should have used two paragraphs above and maybe a few more words... :,) Cygwin PostgreSQL *is* affected by the Cygwin 63 children per parent fork limitation. PostgreSQL *can* be affected by the Cygwin DLL base address conflict fork issue, but in my experience (both personal and by monitoring the Cygwin and pgsql-cygwin lists), no one has been affected yet. The DLL base address conflict is a "probability" thing. The more DLLs loaded the greater the chance of a conflict (and fork() failing). Since, Cygwin PostgreSQL loads only a few DLLs, this has not become an issue (yet). Jason
Kostya is a good qualified programmer. I know him and he is always open for challenges. Some time ago, me and Teodor ask him about GiST support in his another database (Gigabase). It was sort of challenge ( we wanted to port our contrib/tsearch module ) and he did that (using libgist). We work with gigabase database embedded into our application under Windows (we had a lot of troubles with perforance of postgresql under Cygwin:-) and quite happy. On Mon, 3 Jun 2002, Robert Schrem wrote: > Hi, > > Some of you might already know GOODS, programmed > almost entirely by Konstantin Knizhnik - if not you should > really have a look at it right now (be warned: consuming this > extraordinary work might change your levels about the > required quality of a 'good programmer' forever. At least > this happend to me... ;): > http://www.garret.ru/~knizhnik/goods.html > > Some core features of this backend (as they come to my mind): > -> full ACID transaction support > -> distributed stoarge management (->distributed transactions) > -> multible reader/single writer (is this called MVCC within PostgreSQL?) > -> dual client side object cache > -> online backup (snapshot backup AND permanent backup) > -> nested transactions on object level > -> transaction isolation levels on object level > -> object level shared and exclusive locks > -> excellent C++ programming interface > -> WAL > -> garbage collection for no longer reference database objects > -> fully thread safe client interface > -> JAVA client API > -> very high performance as a result of a lot of fine tuning > -> asyncrous event notification on object instance modification > -> extremly high code quality > -> a one person effort, hence a very clean design > -> the most relevant platforms are supported out of the box > -> complete build is done in less than a minute on my machine > -> it's documented > ... > > The licensing of this coding wonder: >>> PUBLIC DOMAIN <<< > > I'm using GOODS quiet a while now in the context of my > development activities for a native XML database and have > very promissing experiences concerning performance and > stability of GOODS. E.g.: The performance seems to be > better than sleepycat's berkeley db library - especially > with mutliple simultanous transactions... > > Maybe the only restriction to use this backend in postgres > from now on: it's completely C++ ... > > I'm wondering why there is no SQL frontend yet for this > execellent backend... > > You may want to look also at a comparision chart of some > other backends than GOODS (some of them from the same > author!!! I'm wondering how he was able to code all this...): > http://www.garret.ru/~knizhnik/compare.html > > kind regards, > > Robert > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
That's what Apache does. Note, on most platforms MAP_ANON is equivalent to mmmap-ing /dev/zero. Solaris for example does not provide MAP_ANON but using fd=open(/dev/zero) mmap(fd, ...) close(fd) works just fine. ----- Original Message ----- From: "Bruce Momjian" <pgman@candle.pha.pa.us> To: "Igor Kovalenko" <Igor.Kovalenko@motorola.com> Cc: "Tom Lane" <tgl@sss.pgh.pa.us>; "mlw" <markw@mohawksoft.com>; "Marc G. Fournier" <scrappy@hub.org>; <pgsql-hackers@postgresql.org> Sent: Sunday, June 02, 2002 7:47 PM Subject: Re: [HACKERS] HEADS UP: Win32/OS2/BeOS native ports > Igor Kovalenko wrote: > > It does not have to be anonymous. POSIX also defines shm_open(same arguments > > as open) API which will create named object in whatever location corresponds > > to shared memory storage on that platform (object is then grown to needed > > size by ftruncate() and the fd is then passed to mmap). The object will > > exist in name space and can be detected by subsequent calls to shm_open() > > with same name. It is not really different from doing open(), but more > > portable (mmap() on regular files may not be supported). > > Actually, I think the best shared memory implemention would be > MAP_ANON | MAP_SHARED mmap(), which could be called from the postmaster > and passed to child processes. > > While all our platforms have mmap(), many don't have MAP_ANON, but those > that do could use it. You need MAP_ANON to prevent the shared memory > from being written to a disk file. > > -- > Bruce Momjian | http://candle.pha.pa.us > pgman@candle.pha.pa.us | (610) 853-3000 > + If your life is a hard drive, | 830 Blythe Avenue > + Christ can be your backup. | Drexel Hill, Pennsylvania 19026 >
Jason Tishler wrote: > > On Sun, Jun 02, 2002 at 09:33:57PM -0400, mlw wrote: > > Bruce Momjian wrote: > > > mlw wrote: > > > > Like I told Marc, I don't care. You spec out what you want and I'll write > > > > it for Windows. > > > > > > > > That being said, a SysV IPC interface for native Windows would be kind of > > > > cool to have. > > > > > > I am wondering why we don't just use the Cygwin shm/sem code in our > > > project, or maybe the Apache stuff; why bother reinventing the wheel. > > > > but! in the course of testing some code, I managed to gain some experience > > with cygwin. I have seen fork() problems with a large number of processes. > > Since Cygwin's fork() is implemented with WaitForMultipleObjects(), > it has a limitation of only 63 children per parent. Also, there can > be DLL base address conflicts (causing Cygwin fork() to fail) that are > avoidable by rebasing the appropriate DLLs. AFAICT, Cygwin PostgreSQL is > currently *not* affected by this issue where as other Cygwin applications > such as Python and Apache are. Why would not PostgreSQL be affected by this?
Jason Tishler wrote: > > On Mon, Jun 03, 2002 at 09:36:51AM -0400, mlw wrote: > > Jason Tishler wrote: > > > > > > On Sun, Jun 02, 2002 at 09:33:57PM -0400, mlw wrote: > > > > Bruce Momjian wrote: > > > > > mlw wrote: > > > > > > Like I told Marc, I don't care. You spec out what you want and I'll > > > > > > write it for Windows. > > > > > > > > > > > > That being said, a SysV IPC interface for native Windows would be > > > > > > kind of cool to have. > > > > > > > > > > I am wondering why we don't just use the Cygwin shm/sem code in our > > > > > project, or maybe the Apache stuff; why bother reinventing the wheel. > > > > > > > > but! in the course of testing some code, I managed to gain some experience > > > > with cygwin. I have seen fork() problems with a large number of processes. > > > > > > Since Cygwin's fork() is implemented with WaitForMultipleObjects(), > > > it has a limitation of only 63 children per parent. Also, there can > > > be DLL base address conflicts (causing Cygwin fork() to fail) that are > > > avoidable by rebasing the appropriate DLLs. AFAICT, Cygwin PostgreSQL is > > > currently *not* affected by this issue where as other Cygwin applications > > > such as Python and Apache are. > > > > Why would not PostgreSQL be affected by this? > > Sorry, if I was unclear -- I should have used two paragraphs above and > maybe a few more words... :,) > > Cygwin PostgreSQL *is* affected by the Cygwin 63 children per parent > fork limitation. > > PostgreSQL *can* be affected by the Cygwin DLL base address conflict > fork issue, but in my experience (both personal and by monitoring the > Cygwin and pgsql-cygwin lists), no one has been affected yet. The DLL > base address conflict is a "probability" thing. The more DLLs loaded > the greater the chance of a conflict (and fork() failing). Since, Cygwin > PostgreSQL loads only a few DLLs, this has not become an issue (yet). I'm not sure the DLL load address is a big issue for PostgreSQL, AFAIK no option DLLs will be loaded by Postmaster. So, with fork() it will be a simple process. A PostgreSQL child will die upon completion, and never execute fork(). My concern would be the limit on the number of child processes allowed. 63 is far below what would be considered a usable number in production, and as long as that is an issue, I don't think anyone would take PostgreSQL seriously. A Windows version of PostgreSQL must run within the confines of the Windows OS. The reason, IMHO, that no one has found any serious bugs in the cygwin version, is because no one is seriously using it. Anyone who *would* seriously use it, knows better.