Thread: signed short fd
We have the following definition in fd.c: typedef struct vfd {signed short fd; /* current FD, or VFD_CLOSED if none */... } Vfd; but seems we use Vfd.fd as an integer, say in fileNameOpenFile() we have: vfdP->fd = BasicOpenFile(fileName, fileFlags, fileMode); So is there any special reason we don't worry that convert an integer to short will not lose data? Maybe we make the assumption that all OS will implement "fd" as an array index and is at most 2^16 this big, but why not use an integer? Regards, Qingqing
"Qingqing Zhou" <zhouqq@cs.toronto.edu> writes: > So is there any special reason we don't worry that convert an integer to > short will not lose data? It's not possible for that to happen unless the user has set max_files_per_process to more than 32K, so I'm not particularly worried. Do you know of any platforms that would be unlikely to go belly-up with dozens or hundreds of PG backends each trying to use tens of thousands of open files? While I agree that storing this as int16 is micro-optimization, I don't see it as likely to be a problem in the foreseeable future. If it makes you feel better, we can constrain max_files_per_process to 32K in guc.c. > Maybe we make the assumption that all OS will > implement "fd" as an array index The POSIX spec requires open() to assign fd's consecutively from zero. http://www.opengroup.org/onlinepubs/007908799/xsh/open.html regards, tom lane
> >> Maybe we make the assumption that all OS will >> implement "fd" as an array index > > The POSIX spec requires open() to assign fd's consecutively from zero. > http://www.opengroup.org/onlinepubs/007908799/xsh/open.html With all due respect, PostgreSQL now runs natively on Win32. Having a POSIX-only mentality, especially with something so trivial, is a mistake. I would say "int" is the best way to handle it. You just *never* know.
pgsql@mohawksoft.com writes: >> The POSIX spec requires open() to assign fd's consecutively from zero. >> http://www.opengroup.org/onlinepubs/007908799/xsh/open.html > With all due respect, PostgreSQL now runs natively on Win32. ... using the POSIX APIs that Microsoft so kindly provides. fd.c will certainly not work at all on a platform that doesn't provide a POSIX-like file access API, and in the absence of any evidence to the contrary, I don't see why we shouldn't assume that the platform adheres to that part of the spec too. regards, tom lane
> pgsql@mohawksoft.com writes: >>> The POSIX spec requires open() to assign fd's consecutively from zero. >>> http://www.opengroup.org/onlinepubs/007908799/xsh/open.html > >> With all due respect, PostgreSQL now runs natively on Win32. > > ... using the POSIX APIs that Microsoft so kindly provides. > fd.c will certainly not work at all on a platform that doesn't > provide a POSIX-like file access API, and in the absence of any > evidence to the contrary, I don't see why we shouldn't assume > that the platform adheres to that part of the spec too. > I'm a "better safe than sorry" sort of guy. I would rather code defensively against a poorly implemented API. However: "Upon successful completion, the function will open the file and return a non-negative integer representing the lowest numbered unused file descriptor. Otherwise, -1 is returned and errno is set to indicate the error. No files will be created or modified if the function returns -1." That is hardly anything that I would feel comfortable with. Lets break this down into all the areas that are ambiguous: "unused" file descriptor, define "unused." Is it unused ever, or currently unused? Could an API developer simply just increment file opens? What about just allocating a structure on each open, and returning its pointer cast to an int? Also notice that no mention of process separation exists, it could very well be that a file descriptor may be usable system wide, with the exceptions of stdin, stdout, and stderr. Nowhere does it say how the file descriptors are numbered. 1,2,3,4 sure, that's what you expect, but it isn't an explicitly documented behavior. What is documented, however, that it is a machine "int" and that the number will be positive and be the lowest "unused" descriptor (depending on the definition of "unused") This is the sort of thing that makes software brittle and likely to crash. Sure, you may be "right" in saying a "short int" is enough. Some developer creating a POSIX clib my think he is right doing something his way. What happens is that there is a potentially serious bug that will only show up at seemingly random times. The fact is that it is PostgreSQL that would be wrong, the API is documented as taking an "int." PostgreSQL casts it to a "short." What ever you read into the implementation of the API is wrong. The API is an abstraction and you should assume you don't know anything about it.
pgsql@mohawksoft.com writes: > That is hardly anything that I would feel comfortable with. Lets break > this down into all the areas that are ambiguous: There isn't anything ambiguous about this, nor is it credible that there are implementations that don't follow the intent of the spec. Consider the standard paradigm for replacing stdout: you close(1) and then open() the target file. If the open() doesn't pick 1 as the fd, you're screwed. Every shell in the world would break atop such an implementation. It may well be the case that saving 4 bytes per VFD is useless micro-optimization. But the code isn't broken as it stands. regards, tom lane
> pgsql@mohawksoft.com writes: >> That is hardly anything that I would feel comfortable with. Lets break >> this down into all the areas that are ambiguous: > > There isn't anything ambiguous about this, nor is it credible that there > are implementations that don't follow the intent of the spec. How do you know the intent of the spec? I have seen no meta discussion about the behavior of the file descriptor integer returned from open. The Steven's book makes no such assumptions, and the steven's book (Advanced Programming in the UNIX Environment) is what people reference. > Consider > the standard paradigm for replacing stdout: you close(1) and then open() > the target file. If the open() doesn't pick 1 as the fd, you're screwed. > Every shell in the world would break atop such an implementation. I said that stdin, stdout, and stderr would be treated differently as they are on all platforms. > > It may well be the case that saving 4 bytes per VFD is useless > micro-optimization. But the code isn't broken as it stands. It most likely is not broken as it is, but it would be interesting to put an assert(fd < 32768) in the code and see if it ever breaks. Never the less, the spec DOES call for file fds to be a machine "int." All acceptable coding practices would demand that since the API spec calls for an int, the application should use an int. This is the sort of thing that is caught and fixed in any standard code review. Why is this an argument? What am I missing that you are defending?
pgsql@mohawksoft.com wrote: >>pgsql@mohawksoft.com writes: >> >> >>>That is hardly anything that I would feel comfortable with. Lets break >>>this down into all the areas that are ambiguous: >>> >>> >>There isn't anything ambiguous about this, nor is it credible that there >>are implementations that don't follow the intent of the spec. >> >> > >How do you know the intent of the spec? I have seen no meta discussion >about the behavior of the file descriptor integer returned from open. The >Steven's book makes no such assumptions, and the steven's book (Advanced >Programming in the UNIX Environment) is what people reference. > > > > > My copy of APUE says on page 49: "The file descriptor returned by open is the lowest numbered unused descriptor. This is used by some applications to open a new file on standard input, standard output, or standard error." Unless someone can show there's an actual problem this discussion seems quite pointless. cheers andrew
> My copy of APUE says on page 49: "The file descriptor returned by open > is the lowest numbered unused descriptor. This is used by some > applications to open a new file on standard input, standard output, or > standard error." Yes, I'll restate my questions: What is meant by "unused?" Is it read to mean that a higher number file is *never* returned if there is a lower number that has been used and is now available? Is that something we can 100% absolutely depend on. On All curent and future platforms? It is a stupid idea to truncate the upper bytes of an integer without good reason. I can see LOTS of reasons why this will break something in the future. The upper bits may be used to identify storage media or characteristics. My point is that the spec calls for an "int," PostgreSQL should use an int. > > Unless someone can show there's an actual problem this discussion seems > quite pointless. > The point is that this *is* silly, but I am at a loss to understand why it isn't a no-brainer to change. Why is there a fight over a trivial change which will ensure that PostgreSQL aligns to the documented behavior of "open()"
pgsql@mohawksoft.com wrote: > The point is that this *is* silly, but I am at a loss to understand why it > isn't a no-brainer to change. Why is there a fight over a trivial change > which will ensure that PostgreSQL aligns to the documented behavior of > "open()" (Why characterise this as a "fight", rather than a discussion? Perhaps it is because of the same combative, adversarial attitude you seem to bring to every discussion you're involved in on -hackers...) Anyway, I agree, there's no point keeping it a short; I highly doubt this would actually be a problem, but we may as well change it to an int. -Neil
> pgsql@mohawksoft.com wrote: >> The point is that this *is* silly, but I am at a loss to understand why >> it >> isn't a no-brainer to change. Why is there a fight over a trivial change >> which will ensure that PostgreSQL aligns to the documented behavior of >> "open()" > > (Why characterise this as a "fight", rather than a discussion? Perhaps > it is because of the same combative, adversarial attitude you seem to > bring to every discussion you're involved in on -hackers...) I really don't intend to do that, and it does seem to happen a lot. I am the first to admit I lack tact, but often times I view the decisions made as rather arbitrary and lacking a larger perspective, but that is a rant I don't want to get right now. > > Anyway, I agree, there's no point keeping it a short; I highly doubt > this would actually be a problem, but we may as well change it to an int. And this is my point. There are things that are "no brainers," and a few times I have been completely dumbfounded as to the source of resistence. Silently truncating the upper 2 bytes of data type declared as an "int" is a bug. I can't believe anyone would defend it, but here it happens. Maybe it is me. I know I'm stubborn and confrontational, personally I've wished I could be different, but I'm 42 so I guess I'm not going to change any time soon. Regardless of the source, if you want code to be portable, you have to take APIs at face value. Any assumptions you think you can make are by definition wrong. Allow the API authors the space to handle what they need to handle. Assuming a specific behavior is dangerous. Is it currently a problem, most likely not, but since there is no downside, why leave it lurking to bite us?
> I really don't intend to do that, and it does seem to happen a lot. I am > the first to admit I lack tact, but often times I view the decisions made > as rather arbitrary and lacking a larger perspective, but that is a rant I > don't want to get right now. Perhaps it's your lack of a real name and complete anonyminity (hence invulnerablility) that gets to people...
At 2005-03-14 16:25:22 -0500, pgsql@mohawksoft.com wrote: > > > "The file descriptor returned by open is the lowest numbered unused > > descriptor. [...] > > What is meant by "unused?" Perhaps you should actually look at the standard. "The open( ) function shall return a file descriptor for the named file that is the lowest file descriptor not currentlyopen for that process." "The close( ) function shall deallocate the file descriptor indicated by fildes. To deallocate means to make the file descriptoravailable for return by subsequent calls to open( ) or other functions that allocate file descriptors." > Is it read to mean that a higher number file is *never* returned if > there is a lower number that has been used and is now available? Yes. -- ams
Christopher Kings-Lynne wrote: > > I really don't intend to do that, and it does seem to happen a lot. I am > > the first to admit I lack tact, but often times I view the decisions made > > as rather arbitrary and lacking a larger perspective, but that is a rant I > > don't want to get right now. > > Perhaps it's your lack of a real name and complete anonyminity (hence > invulnerablility) that gets to people... I actually met him _briefly_ at Linuxworld in Boston. He just said "hi", then disappeared. :-) -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
> Christopher Kings-Lynne wrote: >> > I really don't intend to do that, and it does seem to happen a lot. I >> am >> > the first to admit I lack tact, but often times I view the decisions >> made >> > as rather arbitrary and lacking a larger perspective, but that is a >> rant I >> > don't want to get right now. >> >> Perhaps it's your lack of a real name and complete anonyminity (hence >> invulnerablility) that gets to people... Is it fixed? > > I actually met him _briefly_ at Linuxworld in Boston. He just said > "hi", then disappeared. :-) Bruce, I did want to meet you to a greater extent, but you we surrounded by people and looked quite busy.
Mark Woodward wrote: > > Christopher Kings-Lynne wrote: > >> > I really don't intend to do that, and it does seem to happen a lot. I > >> am > >> > the first to admit I lack tact, but often times I view the decisions > >> made > >> > as rather arbitrary and lacking a larger perspective, but that is a > >> rant I > >> > don't want to get right now. > >> > >> Perhaps it's your lack of a real name and complete anonyminity (hence > >> invulnerablility) that gets to people... > > Is it fixed? Wow, he comes out of the shadows. :-) > > > > I actually met him _briefly_ at Linuxworld in Boston. He just said > > "hi", then disappeared. :-) > > Bruce, I did want to meet you to a greater extent, but you we surrounded > by people and looked quite busy. Yea, I was just teasing. It was a very busy conference. I remember at night just wanting to turn myself off. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
>>>>Perhaps it's your lack of a real name and complete anonyminity (hence >>>>invulnerablility) that gets to people... >> >>Is it fixed? Yeah, hi Mark :) Chris
On Mon, Mar 14, 2005 at 10:45:51PM -0500, Bruce Momjian wrote: > Mark Woodward wrote: > > Bruce, I did want to meet you to a greater extent, but you we surrounded > > by people and looked quite busy. > > Yea, I was just teasing. It was a very busy conference. I remember at > night just wanting to turn myself off. Were you able to? That'd make a very cool trick. -- Alvaro Herrera (<alvherre[@]dcc.uchile.cl>) "That sort of implies that there are Emacs keystrokes which aren't obscure. I've been using it daily for 2 years now and have yet to discover any key sequence which makes any sense." (Paul Thomas)
Alvaro Herrera wrote: > On Mon, Mar 14, 2005 at 10:45:51PM -0500, Bruce Momjian wrote: > > Mark Woodward wrote: > > > > Bruce, I did want to meet you to a greater extent, but you we surrounded > > > by people and looked quite busy. > > > > Yea, I was just teasing. It was a very busy conference. I remember at > > night just wanting to turn myself off. > > Were you able to? That'd make a very cool trick. No, but I have wished to have that switch on my children sometimes. :-) -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
> Mark Woodward wrote: >> > Christopher Kings-Lynne wrote: >> >> > I really don't intend to do that, and it does seem to happen a lot. >> I >> >> am >> >> > the first to admit I lack tact, but often times I view the >> decisions >> >> made >> >> > as rather arbitrary and lacking a larger perspective, but that is a >> >> rant I >> >> > don't want to get right now. >> >> >> >> Perhaps it's your lack of a real name and complete anonyminity (hence >> >> invulnerablility) that gets to people... >> >> Is it fixed? > > Wow, he comes out of the shadows. :-) I hope you guys don't think I was doing it intentionally. I use my own mail server and I just create separate email accounts for different projects. I just simply logged in (way back when) and never bothered going to "Options" to set my name. It was neglect, not secrecy. > >> > >> > I actually met him _briefly_ at Linuxworld in Boston. He just said >> > "hi", then disappeared. :-) >> >> Bruce, I did want to meet you to a greater extent, but you we surrounded >> by people and looked quite busy. > > Yea, I was just teasing. It was a very busy conference. I remember at > night just wanting to turn myself off. Yes it was. What do you think: My impression was that the corporate side (with a few exceptions) was, while very professional and flashy, lacking in technical merit. SGI was one of the few exceptions, of course. The ".org" side was really fun, lots of guys with interesting projects, messy booths, and grundgy cloths. Has the soul left Linux? Has it been consumed by the big corporations?
Mark Woodward wrote: > >> > I actually met him _briefly_ at Linuxworld in Boston. He just said > >> > "hi", then disappeared. :-) > >> > >> Bruce, I did want to meet you to a greater extent, but you we surrounded > >> by people and looked quite busy. > > > > Yea, I was just teasing. It was a very busy conference. I remember at > > night just wanting to turn myself off. > > Yes it was. What do you think: My impression was that the corporate side > (with a few exceptions) was, while very professional and flashy, lacking > in technical merit. SGI was one of the few exceptions, of course. The > ".org" side was really fun, lots of guys with interesting projects, messy > booths, and grundgy cloths. Has the soul left Linux? Has it been consumed > by the big corporations? I know our booth was very busy, while many corporate booths were empty. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Ühel kenal päeval (esmaspäev, 14. märts 2005, 22:13-0500), kirjutas Bruce Momjian: > Christopher Kings-Lynne wrote: > > > I really don't intend to do that, and it does seem to happen a lot. I am > > > the first to admit I lack tact, but often times I view the decisions made > > > as rather arbitrary and lacking a larger perspective, but that is a rant I > > > don't want to get right now. > > > > Perhaps it's your lack of a real name and complete anonyminity (hence > > invulnerablility) that gets to people... > > I actually met him _briefly_ at Linuxworld in Boston. He just said > "hi", then disappeared. :-) Was his real name 'pgsql' ? ;) -- Hannu Krosing <hannu@tm.ee>