Thread: md.c should not call files "relations"
There's an interesting thread over here http://archives.postgresql.org/pgsql-sql/2009-08/msg00013.php in which someone mistook a filesystem-level permissions problem for a database permissions problem. It wasn't exactly his fault, I think, since the message he was presented with was ERROR: could not create relation "test": Permission denied which is not all that obviously different from what you would get for a SQL-permissions violation. I am thinking that this message would be more correct and less confusing if it looked something like ERROR: could not create file "12345/67890": Permission denied ie, when reflecting an OS-level error we should call a file a file and provide its filesystem name, not the name of the table that we were hoping to map to it. This would be more likely to lead the user's mind in the right direction, and he'd need the filesystem pathname for any detailed investigation anyway. This would have the further advantage that we could make all the errors in md.c consistent --- some of them provide filesystem names rather than table names because that's all they have available. Lastly, I'm wondering why someone seems to have removed the double quotes around the filesystem name in some of these messages. Surely that's not per style guide. Comments? regards, tom lane
On Aug 4, 2009, at 6:30 PM, Tom Lane wrote: > Comments? +1 Seems like a no-brainer. David
Tom Lane wrote: > There's an interesting thread over here > http://archives.postgresql.org/pgsql-sql/2009-08/msg00013.php > in which someone mistook a filesystem-level permissions problem > for a database permissions problem. It wasn't exactly his fault, > I think, since the message he was presented with was > > ERROR: could not create relation "test": Permission denied > > which is not all that obviously different from what you would > get for a SQL-permissions violation. > > I am thinking that this message would be more correct and less > confusing if it looked something like > > ERROR: could not create file "12345/67890": Permission denied > > ie, when reflecting an OS-level error we should call a file a file and > provide its filesystem name, not the name of the table that we were > hoping to map to it. This would be more likely to lead the user's > mind in the right direction, and he'd need the filesystem pathname > for any detailed investigation anyway. We already print the file name, not table name - since version 8.0. The OP that saw the message was on 7.4. I agree we should call file a file. > This would have the further advantage that we could make all the > errors in md.c consistent --- some of them provide filesystem names > rather than table names because that's all they have available. > > Lastly, I'm wondering why someone seems to have removed the double > quotes around the filesystem name in some of these messages. > Surely that's not per style guide. When I replaced %u/%u/%u with %s containing the relpath() of the file, it didn't occur to add quotes. Agreed, they should be quoted. Want me to change those or are you on it already? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > Want me to change those or are you on it already? I'm going to bed --- if you wanna do it, have at it ... regards, tom lane
Tom Lane wrote: > Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: >> Want me to change those or are you on it already? > > I'm going to bed --- if you wanna do it, have at it ... Ok. I note that many of the messages currently print the relpath() of the relation, and don't include the affected segment suffix. For example: could not read block 140000 of relation base/11566/24614: read only 1 of 8192 bytes If we change them to point to the exactly right filename including segment suffix, then the block number becomes confusing, since that would still refer block number within the relation, not the segment. Right now, the "relation xxx" is referring to the segmented virtual file as whole, not to any specific segment. One option is to revert those messages to 8.3 style: could not read block 140000 of relation 1663/11566/24614: read only 1 of 8192 bytes We'd need to include the fork there, so at least for forks other than the main one it would become something like could not read block 140000 of relation 1663/11566/24614/fsm: read only 1 of 8192 bytes Another option is to print the byte offset within segment file instead of block number: could not read 8129 bytes at offset 73138176 of file "base/11566/24614.1": read only 1 bytes That feels more concise and describes accurately what the failing OS call was. However, it doesn't fit these two messages: cannot extend relation %s beyond %u blocks could not truncate relation %s to %u blocks: it's only %u blocks now since those genuinely don't refer to any particular segment. Also, if we want to support RELSEG_SIZE > 4GB, we'd have to use INT64_FORMAT in the format strings, and I don't think that works nicely with translations. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Aug 4, 2009, at 11:10 PM, Tom Lane wrote: >> Want me to change those or are you on it already? > > I'm going to bed --- if you wanna do it, have at it ... Oh please. Everyone knows that you don't sleep, Tom. You just sit back in your chair and power nap for five minutes once in a while, perhaps between reading the -hackers and -general mail lists. ;-P David
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > I note that many of the messages currently print the relpath() of the > relation, and don't include the affected segment suffix. For example: > could not read block 140000 of relation base/11566/24614: read only 1 > of 8192 bytes > If we change them to point to the exactly right filename including > segment suffix, then the block number becomes confusing, since that > would still refer block number within the relation, not the segment. Hmm, good point. I don't think the byte-offset solution is usable, because of the INT64_FORMAT problem. What I would vote for is just continuing to show the block number relative to the whole relation, while (as much as possible) showing the actual filesystem pathname of the file being mentioned. This would mean that anyone trying to interpret the block number would have to be aware of what it meant and do the appropriate modulo calculation, but frankly I doubt that all that many people will care about exactly what offset is implied. BTW, I wonder whether it would be worth adding an entry point to fd.c to return the path name associated with a logical fd, rather than sprinkling extra relpath() calls throughout these messages. regards, tom lane
Tom Lane wrote: > Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: >> I note that many of the messages currently print the relpath() of the >> relation, and don't include the affected segment suffix. For example: > >> could not read block 140000 of relation base/11566/24614: read only 1 >> of 8192 bytes > >> If we change them to point to the exactly right filename including >> segment suffix, then the block number becomes confusing, since that >> would still refer block number within the relation, not the segment. > > Hmm, good point. I don't think the byte-offset solution is usable, > because of the INT64_FORMAT problem. What I would vote for is just > continuing to show the block number relative to the whole relation, > while (as much as possible) showing the actual filesystem pathname of > the file being mentioned. This would mean that anyone trying to > interpret the block number would have to be aware of what it meant > and do the appropriate modulo calculation, but frankly I doubt that all > that many people will care about exactly what offset is implied. Ok. The most likely scenario where it would be confusing would be if you get an error along the lines of "read error on block 200000 in file XXX.1": you look at file XXX.1 and conclude that the file must be truncated because the file is much shorter than 200000 blocks. Some low-level knowledge is indeed needed to interpret that correctly, but then again knowing to multiply by 8192 to get the offset is low-level knowledge to begin with. > BTW, I wonder whether it would be worth adding an entry point to fd.c > to return the path name associated with a logical fd, rather than > sprinkling extra relpath() calls throughout these messages. Yes. I was going to add a function to md.c to construct the filename from (SmgrRelation, ForkNumber, segment number), but that's an even better idea. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com