md.c should not call files "relations"
There's an interesting thread over here
http://archives.postgresql.org/pgsql-sql/2009-08/msg00013.php
in which someone mistook a filesystem-level permissions problem
for a database permissions problem. It wasn't exactly his fault,
I think, since the message he was presented with was
ERROR: could not create relation "test": Permission denied
which is not all that obviously different from what you would
get for a SQL-permissions violation.
I am thinking that this message would be more correct and less
confusing if it looked something like
ERROR: could not create file "12345/67890": Permission denied
ie, when reflecting an OS-level error we should call a file a file and
provide its filesystem name, not the name of the table that we were
hoping to map to it. This would be more likely to lead the user's
mind in the right direction, and he'd need the filesystem pathname
for any detailed investigation anyway.
This would have the further advantage that we could make all the
errors in md.c consistent --- some of them provide filesystem names
rather than table names because that's all they have available.
Lastly, I'm wondering why someone seems to have removed the double
quotes around the filesystem name in some of these messages.
Surely that's not per style guide.
Comments?
regards, tom lane
On Aug 4, 2009, at 6:30 PM, Tom Lane wrote:
Comments?
+1 Seems like a no-brainer.
David
Tom Lane wrote:
There's an interesting thread over here
http://archives.postgresql.org/pgsql-sql/2009-08/msg00013.php
in which someone mistook a filesystem-level permissions problem
for a database permissions problem. It wasn't exactly his fault,
I think, since the message he was presented with wasERROR: could not create relation "test": Permission denied
which is not all that obviously different from what you would
get for a SQL-permissions violation.I am thinking that this message would be more correct and less
confusing if it looked something likeERROR: could not create file "12345/67890": Permission denied
ie, when reflecting an OS-level error we should call a file a file and
provide its filesystem name, not the name of the table that we were
hoping to map to it. This would be more likely to lead the user's
mind in the right direction, and he'd need the filesystem pathname
for any detailed investigation anyway.
We already print the file name, not table name - since version 8.0. The
OP that saw the message was on 7.4.
I agree we should call file a file.
This would have the further advantage that we could make all the
errors in md.c consistent --- some of them provide filesystem names
rather than table names because that's all they have available.Lastly, I'm wondering why someone seems to have removed the double
quotes around the filesystem name in some of these messages.
Surely that's not per style guide.
When I replaced %u/%u/%u with %s containing the relpath() of the file,
it didn't occur to add quotes. Agreed, they should be quoted.
Want me to change those or are you on it already?
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
Want me to change those or are you on it already?
I'm going to bed --- if you wanna do it, have at it ...
regards, tom lane
Tom Lane wrote:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
Want me to change those or are you on it already?
I'm going to bed --- if you wanna do it, have at it ...
Ok.
I note that many of the messages currently print the relpath() of the
relation, and don't include the affected segment suffix. For example:
could not read block 140000 of relation base/11566/24614: read only 1
of 8192 bytes
If we change them to point to the exactly right filename including
segment suffix, then the block number becomes confusing, since that
would still refer block number within the relation, not the segment.
Right now, the "relation xxx" is referring to the segmented virtual file
as whole, not to any specific segment. One option is to revert those
messages to 8.3 style:
could not read block 140000 of relation 1663/11566/24614: read only 1
of 8192 bytes
We'd need to include the fork there, so at least for forks other than
the main one it would become something like
could not read block 140000 of relation 1663/11566/24614/fsm: read
only 1 of 8192 bytes
Another option is to print the byte offset within segment file instead
of block number:
could not read 8129 bytes at offset 73138176 of file
"base/11566/24614.1": read only 1 bytes
That feels more concise and describes accurately what the failing OS
call was. However, it doesn't fit these two messages:
cannot extend relation %s beyond %u blocks
could not truncate relation %s to %u blocks: it's only %u blocks now
since those genuinely don't refer to any particular segment. Also, if we
want to support RELSEG_SIZE > 4GB, we'd have to use INT64_FORMAT in the
format strings, and I don't think that works nicely with translations.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
On Aug 4, 2009, at 11:10 PM, Tom Lane wrote:
Want me to change those or are you on it already?
I'm going to bed --- if you wanna do it, have at it ...
Oh please. Everyone knows that you don't sleep, Tom. You just sit back
in your chair and power nap for five minutes once in a while, perhaps
between reading the -hackers and -general mail lists. ;-P
David
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
I note that many of the messages currently print the relpath() of the
relation, and don't include the affected segment suffix. For example:
could not read block 140000 of relation base/11566/24614: read only 1
of 8192 bytes
If we change them to point to the exactly right filename including
segment suffix, then the block number becomes confusing, since that
would still refer block number within the relation, not the segment.
Hmm, good point. I don't think the byte-offset solution is usable,
because of the INT64_FORMAT problem. What I would vote for is just
continuing to show the block number relative to the whole relation,
while (as much as possible) showing the actual filesystem pathname of
the file being mentioned. This would mean that anyone trying to
interpret the block number would have to be aware of what it meant
and do the appropriate modulo calculation, but frankly I doubt that all
that many people will care about exactly what offset is implied.
BTW, I wonder whether it would be worth adding an entry point to fd.c
to return the path name associated with a logical fd, rather than
sprinkling extra relpath() calls throughout these messages.
regards, tom lane
Tom Lane wrote:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
I note that many of the messages currently print the relpath() of the
relation, and don't include the affected segment suffix. For example:could not read block 140000 of relation base/11566/24614: read only 1
of 8192 bytesIf we change them to point to the exactly right filename including
segment suffix, then the block number becomes confusing, since that
would still refer block number within the relation, not the segment.Hmm, good point. I don't think the byte-offset solution is usable,
because of the INT64_FORMAT problem. What I would vote for is just
continuing to show the block number relative to the whole relation,
while (as much as possible) showing the actual filesystem pathname of
the file being mentioned. This would mean that anyone trying to
interpret the block number would have to be aware of what it meant
and do the appropriate modulo calculation, but frankly I doubt that all
that many people will care about exactly what offset is implied.
Ok. The most likely scenario where it would be confusing would be if you
get an error along the lines of "read error on block 200000 in file
XXX.1": you look at file XXX.1 and conclude that the file must be
truncated because the file is much shorter than 200000 blocks. Some
low-level knowledge is indeed needed to interpret that correctly, but
then again knowing to multiply by 8192 to get the offset is low-level
knowledge to begin with.
BTW, I wonder whether it would be worth adding an entry point to fd.c
to return the path name associated with a logical fd, rather than
sprinkling extra relpath() calls throughout these messages.
Yes. I was going to add a function to md.c to construct the filename
from (SmgrRelation, ForkNumber, segment number), but that's an even
better idea.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com