Re: sgml cleanup: unescaped '>' characters
Peter Eisentraut wrote:
as well as seemingly-invalid SGML, such as using '>' unescaped inside
normal SGML entries.Unescaped > is valid, AFAIK.
Oh, that's interesting. I took a quick look at "The SGML FAQ book",
page 73 [1], which supports this claim.But I notice we've been fixing such issues in the recent past (e.g.
commit d420ba2a2d4ea4831f89a3fd7ce86b05eff932ff). Don't we want to
continue doing so? Not to mention the fact that we have
./src/tools/find_gt_lt, which while somewhat broken, has the
ostensible goal of finding such problems in the SGML. Or do we want to
stop worrying about '>' entirely, and rename find_gt_lt to find_lt,
instead?I don't know what the rationale for this tool is. I have never used it.
Clearly, the reference shows, and the tools we use confirm, that it is
not necessary to use it.
I have updated the scripts and instructions accordingly.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +
Import Notes
Reply to msg id not found: 1314725779.11209.1.camel@vanquo.pezone.net
On tor, 2011-09-01 at 10:17 -0400, Bruce Momjian wrote:
Peter Eisentraut wrote:
as well as seemingly-invalid SGML, such as using '>' unescaped inside
normal SGML entries.Unescaped > is valid, AFAIK.
Oh, that's interesting. I took a quick look at "The SGML FAQ book",
page 73 [1], which supports this claim.But I notice we've been fixing such issues in the recent past (e.g.
commit d420ba2a2d4ea4831f89a3fd7ce86b05eff932ff). Don't we want to
continue doing so? Not to mention the fact that we have
./src/tools/find_gt_lt, which while somewhat broken, has the
ostensible goal of finding such problems in the SGML. Or do we want to
stop worrying about '>' entirely, and rename find_gt_lt to find_lt,
instead?I don't know what the rationale for this tool is. I have never used it.
Clearly, the reference shows, and the tools we use confirm, that it is
not necessary to use it.I have updated the scripts and instructions accordingly.
That still leaves open why we bother about escaping <.
Peter Eisentraut wrote:
On tor, 2011-09-01 at 10:17 -0400, Bruce Momjian wrote:
Peter Eisentraut wrote:
as well as seemingly-invalid SGML, such as using '>' unescaped inside
normal SGML entries.Unescaped > is valid, AFAIK.
Oh, that's interesting. I took a quick look at "The SGML FAQ book",
page 73 [1], which supports this claim.But I notice we've been fixing such issues in the recent past (e.g.
commit d420ba2a2d4ea4831f89a3fd7ce86b05eff932ff). Don't we want to
continue doing so? Not to mention the fact that we have
./src/tools/find_gt_lt, which while somewhat broken, has the
ostensible goal of finding such problems in the SGML. Or do we want to
stop worrying about '>' entirely, and rename find_gt_lt to find_lt,
instead?I don't know what the rationale for this tool is. I have never used it.
Clearly, the reference shows, and the tools we use confirm, that it is
not necessary to use it.I have updated the scripts and instructions accordingly.
That still leaves open why we bother about escaping <.
The problem is that I often add SGML that has:
if (1 < 0) ...
I need something to warn me about those, especially in the release
notes.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +
On tor, 2011-09-01 at 14:17 -0400, Bruce Momjian wrote:
That still leaves open why we bother about escaping <.
The problem is that I often add SGML that has:
if (1 < 0) ...
I need something to warn me about those, especially in the release
notes.
Why do you need to be warned about that?
Peter Eisentraut wrote:
On tor, 2011-09-01 at 14:17 -0400, Bruce Momjian wrote:
That still leaves open why we bother about escaping <.
The problem is that I often add SGML that has:
if (1 < 0) ...
I need something to warn me about those, especially in the release
notes.Why do you need to be warned about that?
If I have:
if (1 < fred)
it will think "fred" is a SGML tag, no?
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +
On tor, 2011-09-01 at 17:31 -0400, Bruce Momjian wrote:
Peter Eisentraut wrote:
On tor, 2011-09-01 at 14:17 -0400, Bruce Momjian wrote:
That still leaves open why we bother about escaping <.
The problem is that I often add SGML that has:
if (1 < 0) ...
I need something to warn me about those, especially in the release
notes.Why do you need to be warned about that?
If I have:
if (1 < fred)
it will think "fred" is a SGML tag, no?
No, a < followed by a space is not a tag, it's character data. If it
thought it were a tag, it would complain.
Peter Eisentraut wrote:
On tor, 2011-09-01 at 17:31 -0400, Bruce Momjian wrote:
Peter Eisentraut wrote:
On tor, 2011-09-01 at 14:17 -0400, Bruce Momjian wrote:
That still leaves open why we bother about escaping <.
The problem is that I often add SGML that has:
if (1 < 0) ...
I need something to warn me about those, especially in the release
notes.Why do you need to be warned about that?
If I have:
if (1 < fred)
it will think "fred" is a SGML tag, no?
No, a < followed by a space is not a tag, it's character data. If it
thought it were a tag, it would complain.
Sometimes it is '<' (in single quotes), which I thought would be a
problem.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +
On lör, 2011-09-03 at 16:47 -0400, Bruce Momjian wrote:
Peter Eisentraut wrote:
On tor, 2011-09-01 at 17:31 -0400, Bruce Momjian wrote:
Peter Eisentraut wrote:
On tor, 2011-09-01 at 14:17 -0400, Bruce Momjian wrote:
That still leaves open why we bother about escaping <.
The problem is that I often add SGML that has:
if (1 < 0) ...
I need something to warn me about those, especially in the release
notes.Why do you need to be warned about that?
If I have:
if (1 < fred)
it will think "fred" is a SGML tag, no?
No, a < followed by a space is not a tag, it's character data. If it
thought it were a tag, it would complain.Sometimes it is '<' (in single quotes), which I thought would be a
problem.
The bottom line is, the SGML parser can figure that out itself, and if
it has a problem, it will complain. We don't need to second guess it
with regular expressions that are handcrafted out of thin air.
I was hoping you would remember whether you initially put this in
because of some tool problem. But if we are not finding any supporting
evidence, I would suggest that we just scrap this thing entirely.
Peter Eisentraut wrote:
On l?r, 2011-09-03 at 16:47 -0400, Bruce Momjian wrote:
Peter Eisentraut wrote:
On tor, 2011-09-01 at 17:31 -0400, Bruce Momjian wrote:
Peter Eisentraut wrote:
On tor, 2011-09-01 at 14:17 -0400, Bruce Momjian wrote:
That still leaves open why we bother about escaping <.
The problem is that I often add SGML that has:
if (1 < 0) ...
I need something to warn me about those, especially in the release
notes.Why do you need to be warned about that?
If I have:
if (1 < fred)
it will think "fred" is a SGML tag, no?
No, a < followed by a space is not a tag, it's character data. If it
thought it were a tag, it would complain.Sometimes it is '<' (in single quotes), which I thought would be a
problem.The bottom line is, the SGML parser can figure that out itself, and if
it has a problem, it will complain. We don't need to second guess it
with regular expressions that are handcrafted out of thin air.I was hoping you would remember whether you initially put this in
because of some tool problem. But if we are not finding any supporting
evidence, I would suggest that we just scrap this thing entirely.
I put it in to warn about release.sgml markup problems, so I properly
escaped all non-tag '>' and '<' characters.
I have removed the tool. We can always re-add it if we find it is
needed.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +