Getting our tables to render better in PDF output

Started by Tom Laneabout 6 years ago29 messagesdocs
Jump to latest
#1Tom Lane
tgl@sss.pgh.pa.us

The crummy formatting of our tables of functions and operators has
been an issue for a long time. To my mind, there are several things
that need to be addressed:

* The layout is completely unfriendly to function descriptions that
run to more than a few words.

* It's not very practical to have more than one example per function
(or at least, we seldom do so).

* The results look completely awful in PDF format, because of the
narrow effectively-available space, plus the fact that the toolchain
will prefer to overprint following columns instead of breaking text
where there's no whitespace.

In [1]/messages/by-id/20200116184444.GA25792@alvherre.pgsql, Alvaro suggested that we might be able to improve matters by
taking advantage of DocBook's features for column and row spanning.
I did some concrete experimentation in that line, and attached are
two alternative patches that show a couple of things we might do.
Both patches change tables 9.31 (Date/Time Functions) and 9.33
(Enum Support Functions), which I chose somewhat at random, but of
course there would be a lot more to be done if we choose to go this way.

The first patch uses only one row for each function example, while
the second patch uses two rows (i.e., example and result in separate
table rows). Otherwise they're the same.

I initially did the enum-support table, and what I tried there included
getting rid of the separate table column for function result type by
writing the functions in the form "func(argtypes) returns resulttype".
(Note that this table failed to specify the result types at all before,
which doesn't seem great.) The layout idea is

function name, args, result description
example example result

where we can repeat the "example / example result" row if we want more
examples per function. Alternatively, in the second patch, it's

function name, args, result description
example
example result

To my eyes, the first alternative is preferable in HTML, unless maybe you
want to read the manual in a *very* narrow browser window. But some of
the examples/results still overrun the available space when looking
at it in PDF A4 format. The second patch fixes that problem, but seems
not very pretty in a normal-width browser window.

When I tried to apply the same idea to the date/time functions table,
it didn't really work well at all, mainly because of a few beasts like
make_interval() --- that caused the left column to be so wide that the
right-hand columns were horrid. (At least with the toolchain version
I'm using, it seems like the colwidth specifications are respected
rigidly in PDF output but just plain ignored in HTML output. What
seems to happen in HTML is that earlier columns get their preferred
width and later ones get squeezed.)

So the layout idea that the patches show for that table is

function name arg types result type
description
example example result

or

function name arg types result type
description
example
example result

(Even with that, I had to savage make_interval's arg-types list a bit
to keep that column from eating too much space...)

I'm not especially wedded to any of these ideas, but I hope to provoke
some discussion about what we might do in this area. DocBook tables
aren't the greatest layout tool in the world, but they do have abilities
we're not exploiting.

Even with these changes, the amount of space available for examples
and results in PDF format is pretty tiny. With examples and results
in the same row, it seems that you can only have a couple of dozen
consecutive non-whitespace characters without running into overwrite
issues, whereas in HTML format the trouble threshold is a good deal
higher. I wonder if we could improve matters by switching to some
narrower font for <literal> text in PDF?

regards, tom lane

[1]: /messages/by-id/20200116184444.GA25792@alvherre.pgsql

Attachments:

tables-one-row-per-example.patchtext/x-diff; charset=us-ascii; name=tables-one-row-per-example.patchDownload+422-258
tables-two-rows-per-example.patchtext/x-diff; charset=us-ascii; name=tables-two-rows-per-example.patchDownload+499-258
#2Alexander Lakhin
exclusion@gmail.com
In reply to: Tom Lane (#1)
Re: Getting our tables to render better in PDF output

Hello Tom,
12.02.2020 00:51, Tom Lane wrote:

The crummy formatting of our tables of functions and operators has
been an issue for a long time. To my mind, there are several things
that need to be addressed:

* The layout is completely unfriendly to function descriptions that
run to more than a few words.

* It's not very practical to have more than one example per function
(or at least, we seldom do so).

* The results look completely awful in PDF format, because of the
narrow effectively-available space, plus the fact that the toolchain
will prefer to overprint following columns instead of breaking text
where there's no whitespace.

Please look at a less invasive approach that we use at Postgres Pro for
some time (mainly for improving the translated documentation, but it
works for the original one too). The idea is to add zero-width spaces
after/before some chars ('(', ',', '[', etc) to let fop split lines
where desired. It has one disadvantage - it's not search-friendly
(though maybe that is application-dependent).
But if it's feasible, I think this approach can at least complement a
manual tables reformatting. Decreasing a font size in the tables seems
appropriate to me too.

Best regards,
Alexander

Attachments:

fix-pdf-overflow.patchtext/x-patch; charset=UTF-8; name=fix-pdf-overflow.patchDownload+122-0
#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alexander Lakhin (#2)
Re: Getting our tables to render better in PDF output

Alexander Lakhin <exclusion@gmail.com> writes:

Please look at a less invasive approach that we use at Postgres Pro for
some time (mainly for improving the translated documentation, but it
works for the original one too). The idea is to add zero-width spaces
after/before some chars ('(', ',', '[', etc) to let fop split lines
where desired. It has one disadvantage - it's not search-friendly
(though maybe that is application-dependent).
But if it's feasible, I think this approach can at least complement a
manual tables reformatting. Decreasing a font size in the tables seems
appropriate to me too.

Hmm, interesting proposal. I experimented and verified that injecting
zero-width space (&#x200B;) does allow line breaking to occur in both
HTML and PDF output, so this could be a route to improving the situation
for overlength example texts. I do not think I like the idea of
automatically injecting tons of them, though. As you say, it might
hinder searching; and it would allow some silly breaks; and there are
cases where it still wouldn't find a break, such as the examples for
sha256() et al. I'd be happier about manually inserting breaks just
in the places we really need them. To keep the source readable, I'd
want to write something like "&zwsp;" not a numeric entity code,
but it looks like we can define custom entities if we want.

For amusement's sake, attached is a screenshot of what Table 9-33
looks like in A4 format, with my one-row-per-example patch of
yesterday plus a few manually-added zero-width spaces to break up
the examples. This is the first PDF rendering of that table that
I've seen that I actually like.

I also attached a screenshot of a segment of Table 9-31, to show
what that layout proposal looks like. It's a little busier, but
it does have the advantage that it's clearer how to apply that
format to operator tables. The "returns <type>" notation isn't used
anywhere in SQL for operators, so I am not in love with the idea of
writing the operator tables that way.

Also worth noting is that in most function tables, and certainly
in the operator tables, we could make the first column narrower.
The same table with the first column half as wide as the others
is depicted in the last screenshot. (For this particular table,
doing that would require breaking some of the longer function
names such as transaction_timestamp. Not sure whether that's
a net win, but we do have the option.)

One issue that I've found is that the toolchain has no idea that
the table rows are in groups, so it's happy to split a table
across pages with a function's description and/or examples on
a new page. No idea if there's any way around that. Fortunately
it's not an issue in HTML, so maybe we don't have to fix it.

Thoughts?

regards, tom lane

Attachments:

table9-33.pngimage/png; name=table9-33.pngDownload+12-1
table9-31.pngimage/png; name=table9-31.pngDownload+18-1
table9-31-narrow.pngimage/png; name=table9-31-narrow.pngDownload+8-3
#4Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#3)
Re: Getting our tables to render better in PDF output

On 2020-Feb-12, Tom Lane wrote:

For amusement's sake, attached is a screenshot of what Table 9-33
looks like in A4 format, with my one-row-per-example patch of
yesterday plus a few manually-added zero-width spaces to break up
the examples. This is the first PDF rendering of that table that
I've seen that I actually like.

I like this. The trick of mkaing the first cell take up two or three
rows makes this much clearer and sensible than what I had obtained.

I also attached a screenshot of a segment of Table 9-31, to show
what that layout proposal looks like. It's a little busier, but
it does have the advantage that it's clearer how to apply that
format to operator tables. The "returns <type>" notation isn't used
anywhere in SQL for operators, so I am not in love with the idea of
writing the operator tables that way.

Yeah, that's a little less obvious. I just noticed that the operators
tables show the operator names but not the input datatypes except in the
examples. Perhaps we could use a layout with a cell labelled
"signature" (namest=col2 nameend=col3) instead of input types + return
types and separate them using &rightarrow; which would look like this:
date + integer → date

Also worth noting is that in most function tables, and certainly
in the operator tables, we could make the first column narrower.
The same table with the first column half as wide as the others
is depicted in the last screenshot. (For this particular table,
doing that would require breaking some of the longer function
names such as transaction_timestamp. Not sure whether that's
a net win, but we do have the option.)

I like making that column narrower.

One issue that I've found is that the toolchain has no idea that
the table rows are in groups, so it's happy to split a table
across pages with a function's description and/or examples on
a new page. No idea if there's any way around that. Fortunately
it's not an issue in HTML, so maybe we don't have to fix it.

My vote goes to postponing a solution to this problem :-)

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#4)
Re: Getting our tables to render better in PDF output

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

On 2020-Feb-12, Tom Lane wrote:

I also attached a screenshot of a segment of Table 9-31, to show
what that layout proposal looks like. It's a little busier, but
it does have the advantage that it's clearer how to apply that
format to operator tables. The "returns <type>" notation isn't used
anywhere in SQL for operators, so I am not in love with the idea of
writing the operator tables that way.

Yeah, that's a little less obvious. I just noticed that the operators
tables show the operator names but not the input datatypes except in the
examples. Perhaps we could use a layout with a cell labelled
"signature" (namest=col2 nameend=col3) instead of input types + return
types and separate them using &rightarrow; which would look like this:
date + integer → date

Oh, that's a thought. We could do the same for functions:

function name type1, type2, type3 → rettype
description ...
example example result

which'd relieve the column-width pressure for functions with several
arguments. On the other hand, that would look a little funny
for functions with no arguments ... not but what they're going to
look funny no matter what. I used "none" in my conversion of
table 9.31, but wasn't satisfied with that, because it relies
completely on font choice to be distinguishable from a data type
named "none". With a separate argument-types cell it'd likely be
better to just leave the cell empty, but do we want to write
just "→ rettype" in a signature cell?

The other thing I was struggling with was how to distinguish
normal zero-argument functions (written with parens) from those
SQL abominations that are function calls with no parens. I think
we need to show that somehow, so that it's clear that the examples
are correct and not typos. It doesn't have to be *totally* obvious,
perhaps, if we have an example to back it up ... but the example
can't be the only thing.

Maybe don't take out the parens? So it'd work like

Function Signature

age (timestamp) → interval

now () → timestamp with time zone

current_timestamp → timestamp with time zone

Also, I think we're both imagining that we'd use the operator name
in operator signatures:

Operator Signature

+ integer + integer → integer

+ + integer → integer

so being consistent with that might suggest including the function name
in function signatures:

Function Signature

age age(timestamp) → interval

now now() → timestamp with time zone

current_timestamp current_timestamp → timestamp with time zone

I'm a bit suspicious of how much horizontal space that would eat, but
if we're able to get rid of the separate cell for result type, it
might work out OK.

regards, tom lane

#6Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#5)
Re: Getting our tables to render better in PDF output

On 2020-Feb-12, Tom Lane wrote:

With a separate argument-types cell it'd likely be
better to just leave the cell empty, but do we want to write
just "→ rettype" in a signature cell?

Yeah, it'd look very odd, and certainly the no-parens case makes it
worse. I like this end result:

so being consistent with that might suggest including the function name
in function signatures:

Function Signature

age age(timestamp) → interval

now now() → timestamp with time zone

current_timestamp current_timestamp → timestamp with time zone

I'm a bit suspicious of how much horizontal space that would eat, but
if we're able to get rid of the separate cell for result type, it
might work out OK.

Regarding no-parens function signatures, perhaps we can add a footnote
indicating that such functions have this strange shape because of the
SQL committee, such as "&dagger; This function signature uses no
parentheses because the SQL standard defines it in that way."

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#6)
Re: Getting our tables to render better in PDF output

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

Yeah, it'd look very odd, and certainly the no-parens case makes it
worse. I like this end result:

Function Signature
age age(timestamp) → interval
now now() → timestamp with time zone
current_timestamp current_timestamp → timestamp with time zone

I gave that a try, and it seems to work really well. It can even handle
the ridiculously long signature for make_interval() in reasonable style,
as shown in the screenshot attached.

One problem with the rightarrow idea is that it's not rendering quite
right for me: it looks great in HTML, but in PDF it comes out flush
with the baseline, as you can see in the screenshot. Hopefully
there's a way to fix that that we can hide in the custom entity ...
but I have no idea how.

I decided to try converting the date/time operators table too, to
see how well this works for that. It's bulkier than before, but
also (I think) more precise. I realized that this table actually
had three examples already for float8 * interval, but it wasn't
at all obvious that they were the same operator. So that aspect
is a lot nicer here. On the other hand, it seems like the text
descriptions are only marginally useful here. I can imagine that
they would be useful in some other operator tables, such as
geometric operators, but I'm a bit tempted to leave them out
in this particular table. The format would adapt to that easily.

Another thing worth considering is removing duplicate left-hand-
column entries, that is, considering all the instances of
similarly-named functions/operators to be "the same". In the
attached patch, I did that for isfinite() but not anywhere else.
I'm not quite sure if it's a good idea or not. It seems like it
makes sense for isfinite(), but perhaps less so for operators.

Again, comments welcome. This is starting to feel like a real
proposal now, but I'm still not at all wedded to it.

regards, tom lane

Attachments:

better-function-tables-wip.patchtext/x-diff; charset=us-ascii; name=better-function-tables-wip.patchDownload+645-334
table-9-31-take2.pngimage/png; name=table-9-31-take2.pngDownload+10-4
#8Alexander Lakhin
exclusion@gmail.com
In reply to: Tom Lane (#3)
Re: Getting our tables to render better in PDF output

12.02.2020 23:58, Tom Lane wrote:

Alexander Lakhin <exclusion@gmail.com> writes:

Please look at a less invasive approach that we use at Postgres Pro for
some time (mainly for improving the translated documentation, but it
works for the original one too). The idea is to add zero-width spaces
after/before some chars ('(', ',', '[', etc) to let fop split lines
where desired. It has one disadvantage - it's not search-friendly
(though maybe that is application-dependent).
But if it's feasible, I think this approach can at least complement a
manual tables reformatting. Decreasing a font size in the tables seems
appropriate to me too.

Hmm, interesting proposal. I experimented and verified that injecting
zero-width space (&#x200B;) does allow line breaking to occur in both
HTML and PDF output, so this could be a route to improving the situation
for overlength example texts. I do not think I like the idea of
automatically injecting tons of them, though. As you say, it might
hinder searching; and it would allow some silly breaks; and there are
cases where it still wouldn't find a break, such as the examples for
sha256() et al. I'd be happier about manually inserting breaks just
in the places we really need them. To keep the source readable, I'd
want to write something like "&zwsp;" not a numeric entity code,
but it looks like we can define custom entities if we want.

Yes, I was starting with manual &zwsp; insertions into the translation,
but later I reduced such insertions just to several dozens. (For
example, we still have "3.1415926535&zwsp;8979323846" in the translation.)
The main issue of the manual approach was that I needed to recheck that
zwsp placement on updates, and I can't see where it's desired until I
generate pdf. Fortunately, fop prints warning like that:
[WARN] FOUserAgent - The contents of fo:block line 2 exceed the
available area in the inline-progression direction by 22725 millipoints.
(See position 127769:983)
It's not very user-friendly, but still useful when we have a pair or two
of them. (For now, I see 559 such warnings in REL_12_STABLE.)
Second issue is that the placement can depend on the page size and in
fact most of that zwsps are not needed for html or other formats
(moreover, some formats can require different placements (if we're not
just implementing some common rules)).
Third (minor) issue is with translation - when I will see some break in
the English source, e.g. "split_part('abc~@~def&zwsp;~@~ghi', '~@~',
2)", should I leave the break in the same place, or it's better to move
it because adjacent text has different length and the table columns have
different width?

For me this approach expresses a belief that the line breaking rules
should be slightly different in our context. For example, having line
break after an opening bracket is feasible and common in function calls
and declarations. Maybe the rules in the proposed xslt could be
improved/restricted, but I think that if fop would allow us to enable an
imaginary 'programming language line breaking rules' mode, we would use
it for our tables (some or all).
Maybe some of the rules can be implemented explicitly in the DocBook
source, just to reduce tons of zwsp in the generated output, or the
"fo:table-cell/fo:block//text()" condition can be improved to filter
some (text-only?) tables out, but I think that the idea of our specific
line breaking rules could work.

Best regards,
Alexander

#9Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Alexander Lakhin (#8)
Re: Getting our tables to render better in PDF output

On 2020-Feb-13, Alexander Lakhin wrote:

Yes, I was starting with manual &zwsp; insertions into the translation,
but later I reduced such insertions just to several dozens. (For
example, we still have "3.1415926535&zwsp;8979323846" in the translation.)
The main issue of the manual approach was that I needed to recheck that
zwsp placement on updates, and I can't see where it's desired until I
generate pdf. Fortunately, fop prints warning like that:
[WARN] FOUserAgent - The contents of fo:block line 2 exceed the
available area in the inline-progression direction by 22725 millipoints.
(See position 127769:983)
It's not very user-friendly, but still useful when we have a pair or two
of them.

It seems to me that a productive way forward would be to fix the layout
to make these warning disappear. Then it will be relatively easy to find
where to fix, if new ones appear.

Now I suppose you're complaining about the "position 127769:983" part of
the error message which tells you with zero clarity where the problem
is. Maybe what we need is to figure out what the numbers mean, and how
to use them; for example if they are byte offsets into the file, then it
should be possible to tell your editor to go to that byte in the
complete XML file.

Second issue is that the placement can depend on the page size and in
fact most of that zwsps are not needed for html or other formats
(moreover, some formats can require different placements (if we're not
just implementing some common rules)).

I suppose A4 page size is going to show slightly different warnings than
Letter page size in the PDF output. Perhaps we can say that we only
care about warnings in one of them, for these purposes.

Having to touch 500+ places does not sound very appetizing, for sure.

Third (minor) issue is with translation - when I will see some break in
the English source, e.g. "split_part('abc~@~def&zwsp;~@~ghi', '~@~',
2)", should I leave the break in the same place, or it's better to move
it because adjacent text has different length and the table columns have
different width?

If the English version is warning-clean, then it should be possible to
keep the zwsps in the same location in the translation, and then tweak
the translation according to any new warnings that appear there.
My guess is that the majority of zwsps are going to want to stay in the
same place.

Maybe some of the rules can be implemented explicitly in the DocBook
source, just to reduce tons of zwsp in the generated output, or the
"fo:table-cell/fo:block//text()" condition can be improved to filter
some (text-only?) tables out, but I think that the idea of our specific
line breaking rules could work.

Maybe we can mark-up specific table cells/columns as being subject to
the special line breaking rules.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#10Alexander Lakhin
exclusion@gmail.com
In reply to: Alvaro Herrera (#9)
Re: Getting our tables to render better in PDF output

Hello Alvaro,
14.02.2020 23:16, Alvaro Herrera wrote:

On 2020-Feb-13, Alexander Lakhin wrote:

Yes, I was starting with manual &zwsp; insertions into the translation,
but later I reduced such insertions just to several dozens. (For
example, we still have "3.1415926535&zwsp;8979323846" in the translation.)
The main issue of the manual approach was that I needed to recheck that
zwsp placement on updates, and I can't see where it's desired until I
generate pdf. Fortunately, fop prints warning like that:
[WARN] FOUserAgent - The contents of fo:block line 2 exceed the
available area in the inline-progression direction by 22725 millipoints.
(See position 127769:983)
It's not very user-friendly, but still useful when we have a pair or two
of them.

It seems to me that a productive way forward would be to fix the layout
to make these warning disappear. Then it will be relatively easy to find
where to fix, if new ones appear.

Now I suppose you're complaining about the "position 127769:983" part of
the error message which tells you with zero clarity where the problem
is. Maybe what we need is to figure out what the numbers mean, and how
to use them; for example if they are byte offsets into the file, then it
should be possible to tell your editor to go to that byte in the
complete XML file.

I'm not complaining about the cryptic position of the problems, I'm
concerned with their number.
The position is specified as {line_number}:{character_postition} in
postgres-*.fo (not in the DocBook source).
For example, when performing `make postgres-A4.pdf` on REL_12_STABLE I get:
[WARN] FOUserAgent - The contents of fo:block line 1 exceed the
available area in the inline-progression direction by more than 50
points. (See position 28808:374)

To find an exact problematic text you can look at the specified line(s)
of postgres-A4.fo:
/$ sed -n '28808,28811p' postgres-A4.fo /
<fo:block id="id-1.5.13.4.7.12.1" wrap-option="wrap" text-align="start"
space-before.minimum="0.8em" space-before.optimum="1em"
space-before.maximum="1.2em" space-after.minimum="0.8em"
space-after.optimum="1em" space-after.maximum="1.2em" hyphenate="false"
white-space-collapse="false" white-space-treatment="preserve"
linefeed-treatment="preserve" font-family="monospace">
EXPLAIN SELECT * FROM tenk1 WHERE unique1 &lt; 100;

Searching this text in pdf gets you to page 467 where you can see a long
line of '---' going of the page...

Third (minor) issue is with translation - when I will see some break in
the English source, e.g. "split_part('abc~@~def&zwsp;~@~ghi', '~@~',
2)", should I leave the break in the same place, or it's better to move
it because adjacent text has different length and the table columns have
different width?

If the English version is warning-clean, then it should be possible to
keep the zwsps in the same location in the translation, and then tweak
the translation according to any new warnings that appear there.
My guess is that the majority of zwsps are going to want to stay in the
same place.

Yes, that's why I consider this as minor issue, but some kind of an
automatic solution can eliminate it at all.

Maybe some of the rules can be implemented explicitly in the DocBook
source, just to reduce tons of zwsp in the generated output, or the
"fo:table-cell/fo:block//text()" condition can be improved to filter
some (text-only?) tables out, but I think that the idea of our specific
line breaking rules could work.

Maybe we can mark-up specific table cells/columns as being subject to
the special line breaking rules.

Things made complicated by the xslt preprocessor, because you can't see
Docbook tags and attributes on a FOP level, but I can explore possible
resolutions if we choose to go this way.

Best regards,
Alexander

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#9)
Re: Getting our tables to render better in PDF output

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

On 2020-Feb-13, Alexander Lakhin wrote:

Third (minor) issue is with translation - when I will see some break in
the English source, e.g. "split_part('abc~@~def&zwsp;~@~ghi', '~@~',
2)", should I leave the break in the same place, or it's better to move
it because adjacent text has different length and the table columns have
different width?

If the English version is warning-clean, then it should be possible to
keep the zwsps in the same location in the translation, and then tweak
the translation according to any new warnings that appear there.
My guess is that the majority of zwsps are going to want to stay in the
same place.

So far as I've seen, the majority of places where we'll still need to
insert break opportunities are in examples and example results, which
don't seem like they'd be subject to translation. I'm really not eager to
turn loose an automatic-zwsp-inserter for a problem that might be mostly
hypothetical once we have a more forgiving table layout in place.

regards, tom lane

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#7)
Re: Getting our tables to render better in PDF output

I wrote:

One problem with the rightarrow idea is that it's not rendering quite
right for me: it looks great in HTML, but in PDF it comes out flush
with the baseline, as you can see in the screenshot. Hopefully
there's a way to fix that that we can hide in the custom entity ...
but I have no idea how.

I poked at this a little bit, and found that I could get a pretty
decent-looking result if I hacked the .fo file to contain
"<fo:inline baseline-shift="10%">→</fo:inline>" rather than a bare
right arrow. (See attached screenshot, wherein the last rightarrow
was fixed this way but the others weren't.) However, I do not
have much of a clue as to how such a fix might be injected into
our stylesheets --- anybody have a suggestion?

regards, tom lane

Attachments:

baseline-shift.pngimage/png; name=baseline-shift.pngDownload+5-3
#13Alexander Lakhin
exclusion@gmail.com
In reply to: Tom Lane (#12)
Re: Getting our tables to render better in PDF output

Hello Tom,

16.02.2020 23:07, Tom Lane wrote:

I poked at this a little bit, and found that I could get a pretty
decent-looking result if I hacked the .fo file to contain
"<fo:inline baseline-shift="10%">→</fo:inline>" rather than a bare
right arrow. (See attached screenshot, wherein the last rightarrow
was fixed this way but the others weren't.) However, I do not
have much of a clue as to how such a fix might be injected into
our stylesheets --- anybody have a suggestion?

Please look at the XSLT template for processing .fo before calling fop.
Maybe this can be done with just the existing stylesheet-fo.xsl, I'll
try to research this later.

Best regards,
Alexander

Attachments:

shift_arrow_up.patchtext/x-patch; charset=UTF-8; name=shift_arrow_up.patchDownload+46-0
#14Alexander Lakhin
exclusion@gmail.com
In reply to: Alexander Lakhin (#13)
Re: Getting our tables to render better in PDF output

17.02.2020 00:21, Alexander Lakhin wrote:

Hello Tom,

16.02.2020 23:07, Tom Lane wrote:

I poked at this a little bit, and found that I could get a pretty
decent-looking result if I hacked the .fo file to contain
"<fo:inline baseline-shift="10%">→</fo:inline>" rather than a bare
right arrow. (See attached screenshot, wherein the last rightarrow
was fixed this way but the others weren't.) However, I do not
have much of a clue as to how such a fix might be injected into
our stylesheets --- anybody have a suggestion?

Please look at the XSLT template for processing .fo before calling fop.
Maybe this can be done with just the existing stylesheet-fo.xsl, I'll
try to research this later.

I've managed to simplify the patch a little by incorporating those
templates in stylesheet-fo.xsl.

Maybe it's better to use the same formatting as in the docbook xsl
template (see docbook/stylesheet/docbook-xsl/xhtml-1_1/inline.xsl).
There "$menuchoice.menu.separator" is enclosed in <fo:inline
font-size=".75em" font-family="{$symbol.font.family}">...</fo:inline>
and you can see the effect on page 536 (IPC parameters can be set in the
System Administration Manager (SAM) under Kernel Configu-
ration → Configurable Parameters.)

Yet another possibility is to use the docbook tags:
<funcdef><function>func()</function>
<returnvalue>int</returnvalue></funcdef>.
Then we can define the desired formatting for such markup (similar to
<menuchoice><guimenu>...</guimenu><guimenuitem>...</guimenuitem></menuchoice>).

Best regards,
Alexander

Attachments:

shift_arrow_up2.patchtext/x-patch; charset=UTF-8; name=shift_arrow_up2.patchDownload+31-0
#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alexander Lakhin (#14)
Re: Getting our tables to render better in PDF output

I set this idea aside during the final v13 commitfest, but I figure that
it's fine to work on documentation improvements during feature freeze,
so I'm going to try to push it forward over the next few weeks.

Barring objections, I want to commit more or less what I posted at [1]/messages/by-id/23574.1581555393@sss.pgh.pa.us,
verify that it looks decent on the website, and then incrementally
convert the rest of our function/operator tables to the new style.
It's too big a job to get done in one commit, but a table or two at
a time seems like a reasonable approach. After the table format
conversion is finished we can take a look at how much of a
bad-line-breaks issue we still have, and decide what to do about that.

First though, we need to nail down exactly what markup to use.

Alexander Lakhin <exclusion@gmail.com> writes:

Maybe it's better to use the same formatting as in the docbook xsl
template (see docbook/stylesheet/docbook-xsl/xhtml-1_1/inline.xsl).
There "$menuchoice.menu.separator" is enclosed in <fo:inline
font-size=".75em" font-family="{$symbol.font.family}">...</fo:inline>
and you can see the effect on page 536 (IPC parameters can be set in the
System Administration Manager (SAM) under Kernel Configu-
ration → Configurable Parameters.)

Yeah, I see that that uses a right-arrow and it looks quite decent in
both HTML and PDF renderings. So we ought to borrow those markup details
rather than solving the problem from scratch.

Yet another possibility is to use the docbook tags:
<funcdef><function>func()</function>
<returnvalue>int</returnvalue></funcdef>.
Then we can define the desired formatting for such markup (similar to
<menuchoice><guimenu>...</guimenu><guimenuitem>...</guimenuitem></menuchoice>).

I looked into this. It appears that <funcdef> is fairly tightly tied
to C function declaration syntax, plus it sounds like it might get
deprecated in future docbook versions. So I don't want to use that.
But we could use <returnvalue> which seems to be defined independently
of <funcdef>, and isn't being used in our docs at present. I found
by experimentation that this doesn't work:

<returnvalue><type>date</type></returnvalue>

(it complains that these two tag types can't be nested); but this does:

<returnvalue>date</returnvalue>

So if we can get <returnvalue> to both insert a right arrow and switch the
font to match <type>'s choice, this would work more or less decently, and
it's probably cleaner than the bare-entity-reference approach I posted
before. I don't have the XSL skills to get that to work though.
Anyone want to help out?

regards, tom lane

[1]: /messages/by-id/23574.1581555393@sss.pgh.pa.us

#16Corey Huinker
corey.huinker@gmail.com
In reply to: Tom Lane (#15)
Re: Getting our tables to render better in PDF output

On Sat, Apr 11, 2020 at 4:51 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I set this idea aside during the final v13 commitfest, but I figure that
it's fine to work on documentation improvements during feature freeze,
so I'm going to try to push it forward over the next few weeks.

If it's ok to work on doc patches during the feature freeze, and if we're
already tweaking function documentation, would it be possible to add in
anchor ids to function definitions so that we could reference specific
functions (or rather the family of functions that share a name like this:
https://www.postgresql.org/docs/devel/functions-datetime.html#FUNCTION-DATE-PART
or similar. I tried it out just now, and the anchoring works, but there's
no obvious place to acquire the anchored link, so presumably we'd
anchor-ize the function name itself.

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Corey Huinker (#16)
Re: Getting our tables to render better in PDF output

Corey Huinker <corey.huinker@gmail.com> writes:

If it's ok to work on doc patches during the feature freeze, and if we're
already tweaking function documentation, would it be possible to add in
anchor ids to function definitions so that we could reference specific
functions (or rather the family of functions that share a name like this:
https://www.postgresql.org/docs/devel/functions-datetime.html#FUNCTION-DATE-PART
or similar. I tried it out just now, and the anchoring works, but there's
no obvious place to acquire the anchored link, so presumably we'd
anchor-ize the function name itself.

Don't have a strong opinion about that, but it'd sure be a lot of new
anchors. Is that going to be a problem for the docs toolchain? If
the anchors are attached to individual function names rather than
sections or paragraphs, do they actually work well as link references?
(I'm particularly wondering how an <xref> would render.)

regards, tom lane

#18Corey Huinker
corey.huinker@gmail.com
In reply to: Tom Lane (#17)
Re: Getting our tables to render better in PDF output

On Sat, Apr 11, 2020 at 6:41 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Corey Huinker <corey.huinker@gmail.com> writes:

If it's ok to work on doc patches during the feature freeze, and if we're
already tweaking function documentation, would it be possible to add in
anchor ids to function definitions so that we could reference specific
functions (or rather the family of functions that share a name like this:

https://www.postgresql.org/docs/devel/functions-datetime.html#FUNCTION-DATE-PART

or similar. I tried it out just now, and the anchoring works, but there's
no obvious place to acquire the anchored link, so presumably we'd
anchor-ize the function name itself.

Don't have a strong opinion about that, but it'd sure be a lot of new
anchors.

True, but it'd would be a lot better than pointing a person to a page that
has 20+ functions defined on it.

Is that going to be a problem for the docs toolchain? If
the anchors are attached to individual function names rather than
sections or paragraphs, do they actually work well as link references?
(I'm particularly wondering how an <xref> would render.)

So I can't speak to any scalability issues for adding a bunch of refs, but
I did try this out for justify_days() (diff attached) and here's what I
found:
* <link linkend="function-justify-days">justify_days</link>
This made a link, in the same font as any other link ref.
* <xref linkend="function-justify-days"/>
This made a link that looks exactly like the previous one, with the text
"justify_days", so if we're fine with the font change, we could use that
* <link
linkend="function-justify-days"><function>justify_days</function></link>
This made the link we want in the function font.

The docbook spec doesn't allow an xref inside a function tag, and no tags
at all can be inside an xref.

Attachments:

link-one-function.difftext/x-patch; charset=US-ASCII; name=link-one-function.diffDownload+4-2
#19Jürgen Purtz
juergen@purtz.de
In reply to: Tom Lane (#15)
Re: Getting our tables to render better in PDF output

On 11.04.20 22:51, Tom Lane wrote:

Yet another possibility is to use the docbook tags:
<funcdef><function>func()</function>
<returnvalue>int</returnvalue></funcdef>.
Then we can define the desired formatting for such markup (similar to
<menuchoice><guimenu>...</guimenu><guimenuitem>...</guimenuitem></menuchoice>).

I looked into this. It appears that <funcdef> is fairly tightly tied
to C function declaration syntax, plus it sounds like it might get
deprecated in future docbook versions.

funcsynopsis, funcdef, function, ... keeps valid in Docbook 5, see:
https://tdg.docbook.org/tdg/5.1/funcsynopsis.html . There is even an
option to distinguish between K&R and ANSI style during rendering:
<?dbhtml funcsynopsis-style='kr'?>

Kind regards, Jürgen Purtz

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#15)
Re: Getting our tables to render better in PDF output

I wrote:

So if we can get <returnvalue> to both insert a right arrow and switch the
font to match <type>'s choice, this would work more or less decently, and
it's probably cleaner than the bare-entity-reference approach I posted
before. I don't have the XSL skills to get that to work though.
Anyone want to help out?

I educated myself a teensy bit about XSL, and unless I'm missing
something, this is really pretty darn trivial; the attached seems
to do the trick.

I experimented with the markup from <guimenuitem> and decided that
I didn't like their choice of a smaller font size in this context;
it looks better to me to leave the arrow full-size. The important
thing to learn from that precedent seems to be that we have to
specify the font correctly, as indeed is mentioned in the docbook
documentation. So it seems to work well to just use

<fo:inline font-family="{$symbol.font.family}">&#x2192; </fo:inline>

(The extra space seems to be necessary, else the arrow ends up
adjacent to the type name.)

So I'm pretty happy with this implementation and will push forward.

regards, tom lane

Attachments:

markup-for-returnvalue.patchtext/x-diff; charset=us-ascii; name=markup-for-returnvalue.patchDownload+11-0
#21Alexander Lakhin
exclusion@gmail.com
In reply to: Tom Lane (#20)
#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alexander Lakhin (#21)
#23Tom Lane
tgl@sss.pgh.pa.us
In reply to: Corey Huinker (#18)
#24Corey Huinker
corey.huinker@gmail.com
In reply to: Tom Lane (#23)
#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Corey Huinker (#18)
#26Corey Huinker
corey.huinker@gmail.com
In reply to: Tom Lane (#25)
#27Tom Lane
tgl@sss.pgh.pa.us
In reply to: Corey Huinker (#26)
#28Corey Huinker
corey.huinker@gmail.com
In reply to: Tom Lane (#27)
#29Tom Lane
tgl@sss.pgh.pa.us
In reply to: Corey Huinker (#28)