Implement TeX's fraction and script alignment#31046
Implement TeX's fraction and script alignment#31046QuLogic wants to merge 14 commits intomatplotlib:text-overhaulfrom
Conversation
e9de85f to
b8b8063
Compare
| num_shift_up = consts.num2 * x_height | ||
| den_shift_down = consts.denom2 * x_height | ||
| min_clr = rule | ||
| delta = rule / 2 |
There was a problem hiding this comment.
Compared to #22852, I added the intermediate delta variable to align a bit more with the TeX book algorithm.
There was a problem hiding this comment.
I I find it surprisingly difficult to map this code onto Knuth's description at nodes 743~748, perhaps it could be made clearer by trying to keep as many variable names as possible (even though I certainly understand that some of them are being inlined)?
There was a problem hiding this comment.
Do you mean the calculations of num_clr and den_clr just below these lines? The apparent difference in the logic arises from the fact that TeX and Mathtext are calculating different things: while TeX determines the amount to independently move the numerator and denominator by, Mathtext calculates the space above and below the fraction line to pack into a Vlist.
For instance,
Pascal presumably did not have min and max functions, which may explain the gymnastics used in the book. My opinion is that diverting from the current code might obfuscate the meaning: if I look at num_shift_up - cnum.depth - axis_height - delta, I instinctively read it as, 'Go up to baseline of the numerator, come down to its descender, and subtract the height of the axis and half its own vertical thickness,' which yields the space between the descender of the numerator and the fraction line.
There was a problem hiding this comment.
Thank you, the clarification is very helpful. Perhaps the following
diff --git i/lib/matplotlib/_mathtext.py w/lib/matplotlib/_mathtext.py
index 3ab5aaa8be..04e0f90ab4 100644
--- i/lib/matplotlib/_mathtext.py
+++ w/lib/matplotlib/_mathtext.py
@@ -2692,14 +2692,14 @@ class Parser:
if style is self._MathStyle.DISPLAYSTYLE:
num_shift_up = consts.num1 * x_height
den_shift_down = consts.denom1 * x_height
- min_clr = 3 * rule
+ clr = 3 * rule # The minimum clearance.
else:
num_shift_up = consts.num2 * x_height
den_shift_down = consts.denom2 * x_height
- min_clr = rule
+ clr = rule # The minimum clearance.
delta = rule / 2
- num_clr = max(num_shift_up - cnum.depth - axis_height - delta, min_clr)
- den_clr = max(axis_height - delta + den_shift_down - cden.height, min_clr)
+ num_clr = max((num_shift_up - cnum.depth) - (axis_height + delta), clr)
+ den_clr = max((axis_height - delta) - (cden.height - den_shift_down), clr)
vlist = Vlist([cnum, # numerator
Vbox(0, num_clr), # space
Hrule(state, rule), # rule(and ditto for the case without fraction rule) would be clearer, in that it follows more directly Knuth's variable names and groupings, while maintaining the higher degree of abstraction provided by python?
|
WRT the original #22852 (comment):
Also unsure about this one; it appears that one square root sign is sized a bit different?
Seems to be the case, but apostrophes are handled by the
This is no longer failing; I think this may be one of the rounding issues for the initial character found (and fixed) in #30059. |
|
TODO: Since the height of fractions is a little bigger, I think I may need to tweak the |
|
I've gone through the sub/super script changes and they seem fine as well, other than a couple known shortcuts. I'm only uncertain about the constants; they may have only been chosen to minimize image changes instead of matching TeX. For CM, it was also calculated with our current hinting settings and not the defaults that have been switched to in the |
tacaswell
left a comment
There was a problem hiding this comment.
Modulo re basing and shortening the constants.
b8b8063 to
b6c10a0
Compare
b6c10a0 to
75c11c0
Compare
|
While looking at the TeX algorithms, I found out about Comparing the values, we have:
Remember that we multiply everything by x-height, so here I've divided that metric out in the last column. Also, we use So, the numbers we have for For the fraction metrics ( Before we had this (with I haven't found any corresponding TFM files with the same metrics in them for the other fonts we have. I believe modern LaTeX can synthesize these directly from the font, and I do see some embedded MATH tables in there, but I haven't worked out the conversions for those yet. I've pushed the changes to the Computer Modern constants as two commits for ease of review. |
I reported this as a bug back then (#23474). Changing the script rendering logic is what exposed it. After merging a fix (#23482), the test passed even with the updated logic.
🎯💯
I remember being puzzled that using the same constants (from either Computer Modern or Latin Modern—I can't remember which one I had checked) didn't yield neatly aligned denominators. The multiplication by the x-height appears to be the key, which didn't strike me then. |
lib/matplotlib/_mathtext.py
Outdated
| sup1 = 0.79716796875 | ||
| sub1 = 0.354296875 | ||
| sub2 = 0.5314453125 | ||
| supdrop = 0.386108 / 0.430555 |
There was a problem hiding this comment.
You can get more accurate (effectively fixed-point representation) values from the tfm file, as tftopl prints values after scaling and drops some decimal points, e.g. for cmsy10 I get
slant=262144, space=0, space_stretch=0, space_shrink=0, x_height=451470, quad=1048579, extra_space=0 num1=709370, num2=412858, num3=465286, denom1=719272, denom2=361592, sup1=432949, sup2=380520, sup3=302922, sub1=157286, sub2=259226, supdrop=404864, subdrop=52429, delim1=2506096, delim2=1059062, axis_height=262144
(values need to be scaled by 2**20).
I extended the Tfm class to read these values at anntzer@a360989 if you want to try your hand at it (we don't need to decide now whether we actually want to integrate this functionality into the Tfm class).
Note that this also allows reading the actual definition of 1em and 1ex ("quad" and "x_height"), which should be better than the current approach of trying to guess the values (see @tfpf's comment
I reported this as a bug back then (#23474). Changing the script rendering logic is what exposed it. After merging a fix (#23482), the test passed even with the updated logic.
the linked threads (#23474 (comment)), and https://tex.stackexchange.com/a/98139).
Perhaps also worth fixing this properly?
There was a problem hiding this comment.
Ah, that's very useful, and might be something to implement along with stuff needed for #31048. For now, I've just put the numbers you have read directly into the file. They're very close and it only affects one image.
For quad / x_height, I can take a look, but we can also make that change independently from this one, I think?
There was a problem hiding this comment.
Sure (though it'll probably again break all images, so it has to be done in the text-overhaul branch too).
There was a problem hiding this comment.
BTW, I just noticed that fontTools has a TFM parser; I'm not sure how extensive it is.
There was a problem hiding this comment.
From a very quick look it looks complete; you can run python -mfontTools.tfmLib foo.tfm and in particular the params above will be printed (again they are scaled by 2**20 but all the decimal points are given so you can just remultiply back).
|
re: the +/-rule issue. Empirically it looks like the +/-rule can be removed by changing how Rules are being rendered; i.e. the following patch appears to work: diff --git i/lib/matplotlib/_mathtext.py w/lib/matplotlib/_mathtext.py
index f45e1044bc..bfe3f3ec29 100644
--- i/lib/matplotlib/_mathtext.py
+++ w/lib/matplotlib/_mathtext.py
@@ -1424,7 +1424,7 @@ class Rule(Box):
def render(self, output: Output, # type: ignore[override]
x: float, y: float, w: float, h: float) -> None:
- self.fontset.render_rect_filled(output, x, y, x + w, y + h)
+ self.fontset.render_rect_filled(output, x, y - h, x + w, y)
class Hrule(Rule):
@@ -2687,9 +2687,9 @@ class Parser:
den_clr = max(axis_height - delta + den_shift_down - cden.height, min_clr)
# Possible bug in fraction rendering. See GitHub PR 22852 comments.
vlist = Vlist([cnum, # numerator
- Vbox(0, num_clr - rule), # space
+ Vbox(0, num_clr), # space
Hrule(state, rule), # rule
- Vbox(0, den_clr + rule), # space
+ Vbox(0, den_clr), # space
cden # denominator
])
vlist.shift_amount = cden.height + den_clr + delta - axis_heightIt would seem credible to me that we got the definition of boxes mixed up at some point (especially due to the mix between upwards y's and downwards y's, which always confuses me), though I haven't actually tracked that down. Edit: I convinced myself that the patch is correct. At the call site matplotlib/lib/matplotlib/_mathtext.py Lines 1706 to 1709 in 51fbfc4 cur_v + off_v to cur_v + off_v - rule_height (this is why cur_v is shifted by + rule_height just before; also at that point some print debugging indicates that y's go downwards), so Rule.render should indeed call render_rect_filled from y - h to y.
|
|
Happy to see you're still around @tfpf and thanks for answering any questions we have.
If I make this change, it affects 65 test images. Most of them do not contain fractions, but |
Yes, I didn't mention this, but: @tfpf: I'm sorry your (nice!) work didn't go in last time, turns out we needed an even bigger overhaul of many parts to fix all the issues; thanks for coming back to help. I'll also take that opportunity to thank @QuLogic for keeping all the moving parts together and taking care of the more-or-less complete patches I keep throwing around 😅
I suspect so? Not that I looked. |
b2afc68 to
df479e9
Compare
There was a problem hiding this comment.
Various small points remain (reorder/rewrite the num/den clearance calculations to follow Knuth more closely, whether to get x-height before or after shrinking, x-height/axis-height detection) but I'll let @QuLogic decide whether he wants to do that in this PR (self-merge is fine I think?) or open separate issues to track them.
|
By the way, https://www.tug.org/TUGboat/tb27-1/tb86jackowski.pdf has some nice figures explaining fraction layout, and https://www.tug.org/~vieth/papers/bachotex2008/math-font-paper.pdf some more discussion on inferring font parameters (though neither the x-height nor the axis-height...). |
|
According to The LaTeX Font Catalogue, I don't think DejaVu Sans provides math support in LaTeX, so I'm a bit uncertain how to compare our math constants with it. Opening up the Instead, I've managed to find values in FontForge's TeX tables in a slightly-hidden settings page, then dividing by the xheight in that same setting page (1120), we get:
Before, we had: Note that the right side of the figure is DejaVu Serif has the same metrics, except the xheight is 1063 instead, so I won't post the table over again. Instead, just before and after images: |
|
There is a dejavu math font: https://www.gust.org.pl/projects/e-foundry/tex-gyre-dejavu-math / https://ctan.org/pkg/tex-gyre-math-dejavu?lang=en but it's only a serif font :/ not sure how well the constants would compare? |
|
Well, we do have DejaVu Serif as well with almost the same metrics; only the xheight is different at 1063 instead of 1120 (updated above with those images as well.) But it looks like the TeX Gyre fonts are for |
#31064 enables to use the TeX Gyre fonts from within usetex. Currently, it does not dynamically set the "constants" but uses the defaults, but one could in principle read the otf MATH table. |
As described in *TeX: the Program* by Don Knuth. New font constants are set to the nearest integral multiples of 0.1 for which numerators and denominators containing normal text do not have to be shifted beyond their default shift amounts at font size 30 in display and text styles. To better process superscripts and subscripts, the x-height is now always calculated instead of being retrieved from the font table (which was the case for Computer Modern); the affected font constants have been changed. A duplicate test was also fixed in the process.
These values are taken from `cmsy10.tfm`, and divided by the x-height in that output to match the scale used in Matplotlib. Also, split the `supdrop`/`subdrop` constants to match with TeX's algorithm.
These values are taken from `cmsy10.tfm`, and divided by the x-height in that output to match the scale used in Matplotlib.
39afc75 to
a77b250
Compare










PR summary
This is a rebase of #22852 by @tfpf. However, since we are planning to refresh the test images already, this reverts the change to move many mathtext images to SVG only (and I believe fixes some duplicated tests due to incorrect conflict resolution). Now, the only test change is to a nested-
\fractest that is a duplicate and now is a nested-\dfractest.This is based on #30059 plus all the current test image changes, so that you can look at the second-last commit for this change and the last commit to review image changes from only this PR. I believe it does a fairly good job of fixing the fraction bar alignment issue that came up in #30059.
I have reviewed the fraction and sub/super script implementation with reference to the TeX book, but have not yet finished reviewing the font constants.
Fixes #18086
Fixes #18389
Fixes #22172
PR checklist