Float compiler-synthesized function applications#3915
Float compiler-synthesized function applications#3915JordanMartinez merged 1 commit intopurescript:masterfrom
Conversation
7f0cf1e to
3d0d6d9
Compare
|
One potential benchmark might be the one in |
|
Thanks! I will try that as soon as I fix the TCO regression in |
3d0d6d9 to
65e90ff
Compare
|
I could not measure a significant time difference compiling Running Bench.purs produced the following output: 0.13.8: 0.13.8 + patch: Everything is significantly faster except for Run, which looks like it's about the same. |
|
I wouldn't expect an improvement for In particular this brings mtl/identity into the same park as Run, which indicates that it's overhead is no longer in dictionary allocation. This is pretty inline with comparing freer libraries with mtl in Haskell without inlining optimizations. |
65e90ff to
4bf7699
Compare
4bf7699 to
33b66bc
Compare
|
I tried out this change with the purerl backend, together with some caching of typeclass dictionary creation there was a significant performance benefit. |
33b66bc to
3898203
Compare
- To use memoization, define (eg in rebar.config) PURERL_MEMOIZE (eg to true) and enable memoization parse transform. - purs change to hoist compiler synthesised function applications (purescript/purescript#3915) required to make much use of this
416b160 to
56d6837
Compare
|
The Windows CI build failed with this error: |
|
Yeah, I'm trying to figure out why. My current guess is line endings? |
6a2eb56 to
a272d41
Compare
|
I've updated this PR to take advantage of the renamer improvements in #4096; the CSE pass now floats definitions all the way to the outermost scope, when possible, which improves sharing and decreases the number of IIFEs. I've also finally added a few tests that verify that the optimization happens by comparing compiler output to golden files. Other optimizations can also be tested in this suite, in the future. |
|
Nice! |
| @@ -0,0 +1 @@ | |||
| *.out.js -text | |||
There was a problem hiding this comment.
This, for anyone who's curious, is my solution to the tests/purs/optimize tests failing on Windows. Git has some sort of heuristic for determining whether a file is text—I suspect it's based on the extension—and if it believes a file is text, by default (at least on the GitHub runners?) it will normalize the newlines in that file to the operating system's default when the file is checked out, and map them back on commit. (Obviously, this breaks tests if it happens to golden test files.) .gitattributes files can be used to tell Git, among other things, that certain files are not text and therefore shouldn't have this behavior.
There are other approaches—we could use git config to disable newline conversion globally on GitHub runners, for example, but that wouldn't fix things for developers on Windows. I'm refreshing some rather old memories about all this now so I don't know if this is the best possible solution, but it seems pretty good.
There was a problem hiding this comment.
We're doing the same thing for the golden layout tests here:
(Maybe consider also adding -merge here)
There was a problem hiding this comment.
Oh thanks, that makes me feel better about this. Must not be extension-based then; probably it's sniffing out non-printable characters or something.
I don't think -merge is needed here; I think these files will merge pretty cleanly.
There was a problem hiding this comment.
The -merge is to make sure Git doesn't automatically merge when pulling. Golden files always need to be accepted wholesale, trying to "merge" them by combining lines from multiple branches is basically always wrong.
There was a problem hiding this comment.
Isn't the important question whether the changes made to the files have good locality? If branch A changes the way imports are encoded in JS, and branch B changes the way that TCO loops get compiled, both of those branches change the compiled JS output but in different parts of the file, and Git's merge algorithm will probably successfully produce the golden file which should result from the combination of the features, right?
a272d41 to
b26f000
Compare
b26f000 to
cdaa8bc
Compare
|
Just FYI. There are 3 merge conflicts here. The first two are obvious, but the third one touches code I'm not familiar with. So I didn't resolve the conflict here. |
cdaa8bc to
0eab8ce
Compare
|
Thanks; sorted! |
|
Work on this is paused pending a resolution to #4179. |
61e2b7e to
f52c17c
Compare
f52c17c to
eb4ef51
Compare
Update on this thread: This PR has (passing) tests for both of the cases @nwolverson submitted. Having gotten #4283 in first meant that I no longer need to change anything in this PR about how binding groups work; I'm pretty sure it was my previously-necessary tinkering with the binding groups desugaring that caused this class of issues. So I think this thread is addressed—although I don't know if @nwolverson will be able to confirm until the purerl backend is compliant with #4283. |
| Glob >=0.10.1 && <0.11, | ||
| haskeline >=0.8.2 && <0.9, | ||
| language-javascript ==0.7.0.0, | ||
| lens >=4.19.2 && <4.20, |
There was a problem hiding this comment.
This is tangential to the PR, but I thought I'd still bring it up. We already depend on microlens. By adding yet another lens library as a dependency, it makes the binary larger. I'm assuming that it's being added here because microlens doesn't suffice. And if that's the case, then could we refactor the codebase to remove microlens so that we only depend on one lens library?
There was a problem hiding this comment.
monoidal-containers (which I added in #4283 before realizing this fact) already depends on lens; if lens and its dependencies (most of which we were already pulling in even before monoidal-containers) result in too much binary bloat, I'll need to inline the parts of monoidal-containers we're using. I don't actually know how much of a difference in binary size we're talking about here, with all the optimizations on; do you?
We can absolutely replace the various microlens usages with lens—I've done so in a local branch—but I figured I'd submit that in a separate PR if we merge this deciding we want to keep lens.
There was a problem hiding this comment.
My comment comes mainly from reading a blogpost a couple of years ago pointing out an issue like this: a project depends on a small lens library but transitively depends on all the other lens libraries such that the author just wondered why one didn't just depend on lens and call it a day. (Or something like that).
Could you clarify for me what the difference between microlens and lens is (if any) besides microlens just being smaller? For example, do they use the same underlying encoding for optics?
There was a problem hiding this comment.
My understanding, based on little more than a skimming of microlens's README and some unattributable background impressions, is that the implementation of microlens is more or less directly lifted from lens; the main if not only difference is how much API and instances are provided and how many dependencies are taken.
|
|
||
| instance Ord k => Semigroup (PluralityMap k) where | ||
| PluralityMap l <> PluralityMap r = | ||
| PluralityMap $ M.mapWithKey (\k -> (|| k `M.member` r)) l `M.union` r |
There was a problem hiding this comment.
Just to clarify, is this
PluralityMap $ (M.mapWithKey (\k -> (|| k `M.member` r)) l) `M.union` ror this?
PluralityMap $ M.mapWithKey (\k -> (|| k `M.member` r)) (l `M.union` r)
I believe it's the second one.
There was a problem hiding this comment.
I think it's the first, actually. Prefix function application has higher precedence than all operators, including backticked identifiers.
There was a problem hiding this comment.
😄 Maybe we should add parenthesis here just to ensure clarity? Though I don't think there's any difference between the two.
There was a problem hiding this comment.
Yeah, the only difference would be performance, if any. I think I'd have to fight with hlint about those parentheses but I'll try and see.
There was a problem hiding this comment.
(Oh, I shouldn't have tried to comment on this while going to bed. There is an important difference between the two: if l = PluralityMap M.empty and r = PluralityMap $ M.singleton 0 False, then in the first, l <> r == PluralityMap $ M.singleton 0 False, but in the second, l <> r == PluralityMap $ M.singleton 0 True.)
There was a problem hiding this comment.
It'd be nice to have optimization tests for recursive computation in type classes, like so:
module Main where
import Prelude
import Prim.Row as R
import Prim.RowList as RL
import Type.Prelude (class IsSymbol, Proxy(..), reflectSymbol)
class FindKeysAux :: forall k. RL.RowList k -> Constraint
class FindKeysAux a where
findKeysAux :: Proxy a -> Array String
instance FindKeysAux RL.Nil where
findKeysAux _ = []
else instance (IsSymbol l, FindKeysAux r) => FindKeysAux (RL.Cons l t r) where
findKeysAux _ = [ reflectSymbol (Proxy :: Proxy l) ] <> findKeysAux (Proxy :: Proxy r)
findKeys :: forall r rl. RL.RowToList r rl => FindKeysAux rl => Proxy r -> Array String
findKeys _ = findKeysAux (Proxy :: Proxy rl)
findKeys1 = findKeys (Proxy :: Proxy (a :: Unit))
findKeys2 = findKeys (Proxy :: Proxy (a :: Unit, b :: Unit))
findKeys3 = findKeys (Proxy :: Proxy (a :: Unit, b :: Unit, c :: Unit))
findKeys4 = findKeys (Proxy :: Proxy (a :: Unit, b :: Unit, c :: Unit, d :: Unit))
findKeys5 = findKeys (Proxy :: Proxy (a :: Unit, b :: Unit, c :: Unit, d :: Unit, e :: Unit))
findKeys6 = findKeys (Proxy :: Proxy (a :: Unit))
findKeys7 = findKeys (Proxy :: Proxy (a :: Unit, b :: Unit))
findKeys8 = findKeys (Proxy :: Proxy (a :: Unit, b :: Unit, c :: Unit))
findKeys9 = findKeys (Proxy :: Proxy (a :: Unit, b :: Unit, c :: Unit, d :: Unit))
findKeys10 = findKeys (Proxy :: Proxy (a :: Unit, b :: Unit, c :: Unit, d :: Unit, e :: Unit))This commit implements a common subexpression elimination pass during CoreFn optimization. The optimizer only targets `App`s that the compiler has created, meaning: applying functions to typeclass dictionaries, accessing superclasses from a typeclass dictionary, or creating new instances of `IsSymbol`. Furthermore, the pass only moves subexpressions if they can be moved to an outer function or if the subexpression appears more than once inside the scope to which it would be moved.
7e39d09 to
283c057
Compare
|
@rhendric Should we pull the trigger and merge this? @purefunctor approved after your latest changes, so it's technically valid. Or are you still testing this or waiting for others to review it? |
|
I'm on board with merging. I figure getting a prerelease of this out early in the release cycle is probably our best strategy for getting it tested by more people before it shows up in a release. |
Agreed. We might want to make a Discourse announcement about this as well. That will get more people to try it out. |
|
Rad, I'll write that up. |
|
I've updated purerl to work with the latest, and I'm happy to say I see no problems in our projects - and it enables a signifcant performance improvement. Happy to see this merged! |
…escript#3915)"" This reverts commit 7cba4ec.
This commit implements a common subexpression elimination pass during
CoreFn optimization. The optimizer only targets
Apps that the compilerhas created, meaning: applying functions to typeclass dictionaries,
accessing superclasses from a typeclass dictionary, or creating new
instances of
IsSymbol. Furthermore, the pass only moves subexpressionsif they can be moved to an outer function or if the subexpression
appears more than once inside the scope to which it would be moved.
This is a start at what I'm told is the uncontroversial part of #687. Here's some before-and-after:
(before)
(after)
I don't know many public PureScript projects that have sufficiently long-running automated tests to be suitable for benchmarking, but purescript-bytestrings is one. On that project, this patch (backported to the 0.13.x branch) increased total compilation time by about 2% and decreased test execution time by about 20%. I would be happy to run the same benchmarks on any other project you suggest.
In order to verify that the optimization happens, a new (small) suite of tests in
tests/purs/optimizehas been added. In this folder,*.pursfiles are compiled and the result is compared to the contents of the corresponding.out.jsfile. These tests are potentially more expensive than usual in terms of maintenance burden, because they are sensitive to any change in the compiled output, but they can be used to prevent regressions in optimizations that are too subtle to be tested by other means.