Skip to content

Check roles of mutually recursive types#3860

Merged
hdgarrood merged 2 commits intopurescript:masterfrom
kl0tl:fix-issue-3857
Jul 11, 2020
Merged

Check roles of mutually recursive types#3860
hdgarrood merged 2 commits intopurescript:masterfrom
kl0tl:fix-issue-3857

Conversation

@kl0tl
Copy link
Copy Markdown
Member

@kl0tl kl0tl commented May 1, 2020

Here’s a tentative to fix #3857.

This turns role inference into a stateful process repeatedly updating roles in the environment until we infer the roles of all the data types belonging to a mutually recursive group. The idea is taken from GHC own implementation (see GHC.Tc.TyCl.Utils.inferRoles).

Unlike GHC we discover mutually recursive data types as we go so an inference step returns S.Set (Qualified (ProperName 'TypeName) instead of Bool and we infer our dependencies roles before taking another step, stopping when we don’t find any dependencies.

@kl0tl kl0tl force-pushed the fix-issue-3857 branch from cda2e89 to d60a158 Compare May 1, 2020 17:37
TypeConstructor _ t1Name ->
let t1Roles = inferRoles env t1Name
let roles = M.lookup t1Name roleEnv'
inferred = fromMaybe (repeat Phantom) roles
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a couple of places you use Phantom as a default for a lookup. Can you clarify why this is desirable?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The phantom role is used as a default because it is the least restrictive, so later inference passes can only refine it to a more precise role. We always prefer the more restrictive role inferred for the same parameter in the Semigroup instance for RoleMap and updateRoles.

For instance, given these definitions:

data D1 a = D1 a
data D2 a = D2 (D1 a)

When inferring D2 roles we discover they depend on those of D1, but we don’t know them yet so we pick phantom for a and store that in our role environment. Then we infer the type parameter of D1 is representational, store that in our role environment and go back to D2. This time we know D1 roles so we can infer a is representational and update the role environment by taking the more restrictive of phantom and representational for a, which is representational.

@kl0tl kl0tl force-pushed the fix-issue-3857 branch 2 times, most recently from 3c9a777 to 3d55d09 Compare May 8, 2020 13:05
natefaubion
natefaubion previously approved these changes May 10, 2020
@hdgarrood
Copy link
Copy Markdown
Contributor

Thanks for picking this up! While reviewing, I realised that we have another bug in Coercible (#3867), and I think it might be preferable to address that one before this one, as that one might necessitate a more fundamental change in the mechanics of how we handle roles in the compiler.

@natefaubion natefaubion dismissed their stale review May 10, 2020 23:30

Removing the approval while we decide how to proceed.

@kl0tl kl0tl force-pushed the fix-issue-3857 branch from 3d55d09 to d5dde28 Compare May 16, 2020 14:58
@kl0tl
Copy link
Copy Markdown
Member Author

kl0tl commented May 16, 2020

I managed to rebase this on top of #3873!

I was able to follow GHC approach more closely now that we don’t have to discover dependencies between mutually recursive as we go anymore. I feared my previous approach wouldn’t propagate roles in some intricate cases but didn’t found any example of that. I find reassuring to do somewhat like GHC though.

@kl0tl kl0tl changed the title Don’t recurse forever when inferring roles of recursive types Check roles of mutually recursive types May 16, 2020
@hdgarrood
Copy link
Copy Markdown
Contributor

@kl0tl I'd like to try to get this PR merged next. Could you merge or rebase master so that it applies cleanly when you have a moment, please?

@kl0tl
Copy link
Copy Markdown
Member Author

kl0tl commented Jun 26, 2020

I rebased the PR.

Copy link
Copy Markdown
Contributor

@hdgarrood hdgarrood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great so far, thanks!

Nothing ->
pure $ fromMaybe (Phantom <$ tyArgs) inferredRoles

inferDataBindingGroupRoles
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than keeping this function "pure", I think it might be preferable (or at least better aligned with the way the rest of the type checker is written) for it to operate in MonadState CheckState m, and read and write directly from the actual type-checking environment, rather than doing its own thing at first and then later updating the environment after inferring/checking roles. I don't feel particularly strongly about this, though. @natefaubion do you have any thoughts on this?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really have much of an opinion. Is there some currying advantage to how the functions are arranged, and how it returns an additional lambda, that's exploited elsewhere?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The explicit floating of initialRoleEnv and inferredRoleEnv outside of the do block in checkDataBindingGroupRoles is supposed to allow the roles of the data declarations in the binding group to be inferred only once and then looked up repeatedly in the DataBindingGroupDeclaration case of Language.PureScript.TypeChecker.typeCheckAll. This isn’t something I’ve verified because I’m not familiar with Core inspection but I’m not sure GHC would be able to share the work otherwise.

I can try to store the inferred roles environment in the typechecker monad state and separate roles inference from roles checking in binding groups. This should be moree performant than computing a new roles map from scratch each time we need to check roles.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since neither @natefaubion nor I feel strongly I’m tempted to say let’s leave this as is for now. I’m happy to merge once there’s a comment about what the Any in inferDataDeclarationRoles means.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I guess it’s that we want to keep walking over all of the data declarations in the binding group until we reach a point where walking doesn’t change anything in the roleEnv, and the Any part of the result indicates whether something changed?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I lifted the role inference to the typechecker monad!

then inferDataBindingGroupRoles moduleName group roleEnv'
else roleEnv'

inferDataDeclarationRoles
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add some documentation about the meaning of the Any in the return type please?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a comment and aliased the Any to DidRolesChange to make things explicit.

@kl0tl kl0tl force-pushed the fix-issue-3857 branch 2 times, most recently from 522d6a1 to bf2c450 Compare June 28, 2020 12:56
@hdgarrood
Copy link
Copy Markdown
Contributor

@kl0tl Thanks for making these changes! On reflection, though, I'm less sure about this design. With this branch as it is now, we are storing roles in three separate places in the CheckState: in the CheckState's checkRoleEnv field, in the roleDeclarations field of the Environment, and in the types field of the Environment. To me, this feels like we have more mutable state than we ought to have - I was more hoping that we could use the types field of the Environment to store the intermediate roles during inference. However, I'd really like to avoid doing anything which would delay getting this fix merged, so at this point I would prefer to leave this as a possibility for changing later, and for now, we could just go back to 4ac6f09, if that sounds okay?

Speaking of minimising the amount of mutable state we have, it occurred to me that now we are dealing with data binding groups sensibly, we can probably make the roleDeclarations field obsolete by putting role annotations in the same binding group as the data types they refer to, so that the data we need is available at the time we actually need it, rather than having to store them in the environment separately first and pull them back out later. Again, that's something I'd like to address separately from this PR, though.

I'm not actually very keen on type DidRolesChange = Any, since to me, having a named type like that suggests a newtype or a separate data type which is different from Any; it suggests more type safety than there really is. As it happens we came quite close to merging a compiler warning which discouraged you from defining type synonyms like this: #2932. I'd prefer a proper data type or newtype like data DidRolesChange = RolesChanged | NoRolesChanged with the appropriate Semigroup instance, or newtype DidRolesChange = DidRolesChange Any, or alternatively just changing it back to Any (and bringing the comment across with it in any case).

@kl0tl kl0tl force-pushed the fix-issue-3857 branch from 2764c28 to 4b4f9c1 Compare July 2, 2020 22:34
@kl0tl
Copy link
Copy Markdown
Member Author

kl0tl commented Jul 2, 2020

I’ve rollback the lifting of the role inference state to the typechecker monad and the DidRolesChange type alias to unblock this pull request.

I don’t think we can get away with storing roles only in the types field of the environment because we need to infer the roles of a data type before inserting it in the environment and we need to store the role declarations separately in order to check they’re not less restrictive than the inferred roles but we can discuss this in another pull request.

@kl0tl kl0tl force-pushed the fix-issue-3857 branch from 4b4f9c1 to 7025e5a Compare July 2, 2020 22:36
Copy link
Copy Markdown
Contributor

@hdgarrood hdgarrood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thank you!

_ ->
[]
lookupRoles env tyName =
fromMaybe [] $ M.lookup tyName (getRoleEnv env)
Copy link
Copy Markdown
Contributor

@natefaubion natefaubion Jul 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change seems a little excessive. Whenever you want to lookup a role, instead of looking it up directly, you are mapping over the entire type environment to build another environment that is a subset of the original. I understand that there may be some clever laziness involved (it isn't clear to me that you are getting that with Map and mapMaybe vs something like List), but this seems like an easy way to also accidentally introduce poor asymptotics. What was the motivation for this change? If it's only to share code, I think you could extract out the predicate for reuse rather than try to reuse getRoleEnv directly.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to share logic and thought the whole map wouldn’t actually be built thanks to laziness but I should have verified this assumption. I’ve reused a typeKindRoles function in both getRoleEnv and lookupRoles.

@hdgarrood hdgarrood merged commit 37ec737 into purescript:master Jul 11, 2020
@hdgarrood
Copy link
Copy Markdown
Contributor

🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Compiler hangs under some situations using Coercible

3 participants