Skip to content

Programmable type errors#2155

Merged
paf31 merged 3 commits into0.9from
programmable-type-errors
May 30, 2016
Merged

Programmable type errors#2155
paf31 merged 3 commits into0.9from
programmable-type-errors

Conversation

@paf31
Copy link
Copy Markdown
Contributor

@paf31 paf31 commented May 23, 2016

Fix #1561.

This will allow to provide better type errors for things like String not being a Semiring or Function not having decidable equality in Prelude.

You can now write

instance noStringSemiring
  :: Fail "String is not a Semiring. Perhaps you want ++ instead of +."
  => Semiring String where
    add  _ _ = ""
    mult _ _ = ""
    one      = ""
    zero     = ""

and "Hello" + "World" will give back

A custom type error occurred while solving type class constraints:

    String is not a Semiring. Perhaps you want ++ instead of +.

@jdegoes
Copy link
Copy Markdown

jdegoes commented May 23, 2016

@paf31 Is this just random syntax that happens to be the same as constraints, or pointing to a future extension of the type system?

@paf31
Copy link
Copy Markdown
Contributor Author

paf31 commented May 23, 2016

No, it's a new type level string. They will have kind Symbol. Fail is in Prim, and is parameterized by a Symbol. This means that the orphan instance restriction prevents us from defining Fail instances (which is a good thing, of course!)

We could reasonably add a library for working with type-level strings. A good starting point would be to have the compiler derive a class like

class KnownSymbol (sym :: Symbol) where
  knownSymbol :: Proxy sym -> String

and

toKnownSymbol :: forall r. String -> (forall sym. KnownSymbol sym => Proxy sym -> r) -> r

@jdegoes
Copy link
Copy Markdown

jdegoes commented May 23, 2016

type-level String

Please consider full-on singleton types for all literal values. 🙏

They will have kind Symbol.

👍 Awesome. Looks like a path toward first-classing field names in row types, yes? That'll buy us lenses which are polymorphic in the field name, among other miraculous things.

@paf31
Copy link
Copy Markdown
Contributor Author

paf31 commented May 23, 2016

Exactly. First-class labels are a much bigger change, but they'd definitely reuse this type machinery.

@coveralls
Copy link
Copy Markdown

Coverage Status

Coverage decreased (-0.07%) to 56.17% when pulling 6f703ac on programmable-type-errors into 469214f on 0.9.

@kritzcreek
Copy link
Copy Markdown
Member

I really like how little code is needed for this. Nifty!

@natefaubion
Copy link
Copy Markdown
Contributor

Are there cases where one would want to write a failing instance where one can't really come up with a default nonsense implementation? I guess you could just fill it in with coercions.

@hdgarrood
Copy link
Copy Markdown
Contributor

Yeah, I think I'd use unsafeCrashWith from Partial.Unsafe for member definitions.

@natefaubion
Copy link
Copy Markdown
Contributor

I think that only works if its behind a function guard, though.

@natefaubion
Copy link
Copy Markdown
Contributor

Oh I guess if you can't make an instance anyway, then it won't ever be created, and thus won't throw.

@paf31
Copy link
Copy Markdown
Contributor Author

paf31 commented May 23, 2016

@natefaubion Right, the creation is always guarded by a function which takes evidence for the Fail constraint. We might want to provide a function failed :: forall msg a. Fail msg => a somewhere too, or we could just pretend that function is defined by the class itself.

@texastoland
Copy link
Copy Markdown

texastoland commented May 23, 2016

Issues I see in priority order:

  1. Your example illustrates it only creates messages for type classes not resolved generally instead of specific to a method i.e. there's a considerable probability the user didn't call +.
  2. Is an implementation necessary at all? Can Fail ever not fail?
  3. Can messages be localized? To me it's less important for translation than grouping messages to establish consistent tone. I suppose you could create Fail aliases. Can they be formatted at all?
  4. Would something like presented at ICFP be too complicated to use in practice?

@paf31
Copy link
Copy Markdown
Contributor Author

paf31 commented May 23, 2016

Your example illustrates it only creates messages for type classes not resolved generally instead of specific to a function i.e. there's a high probability the user wasn't calling +.

True, so perhaps Semiring was not the best example. A better one might be something like Eq (a -> b) where we typically wouldn't be able to implement any of the type class members.

Why is an implementation necessary at all? Can Fail ever not fail?

All instances need implementations (this is just a global rule, which makes a good default, but maybe not here), but we can give a very succinct empty implementation using the failed function described above. Maybe we could look for a Fail instance in the context and skip the check, but I'm not sure it's worth it.

Messages can't be localized. To me it's less important for translation than grouping messages to establish consistent tone. I suppose you could create aliases?

Messages can be localized to the same extent than any other compile-time information can be. That is, via code generation.

Would something like presented at ICFP be too complicated to use in practice?

This is based on the ICFP presentation, it's just a bit of a simplified version since we don't support symbol concatenation or symbols for types. We could add those, but this gives a better bang for our buck in the short term, I think.

@texastoland
Copy link
Copy Markdown

texastoland commented May 23, 2016

perhaps Semiring was not the best example

I think that's the reason it's a perfect example that should be accommodated.

All instances need implementations

Not if they're derived so I suppose these could be derivable?

That is, via code generation.

But can they be formatted at all?

We could add those, but this gives a better bang for our buck in the short term, I think.

I figured just asking for clarification 👌

@natefaubion
Copy link
Copy Markdown
Contributor

natefaubion commented May 23, 2016

Maybe with constraint kinds you could add more information with a type annotation

  add _ _ = failWith (proxy :: Fail "Perhaps you meant append (<>)")

@texastoland
Copy link
Copy Markdown

Seems like it could address my first couple points!

@paf31
Copy link
Copy Markdown
Contributor Author

paf31 commented May 23, 2016

The sum of all constraints would have to be the same for both though. What would the instance look like?

@paf31
Copy link
Copy Markdown
Contributor Author

paf31 commented May 23, 2016

so I suppose these could be derivable?

I'd much rather just make it simple to write the empty instances.

@texastoland
Copy link
Copy Markdown

texastoland commented May 23, 2016

You'd rather have to write arbitrary code that never gets run even though PS is already deriving more complicated stuff?

What would the instance look like?

I don't know but messages for a method/constrained function seem like a more common use case than generic missing instance.

@paf31
Copy link
Copy Markdown
Contributor Author

paf31 commented May 23, 2016

I don't know but messages for a method/constrained function seem like a more common use case than generic missing instance.

Sorry, I don't understand. You can't put the Fail constraint on the class members themselves, they have to go on the instance itself. So either you break the class up (but this isn't going to happen in the case of Semiring, for example), or they have to share an error message.

@natefaubion
Copy link
Copy Markdown
Contributor

My thinking was that the instance always has a Fail constraint, and you could optionally provide a way to give more specific information by checking the bogus definition. My idea was a bit arbitrary, and may not make any sense, but the idea is that since you can't add additional constraints to the functions, maybe you could "build" the constraint via constraint kinds, and the type checker could pick up on that.

@paf31
Copy link
Copy Markdown
Contributor Author

paf31 commented May 23, 2016

I see, but you're still going to struggle to write the instance. If I understand, you want to say something like "this has a Fail subgoal, but the argument won't be solved until I know the member you want". That's going to be tricky :)

@paf31
Copy link
Copy Markdown
Contributor Author

paf31 commented May 24, 2016

Ok, this should be ready to review/merge now.

@texastoland
Copy link
Copy Markdown

I feel about this like your feedback about imports. It doesn't seem to address the user facing issue with a syntax that seems forced. Not only does it require an implementation that's never used but the closest thing we have to undefined isn't in Prelude and itself requires a string that wouldn't get used either. @jbrownson

@paf31
Copy link
Copy Markdown
Contributor Author

paf31 commented May 24, 2016

It doesn't seem to address the user facing issue with a syntax that seems forced.

What's the user-facing issue you're referring to?

@texastoland
Copy link
Copy Markdown

texastoland commented May 24, 2016

it only creates messages for type classes not resolved generally instead of specific to a method

messages for a method/constrained function seem like a more common use case than generic missing instance

^ The issue with missing instances is contextual to their function. Your original example exemplified that use case. I imagine Fail might work better at the method or constraint level.

@paf31
Copy link
Copy Markdown
Contributor Author

paf31 commented May 24, 2016

I imagine Fail might work better at the method or constraint level.

I'm not sure what constraint level means, but method level isn't possible with the current type system. Can you give a concrete proposal for what you're describing?

Would you be happier if it looked like

instance noStringSemiring
  :: Fail "String is not a Semiring. Perhaps you want (++) instead of (+)."
  => Semiring String where
    add  _ _ = failed
    mult _ _ = failed
    one      = failed
    zero     = failed

where failed :: forall msg a. Fail msg => a?

I agree that ideally it would just be

instance noStringSemiring
  :: Fail "String is not a Semiring. Perhaps you want (++) instead of (+)."
  => Semiring String

or something, but this won't be the final iteration of this feature. If it proves to be a success, we can go back and try to support that. Right now, it's overkill for what should be a relatively advanced feature, in my opinion.

@texastoland
Copy link
Copy Markdown

texastoland commented May 24, 2016

Would you be happier if it looked like

That's better but ostensibly the lesser issue if you plan to improve it later.

method level isn't possible with the current type system

That's why I'm apprehensive.

You suggested Eq as more exemplary implemented as:

class Eq a where
  eq :: a -> a -> Boolean

What's characteristically better about Eq is having only a single method like @tfausak's class hierarchy. Messages appear to be contextual to methods.

I'm not sure what constraint level means

abs was recently cleverly reimplemented as:

abs :: forall a. (Ord a, Ring a) => a -> a
abs x = if x >= zero then x else negate x

But if I tried to abs a type representing natural numbers I'd get a message about a missing Ring despite my intuition abs should be id for natural numbers. The motivation for its constraints get lost in its implementation. We should be able to fail an arbitrary function based on its constraints not just a method on its class constraints.

what should be a relatively advanced feature, in my opinion

A feature which aspires to provide library maintainers the capacity to contribute Elm-like compiler feedback based on monomorphic context though.

@texastoland
Copy link
Copy Markdown

Mulling over @natefaubion's suggestion but having no familiarity with dependent types it's strange the error message is parameterized on a single value versus a domain of values (e.g. nonempty strings)?

@hdgarrood
Copy link
Copy Markdown
Contributor

do instances which have a Fail constraint show up in docs right now? If so, I think we should modify Language.PureScript.Docs.Convert.Single so that it discards them.

@hdgarrood
Copy link
Copy Markdown
Contributor

@appshipit I get what you're saying about your intuition for abs, but I don't see how the "motivation for the constraints is getting lost in the implementation." This abs is the absolute value function for ordered rings (see the "Generalizations" subheading in https://en.m.wikipedia.org/wiki/Absolute_value). It's about as direct an encoding of that concept in a programming language you could possibly get, I think.

@texastoland
Copy link
Copy Markdown

texastoland commented May 24, 2016

Not criticizing abs it's quite novel! Just an illustration where I think instances may be the wrong target for messages.

@hdgarrood
Copy link
Copy Markdown
Contributor

Oh, I see what you're getting at now. Yeah, that's fair; I guess, for what you want, we'd need some kind of facility for some kind of pattern matching on of syntax trees, and supplying custom error messages for particular patterns, then (although I have no idea whether that is a good idea or even workable).

I think that instances probably are the right target for custom error messages for certain type-level constructions, though (take the example in that ICFP video you linked, for instance).

@paf31
Copy link
Copy Markdown
Contributor Author

paf31 commented May 24, 2016

I think the instance is the right place to put the message, given why the error occurs in the first place. We get the error because + is defined on Semiring and you tried to use String and + together. If you had used * instead, you'd get the same error, so we just have to write good error messages to deal with that. But it's a mistake to think of + and * and Semiring as somehow being separate things. In the type system, we can't talk about + or * in isolation.

@texastoland
Copy link
Copy Markdown

texastoland commented May 24, 2016

On the contrary it's clearly the coincidence of Semiring and +, Eq and ==, or abs and Ring that causes the error not the missing instance in isolation. Nate's idea looked closer to addressing the presented use cases (except the last one) to me. But locking down Fail's phantom type to a single literal hinders it.

@paf31
Copy link
Copy Markdown
Contributor Author

paf31 commented May 24, 2016

On the contrary it's clearly the coincidence of Semiring and +

There is no coincidence. Or rather, you can't have one without the other.

Nate's idea looked closer to addressing the presented use cases (except the last one) to me.

I've explained why that solution won't work with the current type system.

But locking down Fail's shadow type to a single literal hinders it.

I'm not sure what you're getting at here.

Can we nail down the issue here, or merge this? If there is a better solution which actually works with the type system, I've yet to see it.

@jbrownson
Copy link
Copy Markdown

I've explained why that solution won't work with the current type system.

Seems like if we can't represent a solid solution through overloading the type system, we shouldn't overload the type system?

@paf31
Copy link
Copy Markdown
Contributor Author

paf31 commented May 24, 2016

As I say, I don't yet understand why this isn't a solid solution.

With overlapping instances:

class StringHint a

instance failForString :: Fail "Use (<>) for string concatenation instead of (+)" => StringHint String

instance succeedOtherwise :: StringHint a

class Semiring a where
  add :: StringHint a => a -> a -> a
  ...

That's the first two points covered, with a slightly more verbose implementation and possibly other confusing errors.

Point is, it's a tool, and you can use it in different ways, with different trade-offs.

@jbrownson
Copy link
Copy Markdown

What about a new syntax (or some kind of pragma) for doing this rather than trying to fit it into the type system syntax (could leverage the type system under the hood of course). Just be explicit about what is being expressed in the code rather than encoding it in another existing syntax that I don't believe is really designed to express this. Makes it easier for someone who comes across this syntax to look it up/understand it, and we can express exactly the thing we want to express w/o having to fill in boilerplate.

@paf31
Copy link
Copy Markdown
Contributor Author

paf31 commented May 24, 2016

Potentially, if we can come up with a concrete proposal.

If the consensus is that this is not a good solution, I'm happy to leave it out of 0.9.1 or just close it entirely.

Edit: I'd rather not add new syntax though, personally. I see this as being intimately linked with the type class system, so I don't see it as trying to shoehorn a feature into the type system.

@rgrempel
Copy link
Copy Markdown

It would, of course, be nice to be able to tie programmable type errors to the use of individual functions, rather than the use of the instance as a whole, since it would permit more focused error messages (that is, error messages that can focus on the specific function which was used).

However, I'm not sure it will be essential in practice. I say that for a couple of reasons.

Most type classes do not have very many functions defined. I did a quick look at Prelude, and most of them have 1 function. A few have two or three, and the four in Semiring are the worst case. Of course, other type classes elsewhere may have a different distribution. However, I think it's a typical design goal to keep the number of functions in a type-class small. (The idea being to make each instance implement as little as possible, using generic functions to build on that).

Now, that does mean that one would want to word an instance-level failure message carefully ... something like "If you were using (+), you might want (<>) to append strings". That is, you'd want the "if", since the failure might not have been due to (+).

I suppose that the user might find it odd that the compiler doesn't know whether you were trying to use (+) or not -- and, of course, at some level the compiler does know that. However, I gather that it just doesn't know it in a way that is useable at this point -- that is, it's not exposed to the type system in a way that supports programmable type errors that can be defined in code.

The highlighted words above are meant to allude to a difference between this and what Elm does. The reason, I think, that Elm can more easily be more specific in tying errors to individual functions is that all of Elm's type-classes are built into the compiler itself, so the compiler has intimate, special-cased knowledge of exactly what they do. Here, the error messages are defined in Purescript code, and not special-cased, so we can use this mechanism for any old type-class. That's a huge win -- it is actually kind of miraculous.

The other thing that occurs to me is that even where there are, say, 4 functions in a type-class, it will often be the case that only one of them is "tempting". In the case of Semiring, for instance, you wouldn't really expect a programmer to multiply strings and expect something good to happen. It's really only (+) that is the tempting error. So, the fact that an error message focuses on (+) is not such a terrible thing (especially if the error message alludes to the other possibilities).

All of this to say that I think the glass is 95% full here.

@texastoland
Copy link
Copy Markdown

texastoland commented May 24, 2016

That's a huge win -- it is actually kind of miraculous.

I think empowering library maintainers to author error messages could leapfrog Elm in a way. I appreciate its simplicity. If there's any way to associate them with methods (or even non-class functions) I just believe it'd enable more relevant contextual feedback!

Can we nail down the issue here, or merge this?

I'd like another few days to cogitate. We'll meet up this weekend anyway?

@maddie927
Copy link
Copy Markdown

Any in person discussion should be shared with the people here though, so no one's excluded for not being able to make it to Catalina.

@paf31
Copy link
Copy Markdown
Contributor Author

paf31 commented May 24, 2016

We have other things to finish for 0.9.1, so we can leave this here for a few days, but let's try to wrap it up by the end of the week so that we can finish off the core libraries. Lots of things will be blocked by that.

@texastoland
Copy link
Copy Markdown

@jbrownson suggested a syntax inspired by foreign function imports. @paf31 raised concerns related to implementation, performance, and potential not to accommodate future type system features. I wanted to present it to nurture other ideas that may find a compromise with existing type class machinery:

-- would also work if add/append were defined monomorphically
error add :: forall a. String -> String -> a = "Use (++) to concatenate strings."
error append :: forall a. Number -> Number -> a = "Use (+) to add Numbers."

-- previous abs example
error abs :: forall a. Nat -> a = "Cannot abs a natural number because it implies negatability."
abs :: forall a. (Ord a, Ring a) => a -> a
abs x = if x >= zero then x else negate x

@hdgarrood
Copy link
Copy Markdown
Contributor

Agreed with @rgrempel. I'd like to see this go in for 0.9.1. I don't think we are making any promises about backwards compatibility yet, and it's not as if people might start depending on this feature in a way that causes serious problems if we were to later remove it anyway. So if we ever do come up with an alternative that fully obsoletes it somehow, I don't see that there would be any real harm done. We can see from this PR that it's super simple from the point of view of compiler internals, too.

I would love to see a more detailed proposal for an alternative solution; I don't think the ones offered so far are sufficiently fleshed-out to justify holding this back. In particular, it's not clear to me that any alternative solution couldn't complement what we have here instead of replacing it.

Just for context, a slightly more advanced version of this feature just landed in GHC 8.0.1: https://downloads.haskell.org/~ghc/8.0.1/docs/html/users_guide/glasgow_exts.html#custom-errors with a great example of a custom error we could define in our Prelude too. This feature would also work great for types which cannot be given good Generic instances, such as JSDate and Map (as @paf31 recently pointed out in IRC).

@paf31
Copy link
Copy Markdown
Contributor Author

paf31 commented May 25, 2016

@AppShipIt From our conversation from last night:

Generally, I like the idea, but I prefer the constraints approach for the following reasons:

  • (implementation) It seems obvious how the specialized types apply when you're applying a function with that name. But typechecking isn't only function applications, and how that fits generally into the typechecker is unclear to me. In general, it might be that the type checker inferred abs :: Nat -> Nat due to some other unrelated error. I think this might be as prone to bad error messages as my suggestion. The constraints approach at least has the desired property that there is a very clear place in the type checker when the rule fires - namely, when you try to solve a Fail constraint.
  • (performance) The typechecker will be slower if we have to check N alternatives every time we try to apply a function. On the other hand, the constraint approach is type-directed, and there is only ever one next subgoal to try. This is really important for performance.
  • (correctness) It's unclear to me that the approach you suggested is safe in the presence of separate compilation, especially if the semantics of error is to collect all failures into a single error message. This is related to no-backtracking in the constraint solver.
  • (expressiveness) Basing the solution on existing type system features is important, because it allows us to talk about failure modes using existing type level programming techniques. As these develop, we'll have new ways to talk about failure. It also allows a simple implementation, where a new language-level feature would need to be integrated with many different parts of the compiler, and tested thoroughly. I doubt we'd get it implemented, honestly.

I think your solution would be a better fit for languages without type classes and other type system features.

@paf31
Copy link
Copy Markdown
Contributor Author

paf31 commented May 28, 2016

I don't think we are making any promises about backwards compatibility yet

In particular, it's not clear to me that any alternative solution couldn't complement what we have here instead of replacing it.

I think this is right. We could mark this as experimental if we really wanted to, but I consider most things experimental while we're < 1.0 anyway.

So I'll try to get this merged later today.

@texastoland
Copy link
Copy Markdown

texastoland commented May 28, 2016

I still want to discuss it further tonight or tomorrow. We could even fire up Zoom. No one has presented a use case this resolves other than incidentally. Error messages are arguably the biggest encumbrance in PureScript and improving them should be done considerately not with a type feature just because it's easy or neat or even precedented. Throwing it in without a plan to make it contextual to function application looks like YAGNI. The justification about breaking changes likewise didn't favor #1901. On the contrary your feedback was to explorer the underlying issue. If a set of use cases could be provided that don't depend on guessing which method was applied I'd be excited.

@paf31
Copy link
Copy Markdown
Contributor Author

paf31 commented May 30, 2016

After talking with @AppShipIt and @jbrownson some more about this over the weekend, I'm going to get this merged in. We discussed an alternative function-level approach, which can be tracked in a separate issue, and use cases for this feature. The Semiring example is not a very compelling one, but I think the Generic use cases will be much more useful.

@paf31 paf31 merged commit c9f0839 into 0.9 May 30, 2016
@paf31 paf31 deleted the programmable-type-errors branch May 30, 2016 17:30
@texastoland
Copy link
Copy Markdown

To clarify for library authors instance-directed errors serve a different use case than function-directed errors. Fail instances explain why instances can never exist. Function-directed errors (like add or abs) provide hints to fixing types and constraints. A key metric is if the message mentions a function it's arguably misusing this feature.

archaeron pushed a commit to archaeron/purescript that referenced this pull request Apr 6, 2017
* Programmable type errors

* update unifiesWith and typeHeadsAreEqual
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants