Skip to content

Require rebuild if any codegen targets are outdated#4053

Merged
hdgarrood merged 3 commits intopurescript:masterfrom
hdgarrood:fix-3911
Apr 12, 2021
Merged

Require rebuild if any codegen targets are outdated#4053
hdgarrood merged 3 commits intopurescript:masterfrom
hdgarrood:fix-3911

Conversation

@hdgarrood
Copy link
Copy Markdown
Contributor

Description of the change

Fixes #3911, #3914: prevents an issue where a module would be considered
up-to-date if only some of the requested codegen targets were up-to-date
(as opposed to all of them).


Checklist:

  • Added the change to the changelog's "Unreleased" section with a reference to this PR (e.g. "- Made a change (#0000)")
  • Linked any existing issues or proposals that this pull request should close
  • (n/a) Updated or added relevant documentation
  • Added a test for the contribution (if applicable)

Fixes purescript#3911, purescript#3914: prevents an issue where a module would be considered
up-to-date if only some of the requested codegen targets were up-to-date
(as opposed to all of them).
@hdgarrood
Copy link
Copy Markdown
Contributor Author

github didn't interpret my original message the way I intended, so this comment is just for github; this fixes #3914 as well as #3911.

Copy link
Copy Markdown
Member

@f-f f-f left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏

@hdgarrood
Copy link
Copy Markdown
Contributor Author

Thanks for the review!

Apparently the "fixes #xyz" doesn't count if it's not in the original PR description >:(

@hdgarrood
Copy link
Copy Markdown
Contributor Author

I'm reasonably confident that this change does fix the issue, I think it's just that the tests are flaky.

@rhendric
Copy link
Copy Markdown
Member

What OS(es) are you testing on? It's a little suspicious that the macOS and Linux builds are both failing precisely these two new tests and no others, and I can't see anything in the tests themselves which would make them less reliable than the others in the suite. (I wonder if there's something like an OS-specific issue with reading file timestamps.)

@hdgarrood
Copy link
Copy Markdown
Contributor Author

I'm testing on Linux and both new tests seem to pass about half the time. In CI, different OSes will be using different filesystems - eg I'd guess the linux CI box is using ext4, and the windows CI box is probably using NTFS - and so I think we do expect some OS-specific behaviour here as a result of that, eg because different filesystems have different timestamp resolutions.

@hdgarrood
Copy link
Copy Markdown
Contributor Author

Actually until I can figure out why the tests are failing I think I’m going to say that I’m only somewhat confident this works.

@rhendric
Copy link
Copy Markdown
Member

Ah, no, I think your confidence is valid and you're absolutely right that it's a timestamp resolution thing. I added some code to the tests to log output file modification times and submitted that to Travis, and in the cases when it failed, the timestamps were exactly equal—either because, as on macOS, the filesystem was only recording with one-second resolution, or because, as on Linux, the timestamps appeared to be equal down to the nanosecond, which I can only assume is because the kernel has a limited time resolution. But regardless of the cause, based on its view of the world the changed code was correctly responding that nothing needed rebuilding.

In all of these tests, we're deterministically setting the modification times of the inputs, but not of the outputs, and these are the first tests to rely on the latter. Can we just fudge the modification times of docs.json/corefn.json in the tests? It's not clean but it's probably better than flaky tests.

@hdgarrood
Copy link
Copy Markdown
Contributor Author

I think we do have to be able to gracefully handle the cases where timestamps are exactly equal, unfortunately, as when I was trying to figure out what was going on here I found that it was relatively common that when running the compiler normally, the externs.json, index.js, and docs.json files all had the exact same timestamp. Experimentally, for me (ext4 on an SSD on linux), it appears that I do at least have microsecond resolution locally. In fact I am slightly surprised that the timestamps being equal is causing the tests to fail, because I deliberately used <= rather than < in getOutputTimestamp because I was expecting this to happen.

I don't think these are the first tests to rely on output timestamps incidentally - we compare output timestamps in order to ensure that downstream modules are recompiled if upstream modules change, so I think the two recompiles downstream modules tests also rely on output timestamps in a similar-ish way to these new ones.

@rhendric
Copy link
Copy Markdown
Member

The problem isn't that the timestamps are equal after the first compile; it's that the timestamps are equal after the second compile, between a file touched only by the first and a file recreated by the second. Your code does correctly handle the case when a single compile generates all its outputs with equal timestamps; in that case, everything appears to be up-to-date. This only causes an issue if a second compile then overwrites some of those outputs, making the others not up-to-date, but in a way that is invisible because the second compile happened within the same tick as the first. Outside of tests, I think that's an extremely unlikely scenario; and if you wanted to handle it correctly anyway, I think you'd have to abandon modification times altogether, instead embedding high-resolution timestamps or generation counters or input hashes in the files themselves, or in a cache database.

You might be right about recompiles downstream modules, but I had thought from an admittedly brief review that those tests were relying on the changed timestamp of one of the input files to trigger a rebuild of that module, and the rebuild of that module triggers a rebuild of any downstream modules, none of which depends on the timestamp of the output file, just on the previously cached timestamp of the input file. It's not a terribly important point either way, but I do think it explains why we haven't generally seen these tests flake before, whereas I'm pretty sure that without some modification, these new tests are likely to fail quite frequently.

@hdgarrood
Copy link
Copy Markdown
Contributor Author

Ah! Thanks. Your analysis - both for why these new tests are flaky and for why the existing ones aren't - is persuasive to me. In that case, I think adding a sleep 1 in between each of the compiler invocations in these new tests would solve the problem, and that feels preferable to me than messing with the output file timestamps.

@hdgarrood hdgarrood mentioned this pull request Apr 12, 2021
@hdgarrood
Copy link
Copy Markdown
Contributor Author

Adding threadDelay (10^6) does seem to fix the flakiness for me locally - both tests have passed 10 times in a row :)

@hdgarrood hdgarrood merged commit 06e38cb into purescript:master Apr 12, 2021
@hdgarrood hdgarrood deleted the fix-3911 branch April 12, 2021 21:07
hdgarrood added a commit to hdgarrood/purescript that referenced this pull request Apr 13, 2021
Tasty and hspec are both testing frameworks: they are meant to satisfy
the same need, and they're not really intended to be used together.
See e.g. this comment from the tasty-hspec Haddocks:

> hspec and tasty serve similar purposes; consider using one or the
> other.
>
> However, in a pinch, this module allows you to run an hspec Spec as a
> tasty TestTree.

I'd like to switch back to using hspec throughout for two reasons:

1. With hspec, failures are summarised at the end of the test output.
   Tasty reports failures inline, which can be very annoying, especially
   in CI, as you then have to search through the logs to find the
   failure. Displaying failures at the end seems like a much more
   sensible default to me.

2. We get to take advantage of HasCallStack in hspec-expectations to
   help narrow down where a failure is coming from. This is particularly
   helpful in tests like the ones in TestMake.hs where we run something
   like ``someAction `shouldReturn` someResult`` more than once in a
   particular test, as without the source location from HasCallStack
   it's not immediately clear which assertion is failing. (This would
   have been useful in purescript#4053).

In purescript#2848, when tasty was first introduced, one of the main benefits we
were getting was that it would let us run individual tests via CLI
options. This is still possible with this PR: hspec has a `--match` CLI
option for this.

There's no `--accept` flag for golden tests, but instead you can set an
`HSPEC_ACCEPT` environment variable, or simply delete the files you want
to regenerate.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Internal error due to old docs.json files

3 participants