-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Description
This is probably a GitHub issue, but https://www.githubstatus.com/ currently says "All Systems Operational".
Current Status
Mitigated: it seems to resolve itself, at least temporarily.
Error looks like
++ python3 .github/scripts/get_workflow_job_id.py 3761062705 i-04939a5bd44132575
Traceback (most recent call last):
File ".github/scripts/get_workflow_job_id.py", line 48, in <module>
jobs = response.json()["jobs"]
KeyError: 'jobs'
After #91145, now this is the error message:
RuntimeError: ('Is github alright?', "Recieved status code '502' when attempting to retrieve runs:\n", '{\n "message": "Server Error"\n}\n')
Found a simple repro: this gives Server Error 502 currently
https://api.github.com/repos/pytorch/pytorch/actions/runs/3761801620/jobs?per_page=100
For some other jobs, it's fine.
Incident timeline (all times pacific)
Include when the incident began, when it was detected, mitigated, root caused, and finally closed.
- 2022-12-22 ~14:00 PT incident began
User impact
Almost every workflow is failing with the above error
Root cause
Probably a GitHub issue.
Mitigation
How did we mitigate the issue?
Prevention/followups
How do we prevent issues like this in the future?
cc @seemethere @malfet @pytorch/pytorch-dev-infra