Skip to content

Add Surface Laptop 5 lid GPE + s2idle/hibernate fix (v5.4)#2011

Open
wowitsjack wants to merge 9 commits into
linux-surface:masterfrom
wowitsjack:surface-laptop-5-s2idle-fix
Open

Add Surface Laptop 5 lid GPE + s2idle/hibernate fix (v5.4)#2011
wowitsjack wants to merge 9 commits into
linux-surface:masterfrom
wowitsjack:surface-laptop-5-s2idle-fix

Conversation

@wowitsjack

@wowitsjack wowitsjack commented Feb 25, 2026

Copy link
Copy Markdown

Closes #1782

Two patches, both across all kernel versions (6.6, 6.9-6.18).

1. Add SL5 to surface_gpe DMI table

Surface Laptop 5 uses GPE 0x52 for lid state notification, same as the Surface Pro 9, but was missing from the DMI match table.

2. surface_s2idle_fix module (v5.4)

Intel INTC1055 pinctrl power-gating during s2idle corrupts pin 213's PADCFG0 RXINV bit. The flipped bit fires a spurious SCI on GPE 0x52. In the kernel's s2idle wake path, acpi_ec_dispatch_gpe() calls acpi_any_gpe_status_set(), which promotes any non-EC GPE with status set to a full resume. The corrupted pin causes level-like re-assertion, so the status bit is always set when checked. The system either never wakes (wakeup framework poisoned by pm_system_cancel_wakeup()) or wakes spuriously on every s2idle cycle.

Fix: The ACPI wakeup handler runs before acpi_ec_dispatch_gpe() in the s2idle wake decision path (acpi_check_wakeup_handlers() at sleep.c:777, before acpi_ec_dispatch_gpe() at sleep.c:786). The handler:

  1. Repairs PADCFG corruption (stops the pin from re-asserting)
  2. Clears GPE 0x52 status via acpi_clear_gpe() (so acpi_any_gpe_status_set() doesn't see it)
  3. Returns false (no full wake from this handler)

The actual lid-open decision is deferred to an LPS0 check() callback which reads RXSTATE directly and only calls pm_system_wakeup() if the lid is genuinely open.

Result: spurious GPEs stay invisible inside the s2idle loop (sub-millisecond, no user-visible wake). Genuine lid-open wakes the system cleanly.

Additional layers:

  • GPE 0x52 masked during entire s2idle cycle (VNN RXSTATE glitches can't fire as SCI)
  • 500ms RXSTATE poll timer for lid-open detection (bypasses GPE entirely)
  • resume_early PADCFG repair as safety net
  • GPE unmask delayed to PM_POST_SUSPEND (prevents stale GPE from causing false lid events)
  • Passive KEY_POWER input handler distinguishes real vs spurious wakes
  • Post-resume failsafe re-suspends on spurious wake (lid still closed)
  • Exponential backoff (2s/4s/8s/15s) prevents rapid sleep-wake storms
  • Background RXSTATE polling catches lid close events when GPIO edge path breaks after s2idle cycles

v5.4 changes

  • GPE refcount fix. Use acpi_enable_gpe instead of acpi_set_gpe to maintain proper ACPI reference counting. Fixes second lid close not being detected after first suspend cycle.
  • GPE masked during s2idle. VNN power-gating causes transient RXSTATE glitches that fire GPE 0x52 as SCI, promoting spurious full wakes. Lid detection now handled entirely by 500ms poll timer reading RXSTATE directly.
  • GPE unmask delayed to PM_POST_SUSPEND. Stale GPE status from lid state changes fires immediately on unmask during resume_early, causing false lid events to logind (which re-suspends via HandleLidSwitch). Moved unmask to after system is fully stable.
  • Full PADCFG0 corruption detection. Compare entire register (minus volatile GPIORXSTATE) instead of just RXINV. Catches all VNN corruption (pad mode, termination, GPIORXDIS).
  • 1ms GPIORXDIS toggle + 15s settling guard. Extended input buffer enable from 10us to 1ms for reliable re-latching. 15s settling guard suppresses false lid-open transitions after PADCFG restore.
  • Lock screen flash fix. lock_sessions() now blocks with UMH_WAIT_PROC, plus 500ms delay before pm_suspend in failsafe to let GNOME render lock screen into frame buffer before freeze.
  • lps0_prepare refresh. Replaced fix_padcfg_corruption with save_all_pins to capture ACPI's legitimate RXINV changes from _L52, preventing false corruption detection.
  • Lid poll seeding. On wake, seed last_poll_rxstate from lid_was_closed_at_suspend to prevent false open transitions during RXSTATE settling.

v5.3c changes

  • Failsafe death sleep fix. Power button wake path now checks lid_close_reported before starting resync, preventing infinite re-suspend loop when lid was opened during sleep.
  • Failsafe lock fix. lock_sessions() helper called before all failsafe re-suspend paths, so screen is locked after rapid-wake VNN cycles.
  • Resync abort safety. lid_resync_fn checks poller state before and after backoff, aborting if lid was opened during the wait.

v5.3b changes

  • Suspend-to-lock mode. When suspend_to_lock=1 (default), GUI/CLI/keyboard-triggered suspend (lid open) is intercepted at the PM notifier level and converted to: lock screen, disable networking, blank display after 2s. A passive input handler restores display + networking on any keypress. Lid-close suspend passes through to real s2idle with full module protection.
  • PADCFG0 init correction expanded. VNN corruption can clear GPIORXDIS (bit 8) in addition to flipping RXINV (bit 23). The init path now checks and corrects both bits independently.

v5.3 changes

  • GPIORXDIS toggle after PADCFG restore to force pin input re-latch
  • Accurate sleep duration tracking via ktime_get_boottime()
  • Display/backlight recovery on s2idle resume when PADCFG corruption detected
  • Full hibernate PM ops (freeze/thaw/poweroff/restore)
  • RTC-based time sync after long hibernation
  • Post-resume wifi bounce and display reconnect

Tested on Surface Laptop 5 (i5-1245U), fresh Ubuntu 25.10, kernel 6.18.7-surface-1. s2idle cycles, hibernate cycles, extended sleep (11h+), repeated lid open/close, suspend-to-lock with lid open, rapid-wake VNN cycles with failsafe re-suspend + lock verification.

@wylfen

wylfen commented Feb 25, 2026

Copy link
Copy Markdown

Decided to give this patch a try on my Surface Laptop 5 with 6.18.13, but it won't cleanly apply. It's unhappy about the following hunk:

@@ -14,6 +14,7 @@ obj-$(CONFIG_SURFACE_AGGREGATOR_TABLET_SWITCH) += surface_aggregator_tabletsw.o
 obj-$(CONFIG_SURFACE_DTX)		+= surface_dtx.o
 obj-$(CONFIG_SURFACE_GPE)		+= surface_gpe.o
 obj-$(CONFIG_SURFACE_HOTPLUG)		+= surface_hotplug.o
+obj-$(CONFIG_SURFACE_S2IDLE_FIX)	+= surface_s2idle_fix.o
 obj-$(CONFIG_SURFACE_PLATFORM_PROFILE)	+= surface_platform_profile.o
 obj-$(CONFIG_SURFACE_PRO3_BUTTON)	+= surfacepro3_button.o

The header gives -14,6 +14,7 but there's only 6 lines in total, so it tries to interpret the next line but that one just begins a new patch. Was this manually edited? In any case, changing it to -14,5 +14,6 fixes it for me. As for the patch itself, will see in a couple of days.

@wylfen

wylfen commented Feb 25, 2026

Copy link
Copy Markdown

Hm, seems to be a bit wonky for me still, on my first try I somehow got my system into a state where it would suspend and then automatically immediately unsuspend (journalctl output from when that happens). Then the system wouldn't even completely turn off anymore - I tried a reboot and it just hung after systemd's last "Shutting down." message. On the console I still saw output that told me that s2idle_fix recognized me pressing the power button, but I had to long-press it to finally shut down the system.

Now that I rebooted I can't reproduce this, however. Like I said this is on 6.18.13, with an i7-1265U.

@wowitsjack

Copy link
Copy Markdown
Author

@wylfen re: Makefile hunk header - Fixed in v2, good catch. The Makefile hunk had a wrong line count in the header.

@wowitsjack

Copy link
Copy Markdown
Author

@wylfen re: suspend loop - Found and fixed in v2. Two things were going wrong:

  1. The failsafe was checking lid_was_closed_at_suspend (the historical state from when you closed the lid) but not the current physical lid state. So if anything caused a brief wake while the lid was closed, the failsafe would see "no power button, lid was closed" and force a re-suspend. If whatever caused the first wake kept firing, you'd get the loop you saw. v2 now reads PADCFG0 GPIORXSTATE directly in the failsafe, if the lid is physically open, it stays awake regardless of history.

  2. There was a deeper problem causing the brief wakes in the first place: the RTC keepalive alarm was being treated as a full wakeup event. The s2idle loop calls acpi_s2idle_wake() before check(), and the RTC alarm sets PM1_STS RTC_STS which acpi_any_fixed_event_status_set() picks up as a valid wake. So the loop would break before check() could re-arm the alarm. v2 suppresses the RTC as a wakeup source with acpi_disable_event(ACPI_EVENT_RTC) and disable_irq_wake(8), restored on resume.

If you want to re-test, pull the updated branch. The module is also significantly more capable now: EC keepalive prevents death sleep entirely, and lid-open wakes the system within ~5 seconds.

@wylfen

wylfen commented Feb 26, 2026

Copy link
Copy Markdown

Nice, I'll retest in the coming days and report back.

@wowitsjack

wowitsjack commented Feb 26, 2026

Copy link
Copy Markdown
Author

I am STUCK.

D:

@wowitsjack wowitsjack closed this Feb 26, 2026
@wylfen

wylfen commented Feb 26, 2026

Copy link
Copy Markdown

Oh, no worries, I appreciate the effort. Could you say which firmware version you updated to?

@wowitsjack

Copy link
Copy Markdown
Author

Oh, no worries, I appreciate the effort. Could you say which firmware version you updated to?

SAM firmware: 15.204.139

@wylfen

wylfen commented Feb 26, 2026

Copy link
Copy Markdown

Seems I have been on the same one for a while already. What's interesting is that Microsoft lists that particular SAM update for October 2024. There's a more recent one from November 2025, 15.305.139.0, that I don't even have installed yet (but I also basically don't boot Windows on the device anymore)

@wowitsjack wowitsjack reopened this Feb 27, 2026
@wowitsjack wowitsjack force-pushed the surface-laptop-5-s2idle-fix branch from 4c9694c to 026e497 Compare February 27, 2026 04:15
@wowitsjack wowitsjack changed the title Add Surface Laptop 5 lid GPE + s2idle death sleep fix Add Surface Laptop 5 lid GPE + s2idle fix (v3) Feb 27, 2026
@wowitsjack

wowitsjack commented Feb 27, 2026

Copy link
Copy Markdown
Author

Hey, sorry for closing this prematurely. I thought a SAM firmware update (15.204.139) had changed the behaviour, but after a fresh OS install and more testing, I slogged away and aligned to the new one

v3 is a pretty significant rewrite from v2. The big change: RTC keepalive is gone entirely. Turns out it was never needed. The real problem was in the s2idle wake decision path.

When INTC1055 power-gates during s2idle, pin 213 (lid, GPE 0x52) gets its PADCFG0 RXINV bit corrupted. That fires a spurious SCI. In the wake path, acpi_ec_dispatch_gpe() calls acpi_any_gpe_status_set(), which promotes any non-EC GPE with status set to a full resume. Since the corrupted pin re-asserts level-like, the status bit is always set when checked, and the system either never sleeps properly or wakes spuriously every cycle.

The fix is simpler than v2. ACPI wakeup handlers run before acpi_ec_dispatch_gpe() in the s2idle loop (acpi_check_wakeup_handlers() at sleep.c:777, acpi_ec_dispatch_gpe() at sleep.c:786). So the handler:

  1. Repairs RXINV corruption in PADCFG0
  2. Clears GPE 0x52 status via acpi_clear_gpe()
  3. Returns false (no wake from this handler)

The spurious GPE gets cleaned up before anything else sees it. The actual lid-open decision goes through an LPS0 check() callback that reads RXSTATE directly. Genuine lid-open calls pm_system_wakeup(), everything else stays asleep.

No more RTC alarm, no more EC keepalive, no more interrupt suppression. Just fix the corruption before the wake path can see it.

Still has the safety layers from v2: passive KEY_POWER handler for spurious wake detection, post-resume failsafe with exponential backoff, background RXSTATE polling for when the GPIO edge path breaks after multiple s2idle cycles.

Tested on SL5 (i5-1245U), fresh Ubuntu 25.10, kernel 6.18.7-surface-1, same SAM firmware 15.204.139 that I thought broke everything. Hundreds of cycles, no death sleep, no suspend loops.

@wylfen would be great if you could test v3 when you get a chance.

@wowitsjack wowitsjack force-pushed the surface-laptop-5-s2idle-fix branch from 026e497 to ec1de45 Compare February 27, 2026 10:46
@wowitsjack

Copy link
Copy Markdown
Author

Pushed v4. Two additions on top of v3:

SW_LID input events - the module now registers an input device and emits EV_SW / SW_LID on lid open/close. This keeps logind and GNOME in sync with the kernel's lid state. Without this, userspace had no idea the lid was moving since the GPE path doesn't generate input events on SL5.

Session locking before suspend - when the module calls pm_suspend() directly (lid close detected via RXSTATE polling), it bypasses logind's PrepareForSleep signal entirely, so GNOME never gets the chance to lock the screen. You'd get a brief flash of unlocked desktop on resume before GNOME caught up. Now the module calls loginctl lock-sessions via call_usermodehelper() with a 300ms settle time before suspending. Screen is locked before the display freezes.

Core s2idle fix is unchanged from v3, just these two quality-of-life additions.

@wylfen if you're still testing, these shouldn't affect the suspend/resume behaviour at all, just nicer userspace integration.

@wylfen

wylfen commented Feb 27, 2026

Copy link
Copy Markdown

Oh, nice, I was just to report similar issues for KDE Plasma. I'll retest with those changes.

@wowitsjack

Copy link
Copy Markdown
Author

Oh, nice, I was just to report similar issues for KDE Plasma. I'll retest with those changes.

Heck yeah, thank you so much! <3

@wylfen

wylfen commented Feb 27, 2026

Copy link
Copy Markdown

Alright, so the normal suspend/resume behaviour seems fixed for me now. I've not had suspend loops or death sleep. With the recent changes from v4, userspace behaviour is definitely better, but I still have some feedback there:

  1. First is a general question about the design; I'm not sure how other desktop environments treat this, but with KDE Plasma I can choose what happens when I press the power button or close the lid. With this module applied, my preferences there are overridden. Whilst the module does the right thing now (suspend + ensure lock), one might want the power button to simply lock the screen and not automatically suspend. This was possible before this change for me but now isn't. I'd have to take a longer look at the code, but I wonder whether one could produce a version that only made sure the system wouldn't end up in death sleep.

  2. I think the failsafes are a bit too trigger-happy. On my KDE Plasma system it is now no longer possible to manually press the "Sleep" soft-button on the lockscreen to go to sleep, since the system wakes up again right away (presumably because the lid is still open.)

  3. Whilst I like the idea of making sure the system is locked by calling loginctl lock-sessions, I'm not sure this is a job for this module in particular. Now that the SW_LID input events are correctly propagated, I think that desktop environments can implement their own logic there.

Other than that I had to cajole the patch to apply again:

  1. same hunk issue as previously with the Makefile changes
  2. both the hunk offsets as well as the context for the Kconfig change is wrong for 6.18 at least: for me the context starts at line 186, and the last context line should be depends on ACPI and not depends on PCI_QUIRKS.

But that's mostly a patch cleanup issue - it would probably be useful to base the patch on some set kernel version and then cherry-pick it onto the other branches to get correct patches.


This change as it is with v4 is definitely a step forward in usability for me on my Surface, so thanks a lot! I'll run the current version for at least a week so I can see how it handles when I'm actually actively using the device at work.

v5: Architectural cleanup per tester feedback.

- Removed lock_sessions (loginctl call from kernel): SW_LID events
  let the DE handle locking natively, no need for kernel-side session lock.

- lid_poll_fn no longer calls pm_suspend(): it only emits SW_LID events.
  The desktop environment decides suspend policy on lid close. This stops
  the module from overriding DE preferences (e.g. "lock only on lid close").

- Failsafe re-suspend logic is now purely physical: re-suspends when lid
  is closed and no power button was pressed (spurious wake), stays awake
  when lid is open. Works regardless of who initiated the suspend.

v4: Added SW_LID input events and loginctl lock-sessions.
v3: 5-layer architecture with spurious wake, lid resync, background polling.
v2: Fixed Kconfig/Makefile patch context.
v1: Initial release.
@wowitsjack wowitsjack force-pushed the surface-laptop-5-s2idle-fix branch from ec1de45 to aa3e525 Compare February 27, 2026 12:28
@wowitsjack

Copy link
Copy Markdown
Author

Pushed v5, addressing all your feedback:

1) DE preference override - lid_poll_fn no longer calls pm_suspend(). It only emits SW_LID events now. The desktop environment decides what to do on lid close (suspend, lock, nothing, whatever you've configured). The module's job is purely death-sleep prevention and spurious wake recovery.

2) Trigger-happy failsafe - the failsafe now only cares about physical lid state. Lid closed + no power button = spurious wake, re-suspend. Lid open = stay awake, no questions asked. If you hit Sleep from the lock screen with lid open, the system sleeps and wakes normally without the module interfering.

3) lock_sessions removed - agreed, now that SW_LID events work, there's no reason for the kernel to be calling loginctl. DEs handle locking on their own via the lid switch events. Removed call_usermodehelper, lock_sessions(), and the kmod.h include entirely.

The core s2idle fix (PADCFG corruption repair, GPE 0x52 masking, LPS0 hooks, wakeup handler) is unchanged. Just the userspace interaction got cleaned up.

Re: patch context issues, I'll take a closer look at getting the Kconfig/Makefile hunks right per kernel version. The current patches are all generated against the same base, which is why the context drifts on some versions. Cherry-picking per branch is the right approach, I just haven't set that up yet.

Thanks again for the detailed feedback, especially the KDE perspective. Super valuable having someone test on a different DE.

@wylfen

wylfen commented Feb 27, 2026

Copy link
Copy Markdown

This is great, thanks. I'll give the new version a look tomorrow.

@wylfen

wylfen commented Feb 27, 2026

Copy link
Copy Markdown

So I've been looking at the patches and this conversation again and I got to ask: How much am I talking to an LLM and how much of the patches were generated by one? I'm not sure whether this project has any sort of policies for this, but I think it's only respectful to disclose any LLM usage.

I think there's a real problem here to be investigated and solved, but I can't personally continue looking at this in good faith until I know what exactly is going on.

@dmusican

Copy link
Copy Markdown

@wowitsjack, I'm grateful for your help on this. It looks as though this has been quite frustrating, and I really appreciate your posts and the efforts you've been taking to work through them. I've been suffering from this bug for a while, and am looking forward to the fix.

@wowitsjack

Copy link
Copy Markdown
Author

So I've been looking at the patches and this conversation again and I got to ask: How much am I talking to an LLM and how much of the patches were generated by one? I'm not sure whether this project has any sort of policies for this, but I think it's only respectful to disclose any LLM usage.

I think there's a real problem here to be investigated and solved, but I can't personally continue looking at this in good faith until I know what exactly is going on.

I have dyslexia, so all of my comments are passed through CoPilot, so I don't make a fool of myself.

Sorry if that's a bit impersonal x__x

@wowitsjack

wowitsjack commented Feb 27, 2026

Copy link
Copy Markdown
Author

@wowitsjack, I'm grateful for your help on this. It looks as though this has been quite frustrating, and I really appreciate your posts and the efforts you've been taking to work through them. I've been suffering from this bug for a while, and am looking forward to the fix.

Thank you, really.

I super appreciate that! It's been...quite a ride. A long time spent tracing sleep states in Windows and Linux. A LONG time.

This has been without a doubt the toughest software/hardware engineering piece I have ever worked on.

@wowitsjack

Copy link
Copy Markdown
Author

@wowitsjack, I'm grateful for your help on this. It looks as though this has been quite frustrating, and I really appreciate your posts and the efforts you've been taking to work through them. I've been suffering from this bug for a while, and am looking forward to the fix.

If you'd like to give it a test now, I also maintain a quick easy installer for my own use. - https://github.com/wowitsjack/Surface-Linux-Lid-Fix

@dmusican

Copy link
Copy Markdown

Thanks, I tried to install it --- got this error at depmod -a:

modprobe: ERROR: could not insert 'surface_s2idle_fix': Key was rejected by service

I'm trying to install it on top of kernel 6.18.7-surface-1, if that's useful info.

@dmusican

Copy link
Copy Markdown

Also found this in the build output, it's probably relevant:

At main.c:171:
- SSL error:FFFFFFFF80000002:system library::No such file or directory: ../crypto/bio/bss_file.c:67
- SSL error:10000080:BIO routines::no such file: ../crypto/bio/bss_file.c:75
sign-file: /usr/src/linux-headers-6.18.7-surface-1/certs/signing_key.pem
  DEPMOD  /lib/modules/6.18.7-surface-1

@wowitsjack

Copy link
Copy Markdown
Author

@dmusican That's a Secure Boot thing, not a problem with the module itself. Your kernel is enforcing module signing and the build can't find a signing key, so modprobe rejects it.

Easiest fix: disable Secure Boot in your UEFI/BIOS settings. On the Surface, hold Volume Up while powering on to get into the UEFI menu, then turn off Secure Boot under Security.

If you want to keep Secure Boot on, you can self-sign the module with a MOK (Machine Owner Key):

# Generate a key pair
openssl req -new -x509 -newkey rsa:2048 -keyout MOK.priv -outform DER -out MOK.der -nodes -days 36500 -subj "/CN=My Module Signing Key/"

# Enroll it (will prompt for a password, you'll enter it again on next reboot)
sudo mokutil --import MOK.der

# Reboot, MOK Manager will appear, select "Enroll MOK" and enter your password

# Then sign the module
sudo /usr/src/linux-headers-$(uname -r)/scripts/sign-file sha256 MOK.priv MOK.der /lib/modules/$(uname -r)/updates/surface_s2idle_fix.ko

# Now modprobe should work
sudo modprobe surface_s2idle_fix

Disabling Secure Boot is way less hassle though, and fine for a personal laptop.

@dmusican

Copy link
Copy Markdown

Thanks --- this isn't my machine, it's one for work where I'm already on thin ice for having dual-booted it to Linux... it also boots to Windows (which I use infrequently), and so I don't know what hassles this is going to cause there or for the admin software they're running on that end.

I REALLY appreciate that you're fixing this! And yeah, I could go for the self-signed version (thanks for the instructions on this), but I think I'm going to have to defer on testing and wait for the release.

@wowitsjack

wowitsjack commented Feb 28, 2026

Copy link
Copy Markdown
Author

Hey all, just wanted to address this properly.

@wylfen I completely understand your concern, and I'm sorry if the way I've been communicating has made this feel off. I mentioned earlier that I have dyslexia and lean on CoPilot to clean up my writing so it's actually readable, but I totally get that it can come across as impersonal or raise questions. That's on me.

I don't want my involvement to be a blocker or a source of friction for anyone trying to get this bug fixed. The last thing I want is for trust issues around how I write comments to get in the way of people who've been dealing with this since 2023. So I think the right call is for me to step back and close this out.

@dmusican I'm really sorry it's ending up this way. I know you've been waiting on a fix for a while and I wish I could've gotten this over the line for you.

Thanks for testing and for the kind words, @dmusican . No hard feelings.

Just a small note: if I do end up pulling this contribution, please don't reuse my code or patches without my explicit permission. I'd like to retain control over what I've written.

@wowitsjack wowitsjack closed this Feb 28, 2026
@wylfen

wylfen commented Feb 28, 2026

Copy link
Copy Markdown

I mentioned earlier that I have dyslexia and lean on CoPilot to clean up my writing so it's actually readable, but I totally get that it can come across as impersonal or raise questions. That's on me.

I'm sympathetic to the decision to use LLMs to facilitate communication if you think it's otherwise impossible. I don't mean to shame you for such a decision. Due to the nature of the technology, however, I think it's important to disclose its usage: only once LLM use is documented properly can people take a look at the work in a meaningful way. This is both about the nature of the code produced and about respectful communication.

I'm still convinced that problems with sleep are real issues to be documented and fixed and that this patch does prod at the right bits. It's just difficult for me engage with the code directly because I am very inexperienced in hardware engineering and because of the code's complexity and size. So the only thing I can fall back on is trusting your output, which became hard for me to do. It would need someone from this project with hardware experience to properly review this.

I do also appreciate the effort, it's unfortunate to have this end on a rather disappointing note.

@wowitsjack

Copy link
Copy Markdown
Author

I mentioned earlier that I have dyslexia and lean on CoPilot to clean up my writing so it's actually readable, but I totally get that it can come across as impersonal or raise questions. That's on me.

I'm sympathetic to the decision to use LLMs to facilitate communication if you think it's otherwise impossible. I don't mean to shame you for such a decision. Due to the nature of the technology, however, I think it's important to disclose its usage: only once LLM use is documented properly can people take a look at the work in a meaningful way. This is both about the nature of the code produced and about respectful communication.

I'm still convinced that problems with sleep are real issues to be documented and fixed and that this patch does prod at the right bits. It's just difficult for me engage with the code directly because I am very inexperienced in hardware engineering and because of the code's complexity and size. So the only thing I can fall back on is trusting your output, which became hard for me to do. It would need someone from this project with hardware experience to properly review this.

I do also appreciate the effort, it's unfortunate to have this end on a rather disappointing note.

@wylfen

Wolfgang,

Which is it, mate? You're losing track of your own argument.

So is it you appreciate my work, or you "aren't sure [I'm] real" and I'm an LLM?
Please try and make up your mind.

I want to explain how this has actually landed.

"If you think it's otherwise impossible." I have dyslexia. I use assistive tools so I can write clearly. That's how I participate in projects like this. When you phrase it as "if you think," you're leaving room to question whether my need is real. I don't believe that was your aim, but I need you to understand how that reads from my side.

Asking me to formally disclose and document my use of accessibility tools before my work can be "meaningfully evaluated," and framing that as "respectful communication," I understand the broader concern about LLMs in open source. But applied to someone using assistive technology for a disability, that amounts to asking me to label myself before I'm taken seriously. Would you ask someone using a screen reader to disclose that before engaging with their review?

You've mentioned you're "very inexperienced in hardware engineering" and found the code too complex to engage with directly. I respect that honesty. But it means the right move was to flag this for someone with hardware experience. Instead, what happened was a public accusation about my writing style, and I stepped back because I didn't want my involvement to become a blocker for people who need this fix.

Since closing this PR, I've had people reaching out in DMs the project's Matrix chat asking if they can still get a copy. I have been updating the community of the unfortunate turn of events your attitude and approach has cascaded into.

@dmusican, I'm sorry this has left you hanging. If you have a Discord or wherever works for you, I'll help you get the fix running on your machine directly. You shouldn't have to keep waiting because of this.

I'll also be emailing qzed to make it explicitly clear that my code and patches are not to be used or derived from in any capacity by this project. If there are questions about why, they can ask you, Mr. Müller.

It's a little off to me, someone who couldn't evaluate the code technically anyway trying to police the disability tools of others. What would formal disclosure have changed for a reviewer who self-admittedly lacks the hardware expertise to evaluate the substance?

I hear you that this is disappointing. It is. For everyone.

@wowitsjack wowitsjack reopened this Feb 28, 2026
@wowitsjack

Copy link
Copy Markdown
Author

Hey all, quick update.

@qzed reached out to me directly after receiving my email, and we had a good conversation. His position is clear: LLMs are tools, and it's the end product that counts. That's all I needed to hear.

I'm reopening this and making the patches fully available under GPL to the linux-surface project and anyone else who wants to use them. No restrictions.

The fix works. People need it. That's what matters.

@dmusican, the offer still stands if you want help getting it running on your machine in the meantime.

@qzed

qzed commented Feb 28, 2026

Copy link
Copy Markdown
Member

Honestly, I'm not sure what the problem with using or not using LLMs is. LLMs are, in my opinion, tools. In the end, it's the end product that needs to be evaluated, doesn't matter if an LLM was used or not. At this point, I also guess that a lot of people already use some kind of LLM (e.g, copilot) while coding, and I don't necessarily expect everyone to straight up disclose that all the time. Again, the end product matters.

Now of course it's your own choice of who (and what tools) to trust or not. I personally always found it a bit hard to just trust anyone on the internet that gives you something to run (yes, also before LLMs). But fwiw, I think at this point, any kind of exploit still very likely needs a human hand.

I'll try to review the code when I have some time. Please be civil and respectful.

@wowitsjack wowitsjack changed the title Add Surface Laptop 5 lid GPE + s2idle fix (v3) Add Surface Laptop 5 lid GPE + s2idle fix (v5) Feb 28, 2026
@dmusican

dmusican commented Mar 1, 2026

Copy link
Copy Markdown

@wowitsjack, thanks for the offer! I'm going to give this a little time to see if the commit just makes it in, then I don't have to futz with signing it, etc. If I get inspired to try it before that, I will. I'm optimistic that hopefully this just works out, but I'll speak up if I can use some help. Again, thanks for all you've done.

Replace stripped s2idle-only patches (924 lines) with full v5.2a
source (1712 lines) across all 11 kernel versions.

v5.2a adds hibernate support with RTC-based time synchronization,
post-resume wifi bounce recovery, display reconnect fix, hardened
SNTP-free timekeeping for long hibernation, and freeze/thaw PM ops.
@wowitsjack wowitsjack changed the title Add Surface Laptop 5 lid GPE + s2idle fix (v5) Add Surface Laptop 5 lid GPE + s2idle/hibernate fix (v5.2a) Mar 6, 2026
@wowitsjack

wowitsjack commented Mar 6, 2026

Copy link
Copy Markdown
Author

v5.2a update

Patches updated from the stripped s2idle-only source (924 lines) to the full v5.2a module (1712 lines). Core s2idle fix is unchanged, same layered PADCFG repair + GPE masking + LPS0 hooks.

The big changes:

  1. Hibernate support. Added freeze/thaw/poweroff/restore PM ops. PADCFG save/restore and GPE masking now cover suspend-to-disk, not just s2idle. Same corruption can happen during hibernate's ACPI transitions, so the same fix applies.

  2. RTC-based time sync. After a long hibernate (11h+ tested), the system clock drifts significantly. The module reads the hardware RTC on resume and corrects via do_settimeofday64(). There's a configurable max-correction cap so it won't set time backwards on short hibernates where the clock is still close enough.

  3. Post-resume wifi bounce. NetworkManager sometimes comes back with stale interfaces after hibernate. The module detects this and kicks a reconnect so you don't wake up to no wifi.

  4. Display reconnect. DRM reprobing after hibernate to recover from blank/frozen displays. Hibernate is rougher on the display pipeline than s2idle, this catches the cases where the panel doesn't come back on its own.

  5. No more SNTP. Previously was testing an external SNTP fallback for time sync. Removed entirely, timekeeping is fully self-contained using the hardware RTC. One fewer moving part.

All 11 kernel versions (6.6, 6.9-6.18) updated. Tested on SL5 (i5-1245U), Ubuntu 25.10, kernel 6.18.7-surface-1. s2idle and hibernate cycles, including 11h+ hibernation. No death sleep, no PADCFG corruption, correct wall clock on resume.

wowitsjack added 2 commits March 6, 2026 12:55
v5.3 fixes:
- GPIORXDIS toggle after PADCFG restore for pin input re-latch
- Display/backlight recovery on s2idle resume with PADCFG corruption
- Accurate sleep duration tracking via ktime_get_boottime()
- Version string consistency (v5.1a -> v5.3)
@wowitsjack wowitsjack changed the title Add Surface Laptop 5 lid GPE + s2idle/hibernate fix (v5.2a) Add Surface Laptop 5 lid GPE + s2idle/hibernate fix (v5.3) Mar 6, 2026
@wowitsjack

Copy link
Copy Markdown
Author

v5.3 update pushed

Hit a confirmed death sleep on lid close while running v5.1a (hadn't rebuilt from v5.2a source, my bad). Dug through the previous boot logs and found the kill chain:

After ~51 minutes of s2idle, VNN power-gating corrupted PADCFG0 as expected. The module corrected the register values, but the pin's input buffer was still latched to the corrupted state. RXSTATE read 0 (lid open) even though the lid was physically closed. The module thought it was a genuine wake, didn't re-suspend, but display recovery only ran on the hibernate path, not s2idle. Black screen, system running but no way to interact, eventually re-suspended into a state it couldn't come back from.

Also found that ktime_get() (CLOCK_MONOTONIC) doesn't advance during s2idle because the TSC is halted. A 51-minute sleep was showing as 759ms in the logs, which broke all the rapid-wake detection and backoff timing.

v5.3 fixes:

  1. GPIORXDIS toggle after every PADCFG restore. Toggling bit 8 forces the input buffer to re-latch from the actual electrical state of the pin. 10us assert, 100us settle before trusting RXSTATE.

  2. ktime_get_boottime() instead of ktime_get() for sleep duration tracking. CLOCK_BOOTTIME includes time spent in suspend, so the timing logic actually works now.

  3. Display/backlight recovery on s2idle path when PADCFG corruption was detected during the cycle. Previously this only ran after hibernate.

All patches updated across 6.6, 6.9-6.18.

Update surface_s2idle_fix patches to v5.3b across all kernel versions.

New features:
- suspend_to_lock: intercept GUI/manual suspend, convert to screen lock
  + display blank + network disable (lid-close suspend unaffected)
- Wake input handler restores display and networking on keypress
- SAM wakeup setup for power button wake from s2idle
- Power button detection in lps0_check

Fixes:
- Init-time PADCFG0 restoration now checks GPIORXDIS (bit 8) in
  addition to RXINV, fixing broken lid detection after module reload
  following VNN corruption
@wowitsjack wowitsjack changed the title Add Surface Laptop 5 lid GPE + s2idle/hibernate fix (v5.3) Add Surface Laptop 5 lid GPE + s2idle/hibernate fix (v5.3b) Mar 8, 2026
v5.3c changes:
- Fix failsafe death sleep: power button path checks lid_close_reported
  before starting resync, preventing infinite re-suspend when lid opened
  during sleep
- Add lock_sessions() before all failsafe re-suspend paths so screen is
  locked after rapid-wake VNN cycles
- Add lid_close_reported safety checks in lid_resync_fn to abort if
  poller detected lid open during backoff
- Version bump to 5.3c
@wowitsjack wowitsjack changed the title Add Surface Laptop 5 lid GPE + s2idle/hibernate fix (v5.3b) Add Surface Laptop 5 lid GPE + s2idle/hibernate fix (v5.3c) Mar 8, 2026
GPE refcount fix, full PADCFG0 corruption detection, lock screen
flash fix, RXSTATE settling guard, GPE masked during s2idle.
@wowitsjack wowitsjack changed the title Add Surface Laptop 5 lid GPE + s2idle/hibernate fix (v5.3c) Add Surface Laptop 5 lid GPE + s2idle/hibernate fix (v5.4) Mar 12, 2026
@qzed

qzed commented Mar 13, 2026

Copy link
Copy Markdown
Member

@wowitsjack Thanks for all your work! Would you mind opening a PR against the kernel repo? I think it's just a bit easier to review and I have the patch-from-kernel-repo process quite automated now.

@wowitsjack

Copy link
Copy Markdown
Author

@wowitsjack Thanks for all your work! Would you mind opening a PR against the kernel repo? I think it's just a bit easier to review and I have the patch-from-kernel-repo process quite automated now.

Done! :D

wowitsjack added 2 commits March 30, 2026 19:08
Fix display wake on lid open: call ULID directly to set ACPI button
driver state, enabling real SW_LID 1->0 transition on event0.
GPI_GPE_EN re-enable and RXINV write ensure _L52 fires on wake.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Surface Laptop 5 DMI ID Missing From GPE Modules List Leading to Failed Lid Suspend Events

4 participants