fix(borders): prevent use-after-free in border destruction

This commit fixes a use-after-free bug in the border_manager that was
causing crashes with exception code 0xc000041d when borders were being
destroyed during workspace/monitor changes.

The root cause was a race condition between the main thread destroying a
border window and the border's window thread still processing queued
window messages (WM_PAINT, EVENT_OBJECT_LOCATIONCHANGE). The crash
occurred when these callbacks invoked render_target.EndDraw(), which
internally calls HwndPresenter::Present() to present the rendered frame
to the HWND. By this point, the HWND and its associated DirectX surfaces
had already been freed, resulting in Direct2D attempting to dereference
freed memory (0xFEEEFEEEFEEEFEEE - Windows heap poison value).

The issue stemmed from two problems in destroy_border():

1. Direct2D resources (render_target and brushes) were not being
   released before closing the window. These COM objects hold internal
   pointers to the HWND and its DirectX swap chain/surfaces. When
   close_window() was called, Windows began tearing down these resources
   while the Direct2D objects still held dangling pointers to them.

2. GWLP_USERDATA was not being cleared before closing the window. This
   meant that any messages already queued in the window's message queue
   could still retrieve the border_pointer and attempt to render using
   the now-invalid Direct2D resources.

This commit addresses both issues:

- In destroy_border() (mod.rs): Explicitly set render_target to None and
  clear the brushes HashMap before calling destroy(). This ensures that
  Direct2D COM objects are properly released while the HWND is still
  valid, preventing EndDraw() from accessing freed HWND resources.

- In destroy() (border.rs): Clear GWLP_USERDATA before calling
  close_window(). This ensures that any pending window messages will see
  a null pointer and exit early from the callbacks (which already have
  null pointer checks in place).

These changes create two layers of defense against the race condition:
the callbacks won't access the border_pointer (it's null), and even if
they somehow did, the render_target would be None so no rendering
operations would occur.

I think this is the root cause of a lot of crash tickets that people are
mistakenly attributing to specific applications which lack
reproducibility across different users/machines, i.e. #1626, #1624.
This commit is contained in:
LGUG2Z
2026-01-12 08:46:03 -08:00
parent 0b39551470
commit bdef1448e1
2 changed files with 12 additions and 0 deletions

View File

@@ -313,6 +313,11 @@ impl Border {
}
pub fn destroy(&self) -> color_eyre::Result<()> {
// clear user data **BEFORE** closing window
// pending messages will see a null pointer and exit early
unsafe {
SetWindowLongPtrW(self.hwnd(), GWLP_USERDATA, 0);
}
WindowsApi::close_window(self.hwnd)
}

View File

@@ -767,6 +767,13 @@ fn remove_border(
fn destroy_border(border: Box<Border>) -> color_eyre::Result<()> {
let raw_pointer = Box::into_raw(border);
unsafe {
// release d2d resources **BEFORE** destroying window
// this drops render_target and brushes while HWND is still valid
// prevents EndDraw() from accessing freed HWND resources
(*raw_pointer).render_target = None;
(*raw_pointer).brushes.clear();
// Now safe to destroy window
(*raw_pointer).destroy()?;
}
Ok(())