Skip to content

[lldb] Add WebAssembly Process Plugin #150143

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 29, 2025

Conversation

JDevlieghere
Copy link
Member

@JDevlieghere JDevlieghere commented Jul 22, 2025

Extend support in LLDB for WebAssembly. This PR adds a new Process plugin (ProcessWasm) that extends ProcessGDBRemote for WebAssembly targets. It adds support for WebAssembly's memory model with separate address spaces, and the ability to fetch the call stack from the WebAssembly runtime.

I have tested this change with the WebAssembly Micro Runtime (WAMR, https://github.com/bytecodealliance/wasm-micro-runtime) which implements a GDB debug stub and supports the qWasmCallStack packet.

(lldb) process connect --plugin wasm connect://localhost:4567
Process 1 stopped
* thread #1, name = 'nobody', stop reason = trace
    frame #0: 0x40000000000001ad
wasm32_args.wasm`main:
->  0x40000000000001ad <+3>:  global.get 0
    0x40000000000001b3 <+9>:  i32.const 16
    0x40000000000001b5 <+11>: i32.sub
    0x40000000000001b6 <+12>: local.set 0
(lldb) b add
Breakpoint 1: where = wasm32_args.wasm`add + 28 at test.c:4:12, address = 0x400000000000019c
(lldb) c
Process 1 resuming
Process 1 stopped
* thread #1, name = 'nobody', stop reason = breakpoint 1.1
    frame #0: 0x400000000000019c wasm32_args.wasm`add(a=<unavailable>, b=<unavailable>) at test.c:4:12
   1    int
   2    add(int a, int b)
   3    {
-> 4        return a + b;
   5    }
   6
   7    int
(lldb) bt
* thread #1, name = 'nobody', stop reason = breakpoint 1.1
  * frame #0: 0x400000000000019c wasm32_args.wasm`add(a=<unavailable>, b=<unavailable>) at test.c:4:12
    frame #1: 0x40000000000001e5 wasm32_args.wasm`main at test.c:12:12
    frame #2: 0x40000000000001fe wasm32_args.wasm

This PR is based on an unmerged patch from Paolo Severini: https://reviews.llvm.org/D78801. I intentionally stuck to the foundations to keep this PR small. I have more PRs in the pipeline to support the other features/packets.

My motivation for supporting Wasm is to support debugging Swift compiled to WebAssembly: https://www.swift.org/documentation/articles/wasm-getting-started.html

@llvmbot
Copy link
Member

llvmbot commented Jul 22, 2025

@llvm/pr-subscribers-lldb

Author: Jonas Devlieghere (JDevlieghere)

Changes

Extend support in LLDB for WebAssembly. This PR adds a new Process plugin (ProcessWasm)that extends ProcessGDBRemote for WebAssembly targets. It adds support for WebAssembly's memory model with separate address spaces, and the ability to fetch the call stack from the WebAssembly runtime.

I have tested this change with the WebAssembly Micro Runtime (WAMR, https://github.com/bytecodealliance/wasm-micro-runtime) which implements a GDB debug stub and supports the qWasmCallStack packet.

(lldb) process connect --plugin wasm connect://localhost:4567
Process 1 stopped
* thread #<!-- -->1, name = 'nobody', stop reason = trace
    frame #<!-- -->0: 0x40000000000001ad
wasm32_args.wasm`main:
-&gt;  0x40000000000001ad &lt;+3&gt;:  global.get 0
    0x40000000000001b3 &lt;+9&gt;:  i32.const 16
    0x40000000000001b5 &lt;+11&gt;: i32.sub
    0x40000000000001b6 &lt;+12&gt;: local.set 0
(lldb) b add
Breakpoint 1: where = wasm32_args.wasm`add + 28 at test.c:4:12, address = 0x400000000000019c
(lldb) c
Process 1 resuming
Process 1 stopped
* thread #<!-- -->1, name = 'nobody', stop reason = breakpoint 1.1
    frame #<!-- -->0: 0x400000000000019c wasm32_args.wasm`add(a=&lt;unavailable&gt;, b=&lt;unavailable&gt;) at test.c:4:12
   1    int
   2    add(int a, int b)
   3    {
-&gt; 4        return a + b;
   5    }
   6
   7    int
(lldb) bt
* thread #<!-- -->1, name = 'nobody', stop reason = breakpoint 1.1
  * frame #<!-- -->0: 0x400000000000019c wasm32_args.wasm`add(a=&lt;unavailable&gt;, b=&lt;unavailable&gt;) at test.c:4:12
    frame #<!-- -->1: 0x40000000000001e5 wasm32_args.wasm`main at test.c:12:12
    frame #<!-- -->2: 0x40000000000001fe wasm32_args.wasm

This PR is based on an unmerged patch from Paolo Severini: https://reviews.llvm.org/D78801. I intentionally stuck to the foundations to keep this PR small. I have more PRs in the pipeline to support the other features/packets.

My motivation for supporting Wasm is to support Swift compiled to WebAssembly: https://www.swift.org/documentation/articles/wasm-getting-started.html


Patch is 23.14 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/150143.diff

13 Files Affected:

  • (modified) lldb/packages/Python/lldbsuite/test/lldbgdbclient.py (+2-2)
  • (modified) lldb/source/Plugins/Process/CMakeLists.txt (+1)
  • (added) lldb/source/Plugins/Process/Wasm/CMakeLists.txt (+10)
  • (added) lldb/source/Plugins/Process/Wasm/ProcessWasm.cpp (+127)
  • (added) lldb/source/Plugins/Process/Wasm/ProcessWasm.h (+87)
  • (added) lldb/source/Plugins/Process/Wasm/ThreadWasm.cpp (+34)
  • (added) lldb/source/Plugins/Process/Wasm/ThreadWasm.h (+38)
  • (added) lldb/source/Plugins/Process/Wasm/UnwindWasm.cpp (+81)
  • (added) lldb/source/Plugins/Process/Wasm/UnwindWasm.h (+51)
  • (modified) lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp (+7-2)
  • (modified) lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.h (+2)
  • (modified) lldb/source/Target/Platform.cpp (+6)
  • (modified) lldb/test/API/functionalities/gdb_remote_client/TestWasm.py (+21-5)
diff --git a/lldb/packages/Python/lldbsuite/test/lldbgdbclient.py b/lldb/packages/Python/lldbsuite/test/lldbgdbclient.py
index 459460b84fbae..599f7878e6edb 100644
--- a/lldb/packages/Python/lldbsuite/test/lldbgdbclient.py
+++ b/lldb/packages/Python/lldbsuite/test/lldbgdbclient.py
@@ -45,7 +45,7 @@ def createTarget(self, yaml_path):
         self.yaml2obj(yaml_path, obj_path)
         return self.dbg.CreateTarget(obj_path)
 
-    def connect(self, target):
+    def connect(self, target, plugin="gdb-remote"):
         """
         Create a process by connecting to the mock GDB server.
 
@@ -54,7 +54,7 @@ def connect(self, target):
         listener = self.dbg.GetListener()
         error = lldb.SBError()
         process = target.ConnectRemote(
-            listener, self.server.get_connect_url(), "gdb-remote", error
+            listener, self.server.get_connect_url(), plugin, error
         )
         self.assertTrue(error.Success(), error.description)
         self.assertTrue(process, PROCESS_IS_VALID)
diff --git a/lldb/source/Plugins/Process/CMakeLists.txt b/lldb/source/Plugins/Process/CMakeLists.txt
index bd9b1b86dbf13..730fc9cd4056c 100644
--- a/lldb/source/Plugins/Process/CMakeLists.txt
+++ b/lldb/source/Plugins/Process/CMakeLists.txt
@@ -29,3 +29,4 @@ add_subdirectory(elf-core)
 add_subdirectory(mach-core)
 add_subdirectory(minidump)
 add_subdirectory(FreeBSDKernel)
+add_subdirectory(Wasm)
diff --git a/lldb/source/Plugins/Process/Wasm/CMakeLists.txt b/lldb/source/Plugins/Process/Wasm/CMakeLists.txt
new file mode 100644
index 0000000000000..ff8a3c792ad53
--- /dev/null
+++ b/lldb/source/Plugins/Process/Wasm/CMakeLists.txt
@@ -0,0 +1,10 @@
+add_lldb_library(lldbPluginProcessWasm PLUGIN
+  ProcessWasm.cpp
+  ThreadWasm.cpp
+  UnwindWasm.cpp
+
+  LINK_LIBS
+    lldbCore
+  LINK_COMPONENTS
+    Support
+  )
diff --git a/lldb/source/Plugins/Process/Wasm/ProcessWasm.cpp b/lldb/source/Plugins/Process/Wasm/ProcessWasm.cpp
new file mode 100644
index 0000000000000..ee5377b33dc97
--- /dev/null
+++ b/lldb/source/Plugins/Process/Wasm/ProcessWasm.cpp
@@ -0,0 +1,127 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "ProcessWasm.h"
+#include "ThreadWasm.h"
+#include "lldb/Core/Module.h"
+#include "lldb/Core/PluginManager.h"
+#include "lldb/Core/Value.h"
+#include "lldb/Utility/DataBufferHeap.h"
+
+#include "lldb/Target/UnixSignals.h"
+
+using namespace lldb;
+using namespace lldb_private;
+using namespace lldb_private::process_gdb_remote;
+using namespace lldb_private::wasm;
+
+LLDB_PLUGIN_DEFINE(ProcessWasm)
+
+ProcessWasm::ProcessWasm(lldb::TargetSP target_sp, ListenerSP listener_sp)
+    : ProcessGDBRemote(target_sp, listener_sp) {
+  /* always use linux signals for wasm process */
+  m_unix_signals_sp = UnixSignals::Create(ArchSpec{"wasm32-Ant-wasi-wasm"});
+}
+
+void ProcessWasm::Initialize() {
+  static llvm::once_flag g_once_flag;
+
+  llvm::call_once(g_once_flag, []() {
+    PluginManager::RegisterPlugin(GetPluginNameStatic(),
+                                  GetPluginDescriptionStatic(), CreateInstance,
+                                  DebuggerInitialize);
+  });
+}
+
+void ProcessWasm::DebuggerInitialize(Debugger &debugger) {
+  ProcessGDBRemote::DebuggerInitialize(debugger);
+}
+
+llvm::StringRef ProcessWasm::GetPluginName() { return GetPluginNameStatic(); }
+
+ConstString ProcessWasm::GetPluginNameStatic() {
+  static ConstString g_name("wasm");
+  return g_name;
+}
+
+const char *ProcessWasm::GetPluginDescriptionStatic() {
+  return "GDB Remote protocol based WebAssembly debugging plug-in.";
+}
+
+void ProcessWasm::Terminate() {
+  PluginManager::UnregisterPlugin(ProcessWasm::CreateInstance);
+}
+
+lldb::ProcessSP ProcessWasm::CreateInstance(lldb::TargetSP target_sp,
+                                            ListenerSP listener_sp,
+                                            const FileSpec *crash_file_path,
+                                            bool can_connect) {
+  if (crash_file_path == nullptr)
+    return std::make_shared<ProcessWasm>(target_sp, listener_sp);
+  return {};
+}
+
+bool ProcessWasm::CanDebug(lldb::TargetSP target_sp,
+                           bool plugin_specified_by_name) {
+  if (plugin_specified_by_name)
+    return true;
+
+  if (Module *exe_module = target_sp->GetExecutableModulePointer()) {
+    if (ObjectFile *exe_objfile = exe_module->GetObjectFile())
+      return exe_objfile->GetArchitecture().GetMachine() ==
+             llvm::Triple::wasm32;
+  }
+  // However, if there is no wasm module, we return false, otherwise,
+  // we might use ProcessWasm to attach gdb remote.
+  return false;
+}
+
+std::shared_ptr<ThreadGDBRemote> ProcessWasm::CreateThread(lldb::tid_t tid) {
+  return std::make_shared<ThreadWasm>(*this, tid);
+}
+
+size_t ProcessWasm::ReadMemory(lldb::addr_t vm_addr, void *buf, size_t size,
+                               Status &error) {
+  wasm_addr_t wasm_addr(vm_addr);
+
+  switch (wasm_addr.GetType()) {
+  case WasmAddressType::Memory:
+  case WasmAddressType::Object:
+    return ProcessGDBRemote::ReadMemory(vm_addr, buf, size, error);
+  case WasmAddressType::Invalid:
+    error.FromErrorStringWithFormat(
+        "Wasm read failed for invalid address 0x%" PRIx64, vm_addr);
+    return 0;
+  }
+}
+
+llvm::Expected<std::vector<lldb::addr_t>>
+ProcessWasm::GetWasmCallStack(lldb::tid_t tid) {
+  StreamString packet;
+  packet.Printf("qWasmCallStack:");
+  packet.Printf("%llx", tid);
+  StringExtractorGDBRemote response;
+  if (m_gdb_comm.SendPacketAndWaitForResponse(packet.GetString(), response) !=
+      GDBRemoteCommunication::PacketResult::Success)
+    return llvm::createStringError("failed to send qWasmCallStack");
+
+  if (!response.IsNormalResponse())
+    return llvm::createStringError("failed to get response for qWasmCallStack");
+
+  addr_t buf[1024 / sizeof(addr_t)];
+  size_t bytes = response.GetHexBytes(
+      llvm::MutableArrayRef<uint8_t>((uint8_t *)buf, sizeof(buf)), '\xdd');
+  if (bytes == 0)
+    return llvm::createStringError("invalid response for qWasmCallStack");
+
+  std::vector<lldb::addr_t> call_stack_pcs;
+  for (size_t i = 0; i < bytes / sizeof(addr_t); i++)
+    call_stack_pcs.push_back(buf[i]);
+
+  return call_stack_pcs;
+}
diff --git a/lldb/source/Plugins/Process/Wasm/ProcessWasm.h b/lldb/source/Plugins/Process/Wasm/ProcessWasm.h
new file mode 100644
index 0000000000000..d75921d76de8d
--- /dev/null
+++ b/lldb/source/Plugins/Process/Wasm/ProcessWasm.h
@@ -0,0 +1,87 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLDB_SOURCE_PLUGINS_PROCESS_WASM_PROCESSWASM_H
+#define LLDB_SOURCE_PLUGINS_PROCESS_WASM_PROCESSWASM_H
+
+#include "Plugins/Process/gdb-remote/ProcessGDBRemote.h"
+
+namespace lldb_private {
+namespace wasm {
+
+/// Each WebAssembly module has separated address spaces for Code and Memory.
+/// A WebAssembly module also has a Data section which, when the module is
+/// loaded, gets mapped into a region in the module Memory.
+/// For the purpose of debugging, we can represent all these separated 32-bit
+/// address spaces with a single virtual 64-bit address space.
+///
+/// Struct wasm_addr_t provides this encoding using bitfields
+enum WasmAddressType { Memory = 0x00, Object = 0x01, Invalid = 0x03 };
+struct wasm_addr_t {
+  uint64_t offset : 32;
+  uint64_t module_id : 30;
+  uint64_t type : 2;
+
+  wasm_addr_t(lldb::addr_t addr)
+      : offset(addr & 0x00000000ffffffff),
+        module_id((addr & 0x00ffffff00000000) >> 32), type(addr >> 62) {}
+
+  wasm_addr_t(WasmAddressType type, uint32_t module_id, uint32_t offset)
+      : offset(offset), module_id(module_id), type(type) {}
+
+  WasmAddressType GetType() { return static_cast<WasmAddressType>(type); }
+  operator lldb::addr_t() { return *(uint64_t *)this; }
+};
+
+/// ProcessWasm provides the access to the Wasm program state
+/// retrieved from the Wasm engine.
+class ProcessWasm : public process_gdb_remote::ProcessGDBRemote {
+public:
+  ProcessWasm(lldb::TargetSP target_sp, lldb::ListenerSP listener_sp);
+  ~ProcessWasm() override = default;
+
+  static lldb::ProcessSP CreateInstance(lldb::TargetSP target_sp,
+                                        lldb::ListenerSP listener_sp,
+                                        const FileSpec *crash_file_path,
+                                        bool can_connect);
+
+  static void Initialize();
+  static void DebuggerInitialize(Debugger &debugger);
+  static void Terminate();
+  static ConstString GetPluginNameStatic();
+  static const char *GetPluginDescriptionStatic();
+
+  llvm::StringRef GetPluginName() override;
+
+  size_t ReadMemory(lldb::addr_t vm_addr, void *buf, size_t size,
+                    Status &error) override;
+
+  bool CanDebug(lldb::TargetSP target_sp,
+                bool plugin_specified_by_name) override;
+
+  /// Retrieve the current call stack from the WebAssembly remote process.
+  llvm::Expected<std::vector<lldb::addr_t>> GetWasmCallStack(lldb::tid_t tid);
+
+protected:
+  std::shared_ptr<process_gdb_remote::ThreadGDBRemote>
+  CreateThread(lldb::tid_t tid) override;
+
+private:
+  friend class UnwindWasm;
+  process_gdb_remote::GDBRemoteDynamicRegisterInfoSP &GetRegisterInfo() {
+    return m_register_info_sp;
+  }
+
+  ProcessWasm(const ProcessWasm &);
+  const ProcessWasm &operator=(const ProcessWasm &) = delete;
+};
+
+} // namespace wasm
+} // namespace lldb_private
+
+#endif
diff --git a/lldb/source/Plugins/Process/Wasm/ThreadWasm.cpp b/lldb/source/Plugins/Process/Wasm/ThreadWasm.cpp
new file mode 100644
index 0000000000000..a6553ffffedaa
--- /dev/null
+++ b/lldb/source/Plugins/Process/Wasm/ThreadWasm.cpp
@@ -0,0 +1,34 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "ThreadWasm.h"
+
+#include "ProcessWasm.h"
+#include "UnwindWasm.h"
+#include "lldb/Target/Target.h"
+
+using namespace lldb;
+using namespace lldb_private;
+using namespace lldb_private::wasm;
+
+Unwind &ThreadWasm::GetUnwinder() {
+  if (!m_unwinder_up) {
+    assert(CalculateTarget()->GetArchitecture().GetMachine() ==
+           llvm::Triple::wasm32);
+    m_unwinder_up.reset(new wasm::UnwindWasm(*this));
+  }
+  return *m_unwinder_up;
+}
+
+llvm::Expected<std::vector<lldb::addr_t>> ThreadWasm::GetWasmCallStack() {
+  if (ProcessSP process_sp = GetProcess()) {
+    ProcessWasm *wasm_process = static_cast<ProcessWasm *>(process_sp.get());
+    return wasm_process->GetWasmCallStack(GetID());
+  }
+  return llvm::createStringError("no process");
+}
diff --git a/lldb/source/Plugins/Process/Wasm/ThreadWasm.h b/lldb/source/Plugins/Process/Wasm/ThreadWasm.h
new file mode 100644
index 0000000000000..1c90f58767bc8
--- /dev/null
+++ b/lldb/source/Plugins/Process/Wasm/ThreadWasm.h
@@ -0,0 +1,38 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLDB_SOURCE_PLUGINS_PROCESS_WASM_THREADWASM_H
+#define LLDB_SOURCE_PLUGINS_PROCESS_WASM_THREADWASM_H
+
+#include "Plugins/Process/gdb-remote/ThreadGDBRemote.h"
+
+namespace lldb_private {
+namespace wasm {
+
+/// ProcessWasm provides the access to the Wasm program state
+/// retrieved from the Wasm engine.
+class ThreadWasm : public process_gdb_remote::ThreadGDBRemote {
+public:
+  ThreadWasm(Process &process, lldb::tid_t tid)
+      : process_gdb_remote::ThreadGDBRemote(process, tid) {}
+  ~ThreadWasm() override = default;
+
+  /// Retrieve the current call stack from the WebAssembly remote process.
+  llvm::Expected<std::vector<lldb::addr_t>> GetWasmCallStack();
+
+protected:
+  Unwind &GetUnwinder() override;
+
+  ThreadWasm(const ThreadWasm &);
+  const ThreadWasm &operator=(const ThreadWasm &) = delete;
+};
+
+} // namespace wasm
+} // namespace lldb_private
+
+#endif // LLDB_SOURCE_PLUGINS_PROCESS_WASM_THREADWASM_H
diff --git a/lldb/source/Plugins/Process/Wasm/UnwindWasm.cpp b/lldb/source/Plugins/Process/Wasm/UnwindWasm.cpp
new file mode 100644
index 0000000000000..0852160a8edfa
--- /dev/null
+++ b/lldb/source/Plugins/Process/Wasm/UnwindWasm.cpp
@@ -0,0 +1,81 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "UnwindWasm.h"
+#include "Plugins/Process/gdb-remote/ThreadGDBRemote.h"
+#include "Plugins/Process/wasm/ProcessWasm.h"
+#include "Plugins/Process/wasm/ThreadWasm.h"
+#include "lldb/Utility/LLDBLog.h"
+#include "lldb/Utility/Log.h"
+
+using namespace lldb;
+using namespace lldb_private;
+using namespace process_gdb_remote;
+using namespace wasm;
+
+class WasmGDBRemoteRegisterContext : public GDBRemoteRegisterContext {
+public:
+  WasmGDBRemoteRegisterContext(ThreadGDBRemote &thread,
+                               uint32_t concrete_frame_idx,
+                               GDBRemoteDynamicRegisterInfoSP &reg_info_sp,
+                               uint64_t pc)
+      : GDBRemoteRegisterContext(thread, concrete_frame_idx, reg_info_sp, false,
+                                 false) {
+    PrivateSetRegisterValue(0, pc);
+  }
+};
+
+lldb::RegisterContextSP
+UnwindWasm::DoCreateRegisterContextForFrame(lldb_private::StackFrame *frame) {
+  if (m_frames.size() <= frame->GetFrameIndex())
+    return lldb::RegisterContextSP();
+
+  ThreadSP thread = frame->GetThread();
+  ThreadGDBRemote *gdb_thread = static_cast<ThreadGDBRemote *>(thread.get());
+  ProcessWasm *wasm_process =
+      static_cast<ProcessWasm *>(thread->GetProcess().get());
+
+  return std::make_shared<WasmGDBRemoteRegisterContext>(
+      *gdb_thread, frame->GetConcreteFrameIndex(),
+      wasm_process->GetRegisterInfo(), m_frames[frame->GetFrameIndex()]);
+}
+
+uint32_t UnwindWasm::DoGetFrameCount() {
+  if (!m_unwind_complete) {
+    m_unwind_complete = true;
+    m_frames.clear();
+
+    ThreadWasm &wasm_thread = static_cast<ThreadWasm &>(GetThread());
+    llvm::Expected<std::vector<lldb::addr_t>> call_stack_pcs =
+        wasm_thread.GetWasmCallStack();
+    if (!call_stack_pcs) {
+      LLDB_LOG_ERROR(GetLog(LLDBLog::Unwind), call_stack_pcs.takeError(),
+                     "Failed to get Wasm callstack: {0}");
+      m_frames.clear();
+      return 0;
+    }
+    m_frames = *call_stack_pcs;
+  }
+  return m_frames.size();
+}
+
+bool UnwindWasm::DoGetFrameInfoAtIndex(uint32_t frame_idx, lldb::addr_t &cfa,
+                                       lldb::addr_t &pc,
+                                       bool &behaves_like_zeroth_frame) {
+  if (m_frames.size() == 0)
+    DoGetFrameCount();
+
+  if (frame_idx < m_frames.size()) {
+    behaves_like_zeroth_frame = (frame_idx == 0);
+    cfa = 0;
+    pc = m_frames[frame_idx];
+    return true;
+  }
+
+  return false;
+}
diff --git a/lldb/source/Plugins/Process/Wasm/UnwindWasm.h b/lldb/source/Plugins/Process/Wasm/UnwindWasm.h
new file mode 100644
index 0000000000000..ff5e06d23d960
--- /dev/null
+++ b/lldb/source/Plugins/Process/Wasm/UnwindWasm.h
@@ -0,0 +1,51 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLDB_SOURCE_PLUGINS_PROCESS_WASM_UNWINDWASM_H
+#define LLDB_SOURCE_PLUGINS_PROCESS_WASM_UNWINDWASM_H
+
+#include "lldb/Target/RegisterContext.h"
+#include "lldb/Target/Unwind.h"
+#include <vector>
+
+namespace lldb_private {
+namespace wasm {
+
+/// UnwindWasm manages stack unwinding for a WebAssembly process.
+class UnwindWasm : public lldb_private::Unwind {
+public:
+  UnwindWasm(lldb_private::Thread &thread) : Unwind(thread) {}
+  ~UnwindWasm() override = default;
+
+protected:
+  void DoClear() override {
+    m_frames.clear();
+    m_unwind_complete = false;
+  }
+
+  uint32_t DoGetFrameCount() override;
+
+  bool DoGetFrameInfoAtIndex(uint32_t frame_idx, lldb::addr_t &cfa,
+                             lldb::addr_t &pc,
+                             bool &behaves_like_zeroth_frame) override;
+
+  lldb::RegisterContextSP
+  DoCreateRegisterContextForFrame(lldb_private::StackFrame *frame) override;
+
+private:
+  std::vector<lldb::addr_t> m_frames;
+  bool m_unwind_complete = false;
+
+  UnwindWasm(const UnwindWasm &);
+  const UnwindWasm &operator=(const UnwindWasm &) = delete;
+};
+
+} // namespace wasm
+} // namespace lldb_private
+
+#endif
diff --git a/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp b/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp
index a2c34ddfc252e..14dfdec6a6f62 100644
--- a/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp
+++ b/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp
@@ -323,6 +323,11 @@ ProcessGDBRemote::~ProcessGDBRemote() {
   KillDebugserverProcess();
 }
 
+std::shared_ptr<ThreadGDBRemote>
+ProcessGDBRemote::CreateThread(lldb::tid_t tid) {
+  return std::make_shared<ThreadGDBRemote>(*this, tid);
+}
+
 bool ProcessGDBRemote::ParsePythonTargetDefinition(
     const FileSpec &target_definition_fspec) {
   ScriptInterpreter *interpreter =
@@ -1594,7 +1599,7 @@ bool ProcessGDBRemote::DoUpdateThreadList(ThreadList &old_thread_list,
       ThreadSP thread_sp(
           old_thread_list_copy.RemoveThreadByProtocolID(tid, false));
       if (!thread_sp) {
-        thread_sp = std::make_shared<ThreadGDBRemote>(*this, tid);
+        thread_sp = CreateThread(tid);
         LLDB_LOGV(log, "Making new thread: {0} for thread ID: {1:x}.",
                   thread_sp.get(), thread_sp->GetID());
       } else {
@@ -1726,7 +1731,7 @@ ThreadSP ProcessGDBRemote::SetThreadStopInfo(
 
     if (!thread_sp) {
       // Create the thread if we need to
-      thread_sp = std::make_shared<ThreadGDBRemote>(*this, tid);
+      thread_sp = CreateThread(tid);
       m_thread_list_real.AddThread(thread_sp);
     }
   }
diff --git a/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.h b/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.h
index 7ae33837fd067..7c3dfb179a4b3 100644
--- a/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.h
+++ b/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.h
@@ -246,6 +246,8 @@ class ProcessGDBRemote : public Process,
 
   ProcessGDBRemote(lldb::TargetSP target_sp, lldb::ListenerSP listener_sp);
 
+  virtual std::shared_ptr<ThreadGDBRemote> CreateThread(lldb::tid_t tid);
+
   bool SupportsMemoryTagging() override;
 
   /// Broadcaster event bits definitions.
diff --git a/lldb/source/Target/Platform.cpp b/lldb/source/Target/Platform.cpp
index 8000cd07565ae..f9bf5d1a9f160 100644
--- a/lldb/source/Target/Platform.cpp
+++ b/lldb/source/Target/Platform.cpp
@@ -2076,6 +2076,12 @@ size_t Platform::GetSoftwareBreakpointTrapOpcode(Target &target,
     trap_opcode_size = sizeof(g_loongarch_opcode);
   } break;
 
+  case llvm::Triple::wasm32: {
+    static const uint8_t g_wasm_opcode[] = {0x00}; // unreachable
+    tr...
[truncated]

@vogelsgesang vogelsgesang changed the title [lld] Add WebAssembly Process Plugin [lldb] Add WebAssembly Process Plugin Jul 22, 2025
@JDevlieghere JDevlieghere force-pushed the qWasmCallStack branch 2 times, most recently from 46a7388 to 1d0ad45 Compare July 23, 2025 00:32
@DavidSpickett
Copy link
Collaborator

How does this relate to / overlap with existing PRs #77949 and #78977? They are unlikely to get merged by the authors, so I assume you'll make your own equivalent.

qWasmCallStack

This is a bit of a break from the usual reading unwind info, and I wonder what other stuff is different in this ecosystem.

Can we get an RFC with an overview of that? Like are these runtimes generally implementing GDB stubs, is the debug model super different, that sort of thing.

At least testing can be done anywhere the runtime can run, that's a cool feature.

@JDevlieghere
Copy link
Member Author

JDevlieghere commented Jul 23, 2025

How does this relate to / overlap with existing PRs #77949 and #78977? They are unlikely to get merged by the authors, so I assume you'll make your own equivalent.

I had totally forgotten about those PRs . Yes, they're iterations of the same patch by @paolosevMSFT (which also I based this patch on). The biggest difference is that I'm breaking it down in smaller pieces to make reviewing them easier.

qWasmCallStack

This is a bit of a break from the usual reading unwind info, and I wonder what other stuff is different in this ecosystem.

I've been told that the runtime can have a different stack (separate from the native stack) for the Wasm code and also that the format of call frames may not match native stack frame ABI.

Can we get an RFC with an overview of that? Like are these runtimes generally implementing GDB stubs, is the debug model super different, that sort of thing.

I'm not sure I'm well versed enough in the world of WASM to create an RFC, but I did do a survey of debugging the different runtimes.

  • Wasmtime is a WebAssembly runtime developed by the Bytecode Alliance. It's a standalone runtime that can execute WebAssembly modules outside of web browsers. Wasmtime uses Cranelift (a code generator) to compile Wasm to native machine code. Debugging works by debugging wasmtime itself, which performs JIT compilation of the WebAssembly bytecode to native code. It also supports translating the generated DWARF debug info, so that native debuggers can work with it to debug the JIT-compiled code.

  • Chrome has a DevTools Plugin for debugging C/C++ WebAssembly applications (using DWARF debug information). It uses LLDB to parse DWARF, but otherwise relies on Chrome’s integrated debugging support for Wasm.

  • WebAssembly Micro Runtime (WAMR) is a lightweight standalone WebAssembly runtime. When built with debugging support, it includes a GDB remote stub which a patched LLDB can talk to.

I'm not aware of any other runtimes besides WAMR that support debugging through GDB remote. Of all the approaches, I think it's the most "desirable" going forward. The fact that there is a ByteCode Alliance runtime that supports it makes it even more compelling.

Correction: Looks like Chrome V8 implements this as well and presumably the reason Paolo was supporting this in the first place: https://chromium-review.googlesource.com/c/v8/v8/+/2571341

At least testing can be done anywhere the runtime can run, that's a cool feature.

Yes, long term it would be cool if we can build inferiors and decide where to run them (e.g. QEMU, WAMR, etc). In the short term, we can get pretty good test coverage with the GDB remote test cases.

To put things more pragmatically, LLDB has partial support for Wasm debugging using gdb-remote. There's an "official" runtime that implements a gdb stub with the necessary extensions. My hope is that going forward, support in LLDB would encourage other runtimes to adopt this as well.

@JDevlieghere
Copy link
Member Author

Tagging @xujuntwt95329, @mh4ck-Thales and @xwang98 who took part in the discussion in the previous PRs.

@DavidSpickett
Copy link
Collaborator

I've been told that the runtime can have a different stack (separate from the native stack) for the Wasm code and also that the format of call frames may not match native stack frame ABI.

Sure, so this is like qemu-user? You want to debug the wasm process being hosted within the runtime not the runtime itself.

Because the internet is 99% "use a web browser" for debug, then https://docs.wasmtime.dev/examples-debugging-native-debugger.html seems to be about debugging the runtime with a small amount of the wasm side, can't tell much from that page.

@DavidSpickett
Copy link
Collaborator

This is an overly harsh way of phrasing it, but hopefully it gives you an idea of why I'm asking for an overview of this effort:

  • We already have at least one barely tested architecture, I would like to avoid picking up another one.

I think WASM is cool (all WASM userspace when?), I think not having bugs is cool too. I personally trust that you have done the leg work, but I have to verify that at least a bit, as I would with any other random contributor.

I also know that WASM is a hot topic so for this work to get the best reception, a statement of intent might draw interest, and set expectations appropriately.

But ok, as an RFC would be unlikely to generate actual objections, I'll say what I think here.

The biggest difference is that I'm breaking it down in smaller pieces to make reviewing them easier.

Great, they were a bit much.

I've been told that the runtime can have a different stack (separate from the native stack) for the Wasm code and also that the format of call frames may not match native stack frame ABI.

So they have designed their server specifically to allow debugging both? I wonder how they manage that. A multi-architecture target would be interesting (ala ARM64EC on Windows) but I presume it's different targets. Can lldb handle that? It doesn't have to but it sounds pretty cool if it can.

Wasmtime
It also supports translating the generated DWARF debug info, so that native debuggers can work with it to debug the JIT-compiled code.

So for this, you aren't stepping WASM instructions, you're debugging it like any other native JIT. The input to that JIT just happens to be WASM.

Chrome has a DevTools Plugin for debugging C/C++ WebAssembly applications (using DWARF debug information). It uses LLDB to parse DWARF, but otherwise relies on Chrome’s integrated debugging support for Wasm.

Yeah, 99% of the internet says "just use chrome" to debug. So we're not going to plug in to this part of it then.

I'm not aware of any other runtimes besides WAMR that support debugging through GDB remote. Of all the approaches, I think it's the most "desirable" going forward. The fact that there is a ByteCode Alliance runtime that supports it makes it even more compelling.

We will need to state which runtime we intend to support and one from the people writing the standard is the best choice I agree.

I suppose it running on small devices that cannot host a browser is why they went that way, do the debugger UI stuff elsewhere.

Yes, long term it would be cool if we can build inferiors and decide where to run them (e.g. QEMU, WAMR, etc). In the short term, we can get pretty good test coverage with the GDB remote test cases.

We can get a surprising amount of coverage that way, that is true.

What we can't do is answer is random lldb command supposed to work with wasm? That's what a test suite run would provide.

Ultimately I would prefer regular test suite runs using this WASM runtime, but I would settle for a manual run that gives us a snapshot of the state of WASM support.

Finally, could you open a meta-issue for WASM support? Like we have for RISC-V and LoongArch.

Ofc these issues get outdated pretty fast but at least it's something we can put on the website and folks can ask their "should X work" questions.

...now I've ranted enough about process let me review some actual code :)

@DavidSpickett
Copy link
Collaborator

Also can we get a vague idea of how many more PRs are in the stack for this? Just so save me asking "why is X missing" over and over.

@DavidSpickett
Copy link
Collaborator

process connect --plugin wasm connect://localhost:4567

On the original review, auto-detection of the plugin wasn't working which meant this plugin wasm was needed. Did you ever find out what was the cause? I presume lldb still has a local copy of the program file, so it is looking for something that can handle wasm.

@mh4ck-Thales
Copy link
Contributor

Thanks for picking up the work on this!

I'm not aware of any other runtimes besides WAMR that support debugging through GDB remote. Of all the approaches, I think it's the most "desirable" going forward. The fact that there is a ByteCode Alliance runtime that supports it makes it even more compelling.

I agree that this is the best method for implementing Wasm debugging. We want to be able to debug the Wasm user program transparently, on any supported without having to deal with runtime-related specificities.

We will need to state which runtime we intend to support and one from the people writing the standard is the best choice I agree.

I suppose it running on small devices that cannot host a browser is why they went that way, do the debugger UI stuff elsewhere.

For now only WAMR supports remote debugging with lldb, but I think (hope?) that other runtimes will be supported in the future. In this situation maybe it would be nice to discuss with the Bytecode Alliance what is expected from such a feature and if we can somewhat standardize it, so other runtimes know what they have to implement in order to support it. I'm bringing @alexcrichton into the loop, he's involved in the Bytecode Alliance and LLVM, maybe he'll be able to help us on these topics.

@DavidSpickett
Copy link
Collaborator

For now only WAMR supports remote debugging with lldb, but I think (hope?) that other runtimes will be supported in the future. In this situation maybe it would be nice to discuss with the Bytecode Alliance what is expected from such a feature and if we can somewhat standardize it, so other runtimes know what they have to implement in order to support it. I'm bringing @alexcrichton into the loop, he's involved in the Bytecode Alliance and LLVM, maybe he'll be able to help us on these topics.

Nice, yes once there is something working it will be interesting to see what we build with collaboration.

To be clear: I say that lldb should state which runtime(s) it supports to set expectations. If changes to work with others are not disruptive then the more the merrier (assuming my beloved test coverage is there :) ).

@mh4ck-Thales
Copy link
Contributor

To be clear: I say that lldb should state which runtime(s) it supports to set expectations. If changes to work with others are not disruptive then the more the merrier (assuming my beloved test coverage is there :) ).

I'm not well-versed in lldb, but it makes more sense to me that lldb propose a standard for debugging Wasm remotely and that lldb support any runtime complying with the standard? It seems more simple to have the runtimes follow what lldb wants instead of adapting lldb to each runtime, especially as the list grows. For the test coverage we can have one of more reference runtime (e.g. the ones of the bytecode alliance) but I would say it is up to the runtimes developers to ensure they work with lldb, on our side we just want to ensure that we don't have any bugs / regressions in lldb and for that testing on 2/3 runtimes should be enough

@DavidSpickett
Copy link
Collaborator

Long term, if the WASM standard makes recommendations for debugging then that's great. You are correct there.

Short term, users will need to know what to expect so we need a short, simple, answer to "I tried lldb with whatever runtime and it didn't work". Which ideally is "X is known to work to this specific extent, patches are welcome to support Y and Z".

@DavidSpickett
Copy link
Collaborator

Certainly lldb being in this conversation early would be a good way to prevent the reliance on GDB specific behaviour that often happens in the native debugging world.

@JDevlieghere
Copy link
Member Author

Sure, so this is like qemu-user? You want to debug the wasm process being hosted within the runtime not the runtime itself.

Because the internet is 99% "use a web browser" for debug, then https://docs.wasmtime.dev/examples-debugging-native-debugger.html seems to be about debugging the runtime with a small amount of the wasm side, can't tell much from that page.

Yep, it's the typical choice between "user" vs "system" debugging. Both are valuable and have their own strengths. I think the approach wasmtime took is a good choice for it, but it wouldn't make sense for something like V8 where you'd have to debug the whole JavaScript runtime to debug your Wasm code. Hence why I think the "user" approach is the most promising, generally speaking.

This is an overly harsh way of phrasing it, but hopefully it gives you an idea of why I'm asking for an overview of this effort:

  • We already have at least one barely tested architecture, I would like to avoid picking up another one.

I think WASM is cool (all WASM userspace when?), I think not having bugs is cool too. I personally trust that you have done the leg work, but I have to verify that at least a bit, as I would with any other random contributor.

I don't think that's harsh at all. I think I made almost entirely the same point in one of Paolo's original PRs.

I also know that WASM is a hot topic so for this work to get the best reception, a statement of intent might draw interest, and set expectations appropriately.

I was under the impression that we had reached consensus on this in the various previous PRs. My read of the situation was that there were some practical issues (like the PRs being to big, etc) but no fundamental objections to the approach. Admittedly, I should've included that assumption explicitly in this PR.

But ok, as an RFC would be unlikely to generate actual objections, I'll say what I think here.

The biggest difference is that I'm breaking it down in smaller pieces to make reviewing them easier.

Great, they were a bit much.

I've been told that the runtime can have a different stack (separate from the native stack) for the Wasm code and also that the format of call frames may not match native stack frame ABI.

So they have designed their server specifically to allow debugging both? I wonder how they manage that. A multi-architecture target would be interesting (ala ARM64EC on Windows) but I presume it's different targets. Can lldb handle that? It doesn't have to but it sounds pretty cool if it can.

Not really, this is back to the difference between debugging your runtime (i.e. WAMR or V8) vs debugging the client. To use an analogy, it's the difference between debugging the kernel and looking at its data structures to figure out the processes vs using debugserver to debug a userspace program.

Wasmtime
It also supports translating the generated DWARF debug info, so that native debuggers can work with it to debug the JIT-compiled code.

So for this, you aren't stepping WASM instructions, you're debugging it like any other native JIT. The input to that JIT just happens to be WASM.

I wouldn't even call it debugging WASM. You're debugging native code that happened to be JITed from Wasm.

Chrome has a DevTools Plugin for debugging C/C++ WebAssembly applications (using DWARF debug information). It uses LLDB to parse DWARF, but otherwise relies on Chrome’s integrated debugging support for Wasm.

Yeah, 99% of the internet says "just use chrome" to debug. So we're not going to plug in to this part of it then.

Correct, although since they use V8, if they exposed that debug stub, they'd get native debugging with LLDB out of the box.

I'm not aware of any other runtimes besides WAMR that support debugging through GDB remote. Of all the approaches, I think it's the most "desirable" going forward. The fact that there is a ByteCode Alliance runtime that supports it makes it even more compelling.

We will need to state which runtime we intend to support and one from the people writing the standard is the best choice I agree.

Sounds good, we can list WAMR and V8 and say that we support anyone that implements the GDB remote protocol plus the handful of WASM extensions (which we'll document).

I suppose it running on small devices that cannot host a browser is why they went that way, do the debugger UI stuff elsewhere.

Yes, long term it would be cool if we can build inferiors and decide where to run them (e.g. QEMU, WAMR, etc). In the short term, we can get pretty good test coverage with the GDB remote test cases.

We can get a surprising amount of coverage that way, that is true.

What we can't do is answer is random lldb command supposed to work with wasm? That's what a test suite run would provide.

Ultimately I would prefer regular test suite runs using this WASM runtime, but I would settle for a manual run that gives us a snapshot of the state of WASM support.

That seems fair and a good way to flush out issues once we've added support for everything we know is missing.

Finally, could you open a meta-issue for WASM support? Like we have for RISC-V and LoongArch.

Ofc these issues get outdated pretty fast but at least it's something we can put on the website and folks can ask their "should X work" questions.

#150449

...now I've ranted enough about process let me review some actual code :)

Haha, I appreciate the engagement. Thanks for all the input!

@JDevlieghere
Copy link
Member Author

process connect --plugin wasm connect://localhost:4567

On the original review, auto-detection of the plugin wasn't working which meant this plugin wasm was needed. Did you ever find out what was the cause? I presume lldb still has a local copy of the program file, so it is looking for something that can handle wasm.

I didn't look into it yet, but I added it as a task to #150449

@alexcrichton
Copy link
Contributor

I'm bringing @alexcrichton into the loop, he's involved in the Bytecode Alliance and LLVM, maybe he'll be able to help us on these topics.

👋

FWIW I'm very much an outsider here so I don't fully understand all the dynamics in play per se, but what I can speak more to is plans on the Wasmtime side of things. For Wasmtime we have an approved RFC about the methodology and priorities for implementing debugging. This plan primarily goes through the Debug Adapter Protocol and intentionally does not natively implement GDB/LLDB integration mostly to handle wasm programs that are an interpreter for their own lanuage (e.g. Python-compiled-to-wasm).

We do not currently have anyone slated to implement this work as it hasn't been a priority for existing maintainers yet and we haven't had other volunteers. My assumption though would be that if LLDB had a native wasm plugin support then Wasmtime would implement a debug adapter component to bridge what Wasmtime would implement natively and what LLDB supported.

I'm not personally aware of other runtimes that want to mirror what WAMR is doing here. The current plan for Wasmtime won't include mirroring the support natively, but that's not to say Wasmtime's plan is set in stone and not possible to change either.

@fitzgen
Copy link
Contributor

fitzgen commented Jul 24, 2025

Great to see more interest in WebAssembly from lldb, and thanks for sharing our debugging plans here, @alexcrichton.

We do not currently have anyone slated to implement this work as it hasn't been a priority for existing maintainers yet and we haven't had other volunteers.

Small clarification / additional context as the Wasmtime-focused team I am on just finished some planning: some folks (@cfallin and others) have actually started working on the implementation of parts of that debugging RFC already, and while they haven't yet started on the debug adapter protocol implementation that the lldb frontend (or others) could attach to, that is the next big milestone after the current one. The plan is to begin the debug adapter protocol implementation work roughly around the beginning of Q4.

@JDevlieghere
Copy link
Member Author

Hey @alexcrichton and @fitzgen, thanks for chiming in here! The idea to implement a DAP server to support interpreted languages is an interesting idea. As someone who's been involved with lldb-dap I have some experience in that space.

My immediate question about how you're planning on supporting that was answered by the "Debug Adapter Components" section. The following sentence stood out to me:

[W]e propose using the component model to design an interface for a debug adapter. A debug adapter will import a private interface that gives it low-level access to the underlying Wasm debuggee, and will export a higher-level interface that the debugger will interact with and query.

This is basically describing the GDB remote protocol. In the world of native debugging, it's the standard for talking to a debugger like lldb and gdb. I definitely think it's something worth considering, and it sounds like Alex seems to be saying something similar.

I'm not personally aware of other runtimes that want to mirror what WAMR is doing here. The current plan for Wasmtime won't include mirroring the support natively, but that's not to say Wasmtime's plan is set in stone and not possible to change either.

The V8 JavaScript engine is another example, and in the context of Swift, WasmKit is interested in it as well.

Small clarification / additional context as the Wasmtime-focused team I am on just finished some planning: some folks (@cfallin and others) have actually started working on the implementation of parts of that debugging RFC already, and while they haven't yet started on the debug adapter protocol implementation that the lldb frontend (or others) could attach to, that is the next big milestone after the current one. The plan is to begin the debug adapter protocol implementation work roughly around the beginning of Q4.

When you talk about LLDB as a "client", I assume you mean the "debug adapater component" rather than the "debug adapter protocol"?

@fitzgen
Copy link
Contributor

fitzgen commented Jul 24, 2025

This is basically describing the GDB remote protocol.

Clarification: The interfaces described in that RFC and that I am describing below are programmatic/function interfaces (specifically Wasm component interfaces) -- not remote protocols.

More on the GDB protocol specifically down below.

When you talk about LLDB as a "client", I assume you mean the "debug adapater component" rather than the "debug adapter protocol"?

I see two new ways that LLDB and Wasmtime could interact in the future (on top of the existing way by attaching LLDB to a native Wasmtime process and debugging Wasm (and native code) via the JIT code registration system APIs):

  1. LLDB's TUI frontend could be a client, connected to Wasmtime as a server, and they speak the debug adapter protocol to each other.

  2. LLDB could be compiled to Wasm (ideally as a library, but there are ways a Wasm binary with a main could be embedded) and used to implement a "debug adapter component". Debug adapter components are Wasm components that layer source-level debugging on top of Wasm-machine-level debugging. They export a source-level debugging interface and import a wasm-level debugging interface (which is implemented by Wasmtime itself). When the debugger frontend sends a set-breakpoint request over the debug adapter protocol, for example, Wasmtime will translate that protocol request into an invocation of the associated function in the debug adapter component's exported source-level interface. Then the debug adapter component would (in this case) invoke its LLDB library to determine where to set breakpoints in Wasm, and then finally invoke its imported Wasm-level debugging interface to actually have Wasmtime set those breakpoints in its JIT code.

Both of the above could even happen at the same time.

We could also potentially implement a GDB protocol shim as a debug adapter component. Seems possible but I haven't fully thought it through.


Backing up, here are the main reasons we didn't choose the GDB protocol:

  • It excludes debugging interpreted-in-Wasm programs, as Alex mentioned. Today, we have many users that write Wasm programs in Python or JS that happen to be implemented via compiling the Python/JS interpreter to Wasm and running the user's program inside the interpreter. Users want to debug their Python/JS program, of course, not the interpreter. With debug adapter components, language runtimes have the flexibility to implement their own debug adapter component specific to themselves, enabling source-level debugging of their interpreted programs.

  • Wasm as a language has a bunch of things that don't fit neatly into the GDB protocol and its idea of an abstract ISA:

    • it is a harvard architecture with a managed stack and structured control flow (for example, IIRC, the GDB protocol wants stack walking logic to be done on the client and implemented via inspecting raw stack slots the same as any other memory access on the server, and that is not possible with Wasm's managed stack),
    • it has multiple different kinds of "registers" (value-stack operands, locals, and globals),
    • it has managed, unforgable references whose bit patterns cannot be inspected and have different valid operations depending on the kind of reference,
    • GC structs and arrays as core language primitives, whose raw layout, field offsets, alignment, and bits cannot be inspected, and must be accessed only via struct.get and similar,
    • many different memories (p is a pointer, but a pointer into which memory?),
    • tables that are kind of like memories but for managed references,
    • and probably more things I am forgetting off the top of my head.

    Now if you want to only support the exact shape of Wasm module that LLVM emits (or at least emits currently, who knows what new shapes it will emit as it grows new features in the future) where there is exactly one memory and managed references aren't used, etc... and you don't care about debugging interpreted programs, then a bunch of these concerns evaporate and/or can be papered over. And maybe then the GDB protocol is pretty viable. But Wasmtime intends to support debugging the full WebAssembly language, including all of its oddities, as well as all Wasm toolchains regardless of which Wasm features they leverage.

@JDevlieghere
Copy link
Member Author

I see two new ways that LLDB and Wasmtime could interact in the future (on top of the existing way by attaching LLDB to a native Wasmtime process and debugging Wasm (and native code) via the JIT code registration system APIs):

  1. LLDB's TUI frontend could be a client, connected to Wasmtime as a server, and they speak the debug adapter protocol to each other.

I'm personally skeptical of this part. LLDB's command line driver is very small and doesn't do much. You absolutely could turn that into a DAP client, but you wouldn't get much in return. Most of the "TUI" parts of LLDB are tightly coupled with the core of the library. For folks to retain the same "look and feel", you would have to reimplement all the commands in terms of DAP operations. Doable of course, but likely a lot of work for very little return.

  1. LLDB could be compiled to Wasm (ideally as a library, but there are ways a Wasm binary with a main could be embedded) and used to implement a "debug adapter component". Debug adapter components are Wasm components that layer source-level debugging on top of Wasm-machine-level debugging. They export a source-level debugging interface and import a wasm-level debugging interface (which is implemented by Wasmtime itself). When the debugger frontend sends a set-breakpoint request over the debug adapter protocol, for example, Wasmtime will translate that protocol request into an invocation of the associated function in the debug adapter component's exported source-level interface. Then the debug adapter component would (in this case) invoke its LLDB library to determine where to set breakpoints in Wasm, and then finally invoke its imported Wasm-level debugging interface to actually have Wasmtime set those breakpoints in its JIT code.

How about instead of compiling LLDB to Wasm, we compile a small binary to Wasm and run that in process, handling the callbacks. If that hypothetical server implemented the GDB remote protocol, you basically have how we do debugging on Linux/Windows (lldb-server) and Darwin (debugserver).

Both of the above could even happen at the same time.

We could also potentially implement a GDB protocol shim as a debug adapter component. Seems possible but I haven't fully thought it through.

I'd be happy to keep chatting about this if you're interested in exploring that path further, now or in the future. I also don't want to necessarily push you towards the GDB remote protocol. For languages like C++, Rust, Swift, or really anything targeting LLVM/supported by LLDB, I strongly believe that going with the GDB remote protocol is the best and easiest way to support debugging, but I also totally acknowledge that the trade-offs may be different for Wasmtime. That said, they also don't necessarily need to be mutually exclusive.

@jimingham
Copy link
Collaborator

jimingham commented Jul 24, 2025

I see two new ways that LLDB and Wasmtime could interact in the future (on top of the existing way by attaching LLDB to a native Wasmtime process and debugging Wasm (and native code) via the JIT code registration system APIs):

  1. LLDB's TUI frontend could be a client, connected to Wasmtime as a server, and they speak the debug adapter protocol to each other.

I'm personally skeptical of this part. LLDB's command line driver is very small and doesn't do much. You absolutely could turn that into a DAP client, but you wouldn't get much in return. Most of the "TUI" parts of LLDB are tightly coupled with the core of the library. For folks to retain the same "look and feel", you would have to reimplement all the commands in terms of DAP operations. Doable of course, but likely a lot of work for very little return.

  1. LLDB could be compiled to Wasm (ideally as a library, but there are ways a Wasm binary with a main could be embedded) and used to implement a "debug adapter component". Debug adapter components are Wasm components that layer source-level debugging on top of Wasm-machine-level debugging. They export a source-level debugging interface and import a wasm-level debugging interface (which is implemented by Wasmtime itself). When the debugger frontend sends a set-breakpoint request over the debug adapter protocol, for example, Wasmtime will translate that protocol request into an invocation of the associated function in the debug adapter component's exported source-level interface. Then the debug adapter component would (in this case) invoke its LLDB library to determine where to set breakpoints in Wasm, and then finally invoke its imported Wasm-level debugging interface to actually have Wasmtime set those breakpoints in its JIT code.

How about instead of compiling LLDB to Wasm, we compile a small binary to Wasm and run that in process, handling the callbacks. If that hypothetical server implemented the GDB remote protocol, you basically have how we do debugging on Linux/Windows (lldb-server) and Darwin (debugserver).

Both of the above could even happen at the same time.
We could also potentially implement a GDB protocol shim as a debug adapter component. Seems possible but I haven't fully thought it through.

I'd be happy to keep chatting about this if you're interested in exploring that path further, now or in the future. I also don't want to necessarily push you towards the GDB remote protocol. For languages like C++, Rust, Swift, or really anything targeting LLVM/supported by LLDB, I strongly believe that going with the GDB remote protocol is the best and easiest way to support debugging, but I also totally acknowledge that the trade-offs may be different for Wasmtime. That said, they also don't necessarily need to be mutually exclusive.

The part of the debugger that handles symbol and debug information is big, requires access to lots of large files, allocates lots of memory, etc. So you really want to be able to run lldb itself on the biggest system you have to hand, not the system on which you run the program you are inspecting, which might be a resource constrained device. That's why the gdb-remote protocol or something like it is so attractive. The part that has to run on the system with the debugee doesn't have to reason about symbols and types, and so is small and doesn't consume a lot of resources, and the part that has to deal with lots of data can run somewhere else if that's more appropriate.

@mh4ck-Thales
Copy link
Contributor

I'd be happy to keep chatting about this if you're interested in exploring that path further, now or in the future. I also don't want to necessarily push you towards the GDB remote protocol. For languages like C++, Rust, Swift, or really anything targeting LLVM/supported by LLDB, I strongly believe that going with the GDB remote protocol is the best and easiest way to support debugging, but I also totally acknowledge that the trade-offs may be different for Wasmtime. That said, they also don't necessarily need to be mutually exclusive.

I agree that the GDB remote protocol is the best / easiest way to debug everything supported by LLVM / that isn't interpreted in some way. Regarding the interpreted languages, I'd say that there is also a use for this king of debugging for allowing runtime developers to debug their runtime when compiled to Wasm. Most issues in an interpreted code that are not Wasm-related can be debugged on the language's runtime directly without going through Wasm, and Wasm-related bugs will often need access to the runtime as the bug being Wasm-specific may have something to do with the runtime.

@mh4ck-Thales
Copy link
Contributor

I tested this patch and while basic debugging works, some features that were available in #77949 are not working anymore (retrieval of variable values, disassembly of the Wasm bytecode...). I'm not sure if it's a bug or if it's because the #77949 patch has been split into incremental patches. Would you like me to report these problems here or in #150449?

@DavidSpickett
Copy link
Collaborator

Hence why I think the "user" approach is the most promising, generally speaking.

Agreed. If I want to debug qemu-user I treat it like any other program, if I want the simulated process I connect to the internal stub. Some will want integrated solutions but that can come later).

I was under the impression that we had reached consensus on this in the various previous PRs. My read of the situation was that there were some practical issues (like the PRs being to big, etc) but no fundamental objections to the approach. Admittedly, I should've included that assumption explicitly in this PR.

You weren't wrong, the point of an RFC would have been more to be a place to bring up all the weird details one might need to know to think about how this work will work e.g. the address spaces and the endian and only having a PC and so on.

But we're having that discussion now so it's fine. There's no massive issues with this PR itself so it's not causing confusion.

I'd be happy to keep chatting about this if you're interested in exploring that path further, now or in the future. I also don't want to necessarily push you towards the GDB remote protocol. For languages like C++, Rust, Swift, or really anything targeting LLVM/supported by LLDB, I strongly believe that going with the GDB remote protocol is the best and easiest way to support debugging, but I also totally acknowledge that the trade-offs may be different for Wasmtime. That said, they also don't necessarily need to be mutually exclusive.

Also the GDB remote protocol is very loosely a standard and has many problems, so in a general sense I do want to hear what projects who aren't required to use it want to do. I am interested to read the WASM debug proposals and see what fresh perspectives there are.

The part of the debugger that handles symbol and debug information is big, requires access to lots of large files, allocates lots of memory, etc. So you really want to be able to run lldb itself on the biggest system you have to hand, not the system on which you run the program you are inspecting, which might be a resource constrained device.

I wonder if it's intentional on WAMR's part to choose the gdb-remote protocol for this reason. I can understand why the "hosted" runtimes would want to take on more code for bigger benefits though.

I'm not sure if it's a bug or if it's because the #77949 patch has been split into incremental patches.

@JDevlieghere do you have a WIP tree they can use that has more changes?

@DavidSpickett
Copy link
Collaborator

I'm personally skeptical of this part. LLDB's command line driver is very small and doesn't do much. You absolutely could turn that into a DAP client, but you wouldn't get much in return. Most of the "TUI" parts of LLDB are tightly coupled with the core of the library. For folks to retain the same "look and feel", you would have to reimplement all the commands in terms of DAP operations. Doable of course, but likely a lot of work for very little return.

Sounds like a thing some enthusiast will eventually do, but I don't see any command line DAP client debuggers out there now. Unless you count using neovim or emacs as the IDE but it's not the same sort of use, no low level commands there. One day when the DAP protocol grows a bunch of extensions maybe it'll happen.

I agree that most people will simply connect an existing IDE to this DAP server. Which sounds pretty neat, but doesn't require any input from lldb either way.

@JDevlieghere
Copy link
Member Author

I tested this patch and while basic debugging works, some features that were available in #77949 are not working anymore (retrieval of variable values, disassembly of the Wasm bytecode...). I'm not sure if it's a bug or if it's because the #77949 patch has been split into incremental patches. Would you like me to report these problems here or in #150449?

Yep, that's expected: this PR contains only a subset: notably the foundation (the process plugin) and support for backtraces. I'll be adding the other features in separate PRs to make reviewing easier.

@JDevlieghere
Copy link
Member Author

JDevlieghere commented Jul 25, 2025

@JDevlieghere do you have a WIP tree they can use that has more changes?

Not currently no, but it's only one or two more patches before we have parity with the original PR. I just need to decide how to split them up once this PR lands.

As much as I enjoy the discussion about the future of debugging Wasm, I want to make sure we don't lose focus in this PR. I think all outstanding issues have been address. @DavidSpickett you've been actively engaged here and in the previous PR so I'm mostly looking at you for sign off.

@DavidSpickett
Copy link
Collaborator

I'll be adding the other features in separate PRs to make reviewing easier.

Just for good faith purposes, can you give me a list of the parts you've got it split into at the moment and what state they are in? Done, WIP, whatever. I assume from what you've said, that it's all basically done, just want to get a clear idea.

@DavidSpickett
Copy link
Collaborator

And by "all" I mean your current stack, not "all" of wasm debugging support. That is of an unknown scope at the moment and I'm fine with that.


Get the Wasm callback for the given thread id. This returns a hex-encoding list
of 64-bit addresses for the frame PCs. To match the Wasm specification, the
addresses are encoded in little endian byte order.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add "even if the endian of the Wasm runtime's host is not little endian.".

@DavidSpickett
Copy link
Collaborator

Closed all the comments I think have been addressed. Please look at the ones that remain open.

@JDevlieghere
Copy link
Member Author

JDevlieghere commented Jul 28, 2025

I'll be adding the other features in separate PRs to make reviewing easier.

Just for good faith purposes, can you give me a list of the parts you've got it split into at the moment and what state they are in? Done, WIP, whatever. I assume from what you've said, that it's all basically done, just want to get a clear idea.

I have two patches, which I'll upload later today:

  1. Implement a RegisterContextWasm which exposes virtual registers locals, globals and stack values. (PR)
  2. Support DW_OP_WASM_location in the DWARF expression evaluator, which uses the virtual registers. (PR)

With those two patches, you can see locals:

* thread #1, name = 'nobody', stop reason = breakpoint 1.1
    frame #0: 0x400000000000019c wasm32_args.wasm`add(a=1, b=2) at test.c:4:12
   1    int
   2    add(int a, int b)
   3    {
-> 4        return a + b;
   5    }
   6
   7    int
(lldb) bt
* thread #1, name = 'nobody', stop reason = breakpoint 1.1
  * frame #0: 0x400000000000019c wasm32_args.wasm`add(a=1, b=2) at test.c:4:12
    frame #1: 0x40000000000001e5 wasm32_args.wasm`main at test.c:12:12
    frame #2: 0x40000000000001fe wasm32_args.wasm
(lldb) frame var
(int) a = 1
(int) b = 2

Extend support in LLDB for WebAssembly. This PR adds a new Process
plugin (ProcessWasm)that extends ProcessGDBRemote for WebAssembly
targets. It adds support for WebAssembly's memory model with separate
address spaces, and the ability to fetch the call stack from the
WebAssembly runtime.

I have tested this change with the WebAssembly Micro Runtime (WAMR,
https://github.com/bytecodealliance/wasm-micro-runtime) which implements
a GDB debug stub and supports the qWasmCallStack packet.

```
(lldb) process connect --plugin wasm connect://localhost:4567
Process 1 stopped
* thread llvm#1, name = 'nobody', stop reason = trace
    frame #0: 0x40000000000001ad
wasm32_args.wasm`main:
->  0x40000000000001ad <+3>:  global.get 0
    0x40000000000001b3 <+9>:  i32.const 16
    0x40000000000001b5 <+11>: i32.sub
    0x40000000000001b6 <+12>: local.set 0
(lldb) b add
Breakpoint 1: where = wasm32_args.wasm`add + 28 at test.c:4:12, address = 0x400000000000019c
(lldb) c
Process 1 resuming
Process 1 stopped
* thread llvm#1, name = 'nobody', stop reason = breakpoint 1.1
    frame #0: 0x400000000000019c wasm32_args.wasm`add(a=<unavailable>, b=<unavailable>) at test.c:4:12
   1    int
   2    add(int a, int b)
   3    {
-> 4        return a + b;
   5    }
   6
   7    int
(lldb) bt
* thread llvm#1, name = 'nobody', stop reason = breakpoint 1.1
  * frame #0: 0x400000000000019c wasm32_args.wasm`add(a=<unavailable>, b=<unavailable>) at test.c:4:12
    frame llvm#1: 0x40000000000001e5 wasm32_args.wasm`main at test.c:12:12
    frame llvm#2: 0x40000000000001fe wasm32_args.wasm
```

This PR is based on an unmerged patch from Paolo Severini:
https://reviews.llvm.org/D78801. I intentionally stuck to the
foundations to keep this PR small. I have more PRs in the pipeline to
support the other features/packets.

My motivation for supporting Wasm is to support Swift compiled to
WebAssembly: https://www.swift.org/documentation/articles/wasm-getting-started.html
@DavidSpickett
Copy link
Collaborator

Thanks! Ok good enough for me, and at some point I'll put some time into working with Wasm so I can better review in future.

Copy link
Collaborator

@DavidSpickett DavidSpickett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@JDevlieghere JDevlieghere merged commit a28e7f1 into llvm:main Jul 29, 2025
10 checks passed
@JDevlieghere JDevlieghere deleted the qWasmCallStack branch July 29, 2025 17:07
JDevlieghere added a commit to JDevlieghere/llvm-project that referenced this pull request Jul 31, 2025
Extend support in LLDB for WebAssembly. This PR adds a new Process
plugin (ProcessWasm) that extends ProcessGDBRemote for WebAssembly
targets. It adds support for WebAssembly's memory model with separate
address spaces, and the ability to fetch the call stack from the
WebAssembly runtime.

I have tested this change with the WebAssembly Micro Runtime (WAMR,
https://github.com/bytecodealliance/wasm-micro-runtime) which implements
a GDB debug stub and supports the qWasmCallStack packet.

```
(lldb) process connect --plugin wasm connect://localhost:4567
Process 1 stopped
* thread llvm#1, name = 'nobody', stop reason = trace
    frame #0: 0x40000000000001ad
wasm32_args.wasm`main:
->  0x40000000000001ad <+3>:  global.get 0
    0x40000000000001b3 <+9>:  i32.const 16
    0x40000000000001b5 <+11>: i32.sub
    0x40000000000001b6 <+12>: local.set 0
(lldb) b add
Breakpoint 1: where = wasm32_args.wasm`add + 28 at test.c:4:12, address = 0x400000000000019c
(lldb) c
Process 1 resuming
Process 1 stopped
* thread llvm#1, name = 'nobody', stop reason = breakpoint 1.1
    frame #0: 0x400000000000019c wasm32_args.wasm`add(a=<unavailable>, b=<unavailable>) at test.c:4:12
   1    int
   2    add(int a, int b)
   3    {
-> 4        return a + b;
   5    }
   6
   7    int
(lldb) bt
* thread llvm#1, name = 'nobody', stop reason = breakpoint 1.1
  * frame #0: 0x400000000000019c wasm32_args.wasm`add(a=<unavailable>, b=<unavailable>) at test.c:4:12
    frame llvm#1: 0x40000000000001e5 wasm32_args.wasm`main at test.c:12:12
    frame llvm#2: 0x40000000000001fe wasm32_args.wasm
```

This PR is based on an unmerged patch from Paolo Severini:
https://reviews.llvm.org/D78801. I intentionally stuck to the
foundations to keep this PR small. I have more PRs in the pipeline to
support the other features/packets.

My motivation for supporting Wasm is to support debugging Swift compiled
to WebAssembly:
https://www.swift.org/documentation/articles/wasm-getting-started.html

(cherry picked from commit a28e7f1)
@DavidSpickett
Copy link
Collaborator

I just realised that none of this is conditional on build target or llvm backends enabled. Is that ok?

In lldb/source/Plugins/Process/CMakeLists.txt, we don't have a CMAKE_SYSTEM_NAME to check anyway. So we can't check even if we want to, and there may not be just one system name. Maybe each runtime would have their own idk.

The llvm backends, lldb/source/Plugins/Disassembler/LLVMC/CMakeLists.txt adds them all so you get WASM if it was enabled. If it wasn't, you don't get disassembly? This is true of all other targets tbf, so nothing to worry about I think.

Tell me if that's true or not.

I don't think anything added so far would require help from the llvm backed, outside of the disassembler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants