-
Notifications
You must be signed in to change notification settings - Fork 14.6k
[lldb] Add WebAssembly Process Plugin #150143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-lldb Author: Jonas Devlieghere (JDevlieghere) ChangesExtend support in LLDB for WebAssembly. This PR adds a new Process plugin (ProcessWasm)that extends ProcessGDBRemote for WebAssembly targets. It adds support for WebAssembly's memory model with separate address spaces, and the ability to fetch the call stack from the WebAssembly runtime. I have tested this change with the WebAssembly Micro Runtime (WAMR, https://github.com/bytecodealliance/wasm-micro-runtime) which implements a GDB debug stub and supports the qWasmCallStack packet.
This PR is based on an unmerged patch from Paolo Severini: https://reviews.llvm.org/D78801. I intentionally stuck to the foundations to keep this PR small. I have more PRs in the pipeline to support the other features/packets. My motivation for supporting Wasm is to support Swift compiled to WebAssembly: https://www.swift.org/documentation/articles/wasm-getting-started.html Patch is 23.14 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/150143.diff 13 Files Affected:
diff --git a/lldb/packages/Python/lldbsuite/test/lldbgdbclient.py b/lldb/packages/Python/lldbsuite/test/lldbgdbclient.py
index 459460b84fbae..599f7878e6edb 100644
--- a/lldb/packages/Python/lldbsuite/test/lldbgdbclient.py
+++ b/lldb/packages/Python/lldbsuite/test/lldbgdbclient.py
@@ -45,7 +45,7 @@ def createTarget(self, yaml_path):
self.yaml2obj(yaml_path, obj_path)
return self.dbg.CreateTarget(obj_path)
- def connect(self, target):
+ def connect(self, target, plugin="gdb-remote"):
"""
Create a process by connecting to the mock GDB server.
@@ -54,7 +54,7 @@ def connect(self, target):
listener = self.dbg.GetListener()
error = lldb.SBError()
process = target.ConnectRemote(
- listener, self.server.get_connect_url(), "gdb-remote", error
+ listener, self.server.get_connect_url(), plugin, error
)
self.assertTrue(error.Success(), error.description)
self.assertTrue(process, PROCESS_IS_VALID)
diff --git a/lldb/source/Plugins/Process/CMakeLists.txt b/lldb/source/Plugins/Process/CMakeLists.txt
index bd9b1b86dbf13..730fc9cd4056c 100644
--- a/lldb/source/Plugins/Process/CMakeLists.txt
+++ b/lldb/source/Plugins/Process/CMakeLists.txt
@@ -29,3 +29,4 @@ add_subdirectory(elf-core)
add_subdirectory(mach-core)
add_subdirectory(minidump)
add_subdirectory(FreeBSDKernel)
+add_subdirectory(Wasm)
diff --git a/lldb/source/Plugins/Process/Wasm/CMakeLists.txt b/lldb/source/Plugins/Process/Wasm/CMakeLists.txt
new file mode 100644
index 0000000000000..ff8a3c792ad53
--- /dev/null
+++ b/lldb/source/Plugins/Process/Wasm/CMakeLists.txt
@@ -0,0 +1,10 @@
+add_lldb_library(lldbPluginProcessWasm PLUGIN
+ ProcessWasm.cpp
+ ThreadWasm.cpp
+ UnwindWasm.cpp
+
+ LINK_LIBS
+ lldbCore
+ LINK_COMPONENTS
+ Support
+ )
diff --git a/lldb/source/Plugins/Process/Wasm/ProcessWasm.cpp b/lldb/source/Plugins/Process/Wasm/ProcessWasm.cpp
new file mode 100644
index 0000000000000..ee5377b33dc97
--- /dev/null
+++ b/lldb/source/Plugins/Process/Wasm/ProcessWasm.cpp
@@ -0,0 +1,127 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "ProcessWasm.h"
+#include "ThreadWasm.h"
+#include "lldb/Core/Module.h"
+#include "lldb/Core/PluginManager.h"
+#include "lldb/Core/Value.h"
+#include "lldb/Utility/DataBufferHeap.h"
+
+#include "lldb/Target/UnixSignals.h"
+
+using namespace lldb;
+using namespace lldb_private;
+using namespace lldb_private::process_gdb_remote;
+using namespace lldb_private::wasm;
+
+LLDB_PLUGIN_DEFINE(ProcessWasm)
+
+ProcessWasm::ProcessWasm(lldb::TargetSP target_sp, ListenerSP listener_sp)
+ : ProcessGDBRemote(target_sp, listener_sp) {
+ /* always use linux signals for wasm process */
+ m_unix_signals_sp = UnixSignals::Create(ArchSpec{"wasm32-Ant-wasi-wasm"});
+}
+
+void ProcessWasm::Initialize() {
+ static llvm::once_flag g_once_flag;
+
+ llvm::call_once(g_once_flag, []() {
+ PluginManager::RegisterPlugin(GetPluginNameStatic(),
+ GetPluginDescriptionStatic(), CreateInstance,
+ DebuggerInitialize);
+ });
+}
+
+void ProcessWasm::DebuggerInitialize(Debugger &debugger) {
+ ProcessGDBRemote::DebuggerInitialize(debugger);
+}
+
+llvm::StringRef ProcessWasm::GetPluginName() { return GetPluginNameStatic(); }
+
+ConstString ProcessWasm::GetPluginNameStatic() {
+ static ConstString g_name("wasm");
+ return g_name;
+}
+
+const char *ProcessWasm::GetPluginDescriptionStatic() {
+ return "GDB Remote protocol based WebAssembly debugging plug-in.";
+}
+
+void ProcessWasm::Terminate() {
+ PluginManager::UnregisterPlugin(ProcessWasm::CreateInstance);
+}
+
+lldb::ProcessSP ProcessWasm::CreateInstance(lldb::TargetSP target_sp,
+ ListenerSP listener_sp,
+ const FileSpec *crash_file_path,
+ bool can_connect) {
+ if (crash_file_path == nullptr)
+ return std::make_shared<ProcessWasm>(target_sp, listener_sp);
+ return {};
+}
+
+bool ProcessWasm::CanDebug(lldb::TargetSP target_sp,
+ bool plugin_specified_by_name) {
+ if (plugin_specified_by_name)
+ return true;
+
+ if (Module *exe_module = target_sp->GetExecutableModulePointer()) {
+ if (ObjectFile *exe_objfile = exe_module->GetObjectFile())
+ return exe_objfile->GetArchitecture().GetMachine() ==
+ llvm::Triple::wasm32;
+ }
+ // However, if there is no wasm module, we return false, otherwise,
+ // we might use ProcessWasm to attach gdb remote.
+ return false;
+}
+
+std::shared_ptr<ThreadGDBRemote> ProcessWasm::CreateThread(lldb::tid_t tid) {
+ return std::make_shared<ThreadWasm>(*this, tid);
+}
+
+size_t ProcessWasm::ReadMemory(lldb::addr_t vm_addr, void *buf, size_t size,
+ Status &error) {
+ wasm_addr_t wasm_addr(vm_addr);
+
+ switch (wasm_addr.GetType()) {
+ case WasmAddressType::Memory:
+ case WasmAddressType::Object:
+ return ProcessGDBRemote::ReadMemory(vm_addr, buf, size, error);
+ case WasmAddressType::Invalid:
+ error.FromErrorStringWithFormat(
+ "Wasm read failed for invalid address 0x%" PRIx64, vm_addr);
+ return 0;
+ }
+}
+
+llvm::Expected<std::vector<lldb::addr_t>>
+ProcessWasm::GetWasmCallStack(lldb::tid_t tid) {
+ StreamString packet;
+ packet.Printf("qWasmCallStack:");
+ packet.Printf("%llx", tid);
+ StringExtractorGDBRemote response;
+ if (m_gdb_comm.SendPacketAndWaitForResponse(packet.GetString(), response) !=
+ GDBRemoteCommunication::PacketResult::Success)
+ return llvm::createStringError("failed to send qWasmCallStack");
+
+ if (!response.IsNormalResponse())
+ return llvm::createStringError("failed to get response for qWasmCallStack");
+
+ addr_t buf[1024 / sizeof(addr_t)];
+ size_t bytes = response.GetHexBytes(
+ llvm::MutableArrayRef<uint8_t>((uint8_t *)buf, sizeof(buf)), '\xdd');
+ if (bytes == 0)
+ return llvm::createStringError("invalid response for qWasmCallStack");
+
+ std::vector<lldb::addr_t> call_stack_pcs;
+ for (size_t i = 0; i < bytes / sizeof(addr_t); i++)
+ call_stack_pcs.push_back(buf[i]);
+
+ return call_stack_pcs;
+}
diff --git a/lldb/source/Plugins/Process/Wasm/ProcessWasm.h b/lldb/source/Plugins/Process/Wasm/ProcessWasm.h
new file mode 100644
index 0000000000000..d75921d76de8d
--- /dev/null
+++ b/lldb/source/Plugins/Process/Wasm/ProcessWasm.h
@@ -0,0 +1,87 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLDB_SOURCE_PLUGINS_PROCESS_WASM_PROCESSWASM_H
+#define LLDB_SOURCE_PLUGINS_PROCESS_WASM_PROCESSWASM_H
+
+#include "Plugins/Process/gdb-remote/ProcessGDBRemote.h"
+
+namespace lldb_private {
+namespace wasm {
+
+/// Each WebAssembly module has separated address spaces for Code and Memory.
+/// A WebAssembly module also has a Data section which, when the module is
+/// loaded, gets mapped into a region in the module Memory.
+/// For the purpose of debugging, we can represent all these separated 32-bit
+/// address spaces with a single virtual 64-bit address space.
+///
+/// Struct wasm_addr_t provides this encoding using bitfields
+enum WasmAddressType { Memory = 0x00, Object = 0x01, Invalid = 0x03 };
+struct wasm_addr_t {
+ uint64_t offset : 32;
+ uint64_t module_id : 30;
+ uint64_t type : 2;
+
+ wasm_addr_t(lldb::addr_t addr)
+ : offset(addr & 0x00000000ffffffff),
+ module_id((addr & 0x00ffffff00000000) >> 32), type(addr >> 62) {}
+
+ wasm_addr_t(WasmAddressType type, uint32_t module_id, uint32_t offset)
+ : offset(offset), module_id(module_id), type(type) {}
+
+ WasmAddressType GetType() { return static_cast<WasmAddressType>(type); }
+ operator lldb::addr_t() { return *(uint64_t *)this; }
+};
+
+/// ProcessWasm provides the access to the Wasm program state
+/// retrieved from the Wasm engine.
+class ProcessWasm : public process_gdb_remote::ProcessGDBRemote {
+public:
+ ProcessWasm(lldb::TargetSP target_sp, lldb::ListenerSP listener_sp);
+ ~ProcessWasm() override = default;
+
+ static lldb::ProcessSP CreateInstance(lldb::TargetSP target_sp,
+ lldb::ListenerSP listener_sp,
+ const FileSpec *crash_file_path,
+ bool can_connect);
+
+ static void Initialize();
+ static void DebuggerInitialize(Debugger &debugger);
+ static void Terminate();
+ static ConstString GetPluginNameStatic();
+ static const char *GetPluginDescriptionStatic();
+
+ llvm::StringRef GetPluginName() override;
+
+ size_t ReadMemory(lldb::addr_t vm_addr, void *buf, size_t size,
+ Status &error) override;
+
+ bool CanDebug(lldb::TargetSP target_sp,
+ bool plugin_specified_by_name) override;
+
+ /// Retrieve the current call stack from the WebAssembly remote process.
+ llvm::Expected<std::vector<lldb::addr_t>> GetWasmCallStack(lldb::tid_t tid);
+
+protected:
+ std::shared_ptr<process_gdb_remote::ThreadGDBRemote>
+ CreateThread(lldb::tid_t tid) override;
+
+private:
+ friend class UnwindWasm;
+ process_gdb_remote::GDBRemoteDynamicRegisterInfoSP &GetRegisterInfo() {
+ return m_register_info_sp;
+ }
+
+ ProcessWasm(const ProcessWasm &);
+ const ProcessWasm &operator=(const ProcessWasm &) = delete;
+};
+
+} // namespace wasm
+} // namespace lldb_private
+
+#endif
diff --git a/lldb/source/Plugins/Process/Wasm/ThreadWasm.cpp b/lldb/source/Plugins/Process/Wasm/ThreadWasm.cpp
new file mode 100644
index 0000000000000..a6553ffffedaa
--- /dev/null
+++ b/lldb/source/Plugins/Process/Wasm/ThreadWasm.cpp
@@ -0,0 +1,34 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "ThreadWasm.h"
+
+#include "ProcessWasm.h"
+#include "UnwindWasm.h"
+#include "lldb/Target/Target.h"
+
+using namespace lldb;
+using namespace lldb_private;
+using namespace lldb_private::wasm;
+
+Unwind &ThreadWasm::GetUnwinder() {
+ if (!m_unwinder_up) {
+ assert(CalculateTarget()->GetArchitecture().GetMachine() ==
+ llvm::Triple::wasm32);
+ m_unwinder_up.reset(new wasm::UnwindWasm(*this));
+ }
+ return *m_unwinder_up;
+}
+
+llvm::Expected<std::vector<lldb::addr_t>> ThreadWasm::GetWasmCallStack() {
+ if (ProcessSP process_sp = GetProcess()) {
+ ProcessWasm *wasm_process = static_cast<ProcessWasm *>(process_sp.get());
+ return wasm_process->GetWasmCallStack(GetID());
+ }
+ return llvm::createStringError("no process");
+}
diff --git a/lldb/source/Plugins/Process/Wasm/ThreadWasm.h b/lldb/source/Plugins/Process/Wasm/ThreadWasm.h
new file mode 100644
index 0000000000000..1c90f58767bc8
--- /dev/null
+++ b/lldb/source/Plugins/Process/Wasm/ThreadWasm.h
@@ -0,0 +1,38 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLDB_SOURCE_PLUGINS_PROCESS_WASM_THREADWASM_H
+#define LLDB_SOURCE_PLUGINS_PROCESS_WASM_THREADWASM_H
+
+#include "Plugins/Process/gdb-remote/ThreadGDBRemote.h"
+
+namespace lldb_private {
+namespace wasm {
+
+/// ProcessWasm provides the access to the Wasm program state
+/// retrieved from the Wasm engine.
+class ThreadWasm : public process_gdb_remote::ThreadGDBRemote {
+public:
+ ThreadWasm(Process &process, lldb::tid_t tid)
+ : process_gdb_remote::ThreadGDBRemote(process, tid) {}
+ ~ThreadWasm() override = default;
+
+ /// Retrieve the current call stack from the WebAssembly remote process.
+ llvm::Expected<std::vector<lldb::addr_t>> GetWasmCallStack();
+
+protected:
+ Unwind &GetUnwinder() override;
+
+ ThreadWasm(const ThreadWasm &);
+ const ThreadWasm &operator=(const ThreadWasm &) = delete;
+};
+
+} // namespace wasm
+} // namespace lldb_private
+
+#endif // LLDB_SOURCE_PLUGINS_PROCESS_WASM_THREADWASM_H
diff --git a/lldb/source/Plugins/Process/Wasm/UnwindWasm.cpp b/lldb/source/Plugins/Process/Wasm/UnwindWasm.cpp
new file mode 100644
index 0000000000000..0852160a8edfa
--- /dev/null
+++ b/lldb/source/Plugins/Process/Wasm/UnwindWasm.cpp
@@ -0,0 +1,81 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "UnwindWasm.h"
+#include "Plugins/Process/gdb-remote/ThreadGDBRemote.h"
+#include "Plugins/Process/wasm/ProcessWasm.h"
+#include "Plugins/Process/wasm/ThreadWasm.h"
+#include "lldb/Utility/LLDBLog.h"
+#include "lldb/Utility/Log.h"
+
+using namespace lldb;
+using namespace lldb_private;
+using namespace process_gdb_remote;
+using namespace wasm;
+
+class WasmGDBRemoteRegisterContext : public GDBRemoteRegisterContext {
+public:
+ WasmGDBRemoteRegisterContext(ThreadGDBRemote &thread,
+ uint32_t concrete_frame_idx,
+ GDBRemoteDynamicRegisterInfoSP ®_info_sp,
+ uint64_t pc)
+ : GDBRemoteRegisterContext(thread, concrete_frame_idx, reg_info_sp, false,
+ false) {
+ PrivateSetRegisterValue(0, pc);
+ }
+};
+
+lldb::RegisterContextSP
+UnwindWasm::DoCreateRegisterContextForFrame(lldb_private::StackFrame *frame) {
+ if (m_frames.size() <= frame->GetFrameIndex())
+ return lldb::RegisterContextSP();
+
+ ThreadSP thread = frame->GetThread();
+ ThreadGDBRemote *gdb_thread = static_cast<ThreadGDBRemote *>(thread.get());
+ ProcessWasm *wasm_process =
+ static_cast<ProcessWasm *>(thread->GetProcess().get());
+
+ return std::make_shared<WasmGDBRemoteRegisterContext>(
+ *gdb_thread, frame->GetConcreteFrameIndex(),
+ wasm_process->GetRegisterInfo(), m_frames[frame->GetFrameIndex()]);
+}
+
+uint32_t UnwindWasm::DoGetFrameCount() {
+ if (!m_unwind_complete) {
+ m_unwind_complete = true;
+ m_frames.clear();
+
+ ThreadWasm &wasm_thread = static_cast<ThreadWasm &>(GetThread());
+ llvm::Expected<std::vector<lldb::addr_t>> call_stack_pcs =
+ wasm_thread.GetWasmCallStack();
+ if (!call_stack_pcs) {
+ LLDB_LOG_ERROR(GetLog(LLDBLog::Unwind), call_stack_pcs.takeError(),
+ "Failed to get Wasm callstack: {0}");
+ m_frames.clear();
+ return 0;
+ }
+ m_frames = *call_stack_pcs;
+ }
+ return m_frames.size();
+}
+
+bool UnwindWasm::DoGetFrameInfoAtIndex(uint32_t frame_idx, lldb::addr_t &cfa,
+ lldb::addr_t &pc,
+ bool &behaves_like_zeroth_frame) {
+ if (m_frames.size() == 0)
+ DoGetFrameCount();
+
+ if (frame_idx < m_frames.size()) {
+ behaves_like_zeroth_frame = (frame_idx == 0);
+ cfa = 0;
+ pc = m_frames[frame_idx];
+ return true;
+ }
+
+ return false;
+}
diff --git a/lldb/source/Plugins/Process/Wasm/UnwindWasm.h b/lldb/source/Plugins/Process/Wasm/UnwindWasm.h
new file mode 100644
index 0000000000000..ff5e06d23d960
--- /dev/null
+++ b/lldb/source/Plugins/Process/Wasm/UnwindWasm.h
@@ -0,0 +1,51 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLDB_SOURCE_PLUGINS_PROCESS_WASM_UNWINDWASM_H
+#define LLDB_SOURCE_PLUGINS_PROCESS_WASM_UNWINDWASM_H
+
+#include "lldb/Target/RegisterContext.h"
+#include "lldb/Target/Unwind.h"
+#include <vector>
+
+namespace lldb_private {
+namespace wasm {
+
+/// UnwindWasm manages stack unwinding for a WebAssembly process.
+class UnwindWasm : public lldb_private::Unwind {
+public:
+ UnwindWasm(lldb_private::Thread &thread) : Unwind(thread) {}
+ ~UnwindWasm() override = default;
+
+protected:
+ void DoClear() override {
+ m_frames.clear();
+ m_unwind_complete = false;
+ }
+
+ uint32_t DoGetFrameCount() override;
+
+ bool DoGetFrameInfoAtIndex(uint32_t frame_idx, lldb::addr_t &cfa,
+ lldb::addr_t &pc,
+ bool &behaves_like_zeroth_frame) override;
+
+ lldb::RegisterContextSP
+ DoCreateRegisterContextForFrame(lldb_private::StackFrame *frame) override;
+
+private:
+ std::vector<lldb::addr_t> m_frames;
+ bool m_unwind_complete = false;
+
+ UnwindWasm(const UnwindWasm &);
+ const UnwindWasm &operator=(const UnwindWasm &) = delete;
+};
+
+} // namespace wasm
+} // namespace lldb_private
+
+#endif
diff --git a/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp b/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp
index a2c34ddfc252e..14dfdec6a6f62 100644
--- a/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp
+++ b/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp
@@ -323,6 +323,11 @@ ProcessGDBRemote::~ProcessGDBRemote() {
KillDebugserverProcess();
}
+std::shared_ptr<ThreadGDBRemote>
+ProcessGDBRemote::CreateThread(lldb::tid_t tid) {
+ return std::make_shared<ThreadGDBRemote>(*this, tid);
+}
+
bool ProcessGDBRemote::ParsePythonTargetDefinition(
const FileSpec &target_definition_fspec) {
ScriptInterpreter *interpreter =
@@ -1594,7 +1599,7 @@ bool ProcessGDBRemote::DoUpdateThreadList(ThreadList &old_thread_list,
ThreadSP thread_sp(
old_thread_list_copy.RemoveThreadByProtocolID(tid, false));
if (!thread_sp) {
- thread_sp = std::make_shared<ThreadGDBRemote>(*this, tid);
+ thread_sp = CreateThread(tid);
LLDB_LOGV(log, "Making new thread: {0} for thread ID: {1:x}.",
thread_sp.get(), thread_sp->GetID());
} else {
@@ -1726,7 +1731,7 @@ ThreadSP ProcessGDBRemote::SetThreadStopInfo(
if (!thread_sp) {
// Create the thread if we need to
- thread_sp = std::make_shared<ThreadGDBRemote>(*this, tid);
+ thread_sp = CreateThread(tid);
m_thread_list_real.AddThread(thread_sp);
}
}
diff --git a/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.h b/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.h
index 7ae33837fd067..7c3dfb179a4b3 100644
--- a/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.h
+++ b/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.h
@@ -246,6 +246,8 @@ class ProcessGDBRemote : public Process,
ProcessGDBRemote(lldb::TargetSP target_sp, lldb::ListenerSP listener_sp);
+ virtual std::shared_ptr<ThreadGDBRemote> CreateThread(lldb::tid_t tid);
+
bool SupportsMemoryTagging() override;
/// Broadcaster event bits definitions.
diff --git a/lldb/source/Target/Platform.cpp b/lldb/source/Target/Platform.cpp
index 8000cd07565ae..f9bf5d1a9f160 100644
--- a/lldb/source/Target/Platform.cpp
+++ b/lldb/source/Target/Platform.cpp
@@ -2076,6 +2076,12 @@ size_t Platform::GetSoftwareBreakpointTrapOpcode(Target &target,
trap_opcode_size = sizeof(g_loongarch_opcode);
} break;
+ case llvm::Triple::wasm32: {
+ static const uint8_t g_wasm_opcode[] = {0x00}; // unreachable
+ tr...
[truncated]
|
46a7388
to
1d0ad45
Compare
How does this relate to / overlap with existing PRs #77949 and #78977? They are unlikely to get merged by the authors, so I assume you'll make your own equivalent.
This is a bit of a break from the usual reading unwind info, and I wonder what other stuff is different in this ecosystem. Can we get an RFC with an overview of that? Like are these runtimes generally implementing GDB stubs, is the debug model super different, that sort of thing. At least testing can be done anywhere the runtime can run, that's a cool feature. |
I had totally forgotten about those PRs . Yes, they're iterations of the same patch by @paolosevMSFT (which also I based this patch on). The biggest difference is that I'm breaking it down in smaller pieces to make reviewing them easier.
I've been told that the runtime can have a different stack (separate from the native stack) for the Wasm code and also that the format of call frames may not match native stack frame ABI.
I'm not sure I'm well versed enough in the world of WASM to create an RFC, but I did do a survey of debugging the different runtimes.
I'm not aware of any other runtimes besides WAMR that support debugging through GDB remote. Of all the approaches, I think it's the most "desirable" going forward. The fact that there is a ByteCode Alliance runtime that supports it makes it even more compelling. Correction: Looks like Chrome V8 implements this as well and presumably the reason Paolo was supporting this in the first place: https://chromium-review.googlesource.com/c/v8/v8/+/2571341
Yes, long term it would be cool if we can build inferiors and decide where to run them (e.g. QEMU, WAMR, etc). In the short term, we can get pretty good test coverage with the GDB remote test cases. To put things more pragmatically, LLDB has partial support for Wasm debugging using gdb-remote. There's an "official" runtime that implements a gdb stub with the necessary extensions. My hope is that going forward, support in LLDB would encourage other runtimes to adopt this as well. |
Tagging @xujuntwt95329, @mh4ck-Thales and @xwang98 who took part in the discussion in the previous PRs. |
Sure, so this is like qemu-user? You want to debug the wasm process being hosted within the runtime not the runtime itself. Because the internet is 99% "use a web browser" for debug, then https://docs.wasmtime.dev/examples-debugging-native-debugger.html seems to be about debugging the runtime with a small amount of the wasm side, can't tell much from that page. |
This is an overly harsh way of phrasing it, but hopefully it gives you an idea of why I'm asking for an overview of this effort:
I think WASM is cool (all WASM userspace when?), I think not having bugs is cool too. I personally trust that you have done the leg work, but I have to verify that at least a bit, as I would with any other random contributor. I also know that WASM is a hot topic so for this work to get the best reception, a statement of intent might draw interest, and set expectations appropriately. But ok, as an RFC would be unlikely to generate actual objections, I'll say what I think here.
Great, they were a bit much.
So they have designed their server specifically to allow debugging both? I wonder how they manage that. A multi-architecture target would be interesting (ala ARM64EC on Windows) but I presume it's different targets. Can lldb handle that? It doesn't have to but it sounds pretty cool if it can.
So for this, you aren't stepping WASM instructions, you're debugging it like any other native JIT. The input to that JIT just happens to be WASM.
Yeah, 99% of the internet says "just use chrome" to debug. So we're not going to plug in to this part of it then.
We will need to state which runtime we intend to support and one from the people writing the standard is the best choice I agree. I suppose it running on small devices that cannot host a browser is why they went that way, do the debugger UI stuff elsewhere.
We can get a surprising amount of coverage that way, that is true. What we can't do is answer Ultimately I would prefer regular test suite runs using this WASM runtime, but I would settle for a manual run that gives us a snapshot of the state of WASM support. Finally, could you open a meta-issue for WASM support? Like we have for RISC-V and LoongArch. Ofc these issues get outdated pretty fast but at least it's something we can put on the website and folks can ask their "should X work" questions. ...now I've ranted enough about process let me review some actual code :) |
Also can we get a vague idea of how many more PRs are in the stack for this? Just so save me asking "why is X missing" over and over. |
On the original review, auto-detection of the plugin wasn't working which meant this |
Thanks for picking up the work on this!
I agree that this is the best method for implementing Wasm debugging. We want to be able to debug the Wasm user program transparently, on any supported without having to deal with runtime-related specificities.
For now only WAMR supports remote debugging with lldb, but I think (hope?) that other runtimes will be supported in the future. In this situation maybe it would be nice to discuss with the Bytecode Alliance what is expected from such a feature and if we can somewhat standardize it, so other runtimes know what they have to implement in order to support it. I'm bringing @alexcrichton into the loop, he's involved in the Bytecode Alliance and LLVM, maybe he'll be able to help us on these topics. |
Nice, yes once there is something working it will be interesting to see what we build with collaboration. To be clear: I say that lldb should state which runtime(s) it supports to set expectations. If changes to work with others are not disruptive then the more the merrier (assuming my beloved test coverage is there :) ). |
I'm not well-versed in lldb, but it makes more sense to me that lldb propose a standard for debugging Wasm remotely and that lldb support any runtime complying with the standard? It seems more simple to have the runtimes follow what lldb wants instead of adapting lldb to each runtime, especially as the list grows. For the test coverage we can have one of more reference runtime (e.g. the ones of the bytecode alliance) but I would say it is up to the runtimes developers to ensure they work with lldb, on our side we just want to ensure that we don't have any bugs / regressions in lldb and for that testing on 2/3 runtimes should be enough |
Long term, if the WASM standard makes recommendations for debugging then that's great. You are correct there. Short term, users will need to know what to expect so we need a short, simple, answer to "I tried lldb with whatever runtime and it didn't work". Which ideally is "X is known to work to this specific extent, patches are welcome to support Y and Z". |
Certainly lldb being in this conversation early would be a good way to prevent the reliance on GDB specific behaviour that often happens in the native debugging world. |
Yep, it's the typical choice between "user" vs "system" debugging. Both are valuable and have their own strengths. I think the approach wasmtime took is a good choice for it, but it wouldn't make sense for something like V8 where you'd have to debug the whole JavaScript runtime to debug your Wasm code. Hence why I think the "user" approach is the most promising, generally speaking.
I don't think that's harsh at all. I think I made almost entirely the same point in one of Paolo's original PRs.
I was under the impression that we had reached consensus on this in the various previous PRs. My read of the situation was that there were some practical issues (like the PRs being to big, etc) but no fundamental objections to the approach. Admittedly, I should've included that assumption explicitly in this PR.
Not really, this is back to the difference between debugging your runtime (i.e. WAMR or V8) vs debugging the client. To use an analogy, it's the difference between debugging the kernel and looking at its data structures to figure out the processes vs using debugserver to debug a userspace program.
I wouldn't even call it debugging WASM. You're debugging native code that happened to be JITed from Wasm.
Correct, although since they use V8, if they exposed that debug stub, they'd get native debugging with LLDB out of the box.
Sounds good, we can list WAMR and V8 and say that we support anyone that implements the GDB remote protocol plus the handful of WASM extensions (which we'll document).
That seems fair and a good way to flush out issues once we've added support for everything we know is missing.
Haha, I appreciate the engagement. Thanks for all the input! |
I didn't look into it yet, but I added it as a task to #150449 |
👋 FWIW I'm very much an outsider here so I don't fully understand all the dynamics in play per se, but what I can speak more to is plans on the Wasmtime side of things. For Wasmtime we have an approved RFC about the methodology and priorities for implementing debugging. This plan primarily goes through the Debug Adapter Protocol and intentionally does not natively implement GDB/LLDB integration mostly to handle wasm programs that are an interpreter for their own lanuage (e.g. Python-compiled-to-wasm). We do not currently have anyone slated to implement this work as it hasn't been a priority for existing maintainers yet and we haven't had other volunteers. My assumption though would be that if LLDB had a native wasm plugin support then Wasmtime would implement a debug adapter component to bridge what Wasmtime would implement natively and what LLDB supported. I'm not personally aware of other runtimes that want to mirror what WAMR is doing here. The current plan for Wasmtime won't include mirroring the support natively, but that's not to say Wasmtime's plan is set in stone and not possible to change either. |
Great to see more interest in WebAssembly from
Small clarification / additional context as the Wasmtime-focused team I am on just finished some planning: some folks (@cfallin and others) have actually started working on the implementation of parts of that debugging RFC already, and while they haven't yet started on the debug adapter protocol implementation that the lldb frontend (or others) could attach to, that is the next big milestone after the current one. The plan is to begin the debug adapter protocol implementation work roughly around the beginning of Q4. |
Hey @alexcrichton and @fitzgen, thanks for chiming in here! The idea to implement a DAP server to support interpreted languages is an interesting idea. As someone who's been involved with lldb-dap I have some experience in that space. My immediate question about how you're planning on supporting that was answered by the "Debug Adapter Components" section. The following sentence stood out to me:
This is basically describing the GDB remote protocol. In the world of native debugging, it's the standard for talking to a debugger like lldb and gdb. I definitely think it's something worth considering, and it sounds like Alex seems to be saying something similar.
The V8 JavaScript engine is another example, and in the context of Swift, WasmKit is interested in it as well.
When you talk about LLDB as a "client", I assume you mean the "debug adapater component" rather than the "debug adapter protocol"? |
Clarification: The interfaces described in that RFC and that I am describing below are programmatic/function interfaces (specifically Wasm component interfaces) -- not remote protocols. More on the GDB protocol specifically down below.
I see two new ways that LLDB and Wasmtime could interact in the future (on top of the existing way by attaching LLDB to a native Wasmtime process and debugging Wasm (and native code) via the JIT code registration system APIs):
Both of the above could even happen at the same time. We could also potentially implement a GDB protocol shim as a debug adapter component. Seems possible but I haven't fully thought it through. Backing up, here are the main reasons we didn't choose the GDB protocol:
|
I'm personally skeptical of this part. LLDB's command line driver is very small and doesn't do much. You absolutely could turn that into a DAP client, but you wouldn't get much in return. Most of the "TUI" parts of LLDB are tightly coupled with the core of the library. For folks to retain the same "look and feel", you would have to reimplement all the commands in terms of DAP operations. Doable of course, but likely a lot of work for very little return.
How about instead of compiling LLDB to Wasm, we compile a small binary to Wasm and run that in process, handling the callbacks. If that hypothetical server implemented the GDB remote protocol, you basically have how we do debugging on Linux/Windows (
I'd be happy to keep chatting about this if you're interested in exploring that path further, now or in the future. I also don't want to necessarily push you towards the GDB remote protocol. For languages like C++, Rust, Swift, or really anything targeting LLVM/supported by LLDB, I strongly believe that going with the GDB remote protocol is the best and easiest way to support debugging, but I also totally acknowledge that the trade-offs may be different for Wasmtime. That said, they also don't necessarily need to be mutually exclusive. |
The part of the debugger that handles symbol and debug information is big, requires access to lots of large files, allocates lots of memory, etc. So you really want to be able to run lldb itself on the biggest system you have to hand, not the system on which you run the program you are inspecting, which might be a resource constrained device. That's why the gdb-remote protocol or something like it is so attractive. The part that has to run on the system with the debugee doesn't have to reason about symbols and types, and so is small and doesn't consume a lot of resources, and the part that has to deal with lots of data can run somewhere else if that's more appropriate. |
I agree that the GDB remote protocol is the best / easiest way to debug everything supported by LLVM / that isn't interpreted in some way. Regarding the interpreted languages, I'd say that there is also a use for this king of debugging for allowing runtime developers to debug their runtime when compiled to Wasm. Most issues in an interpreted code that are not Wasm-related can be debugged on the language's runtime directly without going through Wasm, and Wasm-related bugs will often need access to the runtime as the bug being Wasm-specific may have something to do with the runtime. |
I tested this patch and while basic debugging works, some features that were available in #77949 are not working anymore (retrieval of variable values, disassembly of the Wasm bytecode...). I'm not sure if it's a bug or if it's because the #77949 patch has been split into incremental patches. Would you like me to report these problems here or in #150449? |
Agreed. If I want to debug qemu-user I treat it like any other program, if I want the simulated process I connect to the internal stub. Some will want integrated solutions but that can come later).
You weren't wrong, the point of an RFC would have been more to be a place to bring up all the weird details one might need to know to think about how this work will work e.g. the address spaces and the endian and only having a PC and so on. But we're having that discussion now so it's fine. There's no massive issues with this PR itself so it's not causing confusion.
Also the GDB remote protocol is very loosely a standard and has many problems, so in a general sense I do want to hear what projects who aren't required to use it want to do. I am interested to read the WASM debug proposals and see what fresh perspectives there are.
I wonder if it's intentional on WAMR's part to choose the gdb-remote protocol for this reason. I can understand why the "hosted" runtimes would want to take on more code for bigger benefits though.
@JDevlieghere do you have a WIP tree they can use that has more changes? |
Sounds like a thing some enthusiast will eventually do, but I don't see any command line DAP client debuggers out there now. Unless you count using neovim or emacs as the IDE but it's not the same sort of use, no low level commands there. One day when the DAP protocol grows a bunch of extensions maybe it'll happen. I agree that most people will simply connect an existing IDE to this DAP server. Which sounds pretty neat, but doesn't require any input from lldb either way. |
Yep, that's expected: this PR contains only a subset: notably the foundation (the process plugin) and support for backtraces. I'll be adding the other features in separate PRs to make reviewing easier. |
Not currently no, but it's only one or two more patches before we have parity with the original PR. I just need to decide how to split them up once this PR lands. As much as I enjoy the discussion about the future of debugging Wasm, I want to make sure we don't lose focus in this PR. I think all outstanding issues have been address. @DavidSpickett you've been actively engaged here and in the previous PR so I'm mostly looking at you for sign off. |
Just for good faith purposes, can you give me a list of the parts you've got it split into at the moment and what state they are in? Done, WIP, whatever. I assume from what you've said, that it's all basically done, just want to get a clear idea. |
And by "all" I mean your current stack, not "all" of wasm debugging support. That is of an unknown scope at the moment and I'm fine with that. |
lldb/docs/resources/lldbgdbremote.md
Outdated
|
||
Get the Wasm callback for the given thread id. This returns a hex-encoding list | ||
of 64-bit addresses for the frame PCs. To match the Wasm specification, the | ||
addresses are encoded in little endian byte order. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add "even if the endian of the Wasm runtime's host is not little endian.".
Closed all the comments I think have been addressed. Please look at the ones that remain open. |
I have two patches, which I'll upload later today:
With those two patches, you can see locals:
|
Extend support in LLDB for WebAssembly. This PR adds a new Process plugin (ProcessWasm)that extends ProcessGDBRemote for WebAssembly targets. It adds support for WebAssembly's memory model with separate address spaces, and the ability to fetch the call stack from the WebAssembly runtime. I have tested this change with the WebAssembly Micro Runtime (WAMR, https://github.com/bytecodealliance/wasm-micro-runtime) which implements a GDB debug stub and supports the qWasmCallStack packet. ``` (lldb) process connect --plugin wasm connect://localhost:4567 Process 1 stopped * thread llvm#1, name = 'nobody', stop reason = trace frame #0: 0x40000000000001ad wasm32_args.wasm`main: -> 0x40000000000001ad <+3>: global.get 0 0x40000000000001b3 <+9>: i32.const 16 0x40000000000001b5 <+11>: i32.sub 0x40000000000001b6 <+12>: local.set 0 (lldb) b add Breakpoint 1: where = wasm32_args.wasm`add + 28 at test.c:4:12, address = 0x400000000000019c (lldb) c Process 1 resuming Process 1 stopped * thread llvm#1, name = 'nobody', stop reason = breakpoint 1.1 frame #0: 0x400000000000019c wasm32_args.wasm`add(a=<unavailable>, b=<unavailable>) at test.c:4:12 1 int 2 add(int a, int b) 3 { -> 4 return a + b; 5 } 6 7 int (lldb) bt * thread llvm#1, name = 'nobody', stop reason = breakpoint 1.1 * frame #0: 0x400000000000019c wasm32_args.wasm`add(a=<unavailable>, b=<unavailable>) at test.c:4:12 frame llvm#1: 0x40000000000001e5 wasm32_args.wasm`main at test.c:12:12 frame llvm#2: 0x40000000000001fe wasm32_args.wasm ``` This PR is based on an unmerged patch from Paolo Severini: https://reviews.llvm.org/D78801. I intentionally stuck to the foundations to keep this PR small. I have more PRs in the pipeline to support the other features/packets. My motivation for supporting Wasm is to support Swift compiled to WebAssembly: https://www.swift.org/documentation/articles/wasm-getting-started.html
e68b5cc
to
4294028
Compare
Thanks! Ok good enough for me, and at some point I'll put some time into working with Wasm so I can better review in future. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Extend support in LLDB for WebAssembly. This PR adds a new Process plugin (ProcessWasm) that extends ProcessGDBRemote for WebAssembly targets. It adds support for WebAssembly's memory model with separate address spaces, and the ability to fetch the call stack from the WebAssembly runtime. I have tested this change with the WebAssembly Micro Runtime (WAMR, https://github.com/bytecodealliance/wasm-micro-runtime) which implements a GDB debug stub and supports the qWasmCallStack packet. ``` (lldb) process connect --plugin wasm connect://localhost:4567 Process 1 stopped * thread llvm#1, name = 'nobody', stop reason = trace frame #0: 0x40000000000001ad wasm32_args.wasm`main: -> 0x40000000000001ad <+3>: global.get 0 0x40000000000001b3 <+9>: i32.const 16 0x40000000000001b5 <+11>: i32.sub 0x40000000000001b6 <+12>: local.set 0 (lldb) b add Breakpoint 1: where = wasm32_args.wasm`add + 28 at test.c:4:12, address = 0x400000000000019c (lldb) c Process 1 resuming Process 1 stopped * thread llvm#1, name = 'nobody', stop reason = breakpoint 1.1 frame #0: 0x400000000000019c wasm32_args.wasm`add(a=<unavailable>, b=<unavailable>) at test.c:4:12 1 int 2 add(int a, int b) 3 { -> 4 return a + b; 5 } 6 7 int (lldb) bt * thread llvm#1, name = 'nobody', stop reason = breakpoint 1.1 * frame #0: 0x400000000000019c wasm32_args.wasm`add(a=<unavailable>, b=<unavailable>) at test.c:4:12 frame llvm#1: 0x40000000000001e5 wasm32_args.wasm`main at test.c:12:12 frame llvm#2: 0x40000000000001fe wasm32_args.wasm ``` This PR is based on an unmerged patch from Paolo Severini: https://reviews.llvm.org/D78801. I intentionally stuck to the foundations to keep this PR small. I have more PRs in the pipeline to support the other features/packets. My motivation for supporting Wasm is to support debugging Swift compiled to WebAssembly: https://www.swift.org/documentation/articles/wasm-getting-started.html (cherry picked from commit a28e7f1)
I just realised that none of this is conditional on build target or llvm backends enabled. Is that ok? In lldb/source/Plugins/Process/CMakeLists.txt, we don't have a CMAKE_SYSTEM_NAME to check anyway. So we can't check even if we want to, and there may not be just one system name. Maybe each runtime would have their own idk. The llvm backends, lldb/source/Plugins/Disassembler/LLVMC/CMakeLists.txt adds them all so you get WASM if it was enabled. If it wasn't, you don't get disassembly? This is true of all other targets tbf, so nothing to worry about I think. Tell me if that's true or not. I don't think anything added so far would require help from the llvm backed, outside of the disassembler. |
Extend support in LLDB for WebAssembly. This PR adds a new Process plugin (ProcessWasm) that extends ProcessGDBRemote for WebAssembly targets. It adds support for WebAssembly's memory model with separate address spaces, and the ability to fetch the call stack from the WebAssembly runtime.
I have tested this change with the WebAssembly Micro Runtime (WAMR, https://github.com/bytecodealliance/wasm-micro-runtime) which implements a GDB debug stub and supports the qWasmCallStack packet.
This PR is based on an unmerged patch from Paolo Severini: https://reviews.llvm.org/D78801. I intentionally stuck to the foundations to keep this PR small. I have more PRs in the pipeline to support the other features/packets.
My motivation for supporting Wasm is to support debugging Swift compiled to WebAssembly: https://www.swift.org/documentation/articles/wasm-getting-started.html