-
Notifications
You must be signed in to change notification settings - Fork 0
Use ATTACH maps for array-sections/subscripts on pointers. #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: tgt-capture-mapped-ptrs-by-ref
Are you sure you want to change the base?
Use ATTACH maps for array-sections/subscripts on pointers. #1
Conversation
offload/libomptarget/interface.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The libomptarget code will disappear from this PR once llvm#149036 is merged.
@@ -7096,8 +7129,8 @@ class MappableExprsHandler { | |||
const ValueDecl *Mapper = nullptr, bool ForDeviceAddr = false, | |||
const ValueDecl *BaseDecl = nullptr, const Expr *MapExpr = nullptr, | |||
ArrayRef<OMPClauseMappableExprCommon::MappableExprComponentListRef> | |||
OverlappedElements = {}, | |||
bool AreBothBasePtrAndPteeMapped = false) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AreBothBaseptrAndPteeMapped was used to decide to use PTR_AND_OBJ maps for something like map(p, p[0])
. We don't do that now, since we map them independently, and attach them separately.
Extend support in LLDB for WebAssembly. This PR adds a new Process plugin (ProcessWasm) that extends ProcessGDBRemote for WebAssembly targets. It adds support for WebAssembly's memory model with separate address spaces, and the ability to fetch the call stack from the WebAssembly runtime. I have tested this change with the WebAssembly Micro Runtime (WAMR, https://github.com/bytecodealliance/wasm-micro-runtime) which implements a GDB debug stub and supports the qWasmCallStack packet. ``` (lldb) process connect --plugin wasm connect://localhost:4567 Process 1 stopped * thread #1, name = 'nobody', stop reason = trace frame #0: 0x40000000000001ad wasm32_args.wasm`main: -> 0x40000000000001ad <+3>: global.get 0 0x40000000000001b3 <+9>: i32.const 16 0x40000000000001b5 <+11>: i32.sub 0x40000000000001b6 <+12>: local.set 0 (lldb) b add Breakpoint 1: where = wasm32_args.wasm`add + 28 at test.c:4:12, address = 0x400000000000019c (lldb) c Process 1 resuming Process 1 stopped * thread #1, name = 'nobody', stop reason = breakpoint 1.1 frame #0: 0x400000000000019c wasm32_args.wasm`add(a=<unavailable>, b=<unavailable>) at test.c:4:12 1 int 2 add(int a, int b) 3 { -> 4 return a + b; 5 } 6 7 int (lldb) bt * thread #1, name = 'nobody', stop reason = breakpoint 1.1 * frame #0: 0x400000000000019c wasm32_args.wasm`add(a=<unavailable>, b=<unavailable>) at test.c:4:12 frame #1: 0x40000000000001e5 wasm32_args.wasm`main at test.c:12:12 frame #2: 0x40000000000001fe wasm32_args.wasm ``` This PR is based on an unmerged patch from Paolo Severini: https://reviews.llvm.org/D78801. I intentionally stuck to the foundations to keep this PR small. I have more PRs in the pipeline to support the other features/packets. My motivation for supporting Wasm is to support debugging Swift compiled to WebAssembly: https://www.swift.org/documentation/articles/wasm-getting-started.html
…erver (llvm#148774) Summary: There was a deadlock was introduced by [PR llvm#146441](llvm#146441) which changed `CurrentThreadIsPrivateStateThread()` to `CurrentThreadPosesAsPrivateStateThread()`. This change caused the execution path in [`ExecutionContextRef::SetTargetPtr()`](https://github.com/llvm/llvm-project/blob/10b5558b61baab59c7d3dff37ffdf0861c0cc67a/lldb/source/Target/ExecutionContext.cpp#L513) to now enter a code block that was previously skipped, triggering [`GetSelectedFrame()`](https://github.com/llvm/llvm-project/blob/10b5558b61baab59c7d3dff37ffdf0861c0cc67a/lldb/source/Target/ExecutionContext.cpp#L522) which leads to a deadlock. Thread 1 gets m_modules_mutex in [`ModuleList::AppendImpl`](https://github.com/llvm/llvm-project/blob/96148f92146e5211685246722664e51ec730e7ba/lldb/source/Core/ModuleList.cpp#L218), Thread 3 gets m_language_runtimes_mutex in [`GetLanguageRuntime`](https://github.com/llvm/llvm-project/blob/96148f92146e5211685246722664e51ec730e7ba/lldb/source/Target/Process.cpp#L1501), but then Thread 1 waits for m_language_runtimes_mutex in [`GetLanguageRuntime`](https://github.com/llvm/llvm-project/blob/96148f92146e5211685246722664e51ec730e7ba/lldb/source/Target/Process.cpp#L1501) while Thread 3 waits for m_modules_mutex in [`ScanForGNUstepObjCLibraryCandidate`](https://github.com/llvm/llvm-project/blob/96148f92146e5211685246722664e51ec730e7ba/lldb/source/Plugins/LanguageRuntime/ObjC/GNUstepObjCRuntime/GNUstepObjCRuntime.cpp#L57). This fixes the deadlock by adding a scoped block around the mutex lock before the call to the notifier, and moved the notifier call outside of the mutex-guarded section. The notifier call [`NotifyModuleAdded`](https://github.com/llvm/llvm-project/blob/96148f92146e5211685246722664e51ec730e7ba/lldb/source/Target/Target.cpp#L1810) should be thread-safe, since the module should be added to the `ModuleList` before the mutex is released, and the notifier doesn't modify the module list further, and the call is operates on local state and the `Target` instance. ### Deadlocked Thread backtraces: ``` * thread #3, name = 'dbg.evt-handler', stop reason = signal SIGSTOP * frame #0: 0x00007f2f1e2973dc libc.so.6`futex_wait(private=0, expected=2, futex_word=0x0000563786bd5f40) at futex-internal.h:146:13 /*... a bunch of mutex related bt ... */ liblldb.so.21.0git`std::lock_guard<std::recursive_mutex>::lock_guard(this=0x00007f2f0f1927b0, __m=0x0000563786bd5f40) at std_mutex.h:229:19 frame llvm#8: 0x00007f2f27946eb7 liblldb.so.21.0git`ScanForGNUstepObjCLibraryCandidate(modules=0x0000563786bd5f28, TT=0x0000563786bd5eb8) at GNUstepObjCRuntime.cpp:60:41 frame llvm#9: 0x00007f2f27946c80 liblldb.so.21.0git`lldb_private::GNUstepObjCRuntime::CreateInstance(process=0x0000563785e1d360, language=eLanguageTypeObjC) at GNUstepObjCRuntime.cpp:87:8 frame llvm#10: 0x00007f2f2746fca5 liblldb.so.21.0git`lldb_private::LanguageRuntime::FindPlugin(process=0x0000563785e1d360, language=eLanguageTypeObjC) at LanguageRuntime.cpp:210:36 frame llvm#11: 0x00007f2f2742c9e3 liblldb.so.21.0git`lldb_private::Process::GetLanguageRuntime(this=0x0000563785e1d360, language=eLanguageTypeObjC) at Process.cpp:1516:9 ... frame llvm#21: 0x00007f2f2750b5cc liblldb.so.21.0git`lldb_private::Thread::GetSelectedFrame(this=0x0000563785e064d0, select_most_relevant=DoNoSelectMostRelevantFrame) at Thread.cpp:274:48 frame llvm#22: 0x00007f2f273f9957 liblldb.so.21.0git`lldb_private::ExecutionContextRef::SetTargetPtr(this=0x00007f2f0f193778, target=0x0000563786bd5be0, adopt_selected=true) at ExecutionContext.cpp:525:32 frame llvm#23: 0x00007f2f273f9714 liblldb.so.21.0git`lldb_private::ExecutionContextRef::ExecutionContextRef(this=0x00007f2f0f193778, target=0x0000563786bd5be0, adopt_selected=true) at ExecutionContext.cpp:413:3 frame llvm#24: 0x00007f2f270e80af liblldb.so.21.0git`lldb_private::Debugger::GetSelectedExecutionContext(this=0x0000563785d83bc0) at Debugger.cpp:1225:23 frame llvm#25: 0x00007f2f271bb7fd liblldb.so.21.0git`lldb_private::Statusline::Redraw(this=0x0000563785d83f30, update=true) at Statusline.cpp:136:41 ... * thread #1, name = 'lldb', stop reason = signal SIGSTOP * frame #0: 0x00007f2f1e2973dc libc.so.6`futex_wait(private=0, expected=2, futex_word=0x0000563785e1dd98) at futex-internal.h:146:13 /*... a bunch of mutex related bt ... */ liblldb.so.21.0git`std::lock_guard<std::recursive_mutex>::lock_guard(this=0x00007ffe62be0488, __m=0x0000563785e1dd98) at std_mutex.h:229:19 frame llvm#8: 0x00007f2f2742c8d1 liblldb.so.21.0git`lldb_private::Process::GetLanguageRuntime(this=0x0000563785e1d360, language=eLanguageTypeC_plus_plus) at Process.cpp:1510:41 frame llvm#9: 0x00007f2f2743c46f liblldb.so.21.0git`lldb_private::Process::ModulesDidLoad(this=0x0000563785e1d360, module_list=0x00007ffe62be06a0) at Process.cpp:6082:36 ... frame llvm#13: 0x00007f2f2715cf03 liblldb.so.21.0git`lldb_private::ModuleList::AppendImpl(this=0x0000563786bd5f28, module_sp=ptr = 0x563785cec560, use_notifier=true) at ModuleList.cpp:246:19 frame llvm#14: 0x00007f2f2715cf4c liblldb.so.21.0git`lldb_private::ModuleList::Append(this=0x0000563786bd5f28, module_sp=ptr = 0x563785cec560, notify=true) at ModuleList.cpp:251:3 ... frame llvm#19: 0x00007f2f274349b3 liblldb.so.21.0git`lldb_private::Process::ConnectRemote(this=0x0000563785e1d360, remote_url=(Data = "connect://localhost:1234", Length = 24)) at Process.cpp:3250:9 frame llvm#20: 0x00007f2f27411e0e liblldb.so.21.0git`lldb_private::Platform::DoConnectProcess(this=0x0000563785c59990, connect_url=(Data = "connect://localhost:1234", Length = 24), plugin_name=(Data = "gdb-remote", Length = 10), debugger=0x0000563785d83bc0, stream=0x00007ffe62be3128, target=0x0000563786bd5be0, error=0x00007ffe62be1ca0) at Platform.cpp:1926:23 ``` ## Test Plan: Built a hello world a.out Run server in one terminal: ``` ~/llvm/build/Debug/bin/lldb-server g :1234 a.out ``` Run client in another terminal ``` ~/llvm/build/Debug/bin/lldb -o "gdb-remote 1234" -o "b hello.cc:3" ``` Before: Client hangs indefinitely ``` ~/llvm/build/Debug/bin/lldb -o "gdb-remote 1234" -o "b main" (lldb) gdb-remote 1234 ^C^C ``` After: ``` ~/llvm/build/Debug/bin/lldb -o "gdb-remote 1234" -o "b hello.cc:3" (lldb) gdb-remote 1234 Process 837068 stopped * thread #1, name = 'a.out', stop reason = signal SIGSTOP frame #0: 0x00007ffff7fe4a60 ld-linux-x86-64.so.2`_start: -> 0x7ffff7fe4a60 <+0>: movq %rsp, %rdi 0x7ffff7fe4a63 <+3>: callq 0x7ffff7fe5780 ; _dl_start at rtld.c:522:1 ld-linux-x86-64.so.2`_dl_start_user: 0x7ffff7fe4a68 <+0>: movq %rax, %r12 0x7ffff7fe4a6b <+3>: movl 0x18067(%rip), %eax ; _dl_skip_args (lldb) b hello.cc:3 Breakpoint 1: where = a.out`main + 15 at hello.cc:4:13, address = 0x00005555555551bf (lldb) c Process 837068 resuming Process 837068 stopped * thread #1, name = 'a.out', stop reason = breakpoint 1.1 frame #0: 0x00005555555551bf a.out`main at hello.cc:4:13 1 #include <iostream> 2 3 int main() { -> 4 std::cout << "Hello World" << std::endl; 5 return 0; 6 } ```
…lvm#152156) With this new A320 in-order core, we follow adding the FeatureUseFixedOverScalableIfEqualCost feature to A510 and A520 (llvm#132246), which reaps the same code generation benefits of preferring fixed over scalable when the cost is equal. So when we have: ``` void foo(float* a, float* b, float* dst, unsigned n) { for (unsigned i = 0; i < n; ++i) dst[i] = a[i] + b[i]; } ``` When compiling without the feature enabled, we get: ``` ... ld1b { z0.b }, p0/z, [x0, x10] ld1b { z2.b }, p0/z, [x1, x10] add x12, x0, x10 ldr z1, [x12, #1, mul vl] add x12, x1, x10 ldr z3, [x12, #1, mul vl] fadd z0.s, z2.s, z0.s add x12, x2, x10 fadd z1.s, z3.s, z1.s dech x11 st1b { z0.b }, p0, [x2, x10] incb x10, all, mul #2 str z1, [x12, #1, mul vl] ... ``` When compiling with, we get: ``` ... ldp q0, q1, [x12, #-16] ldp q2, q3, [x11, #-16] subs x13, x13, llvm#8 fadd v0.4s, v2.4s, v0.4s fadd v1.4s, v3.4s, v1.4s add x11, x11, llvm#32 add x12, x12, llvm#32 stp q0, q1, [x10, #-16] add x10, x10, llvm#32 ... ```
Auto-generate check lines for scalable-loop-unpredicated-body-scalar-tail.ll, while also updating the input to be more compact and avoid unnecessary checks to keep auto-generated checks compact without loss of generality.
…e.td. (llvm#152547) Only set a target guard if it deviates from its default value[1]. When a target guard is set, it is automatically AND'd with its default value. This means there is no need to use SVETargetGuard="sve,bf16" because SVETargetGuard="bf16" is sufficient. [1] Defaults: SVETargetGuard="sve", SMETargetGuard="sme"
…m#153392) This introduced a 5% compile-time regression on AArch64, see https://llvm-compile-time-tracker.com/compare.php?from=b9138bde3562de5c28a239dbd303caf2406678c6&to=271688b87abe7cf45aceaff8266270a25eb7b436&stat=instructions:u. Reverts llvm#152505.
This commit optimizes `tok::isLiteral` by replacing a succession of `13` conditions with a range-based check. I am not sure whether this is allowed. I believe it is done nowhere else in the codebase ; however, I have seen range-based conditions being used with other enums. --------- Co-authored-by: Corentin Jabot <corentinjabot@gmail.com>
…#153042) Implement a framework to make it easier to detect if evaluate::Expr<T> has certain structure.
…tests (llvm#153383) We missed several +/-0.0 comparison mismatches due to only doing equality checks
…partial reland of llvm#152505) (llvm#153398)
…lvm#146328) This is a series of patches (1/4) to unify assembly/disassembly of recent AArch64 tests into a single file. The aim is to improve consistency, so that all instructions and system registers are thoroughly tested, and future test cases will be in a unified format. This patch: * unifies errorless .s and .txt tests into a single file * remove .txt tests which don't have feature requirements * makes the .s tests have a roundabout run line to test both encoding and assembly See also llvm#146329, llvm#146330 and llvm#146331. --------- Co-authored-by: Virginia Cangelosi <virginia.cangelosi@arm.com>
…139712) Updated version of llvm#134990 which was reverted because of the buildbot [openmp-offload-amdgpu-runtime-2](https://lab.llvm.org/buildbot/#/builders/10) failing. Its configuration has `LLVM_ENABLE_LIBCXX=ON` set although it does not even have libc++ installed. In addition to llvm#134990, this PR adds a check whether C++ libraries are actually available. `#include <chrono>` was chosen because this is the library that [openmp-offload-amdgpu-runtime-2 failed with](llvm#134990 (comment)). Original summary: The buidbot [flang-aarch64-libcxx](https://lab.llvm.org/buildbot/#/builders/89) is currently failing with an ABI issue. The suspected reason is that LLVMSupport.a is built using libc++, but the unittests are using the default C++ standard library, libstdc++ in this case. This predefined `llvm_gtest` target uses the LLVMSupport from `find_package(LLVM)`, which finds the libc++-built LLVMSupport. To fix, store the `LLVM_ENABLE_LIBCXX` setting in the LLVMConfig.cmake such that everything that links to LLVM libraries use the same standard library. In this case discussed in llvm/llvm-zorg#387 it was the flang-rt unittests, but other runtimes with GTest unittests should have the same issue (e.g. offload), and any external project that uses `find_package(LLVM)`. This patch fixed the problem for me locally.
…ests Avoid nested intrinsics in constexpr tests - use __32qs instead to work correctly with -fno-signed-char tests
…th -fno-signed-char tests Work with explicit signed char types to avoid signed/unsigned char truncation out of bounds warnings
Catches a typo in the _mm512_adds_epu8 constexpr test
Caused by llvm#153297. This is likely not Windows on Arm specific but the other Windows bots don't have this problem. json::parse tries to construct llvm::Expected<Message> by moving another instance of Message into it. This would normally use this constructor: ``` /// Create an Expected<T> success value from the given OtherT value, which /// must be convertible to T. template <typename OtherT> Expected(OtherT &&Val, std::enable_if_t<std::is_convertible_v<OtherT, T>> * = nullptr) <...> ``` Note that llvm::Expected does not have a T&& constructor. Presumably the authors thought the converting one would be used. Except that in our build, using clang-cl 19.1.7, Visual Studio 2022, MSVC STL 202208, somehow is_convertible_v is false for Message. If you add a static_assert to check that this is the case, it suddenly becomes convertible. As best I can understand, this is because evaluation of the properties of Message are delayed. Delayed so much that by the time the constructor is called, it's still false. So we can "fix" this by asserting that it is convertible some time before it is checked by the constructor. I'm not sure if that's a compiler problem or the way MSVC STL's variant is written. I couldn't reproduce this behaviour in smaller examples or on Linux systems. This is the least invasive fix and only touches the new lldb code, so I'm going with it. Including the whole error here because it's a strange one and maybe later someone who has a clue about this can fix it in a better way. ``` C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build>ninja [5/15] Building CXX object tools\lldb\source\Pro...CP\CMakeFiles\lldbProtocolMCP.dir\Server.cpp.obj FAILED: tools/lldb/source/Protocol/MCP/CMakeFiles/lldbProtocolMCP.dir/Server.cpp.obj C:\Users\tcwg\scoop\shims\ccache.exe C:\Users\tcwg\scoop\apps\llvm-arm64\current\bin\clang-cl.exe /nologo -TP -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_ENABLE_EXTENDED_ALIGNED_STORAGE -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\source\Protocol\MCP -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\source\Protocol\MCP -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\include -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\include -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\include -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\llvm\include -IC:\Users\tcwg\scoop\apps\python\current\include -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\llvm\..\clang\include -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\..\clang\include -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\source -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\source /DWIN32 /D_WINDOWS /Zc:inline /Zc:__cplusplus /Oi /Brepro /bigobj /permissive- -Werror=unguarded-availability-new /W4 -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported /Gw -Wno-vla-extension /O2 /Ob2 /DNDEBUG -std:c++17 -MD -wd4018 -wd4068 -wd4150 -wd4201 -wd4251 -wd4521 -wd4530 -wd4589 /EHs-c- /GR- /showIncludes /Fotools\lldb\source\Protocol\MCP\CMakeFiles\lldbProtocolMCP.dir\Server.cpp.obj /Fdtools\lldb\source\Protocol\MCP\CMakeFiles\lldbProtocolMCP.dir\lldbProtocolMCP.pdb -c -- C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\source\Protocol\MCP\Server.cpp In file included from C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\source\Protocol\MCP\Server.cpp:9: In file included from C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\include\lldb/Protocol/MCP/Server.h:12: In file included from C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\include\lldb/Protocol/MCP/Protocol.h:17: C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\llvm\include\llvm/Support/JSON.h(938,12): error: no viable conversion from returned value of type 'remove_reference_t<variant<Request, Response, Notification> &>' (aka 'std::variant<lldb_protocol::mcp::Request, lldb_protocol::mcp::Response, lldb_protocol::mcp::Notification>') to function return type 'Expected<variant<Request, Response, Notification>>' 938 | return std::move(Result); | ^~~~~~~~~~~~~~~~~ C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\source\Protocol\MCP\Server.cpp(57,30): note: in instantiation of function template specialization 'llvm::json::parse<std::variant<lldb_protocol::mcp::Request, lldb_protocol::mcp::Response, lldb_protocol::mcp::Notification>>' requested here 57 | auto message = llvm::json::parse<Message>(/*JSON=*/data); | ^ C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\llvm\include\llvm/Support/Error.h(485,40): note: candidate constructor (the implicit copy constructor) not viable: no known conversion from 'remove_reference_t<variant<Request, Response, Notification> &>' (aka 'std::variant<lldb_protocol::mcp::Request, lldb_protocol::mcp::Response, lldb_protocol::mcp::Notification>') to 'const Expected<variant<Request, Response, Notification>> &' for 1st argument 485 | template <class T> class [[nodiscard]] Expected { | ^~~~~~~~ C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\llvm\include\llvm/Support/Error.h(507,3): note: candidate constructor not viable: no known conversion from 'remove_reference_t<variant<Request, Response, Notification> &>' (aka 'std::variant<lldb_protocol::mcp::Request, lldb_protocol::mcp::Response, lldb_protocol::mcp::Notification>') to 'Error &&' for 1st argument 507 | Expected(Error &&Err) | ^ ~~~~~~~~~~~ C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\llvm\include\llvm/Support/Error.h(521,3): note: candidate constructor not viable: no known conversion from 'remove_reference_t<variant<Request, Response, Notification> &>' (aka 'std::variant<lldb_protocol::mcp::Request, lldb_protocol::mcp::Response, lldb_protocol::mcp::Notification>') to 'ErrorSuccess' for 1st argument 521 | Expected(ErrorSuccess) = delete; | ^ ~~~~~~~~~~~~ C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\llvm\include\llvm/Support/Error.h(539,3): note: candidate constructor not viable: no known conversion from 'remove_reference_t<variant<Request, Response, Notification> &>' (aka 'std::variant<lldb_protocol::mcp::Request, lldb_protocol::mcp::Response, lldb_protocol::mcp::Notification>') to 'Expected<variant<Request, Response, Notification>> &&' for 1st argument 539 | Expected(Expected &&Other) { moveConstruct(std::move(Other)); } | ^ ~~~~~~~~~~~~~~~~ C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\llvm\include\llvm/Support/Error.h(526,3): note: candidate template ignored: requirement 'std::is_convertible_v<std::variant<lldb_protocol::mcp::Request, lldb_protocol::mcp::Response, lldb_protocol::mcp::Notification>, std::variant<lldb_protocol::mcp::Request, lldb_protocol::mcp::Response, lldb_protocol::mcp::Notification>>' was not satisfied [with OtherT = remove_reference_t<variant<Request, Response, Notification> &>] 526 | Expected(OtherT &&Val, | ^ C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\llvm\include\llvm/Support/Error.h(544,3): note: candidate template ignored: could not match 'Expected' against 'variant' 544 | Expected(Expected<OtherT> &&Other, | ^ C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\llvm\include\llvm/Support/Error.h(552,12): note: explicit constructor is not a candidate 552 | explicit Expected( | ^ 1 error generated. [8/15] Building CXX object tools\lldb\source\Plu...nProtocolServerMCP.dir\ProtocolServerMCP.cpp.obj ninja: build stopped: subcommand failed. ```
…ned (llvm#148848) DriverKit doesn't define `os_log_error`, so fails to build. Fallback to `os_log` if on DriverKit. rdar://140295247
An atomic update expression of form x = x + a + b is technically illegal, since the right-hand side is parsed as (x+a)+b, and the atomic variable x should be an argument to the top-level +. When the type of x is integer, the result of (x+a)+b is guaranteed to be the same as x+(a+b), so instead of reporting an error, the compiler can treat (x+a)+b as x+(a+b). This PR implements this kind of reassociation for integral types, and for the two arithmetic associative/commutative operators: + and *.
…cit method call (llvm#153373) This PR introduces a mechanism to defer JIT engine initialization, enabling registration of required symbols before global constructor execution. ## Problem Modules containing `gpu.module` generate global constructors (e.g., kernel load/unload) that execute *during* engine creation. This can force premature symbol resolution, causing failures when: - Symbols are registered via `mlirExecutionEngineRegisterSymbol` *after* creation - Global constructors exist (even if not directly using unresolved symbols, e.g., an external function declaration) - GPU modules introduce mandatory binary loading logic ## Usage ```c // Create engine without initialization MlirExecutionEngine jit = mlirExecutionEngineCreate(...); // Register required symbols mlirExecutionEngineRegisterSymbol(jit, ...); // Explicitly initialize (runs global constructors) mlirExecutionEngineInitialize(jit); ``` --------- Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
…lvm#146329) This is a series of patches (2/4) to unify assembly/disassembly of recent AArch64 tests into a single file. The aim is to improve consistency, so that all instructions and system registers are thoroughly tested, and future test cases will be in a unified format. This patch: * removes .txt tests which have only one feature required * makes the .s tests have a roundabout run line to test both encoding and assembly * creates diagnostic tests when needed * fixes naming convention of tests See also llvm#146328, llvm#146330 and llvm#146331. Co-authored-by: Virginia Cangelosi <virginia.cangelosi@arm.com>
Tag types like stucts or enums didn't have a declaration attached to them. The source locations are present in the IPI stream in `LF_UDT_MOD_SRC_LINE` records: ``` 0x101F | LF_UDT_MOD_SRC_LINE [size = 18, hash = 0x1C63] udt = 0x1058, mod = 3, file = 1, line = 0 0x2789 | LF_UDT_MOD_SRC_LINE [size = 18, hash = 0x1E5A] udt = 0x1253, mod = 35, file = 93, line = 17069 ``` The file is an ID in the string table `/names`: ``` ID | String 1 | '\<unknown>' 12 | 'D:\a\_work\1\s\src\ExternalAPIs\WindowsSDKInc\c\Include\10.0.22621.0\um\wingdi.h' 93 | 'D:\a\_work\1\s\src\ExternalAPIs\WindowsSDKInc\c\Include\10.0.22621.0\um\winnt.h' ``` Here, we're not interested in `mod`. This would indicate which module contributed the UDT. I was looking at Rustc's PDB and found that it uses `<unknown>` for some types, so I added a check for that. This makes two DIA PDB shell tests to work with the native PDB plugin. --------- Co-authored-by: Michael Buch <michaelbuch12@gmail.com>
Fix Mac build breakage (reported by aeubanks in llvm#153142 (comment)) by including stdint.h and using uintptr_t
…gather/store_scatter (llvm#152429) Lowering transfer_read/transfer_write to load_gather/store_scatter in case the target uArch doesn't support load_nd/store_nd. The high level steps: 1. compute Strides; 2. compute Offsets; 3. collapseMemrefTo1D; 4. create Load gather or store_scatter op
) Directly emit shl instead of a multiply if VF * Step is a power-of-2. The main motivation here is to prepare the code and test for directly generating and expanding a SCEV expression of the minimum iteration count. SCEVExpander will directly emit shl for multiplies with powers-of-2. InstCombine will also performs this combine, so end-to-end this should effectively by NFC. PR: llvm#153495
The current implementation tries to (1) patch the existing readline module definition if it's already present in the inittab and (2) append our patched readline module to the inittab. The former (1) uses the non-stable Python API and I can't find a situation where this is necessary. We do this work before initialization, so for the readline module to exist, it either needs to be added by Python itself (which doesn't seem to be the case), or someone would have had to have added it without initializing.
`y` should be the first argument and `x` should be the second, otherwise the formula is wrong. This also matches the documentation [here](https://learn.microsoft.com/en-us/windows/win32/direct3dhlsl/dx-graphics-hlsl-step).
…#150655) Instead of just outputting everything into the designated root folder, HTML and JSON output will be placed in html/ and json/ directories.
Fixing buildbot failures after PR llvm#153305, e.g. https://lab.llvm.org/buildbot/#/builders/203/builds/19861 Analysis already depends on `ProfileData`, so the transitive closure of the dependencies of `ScalarOpts` doesn't change. Also avoided an extra dependency (and very unnecessary) on `Instrumentation`. The API previously used doesn't need to live in Instrumentation to begin with, but that's something to address in a follow-up.
…nType (llvm#153646) This was a regression introduced in llvm#147835 Since this regression was never released, there are no release notes. Fixes llvm#153540
…m#145623) Emit safety guards for ptr accesses when cross partition loads exist which have a corresponding store to the same address in a different partition. This will emit the necessary ptr checks for these accesses. The test case was obtained from SuperTest, which SiFive runs regularly. We enabled LoopDistribution by default in our downstream compiler, this change was part of that enablement.
Remove extraneous argument from printf statement --------- Co-authored-by: Joachim <protze@rz.rwth-aachen.de>
The 'cfi_salt' attribute specifies a string literal that is used as a "salt" for Control-Flow Integrity (CFI) checks to distinguish between functions with the same type signature. This attribute can be applied to function declarations, function definitions, and function pointer typedefs. This attribute prevents function pointers from being replaced with pointers to functions that have a compatible type, which can be a CFI bypass vector. The attribute affects type compatibility during compilation and CFI hash generation during code generation. Attribute syntax: [[clang::cfi_salt("<salt_string>")]] GNU-style syntax: __attribute__((cfi_salt("<salt_string>"))) - The attribute takes a single string of non-NULL ASCII characters. - It only applies to function types; using it on a non-function type will generate an error. - All function declarations and the function definition must include the attribute and use identical salt values. Example usage: // Header file: #define __cfi_salt(S) __attribute__((cfi_salt(S))) // Convenient typedefs to avoid nested declarator syntax. typedef int (*fp_unsalted_t)(void); typedef int (*fp_salted_t)(void) __cfi_salt("pepper"); struct widget_ops { fp_unsalted_t init; // Regular CFI. fp_salted_t exec; // Salted CFI. fp_unsalted_t teardown; // Regular CFI. }; // bar.c file: static int bar_init(void) { ... } static int bar_salted_exec(void) __cfi_salt("pepper") { ... } static int bar_teardown(void) { ... } static struct widget_generator _generator = { .init = bar_init, .exec = bar_salted_exec, .teardown = bar_teardown, }; struct widget_generator *widget_gen = _generator; // 2nd .c file: int generate_a_widget(void) { int ret; // Called with non-salted CFI. ret = widget_gen.init(); if (ret) return ret; // Called with salted CFI. ret = widget_gen.exec(); if (ret) return ret; // Called with non-salted CFI. return widget_gen.teardown(); } Link: ClangBuiltLinux/linux#1736 Link: KSPP/linux#365 --------- Signed-off-by: Bill Wendling <morbo@google.com> Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
…#153604) Like we did for the 'private' clause, this adds an easier to use helper function to add the 'firstprivate' clause + recipe to the Parallel and Serial ops.
This is NFC as this target does not have it.
llvm#153472) Use the Python limited API when building with SWIG 4.2 or later.
The stable function map could be huge for a large application. Fully loading it is slow and consumes a significant amount of memory, which is unnecessary and drastically slows down compilation especially for non-LTO and distributed-ThinLTO setups. This patch introduces an opt-in lazy loading support for the stable function map. The detailed changes are: - `StableFunctionMap` - The map now stores entries in an `EntryStorage` struct, which includes offsets for serialized entries and a `std::once_flag` for thread-safe lazy loading. - The underlying map type is changed from `DenseMap` to `std::unordered_map` for compatibility with `std::once_flag`. - `contains()`, `size()` and `at()` are implemented to only load requested entries on demand. - Lazy Loading Mechanism - When reading indexed codegen data, if the newly-introduced `-indexed-codegen-data-lazy-loading` flag is set, the stable function map is not fully deserialized up front. The binary format for the stable function map now includes offsets and sizes to support lazy loading. - The safety of lazy loading is guarded by the once flag per function hash. This guarantees that even in a multi-threaded environment, the deserialization for a given function hash will happen exactly once. The first thread to request it performs the load, and subsequent threads will wait for it to complete before using the data. For single-threaded builds, the overhead is negligible (a single check on the once flag). For multi-threaded scenarios, users can omit the flag to retain the previous eager-loading behavior.
This patchs adds a symbol table to CIRGenFunction plus scopes and insertions to the table where we were missing them previously.
) As fmul and fmadd are so similar, their performance characteristics tend to be the same on most platforms, at least in terms of reciprocal throughputs. Processors capable of performing a given number of fmul per cycle can usually perform the same number of fma, with the extra add being relatively simple on top. This patch makes the scores of the two operations the same, which brings the throughput cost of a fma/fmuladd to 2, and the latency to 3, which are the defaults for fmul. Note that we might also want to change the throughput cost of a fmul to 1, as most processors have ample bandwidth for them, but they should still stay in-line with one another.
…ay-section-using-attach-maptype
This reverts commit ca4ebf9. Causes compile-time crashes for some inputs with RVV zvl512b/zvl1024b configurations. See here for a minimal reproducer: llvm#153393 (comment)
…on-using-attach-maptype
This is the initial clang change to support using
ATTACH
map-type for pointer-attachment.This builds upon the following:
target
by reference. llvm/llvm-project#145454For example, for the following:
The following maps are now emitted by clang:
Previously, the two possible maps emitted by clang were:
(B) does not perform any pointer attachment, while (C) also maps the
pointer p, both of which are incorrect.
With this change, we are using ATTACH-style maps, like
(A)
, for cases where the expression has a base-pointer. For example:We also group mapping of clauses with the same base decl in the order of the increasing complexity of their base-pointers, e.g. for something like:
We first map
spp
, thenspp[0]
then spp[0][0] and spp[0][0].a.This allows us to also group "struct" allocation based on their attach pointers.
Cases that need handling:
p
is a base-pointer in a map from a member function within the same class, p is not beingprivatized
, instead, we still try to create an implicit map ofthis[0:1]
, and accessp
through that, which is incorrect.use_device_addr
clause does not work properly, because we don't have a proper component-list set-up for it, just one component, so we cannot find the proper attach-ptr. Foruse_device_addr
, we should match existing maps whose attach-ptr matches the attach-ptr of theuse_device_addr
operand.use_device_ptr
handling has some issues too. Need debugging.Some tests still haven't been updated. These include: