Skip to content

[clang] Add a CodeGen option to ignore compilation directories #149897

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

cachemeifyoucan
Copy link
Collaborator

When enabling explicit gmodule builds, it is hard to enable unused
current working directory optimization because CWD is embedded in the
debug info when dependency scanner thinks the option is not needed.
This is true even when you pass in an empty -fdebug-compiation-dir
since codegen will directly ask file system for CWD as a fallback,
causing different gmodules to be compiled in different directories.

Add an new cc1 flag -fno-compilation-dir that can make
DebugCompilationDir to be empty. This allows explicit module build to be
truely free from CWD dependencies when it needs the function. That can
reduce the number of clang modules to be generated in complex project
with a slight overhead for string deduplication in debug info.

Created using spr 1.3.6
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen IR generation bugs: mangling, exceptions, etc. debuginfo labels Jul 21, 2025
@llvmbot
Copy link
Member

llvmbot commented Jul 21, 2025

@llvm/pr-subscribers-debuginfo
@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-codegen

Author: Steven Wu (cachemeifyoucan)

Changes

When enabling explicit gmodule builds, it is hard to enable unused
current working directory optimization because CWD is embedded in the
debug info when dependency scanner thinks the option is not needed.
This is true even when you pass in an empty -fdebug-compiation-dir
since codegen will directly ask file system for CWD as a fallback,
causing different gmodules to be compiled in different directories.

Add an new cc1 flag -fno-compilation-dir that can make
DebugCompilationDir to be empty. This allows explicit module build to be
truely free from CWD dependencies when it needs the function. That can
reduce the number of clang modules to be generated in complex project
with a slight overhead for string deduplication in debug info.


Full diff: https://github.com/llvm/llvm-project/pull/149897.diff

8 Files Affected:

  • (modified) clang/include/clang/Basic/CodeGenOptions.h (+3)
  • (modified) clang/include/clang/Driver/Options.td (+5)
  • (modified) clang/lib/CodeGen/CGDebugInfo.cpp (+6)
  • (modified) clang/lib/CodeGen/CoverageMappingGen.cpp (+3)
  • (modified) clang/lib/CodeGen/ObjectFilePCHContainerWriter.cpp (+2)
  • (modified) clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp (+6-24)
  • (modified) clang/test/ClangScanDeps/modules-debug-dir.c (+8-1)
  • (modified) clang/test/CodeGen/debug-info-compilation-dir.c (+7)
diff --git a/clang/include/clang/Basic/CodeGenOptions.h b/clang/include/clang/Basic/CodeGenOptions.h
index cdeedd5b4eac6..9ec2784f6ce1c 100644
--- a/clang/include/clang/Basic/CodeGenOptions.h
+++ b/clang/include/clang/Basic/CodeGenOptions.h
@@ -231,6 +231,9 @@ class CodeGenOptions : public CodeGenOptionsBase {
   /// The string to embed in coverage mapping as the current working directory.
   std::string CoverageCompilationDir;
 
+  /// No compilation directory, ignore debug/coverage compilation directory.
+  bool NoCompilationDir;
+
   /// The string to embed in the debug information for the compile unit, if
   /// non-empty.
   std::string DwarfDebugFlags;
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index e30c152cbce2e..1307e8a6e9e12 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1725,6 +1725,11 @@ def fcoverage_compilation_dir_EQ : Joined<["-"], "fcoverage-compilation-dir=">,
 def ffile_compilation_dir_EQ : Joined<["-"], "ffile-compilation-dir=">, Group<f_Group>,
     Visibility<[ClangOption, CLOption, DXCOption]>,
     HelpText<"The compilation directory to embed in the debug info and coverage mapping.">;
+def fno_compilation_dir: Flag<["-"], "fno-compilation-dir">,
+    Visibility<[ClangOption, CC1Option]>,
+    Group<f_Group>,
+    HelpText<"Ignore compilation directories">,
+    MarshallingInfoFlag<CodeGenOpts<"NoCompilationDir">>;
 defm debug_info_for_profiling : BoolFOption<"debug-info-for-profiling",
   CodeGenOpts<"DebugInfoForProfiling">, DefaultFalse,
   PosFlag<SetTrue, [], [ClangOption, CC1Option],
diff --git a/clang/lib/CodeGen/CGDebugInfo.cpp b/clang/lib/CodeGen/CGDebugInfo.cpp
index a371b6755f74d..8e4644e594d71 100644
--- a/clang/lib/CodeGen/CGDebugInfo.cpp
+++ b/clang/lib/CodeGen/CGDebugInfo.cpp
@@ -643,6 +643,9 @@ unsigned CGDebugInfo::getColumnNumber(SourceLocation Loc, bool Force) {
 }
 
 StringRef CGDebugInfo::getCurrentDirname() {
+  if (CGM.getCodeGenOpts().NoCompilationDir)
+    return StringRef();
+
   if (!CGM.getCodeGenOpts().DebugCompilationDir.empty())
     return CGM.getCodeGenOpts().DebugCompilationDir;
 
@@ -3246,6 +3249,9 @@ llvm::DIModule *CGDebugInfo::getOrCreateModuleRef(ASTSourceDescriptor Mod,
     std::string Remapped = remapDIPath(Path);
     StringRef Relative(Remapped);
     StringRef CompDir = TheCU->getDirectory();
+    if (CompDir.empty())
+      return Remapped;
+
     if (Relative.consume_front(CompDir))
       Relative.consume_front(llvm::sys::path::get_separator());
 
diff --git a/clang/lib/CodeGen/CoverageMappingGen.cpp b/clang/lib/CodeGen/CoverageMappingGen.cpp
index 4aafac349e3e9..0e90908c8db7b 100644
--- a/clang/lib/CodeGen/CoverageMappingGen.cpp
+++ b/clang/lib/CodeGen/CoverageMappingGen.cpp
@@ -2449,6 +2449,9 @@ CoverageMappingModuleGen::CoverageMappingModuleGen(
     : CGM(CGM), SourceInfo(SourceInfo) {}
 
 std::string CoverageMappingModuleGen::getCurrentDirname() {
+  if (CGM.getCodeGenOpts().NoCompilationDir)
+    return {};
+
   if (!CGM.getCodeGenOpts().CoverageCompilationDir.empty())
     return CGM.getCodeGenOpts().CoverageCompilationDir;
 
diff --git a/clang/lib/CodeGen/ObjectFilePCHContainerWriter.cpp b/clang/lib/CodeGen/ObjectFilePCHContainerWriter.cpp
index 95971e57086e7..a5be5bbee753b 100644
--- a/clang/lib/CodeGen/ObjectFilePCHContainerWriter.cpp
+++ b/clang/lib/CodeGen/ObjectFilePCHContainerWriter.cpp
@@ -164,6 +164,8 @@ class PCHContainerGenerator : public ASTConsumer {
     CodeGenOpts.DwarfVersion = CI.getCodeGenOpts().DwarfVersion;
     CodeGenOpts.DebugCompilationDir =
         CI.getInvocation().getCodeGenOpts().DebugCompilationDir;
+    CodeGenOpts.NoCompilationDir =
+        CI.getInvocation().getCodeGenOpts().NoCompilationDir;
     CodeGenOpts.DebugPrefixMap =
         CI.getInvocation().getCodeGenOpts().DebugPrefixMap;
     CodeGenOpts.DebugStrictDwarf = CI.getCodeGenOpts().DebugStrictDwarf;
diff --git a/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp b/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
index 37f8b945d785e..7226e06db9092 100644
--- a/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
+++ b/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
@@ -144,30 +144,9 @@ static void optimizeDiagnosticOpts(DiagnosticOptions &Opts,
 
 static void optimizeCWD(CowCompilerInvocation &BuildInvocation, StringRef CWD) {
   BuildInvocation.getMutFileSystemOpts().WorkingDir.clear();
-  if (BuildInvocation.getCodeGenOpts().DwarfVersion) {
-    // It is necessary to explicitly set the DebugCompilationDir
-    // to a common directory (e.g. root) if IgnoreCWD is true.
-    // When IgnoreCWD is true, the module's content should not
-    // depend on the current working directory. However, if dwarf
-    // information is needed (when CGOpts.DwarfVersion is
-    // non-zero), then CGOpts.DebugCompilationDir must be
-    // populated, because otherwise the current working directory
-    // will be automatically embedded in the dwarf information in
-    // the pcm, contradicting the assumption that it is safe to
-    // ignore the CWD. Thus in such cases,
-    // CGOpts.DebugCompilationDir is explicitly set to a common
-    // directory.
-    // FIXME: It is still excessive to create a copy of
-    // CodeGenOpts for each module. Since we do not modify the
-    // CodeGenOpts otherwise per module, the following code
-    // ends up generating identical CodeGenOpts for each module
-    // with DebugCompilationDir pointing to the root directory.
-    // We can optimize this away by creating a _single_ copy of
-    // CodeGenOpts whose DebugCompilationDir points to the root
-    // directory and reuse it across modules.
-    BuildInvocation.getMutCodeGenOpts().DebugCompilationDir =
-        llvm::sys::path::root_path(CWD);
-  }
+  // To avoid clang inferring working directory from CWD, set to ignore
+  // compilation directory.
+  BuildInvocation.getMutCodeGenOpts().NoCompilationDir = true;
 }
 
 static std::vector<std::string> splitString(std::string S, char Separator) {
@@ -222,6 +201,9 @@ void dependencies::resetBenignCodeGenOptions(frontend::ActionKind ProgramAction,
     CGOpts.ProfileInstrumentUsePath.clear();
     CGOpts.SampleProfileFile.clear();
     CGOpts.ProfileRemappingFile.clear();
+    // To avoid clang inferring compilation directory from CWD, set
+    // -no-compilation-dir option.
+    CGOpts.NoCompilationDir = true;
   }
 }
 
diff --git a/clang/test/ClangScanDeps/modules-debug-dir.c b/clang/test/ClangScanDeps/modules-debug-dir.c
index c4fb4982ed791..066f076e94524 100644
--- a/clang/test/ClangScanDeps/modules-debug-dir.c
+++ b/clang/test/ClangScanDeps/modules-debug-dir.c
@@ -7,6 +7,12 @@
 // RUN:   experimental-full -optimize-args=all > %t/result.json
 // RUN: cat %t/result.json | sed 's:\\\\\?:/:g' | FileCheck %s
 
+// RUN: %deps-to-rsp %t/result.json --module-name=mod > %t/mod.rsp
+// RUN: %clang @%t/mod.rsp -o %t/mod.pcm
+// RUN: llvm-dwarfdump --debug-info %t/mod.pcm | FileCheck %s --check-prefix=DWARF
+// DWARF: DW_TAG_compile_unit
+// DWARF-NOT: DW_AT_comp_dir
+
 //--- cdb.json.in
 [{
   "directory": "DIR",
@@ -28,5 +34,6 @@ module mod {
 // directory when current working directory optimization is in effect.
 // CHECK:  "modules": [
 // CHECK: "command-line": [
-// CHECK: "-fdebug-compilation-dir={{\/|.*:(\\)?}}",
+// CHECK: "-fno-compilation-dir"
+// CHECK-NOT: -fdebug-compilation-dir
 // CHECK:  "translation-units": [
diff --git a/clang/test/CodeGen/debug-info-compilation-dir.c b/clang/test/CodeGen/debug-info-compilation-dir.c
index b49a0f5751f8e..195074764aa39 100644
--- a/clang/test/CodeGen/debug-info-compilation-dir.c
+++ b/clang/test/CodeGen/debug-info-compilation-dir.c
@@ -7,3 +7,10 @@
 // RUN: %clang_cc1 -emit-llvm -debug-info-kind=limited %s -o - | FileCheck -check-prefix=CHECK-DIR %s
 // CHECK-DIR: CodeGen
 
+/// Test path remapping.
+// RUN: %clang_cc1 -fdebug-compilation-dir %S -main-file-name %s -emit-llvm -debug-info-kind=limited %s -o - | FileCheck -check-prefix=CHECK-ABS %s -DPREFIX=%S
+// CHECK-ABS: DIFile(filename: "[[PREFIX]]/debug-info-compilation-dir.c", directory: "[[PREFIX]]")
+
+// RUN: %clang_cc1 -fno-compilation-dir -main-file-name %s -emit-llvm -debug-info-kind=limited %s -o - | FileCheck -check-prefix=CHECK-NOMAP %s -DPREFIX=%S
+// CHECK-NOMAP: DIFile(filename: "[[PREFIX]]/debug-info-compilation-dir.c", directory: "")
+

Created using spr 1.3.6
Copy link
Contributor

@qiongsiwu qiongsiwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks so much!

}
// To avoid clang inferring working directory from CWD, set to ignore
// compilation directory.
BuildInvocation.getMutCodeGenOpts().NoCompilationDir = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I believe this change still make sure that the pcms contains no working directory on the command line if CWD optimization is on. The change looks safe to me.

Created using spr 1.3.6
@dwblaikie
Copy link
Collaborator

I think at least with Bazel we use -fdebug-compilation-dir=/proc/cwd to create build-path-agnostic builds. Is that something you folks can use, rather than adding this new option? Or is there some other way we could make -fdebug-compilation-dir address this? (like differentiate between empty and non-present? (so -fdebug-compilation-dir= would do what you want (awkward, in that it redefines the behavior that someone might be relying on) and then maybe some flag value for "go back to the default" like -fdebug-compilation-dir=<default> or the like) - there was also some attempt by Chrome folks to propose an explicit representation of relative path in comp_dir, but ended up I think as https://dwarfstd.org/issues/210628.1.html (the historic links seem to be dead, unfortunately))

@cachemeifyoucan
Copy link
Collaborator Author

I am not planning to merge yet until debug info people approves the direction. Like you said, I considered -fdebug-compilation-dir= option but it is actually altering the behavior if I want to achieve empty compilation directory. The underlying implementation can be tweaked to achieve the same affect (like using std::optional<std::string> for compilation directory but it seems we need a new flag anyway).

The reason I want to touch this code:

  • If you have a file system that doesn't have current working directory, you can actually achieve empty compilation directory, but that code path is not tested and has bugs when generating debug info (like it will strip the leading / in absolute path)
  • I want to explore generating debug info point to a CASID (or you can think a URL) instead of a path on disk to achieve build-path-agnostic and teach debugging tools to load CAS/remote content. That will get in the way of current logic of path resolution because URLs look like relative path and debugger will try to prefix /proc/cwd to URL before resolve it.

@cachemeifyoucan
Copy link
Collaborator Author

Here is idea to not to add a new option. I am going to remove the fallback in CodeGen but make sure clang-driver does the fallback computation and clang cc1 will always respect that decision from clang-driver. In that case, it is a behavior change for cc1 flag, but not for a driver flag.

@dwblaikie
Copy link
Collaborator

Here is idea to not to add a new option. I am going to remove the fallback in CodeGen but make sure clang-driver does the fallback computation and clang cc1 will always respect that decision from clang-driver. In that case, it is a behavior change for cc1 flag, but not for a driver flag.

& then for now, in your use case, you'd be using the cc1 flag via -Xclang, etc? I think that changing the driver to have the smarts there, but having cc1/frontend just respect whatever it's given sounds good to me, if a little subtle for now - but nothing drastic.

@cachemeifyoucan
Copy link
Collaborator Author

& then for now, in your use case, you'd be using the cc1 flag via -Xclang, etc?

Our explicit module build comes from dependency scanner, where it can generate the cc1 arguments we want to achieve empty compilation directory if needed.

@cachemeifyoucan
Copy link
Collaborator Author

@dwblaikie The other possible implementation is here: #150112

It is so different and I don't feel it is really better so I used a separate PR. I guess the reason I feel it is not better because the entire system has lots of holes in it that just doesn't work in some cases regardless if how it is implemented, e.g.:

  • If -fdebug-compilation-dir is a relative path, the DIFile will just not going to point to an absolute path and defeat the purpose.
  • The interaction with -fdebug-prefix-map is also weird and not correct in many cases. In the other PR, I added this test case to show the remapped path is simply wrong and cannot be recovered:
    // CHECK-REMAP-Y: !DIFile(filename: "y{{/|\\\\}}c.c", directory: "x")

@dwblaikie
Copy link
Collaborator

@dwblaikie The other possible implementation is here: #150112

It is so different and I don't feel it is really better so I used a separate PR. I guess the reason I feel it is not better because the entire system has lots of holes in it that just doesn't work in some cases regardless if how it is implemented,

I worry that adding more distinct options is likely to complicate things further, though...

  • If -fdebug-compilation-dir is a relative path, the DIFile will just not going to point to an absolute path and defeat the purpose.
  • The interaction with -fdebug-prefix-map is also weird and not correct in many cases. In the other PR, I added this test case to show the remapped path is simply wrong and cannot be recovered:
    // CHECK-REMAP-Y: !DIFile(filename: "y{{/|\\\\}}c.c", directory: "x")

Are these existing bugs, or bugs that occur with the alternative patch?

@cachemeifyoucan
Copy link
Collaborator Author

Are these existing bugs, or bugs that occur with the alternative patch?

There are existing bugs so I am not worried about going either direction. I originally overdid the patch trying address them but then realize it should probably be saved for later. I am actually fine with either directions now. @adrian-prantl Do you have any preference.

For the existing bug, I think it can be fixed by:

  • Don't encode directory in DIFile if filename is absolute path and make sure directory + filename always construct the full path.
  • prefix-map ordering probably needs to be redefined. I think this can usually be fixed by applying prefix map in sorted ordering instead of command-line order.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc. clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category debuginfo
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants