Skip to content

[C++ BoundsSafety] Fix false positives when pointer argument is a function call #11046

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: next
Choose a base branch
from

Conversation

ziqingluo-90
Copy link

This PR attempts to fix false positives in such an example:

void cb(int *__counted_by(count) p, size_t count);
int *__counted_by(n) f(size_t n);

void test(size_t len) {
  cb(f(len), len); // false positive
}

The analysis needs to compare the size of f(len), which is specified by __counted_by(n), with the expected count of the first argument of cb, which is specified by __counted_by(count). The comparison interprets the two comparands at two "call contexts" respectively: the count n needs to be interpreted at the call f(len) with a mapping {n -> len} and the count count needs to be interpreted at the call cb(f(len), len) with a mapping {p -> f(len), count -> len}.

The existing compare algorithm is extended from assuming only one comparand needs a substitution map to assuming both comparands need substitution maps.

rdar://155952016

@ziqingluo-90 ziqingluo-90 changed the title [Draft][C++ BoundsSafety] Fix false positives when pointer argument is a function call [C++ BoundsSafety] Fix false positives when pointer argument is a function call Jul 22, 2025
@ziqingluo-90
Copy link
Author

@patrykstefanski This is my attempt to extend the existing CompatibleCountExprVisitor to take two substitution maps, though I think we've reached the limit of this Visitor-based comparison approach. It traverses only one of the comparands so it does not know when to apply substitution for the other comparand. The result is that I have to trySubstitute at each of the overloaded methods.
I'd like to refactor this comparison code later.

}
}
return {E, false};
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that both trySimplifyDerefAddressof and trySubstitute attempt to simplify and substitute expressions. Maybe we could consider combining them into a single function?

I was thinking about something like this:

  // Simplify '*&e' and/or substitute decl. We want to perform both at the same
  // time, since this pattern might occur after a substitution:
  //   E: *x, DependentValues: {x->&y} which results in *&y
  std::pair<const Expr *, bool>
  trySimplifyAndSubstitute(const Expr *const E, bool AllowSubst,
                           const DependentValuesTy *DependentValues) const {
    const Expr *X = E->IgnoreParenImpCasts();

    bool HasDeref = false;
    if (const auto *DerefOp = dyn_cast<UnaryOperator>(X);
        DerefOp && DerefOp->getOpcode() == UO_Deref) {
      HasDeref = true;
      X = DerefOp->getSubExpr()->IgnoreParenImpCasts();
    }

    auto trySubstitute = [&](const Expr *E) -> std::pair<const Expr *, bool> {
      if (AllowSubst && DependentValues) {
        if (const auto *DRE = dyn_cast<DeclRefExpr>(E)) {
          if (auto It = DependentValues->find(DRE->getDecl());
              It != DependentValues->end())
            return {It->second->IgnoreParenImpCasts(), true};
        }
      }
      return {E, false};
    };

    bool Substituted = false;
    std::tie(X, Substituted) = trySubstitute(X);

    if (HasDeref) {
      const auto *AddrOfOp = dyn_cast<UnaryOperator>(X);
      if (!AddrOfOp || AddrOfOp->getOpcode() != UO_AddrOf)
        return {E, false};
      X = AddrOfOp->getSubExpr()->IgnoreParenImpCasts();
    }

    if (!Substituted)
      std::tie(X, Substituted) = trySubstitute(X);

    return {X, Substituted};
  }

In that case, VisitDeclRefExpr and VisitUnaryOperator could look like this:

  bool VisitDeclRefExpr(const DeclRefExpr *SelfDRE, const Expr *Other,
                        bool hasSelfBeenSubstituted,
                        bool hasOtherBeenSubstituted) {
    const ValueDecl *SelfVD = SelfDRE->getDecl();

    const auto [Self, Substituted] = trySimplifyAndSubstitute(
        SelfDRE, !hasSelfBeenSubstituted, DependentValuesSelf);
    if (Self != SelfDRE) {
      // We only simplify '*&e', this couldn't happen here, since we have a DRE.
      assert(Substituted);
      // We substituted the expr, now recurse.
      return Visit(Self, Other, true, hasOtherBeenSubstituted);
    }

    std::tie(Other, hasOtherBeenSubstituted) = trySimplifyAndSubstitute(
        Other, !hasOtherBeenSubstituted, DependentValuesOther);

    const auto *O = Other->IgnoreParenImpCasts();

    if (const auto *OtherDRE = dyn_cast<DeclRefExpr>(O)) {
      // Both SelfDRE and OtherDRE can be transformed from member expressions:
      if (OtherDRE->getDecl() == SelfVD)
        return true;
      return false;
    }

    const auto *OtherME = dyn_cast<MemberExpr>(O);
    if (MemberBase && OtherME) {
      return OtherME->getMemberDecl() == SelfVD &&
             Visit(OtherME->getBase(), MemberBase, hasSelfBeenSubstituted,
                   hasOtherBeenSubstituted);
    }

    return false;
  }

  bool VisitUnaryOperator(const UnaryOperator *SelfUO, const Expr *Other,
                          bool hasSelfBeenSubstituted,
                          bool hasOtherBeenSubstituted) {
    if (SelfUO->getOpcode() != UO_Deref)
      return false; // We don't support any other unary operator

    const auto &[SimplifiedSelf, Substituted] = trySimplifyAndSubstitute(
        SelfUO, !hasSelfBeenSubstituted, DependentValuesSelf);
    if (SelfUO != SimplifiedSelf) {
      return Visit(SimplifiedSelf, Other, Substituted, hasOtherBeenSubstituted);
    }

    std::tie(Other, hasOtherBeenSubstituted) = trySimplifyAndSubstitute(
        Other, !hasOtherBeenSubstituted, DependentValuesOther);

    if (const auto *OtherUO =
            dyn_cast<UnaryOperator>(Other->IgnoreParenImpCasts())) {
      if (SelfUO->getOpcode() == OtherUO->getOpcode())
        return Visit(SelfUO->getSubExpr(), OtherUO->getSubExpr(),
                     hasSelfBeenSubstituted, hasOtherBeenSubstituted);
    }

    return false;
  }

What do you think?

// Handle `PtrArgNoImp` has the form of `(char *) p` and `p` is a sized_by
// pointer. This will be of pattern 5 after stripping the cast:
if (const auto *CastE = dyn_cast<CastExpr>(PtrArgNoImp); CastE && isSizedBy)
PtrArgNoImp = CastE->getSubExpr();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have tests for such casts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants