Skip to content

Conversation

vbvictor
Copy link
Contributor

@vbvictor vbvictor commented Aug 29, 2025

WIP do not review

@vbvictor vbvictor marked this pull request as draft August 29, 2025 21:07
@llvmbot
Copy link
Member

llvmbot commented Aug 29, 2025

@llvm/pr-subscribers-github-workflow

Author: Baranov Victor (vbvictor)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/156106.diff

2 Files Affected:

  • (modified) .github/workflows/email-check.yaml (+26-16)
  • (added) llvm/utils/git/email-check-helper.py (+123)
diff --git a/.github/workflows/email-check.yaml b/.github/workflows/email-check.yaml
index 9390fba4d4e3b..572eb81c96f9a 100644
--- a/.github/workflows/email-check.yaml
+++ b/.github/workflows/email-check.yaml
@@ -17,27 +17,37 @@ jobs:
         uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
         with:
           ref: ${{ github.event.pull_request.head.sha }}
+          fetch-depth: 2
 
-      - name: Extract author email
-        id: author
-        run: |
-          git log -1
-          echo "EMAIL=$(git show -s --format='%ae' HEAD~0)" >> $GITHUB_OUTPUT
-          # Create empty comment file
-          echo "[]" > comments
+      - name: Fetch email check utils
+        uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
+        with:
+          repository: ${{ github.event.pull_request.head.repo.full_name }}
+          ref: ${{ github.event.pull_request.head.ref }}
+          sparse-checkout: |
+            llvm/utils/git/requirements_formatting.txt
+            llvm/utils/git/email-check-helper.py
+          sparse-checkout-cone-mode: false
+          path: email-check-tools
+
+      - name: Setup Python env
+        uses: actions/setup-python@42375524e23c412d93fb67b49958b491fce71c38 # v5.4.0
+        with:
+          python-version: '3.11'
+
+      - name: Install python dependencies
+        run: pip install -r email-check-tools/llvm/utils/git/requirements_formatting.txt
 
       - name: Validate author email
-        if: ${{ endsWith(steps.author.outputs.EMAIL, 'noreply.github.com')  }}
         env:
-          COMMENT: >-
-            ⚠️ We detected that you are using a GitHub private e-mail address to contribute to the repo.<br/>
-            Please turn off [Keep my email addresses private](https://github.com/settings/emails) setting in your account.<br/>
-            See [LLVM Developer Policy](https://llvm.org/docs/DeveloperPolicy.html#email-addresses) and
-            [LLVM Discourse](https://discourse.llvm.org/t/hidden-emails-on-github-should-we-do-something-about-it) for more information.
+          GITHUB_PR_NUMBER: ${{ github.event.pull_request.number }}
+          PR_AUTHOR: ${{ github.event.pull_request.user.login }}
         run: |
-          cat << EOF > comments
-          [{"body" : "$COMMENT"}]
-          EOF
+          echo "[]" > comments &&
+          python ./email-check-tools/llvm/utils/git/email-check-helper.py \
+            --token ${{ secrets.GITHUB_TOKEN }} \
+            --issue-number $GITHUB_PR_NUMBER \
+            --pr-author "$PR_AUTHOR"
 
       - uses: actions/upload-artifact@26f96dfa697d77e81fd5907df203aa23a56210a8 #v4.3.0
         if: always()
diff --git a/llvm/utils/git/email-check-helper.py b/llvm/utils/git/email-check-helper.py
new file mode 100644
index 0000000000000..f9f07520fd649
--- /dev/null
+++ b/llvm/utils/git/email-check-helper.py
@@ -0,0 +1,123 @@
+#!/usr/bin/env python3
+#
+# ====- email-check-helper, checks for private email usage in PRs --*- python -*--==#
+#
+# Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+# See https://llvm.org/LICENSE.txt for license information.
+# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+#
+# ==--------------------------------------------------------------------------------------==#
+"""A helper script to detect private email of a Github user
+This script is run by GitHub actions to ensure that contributors to PR's are not
+using GitHub's private email addresses.
+
+The script enforces the LLVM Developer Policy regarding email addresses:
+https://llvm.org/docs/DeveloperPolicy.html#email-addresses
+"""
+
+import argparse
+import json
+import os
+import subprocess
+import sys
+from typing import Optional
+
+
+COMMENT_TAG = "<!--LLVM EMAIL CHECK COMMENT-->"
+
+
+def get_commit_email() -> Optional[str]:
+    proc = subprocess.run(
+        ["git", "show", "-s", "--format=%ae", "HEAD"],
+        stdout=subprocess.PIPE,
+        stderr=subprocess.PIPE,
+        encoding="utf-8",
+        check=False
+    )
+    if proc.returncode == 0:
+        return proc.stdout.strip()
+    return None
+
+
+def is_private_email(email: Optional[str]) -> bool:
+    if not email:
+        return False
+    return (email.endswith("noreply.github.com") or
+            email.endswith("users.noreply.github.com"))
+
+
+def check_user_email(token: str, pr_author: str) -> bool:
+    try:
+        from github import Github
+
+        print(f"Checking email privacy for user: {pr_author}")
+        
+        api = Github(token)
+        user = api.get_user(pr_author)
+        
+        print(f"User public email: {user.email or 'null (private)'}")
+
+        if user.email is not None and is_private_email(user.email):
+            return True
+
+        return is_private_email(get_commit_email())
+    except:
+        return False
+
+
+def generate_comment() -> str:
+    return f"""{COMMENT_TAG}
+⚠️ We detected that you are using a GitHub private e-mail address to contribute to the repo.<br/>
+Please turn off [Keep my email addresses private](https://github.com/settings/emails) setting in your account.<br/>
+See [LLVM Developer Policy](https://llvm.org/docs/DeveloperPolicy.html#email-addresses) and
+[LLVM Discourse](https://discourse.llvm.org/t/hidden-emails-on-github-should-we-do-something-about-it) for more information.
+"""
+
+
+def main():
+    """Main function."""
+    parser = argparse.ArgumentParser(
+        description="Check for private email usage in GitHub PRs"
+    )
+    parser.add_argument(
+        "--token", type=str, required=True, help="GitHub authentication token"
+    )
+    parser.add_argument(
+        "--repo",
+        type=str,
+        default=os.getenv("GITHUB_REPOSITORY", "llvm/llvm-project"),
+        help="The GitHub repository in the form of <owner>/<repo>",
+    )
+    parser.add_argument(
+        "--issue-number",
+        type=int,
+        required=True,
+        help="The PR number to check"
+    )
+    parser.add_argument(
+        "--pr-author",
+        type=str,
+        required=True,
+        help="The GitHub username of the PR author"
+    )
+
+    args = parser.parse_args()
+
+    has_private_email = check_user_email(args.token, args.pr_author)
+
+    comments = []
+    if has_private_email:
+        comments.append({"body": generate_comment()})
+
+    with open("comments", "w", encoding="utf-8") as f:
+        json.dump(comments, f)
+
+    print(f"Wrote {'comment' if has_private_email else 'empty comments'} to file")
+
+    if has_private_email:
+        print("Private email detected")
+        sys.exit(1)
+
+
+if __name__ == "__main__":
+    main()

Copy link

github-actions bot commented Aug 29, 2025

⚠️ Python code formatter, darker found issues in your code. ⚠️

You can test this locally with the following command:
darker --check --diff -r origin/main...HEAD llvm/utils/git/email-check-helper.py

⚠️
The reproduction instructions above might return results for more than one PR
in a stack if you are using a stacked PR workflow. You can limit the results by
changing origin/main to the base branch/commit you want to compare against.
⚠️

View the diff from darker here.
--- email-check-helper.py	2025-08-30 06:35:14.000000 +0000
+++ email-check-helper.py	2025-08-30 06:37:19.434721 +0000
@@ -30,35 +30,36 @@
     proc = subprocess.run(
         ["git", "show", "-s", "--format=%ae", "HEAD"],
         stdout=subprocess.PIPE,
         stderr=subprocess.PIPE,
         encoding="utf-8",
-        check=False
+        check=False,
     )
     if proc.returncode == 0:
         return proc.stdout.strip()
     return None
 
 
 def is_private_email(email: Optional[str]) -> bool:
     if not email:
         return False
-    return (email.endswith("noreply.github.com") or
-            email.endswith("users.noreply.github.com"))
+    return email.endswith("noreply.github.com") or email.endswith(
+        "users.noreply.github.com"
+    )
 
 
 def check_user_email(token: str, pr_author: str) -> bool:
     try:
         from github import Github
 
         print(f"Checking email privacy for user: {pr_author}")
-        
+
         api = Github(token)
         user = api.get_user(pr_author)
         emails = user.get_emails()
         print(emails)
-        
+
         print(f"User public email: {user.email or 'null (private)'}")
 
         if user.email is not None or is_private_email(user.email):
             return True
 
@@ -93,11 +94,11 @@
     )
     parser.add_argument(
         "--pr-author",
         type=str,
         required=True,
-        help="The GitHub username of the PR author"
+        help="The GitHub username of the PR author",
     )
 
     args = parser.parse_args()
 
     has_private_email = check_user_email(args.token, args.pr_author)

@vbvictor vbvictor changed the title Add check for private emails WIP Add check for private emails Aug 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants