Skip to content

Conversation

@tmc
Copy link
Contributor

@tmc tmc commented Oct 31, 2025

What issue is this addressing?

Addresses #352

What type of issue is this addressing?

bug

What this PR does | solves

This PR adds proper C-style argument packing support for callbacks on Darwin ARM64, completing the tight packing implementation for bidirectional Go↔C interop.

Background

PR #353 introduced tight argument packing for Darwin ARM64 function calls, allowing efficient stack usage for functions with many arguments. However, callbacks (C calling into Go) continued using the traditional 8-byte-per-slot layout, creating an asymmetry in the calling convention.

This PR completes the implementation by adding matching tight packing support for callbacks.

Note that this contains the commits from #353 and should be rebased once that PR is in.

tmc added 4 commits October 31, 2025 12:59
Implement comprehensive improvements to Darwin ARM64 stack argument handling
with C-style packing and smarter validation.

Key improvements:
- Replace simple argument counting with byte-based stack validation that
  accounts for Darwin ARM64's efficient argument packing
- Add C-style stack argument bundling that creates properly aligned packed
  structs for stack arguments, matching Darwin's calling convention
- Separate Darwin ARM64 struct handling from other ARM64 platforms with
  dedicated files (struct_darwin_arm64.go, struct_notdarwin_arm64.go)
- Implement shouldBundleStackArgs() to detect when to start stack packing
- Add collectStackArgs() to separate register vs stack arguments
- Add bundleStackArgs() for proper C-style struct packing with alignment

For Darwin ARM64 specifically:
- Non-HFA/HVA structs use byte-level copying in 8-byte chunks
- HFA/HVA structs continue using element-by-element placement
- Stack arguments are bundled into aligned structs before copying
- Supports complex mixed argument scenarios (primitives + structs)

Testing improvements:
- Add comprehensive TestABI_ArgumentPassing with 15+ test cases
- Test various combinations: 10+ int32s, mixed types, HFA structs
- Add TestABI_TooManyArguments to verify improved error handling
- Test cases cover register exhaustion and stack overflow scenarios

This enables Darwin ARM64 to handle significantly more arguments (e.g., 20
int32s) while maintaining ABI compatibility and providing better error
messages for truly excessive argument counts.
Adds callback detection to avoid applying C-style tight packing to callback
functions, since callback unpacking still uses the traditional 8-byte slot
convention and would break with tightly-packed arguments.

Key changes:
- Add isCallbackFunction() to detect purego callback addresses
- Skip tight packing in RegisterFunc when cfn is a callback
- Add helper functions getCallbackStart() and getMaxCB() for address range checks
- Include TODO comments marking this as temporary until callback unpacking
  is updated to handle tight packing

This ensures callbacks continue working correctly while the tight packing
feature is being developed for Darwin ARM64.
Add C-style argument packing support for callbacks on Darwin ARM64 to match
the tight packing behavior implemented for regular function calls.

Key changes:
- Split callbackWrap into generic and Darwin ARM64-specific implementations
- Add callbackWrapDarwinARM64() with C-style packed stack argument handling
- Implement proper alignment and byte-level argument placement for stack args
- Add callbackWrapGeneric() to preserve existing 8-byte slot behavior for other platforms
- Support mixed register/stack scenarios with correct argument positioning

Testing improvements:
- Add comprehensive callback packing tests with int32, mixed types, and small types
- Add TestCallbackFromC with C-to-Go callback validation using new test library
- Test various argument combinations: 12 int32s, mixed int64/int32, bool/int8/uint8/int16/uint16/int32
- Add testdata/libcbtest/callback_packing_test.c for C-side callback testing

This ensures callbacks work correctly with the C-style argument packing
implemented for Darwin ARM64, maintaining ABI compatibility while supporting
more complex argument scenarios.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements C-style stack argument packing for Darwin ARM64 to match the platform's specific ABI calling conventions. The key issue being solved is that Darwin ARM64 uses byte-level packing for stack arguments (with natural alignment), rather than the 8-byte slot model used by other platforms.

  • Adds platform-specific callback handling for Darwin ARM64 with byte-level packing
  • Introduces stack argument bundling that creates properly aligned packed structs for Darwin ARM64
  • Updates argument validation to use byte-based limits for Darwin ARM64 instead of slot-based limits

Reviewed Changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
syscall_sysv.go Splits callback handling into generic and Darwin ARM64-specific paths with byte-level packing for stack arguments
func.go Adds stack bundling logic for Darwin ARM64 and byte-based validation; removes unused import
struct_darwin_arm64.go Implements Darwin ARM64-specific struct placement, stack bundling, and register allocation logic
struct_notdarwin_arm64.go Adds ARM64 non-Darwin stub that delegates to standard ARM64 register placement
struct_others.go Provides stubs for non-ARM64/non-Darwin platforms that panic if called
struct_arm64.go Renames placeRegisters to placeRegistersArm64 for reuse across platforms
func_test.go Adds comprehensive tests for stack argument passing with various argument combinations
callback_test.go Adds tests for callback packing with mixed types and from C
testdata files Adds C test functions for validating callback behavior and ABI compliance

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

func.go Outdated
Comment on lines 495 to 519
func estimateStackBytes(ty reflect.Type) int {
numInts, numFloats := 0, 0
stackBytes := 0

for i := 0; i < ty.NumIn(); i++ {
arg := ty.In(i)
size := int(arg.Size())

// Check if this goes to register or stack
usesInt := arg.Kind() != reflect.Float32 && arg.Kind() != reflect.Float64
if usesInt && numInts < numOfIntegerRegisters() {
numInts++
} else if !usesInt && numFloats < numOfFloatRegisters {
numFloats++
} else {
// Goes to stack - accumulate total bytes
stackBytes += size
}
}
// Round total to 8-byte boundary
if stackBytes > 0 && stackBytes%align8ByteSize != 0 {
stackBytes = (stackBytes + align8ByteMask) &^ align8ByteMask
}
return stackBytes
}
Copy link

Copilot AI Nov 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The estimateStackBytes function doesn't account for struct arguments, which can have complex register allocation rules (HFA/HVA, size <= 16 bytes). When arg.Kind() is reflect.Struct, it incorrectly treats it as an integer type and may underestimate stack usage. This could lead to the validation passing when it should fail. Consider adding special handling for struct types similar to the logic in structFitsInRegisters.

Copilot uses AI. Check for mistakes.
Move platform-specific code to appropriate files for better organization:

- Move align8ByteMask, align8ByteSize constants and roundUpTo8() from func.go to struct_arm64.go
- Move estimateStackBytes() from func.go to struct_darwin_arm64.go with proper build constraints
- Add estimateStackBytes() stub to struct_others.go for non-Darwin ARM64 platforms
- Clean up variable naming in RegisterFunc (numInts/numFloats for clarity)

This improves code organization by keeping Darwin ARM64-specific functionality
properly separated and constrained to the appropriate platform files.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants