-
Notifications
You must be signed in to change notification settings - Fork 98
Implement Darwin ARM64 callback argument packing #364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Implement comprehensive improvements to Darwin ARM64 stack argument handling with C-style packing and smarter validation. Key improvements: - Replace simple argument counting with byte-based stack validation that accounts for Darwin ARM64's efficient argument packing - Add C-style stack argument bundling that creates properly aligned packed structs for stack arguments, matching Darwin's calling convention - Separate Darwin ARM64 struct handling from other ARM64 platforms with dedicated files (struct_darwin_arm64.go, struct_notdarwin_arm64.go) - Implement shouldBundleStackArgs() to detect when to start stack packing - Add collectStackArgs() to separate register vs stack arguments - Add bundleStackArgs() for proper C-style struct packing with alignment For Darwin ARM64 specifically: - Non-HFA/HVA structs use byte-level copying in 8-byte chunks - HFA/HVA structs continue using element-by-element placement - Stack arguments are bundled into aligned structs before copying - Supports complex mixed argument scenarios (primitives + structs) Testing improvements: - Add comprehensive TestABI_ArgumentPassing with 15+ test cases - Test various combinations: 10+ int32s, mixed types, HFA structs - Add TestABI_TooManyArguments to verify improved error handling - Test cases cover register exhaustion and stack overflow scenarios This enables Darwin ARM64 to handle significantly more arguments (e.g., 20 int32s) while maintaining ABI compatibility and providing better error messages for truly excessive argument counts.
Adds callback detection to avoid applying C-style tight packing to callback functions, since callback unpacking still uses the traditional 8-byte slot convention and would break with tightly-packed arguments. Key changes: - Add isCallbackFunction() to detect purego callback addresses - Skip tight packing in RegisterFunc when cfn is a callback - Add helper functions getCallbackStart() and getMaxCB() for address range checks - Include TODO comments marking this as temporary until callback unpacking is updated to handle tight packing This ensures callbacks continue working correctly while the tight packing feature is being developed for Darwin ARM64.
This reverts commit f497229.
Add C-style argument packing support for callbacks on Darwin ARM64 to match the tight packing behavior implemented for regular function calls. Key changes: - Split callbackWrap into generic and Darwin ARM64-specific implementations - Add callbackWrapDarwinARM64() with C-style packed stack argument handling - Implement proper alignment and byte-level argument placement for stack args - Add callbackWrapGeneric() to preserve existing 8-byte slot behavior for other platforms - Support mixed register/stack scenarios with correct argument positioning Testing improvements: - Add comprehensive callback packing tests with int32, mixed types, and small types - Add TestCallbackFromC with C-to-Go callback validation using new test library - Test various argument combinations: 12 int32s, mixed int64/int32, bool/int8/uint8/int16/uint16/int32 - Add testdata/libcbtest/callback_packing_test.c for C-side callback testing This ensures callbacks work correctly with the C-style argument packing implemented for Darwin ARM64, maintaining ABI compatibility while supporting more complex argument scenarios.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements C-style stack argument packing for Darwin ARM64 to match the platform's specific ABI calling conventions. The key issue being solved is that Darwin ARM64 uses byte-level packing for stack arguments (with natural alignment), rather than the 8-byte slot model used by other platforms.
- Adds platform-specific callback handling for Darwin ARM64 with byte-level packing
- Introduces stack argument bundling that creates properly aligned packed structs for Darwin ARM64
- Updates argument validation to use byte-based limits for Darwin ARM64 instead of slot-based limits
Reviewed Changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| syscall_sysv.go | Splits callback handling into generic and Darwin ARM64-specific paths with byte-level packing for stack arguments |
| func.go | Adds stack bundling logic for Darwin ARM64 and byte-based validation; removes unused import |
| struct_darwin_arm64.go | Implements Darwin ARM64-specific struct placement, stack bundling, and register allocation logic |
| struct_notdarwin_arm64.go | Adds ARM64 non-Darwin stub that delegates to standard ARM64 register placement |
| struct_others.go | Provides stubs for non-ARM64/non-Darwin platforms that panic if called |
| struct_arm64.go | Renames placeRegisters to placeRegistersArm64 for reuse across platforms |
| func_test.go | Adds comprehensive tests for stack argument passing with various argument combinations |
| callback_test.go | Adds tests for callback packing with mixed types and from C |
| testdata files | Adds C test functions for validating callback behavior and ABI compliance |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
func.go
Outdated
| func estimateStackBytes(ty reflect.Type) int { | ||
| numInts, numFloats := 0, 0 | ||
| stackBytes := 0 | ||
|
|
||
| for i := 0; i < ty.NumIn(); i++ { | ||
| arg := ty.In(i) | ||
| size := int(arg.Size()) | ||
|
|
||
| // Check if this goes to register or stack | ||
| usesInt := arg.Kind() != reflect.Float32 && arg.Kind() != reflect.Float64 | ||
| if usesInt && numInts < numOfIntegerRegisters() { | ||
| numInts++ | ||
| } else if !usesInt && numFloats < numOfFloatRegisters { | ||
| numFloats++ | ||
| } else { | ||
| // Goes to stack - accumulate total bytes | ||
| stackBytes += size | ||
| } | ||
| } | ||
| // Round total to 8-byte boundary | ||
| if stackBytes > 0 && stackBytes%align8ByteSize != 0 { | ||
| stackBytes = (stackBytes + align8ByteMask) &^ align8ByteMask | ||
| } | ||
| return stackBytes | ||
| } |
Copilot
AI
Nov 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The estimateStackBytes function doesn't account for struct arguments, which can have complex register allocation rules (HFA/HVA, size <= 16 bytes). When arg.Kind() is reflect.Struct, it incorrectly treats it as an integer type and may underestimate stack usage. This could lead to the validation passing when it should fail. Consider adding special handling for struct types similar to the logic in structFitsInRegisters.
Move platform-specific code to appropriate files for better organization: - Move align8ByteMask, align8ByteSize constants and roundUpTo8() from func.go to struct_arm64.go - Move estimateStackBytes() from func.go to struct_darwin_arm64.go with proper build constraints - Add estimateStackBytes() stub to struct_others.go for non-Darwin ARM64 platforms - Clean up variable naming in RegisterFunc (numInts/numFloats for clarity) This improves code organization by keeping Darwin ARM64-specific functionality properly separated and constrained to the appropriate platform files.
What issue is this addressing?
Addresses #352
What type of issue is this addressing?
bug
What this PR does | solves
This PR adds proper C-style argument packing support for callbacks on Darwin ARM64, completing the tight packing implementation for bidirectional Go↔C interop.
Background
PR #353 introduced tight argument packing for Darwin ARM64 function calls, allowing efficient stack usage for functions with many arguments. However, callbacks (C calling into Go) continued using the traditional 8-byte-per-slot layout, creating an asymmetry in the calling convention.
This PR completes the implementation by adding matching tight packing support for callbacks.
Note that this contains the commits from #353 and should be rebased once that PR is in.