perf: fix multiprocessing timing measurement #37
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes timing measurement accuracy and moves vector preprocessing outside of timed sections. The changes result in better performance measurements.
Summary
Key Changes
Performance Analysis
All comparisons are made relative to Vanilla Python baseline performance (10,322 QPS) using 25 clients/processes:
Benchmark Commands
1. Base version: 9.3K QPS
2. redis-benchmark: 14.7K QPS
3. Vanilla Python: 10.3K QPS
4. This version: 9.9K QPS
Performance Comparison Summary
Key Performance Insights
Timing Accuracy Gains
The optimization brings benchmark results significantly closer to vanilla Python baseline:
redis-benchmark
Redis-benchmark remains the performance ceiling at 14,656 QPS (+42.0% vs vanilla Python), representing the theoretical maximum for direct Redis usage without both framework and Python overhead.
This PR significantly closes the gap between our benchmark framework and vanilla Python baseline: