Welcome to the Pi-on-Py Speedtest, a Python project dedicated to exploring and benchmarking the performance of calculating Pi across various CPU architectures. This project aims to provide insights into how different computational strategies and optimizations can impact the efficiency and speed of Pi calculations on diverse hardware setups.
The Pi-on-Py Speedtest leverages advanced mathematical algorithms and Python's multiprocessing capabilities to divide and conquer the task of calculating Pi. By optimizing for different CPU architectures, this project sheds light on the fascinating world of computational mathematics and its practical implications in hardware performance.
- Multi-Architecture Support: Tailored optimizations for a variety of CPU architectures to ensure maximum performance.
- High Precision Calculations: Utilize Python's
mpmathlibrary for high-precision Pi calculations. - Benchmarking Tools: Includes tools for benchmarking and comparing performance across different systems.
- Progress Reporting: Real-time progress reporting for long-running calculations, providing insights into the calculation process.
The SpeedTest-PiOnPy project is designed to run on multiple CPU architectures, including Arm, AMD, and Intel. Each of these architectures has unique characteristics that can impact the performance of computational tasks. Here's a brief overview:
-
Arm: Known for its power efficiency, Arm processors are widely used in mobile devices and increasingly in servers and desktops. The project's optimizations for Arm leverage multiprocessing to distribute the Pi calculation workload across all available CPU cores, maximizing performance per watt and making it ideal for energy-conscious environments.
-
AMD: AMD CPUs, particularly those with the EPYC architecture, offer a high number of cores and threads, making them well-suited for parallel processing tasks. The optimizations for AMD aim to leverage this multi-threading capability to speed up the Pi calculation process.
-
Intel: Intel processors are renowned for their high single-core performance, which is crucial for tasks that cannot be easily parallelized. The project includes specific optimizations for Intel CPUs to take advantage of their architecture, such as using
pypyfor faster Python execution.pypyis a Python interpreter with a JIT (Just-In-Time) compilation feature that accelerates the execution of Python code, making it a perfect match for Intel's high-performance cores.
By tailoring the optimizations to each CPU architecture, Pi-on-Py ensures that users can achieve the best possible performance regardless of their hardware setup. This approach allows for a more accurate comparison of hardware capabilities across different systems and architectures.
To get started, follow these simple steps:
1.) Clone the Repository
git clone https://github.com/matthansen0/speedtest-PIonPY
cd speedtest-PIonPY2.) (Optional) Ensure Python Base Tools
Most systems already have what you need. If python3 -m venv fails, install the venv module (Debian/Ubuntu example):
sudo apt update && sudo apt install python3 python3-venv -yNo manual pip install steps are required— the preparation script handles dependencies inside a local venv/.
3.) Prepare Environment (One-Time or Re-Runnable)
python3 prepare_benchmark.pyThis creates/updates ./venv, installs dependencies (from requirements.txt), and (optionally) suggests PyPy if you're on Intel/AMD.
4.) Run the Benchmark
python3 run_benchmark.pyOutput shows detected vendor, elapsed time (color-coded), last 50 digits (approximate), and writes a JSON results file (e.g. results_*.json).
Automatic optimizations now included:
- Warm-up pass (1% of iterations) for cache/JIT stabilization
- Core affinity pinning (best-effort) to reduce migration
- ARM big.LITTLE frequency weighting (allocates more work to faster cores)
- Intel/AMD auto re-exec under PyPy via a local managed venv (.pypy_venv) for JIT speedups (safe under PEP 668)
- JSON result artifact for later comparison
- Always-on per-segment (10%) progress reporting
Progress checkpoints (10% per worker segment) are displayed by default. Iteration count is fixed internally (10,000) for consistent cross-architecture comparison; a warm-up (unreported) precedes the main run.
x64 Intel CPU on Azure Ds2 v5, $78.11/mo
x64 AMD CPU on Azure D2ads v5, $83.95/mo
RISC ARM CPU on Azure D2ps v6, $56.94/mo
When running on Intel or AMD, the script attempts to speed up execution by:
- Detecting a
pypy3executable. - Creating a project-local virtual environment at
.pypy_venv/(if not already present). - Ensuring
mpmathis installed inside that venv. - Re‑executing the benchmark under that PyPy environment.
This approach avoids installing packages into the system Python (respects PEP 668) and keeps everything self‑contained in the repository folder.
The computation done here is indicative of one type of compute workload, each chipset has their beneifts and there are cases where they will all outperform each other. The purpose here is simply to show that ARM CPUs, under the right circumstances can be more cost effective while simultaniously being more effecient than x64 CPUs.
The current single-script benchmark uses an approximate segmented parallel method to stress CPUs uniformly. It is suitable for relative throughput comparisons (the goal of this project) but is not a mathematically strict parallelization of the Chudnovsky series. Future improvements may include:
- Exact per-term or binary-splitting implementation (mathematically rigorous)
- JSON output mode for automated comparisons
- Optional correctness validation against known π prefixes
- Thermal / frequency sampling during runs
I welcome contributions from the community! Whether it's adding new optimizations, improving the documentation, or reporting bugs, your contributions are greatly appreciated. Please refer to the CONTRIBUTING.md file for more information on how to contribute.
Here are the next steps for the SpeedTest-PiOnPy project to enhance its functionality and user experience:
- Optimize Algorithm Efficiency: Further refine the mathematical algorithms to improve calculation speed without sacrificing accuracy.
- Enhance User Interface: Develop a more interactive and user-friendly interface for the benchmarking tools.
This project is licensed under the MIT License - see the LICENSE file for details.



