SpeedTest: Pi-on-Py

Welcome to the Pi-on-Py Speedtest, a Python project dedicated to exploring and benchmarking the performance of calculating Pi across various CPU architectures. This project aims to provide insights into how different computational strategies and optimizations can impact the efficiency and speed of Pi calculations on diverse hardware setups.

Project Overview

The Pi-on-Py Speedtest leverages advanced mathematical algorithms and Python's multiprocessing capabilities to divide and conquer the task of calculating Pi. By optimizing for different CPU architectures, this project sheds light on the fascinating world of computational mathematics and its practical implications in hardware performance.

Features

Multi-Architecture Support: Tailored optimizations for a variety of CPU architectures to ensure maximum performance.
High Precision Calculations: Utilize Python's mpmath library for high-precision Pi calculations.
Benchmarking Tools: Includes tools for benchmarking and comparing performance across different systems.
Progress Reporting: Real-time progress reporting for long-running calculations, providing insights into the calculation process.

Understanding CPU Architecture Differences

The SpeedTest-PiOnPy project is designed to run on multiple CPU architectures, including Arm, AMD, and Intel. Each of these architectures has unique characteristics that can impact the performance of computational tasks. Here's a brief overview:

Arm: Known for its power efficiency, Arm processors are widely used in mobile devices and increasingly in servers and desktops. The project's optimizations for Arm leverage multiprocessing to distribute the Pi calculation workload across all available CPU cores, maximizing performance per watt and making it ideal for energy-conscious environments.
AMD: AMD CPUs, particularly those with the EPYC architecture, offer a high number of cores and threads, making them well-suited for parallel processing tasks. The optimizations for AMD aim to leverage this multi-threading capability to speed up the Pi calculation process.
Intel: Intel processors are renowned for their high single-core performance, which is crucial for tasks that cannot be easily parallelized. The project includes specific optimizations for Intel CPUs to take advantage of their architecture, such as using pypy for faster Python execution. pypy is a Python interpreter with a JIT (Just-In-Time) compilation feature that accelerates the execution of Python code, making it a perfect match for Intel's high-performance cores.

By tailoring the optimizations to each CPU architecture, Pi-on-Py ensures that users can achieve the best possible performance regardless of their hardware setup. This approach allows for a more accurate comparison of hardware capabilities across different systems and architectures.

Getting Started

To get started, follow these simple steps:

1.) Clone the Repository

git clone https://github.com/matthansen0/speedtest-PIonPY
cd speedtest-PIonPY

2.) (Optional) Ensure Python Base Tools

Most systems already have what you need. If python3 -m venv fails, install the venv module (Debian/Ubuntu example):

sudo apt update && sudo apt install python3 python3-venv -y

No manual pip install steps are required— the preparation script handles dependencies inside a local venv/.

3.) Prepare Environment (One-Time or Re-Runnable)

python3 prepare_benchmark.py

This creates/updates ./venv, installs dependencies (from requirements.txt), and (optionally) suggests PyPy if you're on Intel/AMD.

4.) Run the Benchmark

python3 run_benchmark.py

Output shows detected vendor, elapsed time (color-coded), last 50 digits (approximate), and writes a JSON results file (e.g. results_*.json).

Automatic optimizations now included:

Warm-up pass (1% of iterations) for cache/JIT stabilization
Core affinity pinning (best-effort) to reduce migration
ARM big.LITTLE frequency weighting (allocates more work to faster cores)
Intel/AMD auto re-exec under PyPy via a local managed venv (.pypy_venv) for JIT speedups (safe under PEP 668)
JSON result artifact for later comparison
Always-on per-segment (10%) progress reporting

Progress checkpoints (10% per worker segment) are displayed by default. Iteration count is fixed internally (10,000) for consistent cross-architecture comparison; a warm-up (unreported) precedes the main run.

Results

x64 Intel CPU on Azure Ds2 v5, $78.11/mo

x64 AMD CPU on Azure D2ads v5, $83.95/mo

RISC ARM CPU on Azure D2ps v6, $56.94/mo

PyPy Optimization Details

When running on Intel or AMD, the script attempts to speed up execution by:

Detecting a pypy3 executable.
Creating a project-local virtual environment at .pypy_venv/ (if not already present).
Ensuring mpmath is installed inside that venv.
Re‑executing the benchmark under that PyPy environment.

This approach avoids installing packages into the system Python (respects PEP 668) and keeps everything self‑contained in the repository folder.

Notes on Accuracy & Future Enhancements

The computation done here is indicative of one type of compute workload, each chipset has their beneifts and there are cases where they will all outperform each other. The purpose here is simply to show that ARM CPUs, under the right circumstances can be more cost effective while simultaniously being more effecient than x64 CPUs.

The current single-script benchmark uses an approximate segmented parallel method to stress CPUs uniformly. It is suitable for relative throughput comparisons (the goal of this project) but is not a mathematically strict parallelization of the Chudnovsky series. Future improvements may include:

Exact per-term or binary-splitting implementation (mathematically rigorous)
JSON output mode for automated comparisons
Optional correctness validation against known π prefixes
Thermal / frequency sampling during runs

Contributing

I welcome contributions from the community! Whether it's adding new optimizations, improving the documentation, or reporting bugs, your contributions are greatly appreciated. Please refer to the CONTRIBUTING.md file for more information on how to contribute.

To-Do List

Here are the next steps for the SpeedTest-PiOnPy project to enhance its functionality and user experience:

Optimize Algorithm Efficiency: Further refine the mathematical algorithms to improve calculation speed without sacrificing accuracy.
Enhance User Interface: Develop a more interactive and user-friendly interface for the benchmarking tools.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
media		media
.gitignore		.gitignore
LICENSE		LICENSE
OPTIMIZATION_NOTES.md		OPTIMIZATION_NOTES.md
README.md		README.md
prepare_benchmark.py		prepare_benchmark.py
requirements.txt		requirements.txt
run_benchmark.py		run_benchmark.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SpeedTest: Pi-on-Py

Project Overview

Features

Understanding CPU Architecture Differences

Getting Started

Results

PyPy Optimization Details

Notes on Accuracy & Future Enhancements

Contributing

To-Do List

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

matthansen0/speedtest-PIonPY

Folders and files

Latest commit

History

Repository files navigation

SpeedTest: Pi-on-Py

Project Overview

Features

Understanding CPU Architecture Differences

Getting Started

Results

PyPy Optimization Details

Notes on Accuracy & Future Enhancements

Contributing

To-Do List

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages