Skip to content

Commit eedcadc

Browse files
committed
Enhance GitHub Stats Analyzer with new features and improvements
- Updated README files (English and Chinese) to include support for multiple output formats (text, JSON, CSV) and clarified command line options. - Incremented version number in setup.py to 1.0.0 to reflect major updates. - Introduced a new test script to verify the inclusion of private repositories when the token belongs to the user. - Improved the GitHubStatsAnalyzer to check token ownership and adjust repository access accordingly. - Enhanced logging capabilities with configurable log levels and improved progress bar handling. - Refactored code for better performance and clarity, including concurrent processing of repositories and commits. - Updated command line argument parsing to include new options for output format and log level.
1 parent 3702c39 commit eedcadc

File tree

10 files changed

+554
-226
lines changed

10 files changed

+554
-226
lines changed

README.md

Lines changed: 26 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ This Python program analyzes a GitHub user's repositories to collect comprehensi
2020
- 📈 Total additions and deletions across all repositories (including forks, but only counting user's own contributions)
2121
- 🔤 Lines of code per programming language
2222
- 📚 Detailed repository information
23+
- 📊 Multiple output formats (text, JSON, CSV)
2324

2425
<div align="center">
2526
<img src="./assets/sample_1.webp" width="49%" alt="Example Output 1" />
@@ -38,10 +39,12 @@ View the latest analysis results in the [stats branch](https://github.com/Sakura
3839
- **Accurate Line Counting**: Precisely measures actual code lines by analyzing commit data directly from GitHub's API
3940
- **Parallel Processing**: Efficiently processes multiple repositories concurrently
4041
- **Rich Output**: Beautiful console output with tables and colors
42+
- **Multiple Output Formats**: Support for text, JSON, and CSV output formats
4143
- **Detailed Logging**: Comprehensive logging for debugging
4244
- **Access Levels**: Supports both basic (no token) and full (with token) access modes
4345
- **Flexible Token Configuration**: Support for multiple ways to provide GitHub token
4446
- **Extensive Testing**: View our [test results and testing pipeline](https://github.com/SakuraPuare/github-stats-analyzer/blob/test-results/test_results/test_report.md) for quality assurance
47+
- **Configurable Analysis**: Control the depth and scope of analysis with various command-line options
4548

4649
## 🔧 Requirements
4750

@@ -127,7 +130,7 @@ python main.py <github_username>
127130
The program supports the following command line options:
128131

129132
```bash
130-
github-stats <github_username> [--debug] [--include-all] [--access-level {basic|full}] [--token TOKEN] [--max-repos MAX_REPOS] [--max-commits MAX_COMMITS] [--max-concurrent-repos MAX_CONCURRENT_REPOS] [--max-retries MAX_RETRIES] [--retry-delay RETRY_DELAY]
133+
github-stats <github_username> [--debug] [--include-all] [--access-level {basic|full}] [--token TOKEN] [--max-repos MAX_REPOS] [--max-commits MAX_COMMITS] [--max-concurrent-repos MAX_CONCURRENT_REPOS] [--max-retries MAX_RETRIES] [--retry-delay RETRY_DELAY] [--output {text|json|csv}] [--log-level {DEBUG|INFO|WARNING|ERROR|CRITICAL}]
131134
```
132135

133136
- `--debug`: Enable debug output for more detailed logging
@@ -138,9 +141,12 @@ github-stats <github_username> [--debug] [--include-all] [--access-level {basic|
138141
- `--token`: GitHub Personal Access Token (can also be set via GITHUB_TOKEN environment variable)
139142
- `--max-repos`: Maximum number of repositories to analyze
140143
- `--max-commits`: Maximum number of commits to analyze per repository
141-
- `--max-concurrent-repos`: Maximum number of repositories to process concurrently (default: 10)
144+
- `--max-concurrent-repos`: Maximum number of repositories to process concurrently (default: 3)
142145
- `--max-retries`: Maximum number of retries for HTTP requests (default: 3)
143146
- `--retry-delay`: Initial delay between retries in seconds (default: 1.0)
147+
- `--output`: Output format (text, json, csv) (default: text)
148+
- `--log-level`: Set the logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL) (default: INFO)
149+
- `--exclude-languages`: Languages to exclude from statistics (space-separated list)
144150

145151
### Access Levels
146152

@@ -158,8 +164,8 @@ The program supports two access levels:
158164

159165
#### Full Access (Token Required)
160166
- Access to all repositories (public and private)
161-
- No limit on number of repositories
162-
- No limit on number of commits
167+
- No limit on number of repositories (default: 1000)
168+
- No limit on number of commits (default: 1000)
163169
- Complete statistics
164170
- Private repository access
165171
- Fork analysis
@@ -205,6 +211,22 @@ The program will display:
205211
- Language statistics sorted by lines of code
206212
- List of repositories with star count and creation date (in full access mode)
207213

214+
### Output Formats
215+
216+
The program supports three output formats:
217+
218+
#### Text (Default)
219+
- Rich console output with tables and colors
220+
- Detailed statistics and repository information
221+
222+
#### JSON
223+
- Structured JSON output for programmatic use
224+
- Contains all statistics and repository information
225+
226+
#### CSV
227+
- Comma-separated values for easy import into spreadsheets
228+
- Contains all statistics and repository information
229+
208230
## 📝 Notes
209231

210232
- The program analyzes all repositories including forks, but only counts the user's own contributions

README_CN.md

Lines changed: 30 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -19,10 +19,12 @@
1919
- 📈 所有仓库的总添加和删除行数(包括fork的仓库,但只统计用户自己的贡献)
2020
- 🔤 每种编程语言的代码行数
2121
- 📚 详细的仓库信息
22+
- 📊 多种输出格式(文本、JSON、CSV)
2223

23-
![示例输出](./assets/sample_1.webp)
24-
25-
![示例输出](./assets/sample_2.webp)
24+
<div align="center">
25+
<img src="./assets/sample_1.webp" width="49%" alt="Example Output 1" />
26+
<img src="./assets/sample_2.webp" width="49%" alt="Example Output 2" />
27+
</div>
2628

2729
## 📊 最新分析结果
2830

@@ -36,10 +38,12 @@
3638
- **精确代码行统计**:通过直接分析GitHub API的提交数据,精确测量实际代码行数
3739
- **并行处理**:高效地并发处理多个仓库
3840
- **丰富输出**:美观的控制台输出,带有表格和颜色
41+
- **多种输出格式**:支持文本、JSON和CSV输出格式
3942
- **详细日志**:用于调试的全面日志记录
4043
- **访问级别**:支持基础(无token)和完整(有token)两种访问模式
4144
- **灵活的Token配置**:支持多种方式提供GitHub token
4245
- **全面测试**:查看我们的[测试结果和测试流程](https://github.com/SakuraPuare/github-stats-analyzer/blob/test-results/test_results/test_report.md)以确保质量
46+
- **可配置分析**:通过各种命令行选项控制分析的深度和范围
4347

4448
## 🔧 要求
4549

@@ -125,7 +129,7 @@ python main.py <github_username>
125129
程序支持以下命令行选项:
126130

127131
```bash
128-
github-stats <github_username> [--debug] [--include-all] [--access-level {basic|full}] [--token TOKEN] [--max-repos MAX_REPOS] [--max-commits MAX_COMMITS] [--max-concurrent-repos MAX_CONCURRENT_REPOS] [--max-retries MAX_RETRIES] [--retry-delay RETRY_DELAY]
132+
github-stats <github_username> [--debug] [--include-all] [--access-level {basic|full}] [--token TOKEN] [--max-repos MAX_REPOS] [--max-commits MAX_COMMITS] [--max-concurrent-repos MAX_CONCURRENT_REPOS] [--max-retries MAX_RETRIES] [--retry-delay RETRY_DELAY] [--output {text|json|csv}] [--log-level {DEBUG|INFO|WARNING|ERROR|CRITICAL}]
129133
```
130134

131135
- `--debug`:启用调试输出,获取更详细的日志
@@ -136,9 +140,12 @@ github-stats <github_username> [--debug] [--include-all] [--access-level {basic|
136140
- `--token`:GitHub 个人访问令牌(也可以通过 GITHUB_TOKEN 环境变量设置)
137141
- `--max-repos`:要分析的最大仓库数量
138142
- `--max-commits`:每个仓库要分析的最大提交数量
139-
- `--max-concurrent-repos`:并发处理的最大仓库数量(默认:10
143+
- `--max-concurrent-repos`:并发处理的最大仓库数量(默认:3
140144
- `--max-retries`:HTTP 请求的最大重试次数(默认:3)
141145
- `--retry-delay`:重试之间的初始延迟秒数(默认:1.0)
146+
- `--output`:输出格式(text、json、csv)(默认:text)
147+
- `--log-level`:设置日志级别(DEBUG、INFO、WARNING、ERROR、CRITICAL)(默认:INFO)
148+
- `--exclude-languages`:要从统计中排除的语言(空格分隔的列表)
142149

143150
### 访问级别
144151

@@ -156,8 +163,8 @@ github-stats <github_username> [--debug] [--include-all] [--access-level {basic|
156163

157164
#### 完整访问(需要 Token)
158165
- 可访问所有仓库(公开和私有)
159-
- 仓库数量无限制
160-
- 提交数量无限制
166+
- 仓库数量无限制(默认:1000)
167+
- 提交数量无限制(默认:1000)
161168
- 完整统计
162169
- 可访问私有仓库
163170
- 分析 fork 仓库
@@ -203,6 +210,22 @@ asyncio.run(analyze_user("octocat", AccessLevel.FULL))
203210
- 按代码行数排序的语言统计
204211
- 仓库列表,包含星标数和创建日期(完整访问模式)
205212

213+
### 输出格式
214+
215+
程序支持三种输出格式:
216+
217+
#### 文本(默认)
218+
- 丰富的控制台输出,带有表格和颜色
219+
- 详细的统计和仓库信息
220+
221+
#### JSON
222+
- 结构化的JSON输出,用于程序化使用
223+
- 包含所有统计和仓库信息
224+
225+
#### CSV
226+
- 逗号分隔的值,便于导入到电子表格
227+
- 包含所有统计和仓库信息
228+
206229
## 📝 注意事项
207230

208231
- 程序分析所有仓库包括 fork 的仓库,但只统计用户自己的贡献

0 commit comments

Comments
 (0)