LinearBoost · hamidkm9 · Jun 7, 2025 · Jun 3, 2025 · Jun 3, 2025 · Jun 3, 2025
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -0,0 +1,25 @@
+default_language_version:
+    python: python3.10
+
+repos:
+  - repo: https://github.com/pre-commit/pre-commit-hooks
+    rev: v5.0.0
+    hooks:
+      - id: check-added-large-files
+      - id: check-toml
+      - id: check-yaml
+      - id: end-of-file-fixer
+      - id: trailing-whitespace
+
+  - repo: https://github.com/pycqa/isort
+    rev: 6.0.1
+    hooks:
+      - id: isort
+
+  - repo: https://github.com/astral-sh/ruff-pre-commit
+    rev: v0.11.12
+    hooks:
+      - id: ruff
+        args:
+          - --fix
+      - id: ruff-format
diff --git a/README.md b/README.md
@@ -27,7 +27,7 @@ Version 0.1.2 of **LinearBoost Classifier** is released. Here are the changes:
 - Both SEFR and LinearBoostClassifier classes are refactored to fully adhere to Scikit-learn's conventions and API. Now, they are standard Scikit-learn estimators that can be used in Scikit-learn pipelines, grid search, etc.
 - Added unit tests (using pytest) to ensure the estimators adhere to Scikit-learn conventions.
 - Added fit_intercept parameter to SEFR similar to other linear estimators in Scikit-learn (e.g., LogisticRegression, LinearRegression, etc.).
-- Removed random_state parameter from LinearBoostClassifier as it doesn't affect the result, since SEFR doesn't expose a random_state argument. According to Scikit-learn documentation for this parameter in AdaBoostClassifier: 
+- Removed random_state parameter from LinearBoostClassifier as it doesn't affect the result, since SEFR doesn't expose a random_state argument. According to Scikit-learn documentation for this parameter in AdaBoostClassifier:
   > it is only used when estimator exposes a random_state.
 - Added docstring to both SEFR and LinearBoostClassifier classes.
 - Used uv for project and package management.
@@ -45,20 +45,20 @@ The documentation is available at https://linearboost.readthedocs.io/.
 
 The following parameters yielded optimal results during testing. All results are based on 10-fold Cross-Validation:
 
-- **`n_estimators`**:  
+- **`n_estimators`**:
   A range of 10 to 200 is suggested, with higher values potentially improving performance at the cost of longer training times.
 
-- **`learning_rate`**:  
+- **`learning_rate`**:
   Values between 0.01 and 1 typically perform well. Adjust based on the dataset's complexity and noise.
 
-- **`algorithm`**:  
+- **`algorithm`**:
   Use either `SAMME` or `SAMME.R`. The choice depends on the specific problem:
   - `SAMME`: May be better for datasets with clearer separations between classes.
   - `SAMME.R`: Can handle more nuanced class probabilities.
 
   **Note:** As of scikit-learn v1.6, the `algorithm` parameter is deprecated and will be removed in v1.8. LinearBoostClassifier will only implement the 'SAMME' algorithm in newer versions.
 
-- **`scaler`**:  
+- **`scaler`**:
   The following scaling methods are recommended based on dataset characteristics:
   - `minmax`: Best for datasets where features are on different scales but bounded.
   - `robust`: Effective for datasets with outliers.
@@ -200,10 +200,10 @@ params = {
 LinearBoost's combination of **runtime efficiency** and **high accuracy** makes it a powerful choice for real-world machine learning tasks, particularly in resource-constrained or real-time applications.
 
 ### 📰 Featured in:
-- [LightGBM Alternatives: A Comprehensive Comparison](https://nightwatcherai.com/blog/lightgbm-alternatives)  
-  _by Jordan Cole, March 11, 2025_  
+- [LightGBM Alternatives: A Comprehensive Comparison](https://nightwatcherai.com/blog/lightgbm-alternatives)
+  _by Jordan Cole, March 11, 2025_
   *Discusses how LinearBoost outperforms traditional boosting frameworks in terms of speed while maintaining accuracy.*
-  
+
 
 Future Developments
 -----------------------------
@@ -224,7 +224,7 @@ This project is licensed under the terms of the MIT license. See [LICENSE](https
 
 Some portions of this code are adapted from the scikit-learn project
 (https://scikit-learn.org), which is licensed under the BSD 3-Clause License.
-See the `licenses/` folder for details. The modifications and additions made to the original code are licensed under the MIT License © 2025 Hamidreza Keshavarz, Reza Rawassizadeh. 
+See the `licenses/` folder for details. The modifications and additions made to the original code are licensed under the MIT License © 2025 Hamidreza Keshavarz, Reza Rawassizadeh.
 The original code from scikit-learn is available at [scikit-learn GitHub repository](https://github.com/scikit-learn/scikit-learn)
 
 Special Thanks to:

diff --git a/SECURITY.md b/SECURITY.md
@@ -0,0 +1,19 @@
+# Security Policy
+
+## Reporting a Vulnerability
+
+If you think you found a vulnerability, and even if you are not sure about it, please report it right away by sending an email to: `hamid9 at outlook dot com`. Please try to be as explicit as possible, describing all the steps and example code to reproduce the security issue.
+
+## Vulnerability Disclosures
+
+Critical vulnerabilities will be disclosed via GitHub's [security advisory](https://github.com/LinearBoost/linearboost-classifier/security) system.
+
+## Public Discussions
+
+Please restrain from publicly discussing a potential security vulnerability.
+
+It's better to discuss privately and try to find a solution first, to limit the potential impact as much as possible.
+
+---
+
+Thanks for your help!
diff --git a/pyproject.toml b/pyproject.toml
@@ -38,13 +38,16 @@ dependencies = [
 [dependency-groups]
 dev = [
     "isort",
+    "pre-commit>=3.5.0",
     "pytest>=7.0.0",
     "ruff>=0.9.2",
 ]
 
 [project.urls]
 Homepage = "https://github.com/LinearBoost/linearboost-classifier"
-Source = "https://github.com/LinearBoost/linearboost-classifier"
+Documentation = "https://linearboost.readthedocs.io"
+Repository = "https://github.com/LinearBoost/linearboost-classifier"
+Issues = "https://github.com/LinearBoost/linearboost-classifier/issues"
 
 [tool.hatch.version]
 path = "src/linearboost/__init__.py"
@@ -66,4 +69,4 @@ line-length = 120
 atomic = true
 profile = "black"
 skip_gitignore = true
-known_first_party = ["black", "blib2to3", "blackd", "_black_version"]
+known_first_party = ["linearboost"]
diff --git a/requirements.txt b/requirements.txt
@@ -1,2 +1,2 @@
 scikit-learn>=1.2.2
-typing-extensions>=4.1.0; python_version < "3.11"
+typing-extensions>=4.1.0; python_version < "3.11"
diff --git a/src/linearboost/linear_boost.py b/src/linearboost/linear_boost.py
@@ -67,7 +67,7 @@ class LinearBoostClassifier(AdaBoostClassifier):
     """A LinearBoost classifier.
 
     A LinearBoost classifier is a meta-estimator based on AdaBoost and SEFR.
-    It is a fast and accurate classification algorithm built to enhance the 
+    It is a fast and accurate classification algorithm built to enhance the
     performance of the linear classifier SEFR.
 
     Parameters
@@ -107,7 +107,7 @@ class LinearBoostClassifier(AdaBoostClassifier):
     class_weight : {"balanced", "balanced_subsample"}, dict or list of dicts, \
             default=None
         Weights associated with classes in the form ``{class_label: weight}``.
-        If not given, all classes are supposed to have weight one. 
+        If not given, all classes are supposed to have weight one.
 
         The "balanced" mode uses the values of y to automatically adjust
         weights inversely proportional to class frequencies in the input data
@@ -122,9 +122,9 @@ class LinearBoostClassifier(AdaBoostClassifier):
 
     loss_function : callable, default=None
         Custom loss function for optimization. Must follow the signature:
-        
+
         ``loss_function(y_true, y_pred, sample_weight) -> float``
-        
+
         where:
         - y_true: Ground truth (correct) target values.
         - y_pred: Estimated target values.
@@ -160,7 +160,7 @@ class LinearBoostClassifier(AdaBoostClassifier):
     estimator_errors_ : ndarray of floats
         Classification error for each estimator in the boosted
         ensemble.
-    
+
     n_features_in_ : int
         Number of features seen during :term:`fit`.