From 3843d35996bd068bbcf99a98cf5a8060a48ceaaf Mon Sep 17 00:00:00 2001 From: SebastienMelo Date: Wed, 13 Aug 2025 11:16:44 +0200 Subject: [PATCH 1/2] made it clear --- .../02_numerical_pipeline_introduction.py | 14 ++++++++++---- python_scripts/02_numerical_pipeline_scaling.py | 1 + 2 files changed, 11 insertions(+), 4 deletions(-) diff --git a/python_scripts/02_numerical_pipeline_introduction.py b/python_scripts/02_numerical_pipeline_introduction.py index 940065dc3..20ec0734b 100644 --- a/python_scripts/02_numerical_pipeline_introduction.py +++ b/python_scripts/02_numerical_pipeline_introduction.py @@ -101,11 +101,18 @@ # ![Predictor fit diagram](../figures/api_diagram-predictor.fit.svg) # # In scikit-learn an object that has a `fit` method is called an **estimator**. +# If the estimator additionally has : +# - a `predict` method, it is called a **predictor**. Examples of predictors +# are classifiers or regressors. +# - a `transform` method, it is called a **transformer**. Examples of +# transformers are scalers or encoders. We will see more about transformers in +# the next notebook. +# # The method `fit` is composed of two elements: (i) a **learning algorithm** and # (ii) some **model states**. The learning algorithm takes the training data and # training target as input and sets the model states. These model states are -# later used to either predict (for classifiers and regressors) or transform -# data (for transformers). +# later used to either predict or transform data as explained above. See the +# glossary for more detailed definitions. # # Both the learning algorithm and the type of model states are specific to each # type of model. @@ -124,8 +131,7 @@ target_predicted = model.predict(data) # %% [markdown] -# An estimator (an object with a `fit` method) with a `predict` method is called -# a **predictor**. We can illustrate the prediction mechanism as follows: +# We can illustrate the prediction mechanism as follows: # # ![Predictor predict diagram](../figures/api_diagram-predictor.predict.svg) # diff --git a/python_scripts/02_numerical_pipeline_scaling.py b/python_scripts/02_numerical_pipeline_scaling.py index 4a0025f5d..b3e899333 100644 --- a/python_scripts/02_numerical_pipeline_scaling.py +++ b/python_scripts/02_numerical_pipeline_scaling.py @@ -88,6 +88,7 @@ # We show how to apply such normalization using a scikit-learn transformer # called `StandardScaler`. This transformer shifts and scales each feature # individually so that they all have a 0-mean and a unit standard deviation. +# We recall that transformers are estimators that have a `transform` method. # # We now investigate different steps used in scikit-learn to achieve such a # transformation of the data. From a8e6351c353e8d4a3858e39c167e9a6a1ac5af43 Mon Sep 17 00:00:00 2001 From: SebastienMelo Date: Wed, 13 Aug 2025 11:32:23 +0200 Subject: [PATCH 2/2] added the notebooks --- notebooks/02_numerical_pipeline_introduction.ipynb | 14 ++++++++++---- notebooks/02_numerical_pipeline_scaling.ipynb | 1 + 2 files changed, 11 insertions(+), 4 deletions(-) diff --git a/notebooks/02_numerical_pipeline_introduction.ipynb b/notebooks/02_numerical_pipeline_introduction.ipynb index a7bbcbd29..0e9b7c856 100644 --- a/notebooks/02_numerical_pipeline_introduction.ipynb +++ b/notebooks/02_numerical_pipeline_introduction.ipynb @@ -162,11 +162,18 @@ "![Predictor fit diagram](../figures/api_diagram-predictor.fit.svg)\n", "\n", "In scikit-learn an object that has a `fit` method is called an **estimator**.\n", + "If the estimator additionally has :\n", + "- a `predict` method, it is called a **predictor**. Examples of predictors\n", + " are classifiers or regressors.\n", + "- a `transform` method, it is called a **transformer**. Examples of\n", + " transformers are scalers or encoders. We will see more about transformers in\n", + " the next notebook.\n", + "\n", "The method `fit` is composed of two elements: (i) a **learning algorithm** and\n", "(ii) some **model states**. The learning algorithm takes the training data and\n", "training target as input and sets the model states. These model states are\n", - "later used to either predict (for classifiers and regressors) or transform\n", - "data (for transformers).\n", + "later used to either predict or transform data as explained above. See the\n", + "glossary for more detailed definitions.\n", "\n", "Both the learning algorithm and the type of model states are specific to each\n", "type of model." @@ -204,8 +211,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "An estimator (an object with a `fit` method) with a `predict` method is called\n", - "a **predictor**. We can illustrate the prediction mechanism as follows:\n", + "We can illustrate the prediction mechanism as follows:\n", "\n", "![Predictor predict diagram](../figures/api_diagram-predictor.predict.svg)\n", "\n", diff --git a/notebooks/02_numerical_pipeline_scaling.ipynb b/notebooks/02_numerical_pipeline_scaling.ipynb index 4fe003f24..14248425c 100644 --- a/notebooks/02_numerical_pipeline_scaling.ipynb +++ b/notebooks/02_numerical_pipeline_scaling.ipynb @@ -137,6 +137,7 @@ "We show how to apply such normalization using a scikit-learn transformer\n", "called `StandardScaler`. This transformer shifts and scales each feature\n", "individually so that they all have a 0-mean and a unit standard deviation.\n", + "We recall that transformers are estimators that have a `transform` method.\n", "\n", "We now investigate different steps used in scikit-learn to achieve such a\n", "transformation of the data.\n",