@@ -575,82 +575,34 @@ def fit(
575
575
Fit both optimizes the machine learning models and builds an ensemble
576
576
out of them.
577
577
578
- # TODO PR1213
579
- #
580
- # `task: Optional[int]` and `is_classification`
581
- #
582
- # `AutoML` tries to identify the task itself with `sklearn.type_of_target`,
583
- # leaving little for the subclasses to do.
584
- # Except this failes when type_of_target(y) == "multiclass".
585
- #
586
- # "multiclass" be mean either REGRESSION or MULTICLASS_CLASSIFICATION,
587
- # and so this is where the subclasses are used to determine which.
588
- # However, this could also be deduced from the `is_classification`
589
- # parameter.
590
- #
591
- # In the future, there is little need for the subclasses of `AutoML`
592
- # and no need for the `task` parameter. The extra functionality
593
- # provided by `AutoMLClassifier` in predict could be moved to
594
- # `AutoSklearnClassifier`, leaving `AutoML` to just produce raw
595
- # outputs and simplifying the heirarchy.
596
- #
597
- # `load_models`
598
- #
599
- # This parameter is likely not needed as they are loaded upon demand
600
- # throughout `AutoML`.
601
- # Creating a @property models that loads models into self.models_ is
602
- # not loaded would remove the need for this parameter and simplyify
603
- # the verification of `load if self.models_ is None` to one place.
604
- #
605
- # `only_return_configuration_space`
606
- #
607
- # This parameter is indicative of a need to create a seperate method
608
- # for this as the functionality of `fit` and what it returns can vary.
609
-
610
578
Parameters
611
579
----------
612
- X : {array-like, sparse matrix}, shape (n_samples, n_features)
580
+ X : np.ndarray | pd.DataFrame | list | spmatrix
613
581
The training input samples.
614
582
615
- y : array-like, shape (n_samples) or (n_samples, n_outputs)
583
+ y : np.ndarray | pd.DataFrame | pd.Series | list
616
584
The target classes.
617
585
618
- task : Optional[int]
619
- The identifier for the task AutoML is to perform.
620
-
621
- X_test : Optional[{array-like, sparse matrix}, shape (n_samples, n_features)]
586
+ X_test : np.ndarray | pd.DataFrame | list | spmatrix | None = None
622
587
Test data input samples. Will be used to save test predictions for
623
588
all models. This allows to evaluate the performance of Auto-sklearn
624
589
over time.
625
590
626
- y_test : Optional[array-like, shape (n_samples) or (n_samples, n_outputs)]
591
+ y_test : np.ndarray | pd.DataFrame | pd.Series | list | None = None
627
592
Test data target classes. Will be used to calculate the test error
628
593
of all models. This allows to evaluate the performance of
629
594
Auto-sklearn over time.
630
595
631
- feat_type : Optional[ list] ,
596
+ feat_type : list[str] | None = None ,
632
597
List of str of `len(X.shape[1])` describing the attribute type.
633
598
Possible types are `Categorical` and `Numerical`. `Categorical`
634
599
attributes will be automatically One-Hot encoded. The values
635
600
used for a categorical attribute must be integers, obtained for
636
601
example by `sklearn.preprocessing.LabelEncoder
637
602
<https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.LabelEncoder.html>`_.
638
603
639
- dataset_name : Optional[str]
640
- Create nicer output. If None, a string will be determined by the
641
- md5 hash of the dataset.
642
-
643
- only_return_configuration_space: bool = False
644
- If set to true, fit will only return the configuration space that will
645
- be used for model search. Otherwise fitting will be performed and an
646
- ensemble created.
647
-
648
- load_models: bool = True
649
- If true, this will load the models into memory once complete.
650
-
651
- is_classification: bool = False
652
- Indicates whether this is a classification task if True or a
653
- regression task if False.
604
+ dataset_name : str | None = None
605
+ Create nicer output. If None, a pseudo-random hash will be used
654
606
655
607
Returns
656
608
-------
0 commit comments