|
29 | 29 | ----
|
30 | 30 |
|
31 | 31 | TSNE algorithm consists of two components: KNN and Gradient Descent.
|
32 |
| -The overall accelration of TSNE depends on the acceleration of each of these algorithms. |
| 32 | +The overall acceleration of TSNE depends on the acceleration of each of these algorithms. |
33 | 33 |
|
34 | 34 | - The KNN part of the algorithm supports all parameters except:
|
35 |
| - |
| 35 | + |
36 | 36 | - ``metric`` != `'euclidean'` or `'minkowski'` with ``p`` != `2`
|
37 | 37 | - The Gradient Descent part of the algorithm supports all parameters except:
|
38 |
| - |
| 38 | + |
39 | 39 | - ``n_components`` = `3`
|
40 | 40 | - ``method`` = `'exact'`
|
41 | 41 | - ``verbose`` != `0`
|
42 | 42 |
|
43 |
| -To get better performance, use parameters supported by both components. |
| 43 | +To get better performance, use parameters supported by both components. |
| 44 | + |
| 45 | +.. _acceleration_rf: |
| 46 | + |
| 47 | +Random Forest |
| 48 | +------------- |
| 49 | + |
| 50 | +Random Forest models accelerated with |intelex| and using the `hist` splitting |
| 51 | +method discretize training data by creating a histogram with a configurable |
| 52 | +number of bins. The following keyword arguments can be used to influence the |
| 53 | +created histogram. |
| 54 | + |
| 55 | +.. list-table:: |
| 56 | + :widths: 10 10 10 30 |
| 57 | + :header-rows: 1 |
| 58 | + :align: left |
| 59 | + |
| 60 | + * - Keyword argument |
| 61 | + - Possible values |
| 62 | + - Default value |
| 63 | + - Description |
| 64 | + * - ``maxBins`` |
| 65 | + - `[0, inf)` |
| 66 | + - ``256`` |
| 67 | + - Number of bins in the histogram with the discretized training data. The |
| 68 | + value ``0`` disables data discretization. |
| 69 | + * - ``minBinSize`` |
| 70 | + - `[1, inf)` |
| 71 | + - ``5`` |
| 72 | + - Minimum number of training data points in each bin after discretization. |
| 73 | + * - ``binningStrategy`` |
| 74 | + - ``quantiles, averages`` |
| 75 | + - ``quantiles`` |
| 76 | + - Selects the algorithm used to calculate bin edges. ``quantiles`` |
| 77 | + results in bins with a similar amount of training data points. ``averages`` |
| 78 | + divides the range of values observed in the training data set into |
| 79 | + equal-width bins of size `(max - min) / maxBins`. |
| 80 | + |
| 81 | +Note that using discretized training data can greatly accelerate model training |
| 82 | +times, especially for larger data sets. However, due to the reduced fidelity of |
| 83 | +the data, the resulting model can present worse performance metrics compared to |
| 84 | +a model trained on the original data. In such cases, the number of bins can be |
| 85 | +increased with the ``maxBins`` parameter, or binning can be disabled entirely by |
| 86 | +setting ``maxBins=0``. |
0 commit comments