mirror of
https://github.com/Llewellynvdm/php-ml.git
synced 2024-06-03 09:00:49 +00:00
create docs for StratifiedRandomSplit
This commit is contained in:
parent
f04cc04da5
commit
ee6ea3b850
|
@ -50,6 +50,7 @@ composer require php-ai/php-ml
|
||||||
* [Accuracy](http://php-ml.readthedocs.io/en/latest/machine-learning/metric/accuracy/)
|
* [Accuracy](http://php-ml.readthedocs.io/en/latest/machine-learning/metric/accuracy/)
|
||||||
* Cross Validation
|
* Cross Validation
|
||||||
* [Random Split](http://php-ml.readthedocs.io/en/latest/machine-learning/cross-validation/random-split/)
|
* [Random Split](http://php-ml.readthedocs.io/en/latest/machine-learning/cross-validation/random-split/)
|
||||||
|
* [Stratified Random Split](http://php-ml.readthedocs.io/en/latest/machine-learning/cross-validation/stratified-random-split/)
|
||||||
* Preprocessing
|
* Preprocessing
|
||||||
* [Normalization](http://php-ml.readthedocs.io/en/latest/machine-learning/preprocessing/normalization/)
|
* [Normalization](http://php-ml.readthedocs.io/en/latest/machine-learning/preprocessing/normalization/)
|
||||||
* [Imputation missing values](http://php-ml.readthedocs.io/en/latest/machine-learning/preprocessing/imputation-missing-values/)
|
* [Imputation missing values](http://php-ml.readthedocs.io/en/latest/machine-learning/preprocessing/imputation-missing-values/)
|
||||||
|
|
|
@ -50,6 +50,7 @@ composer require php-ai/php-ml
|
||||||
* [Accuracy](http://php-ml.readthedocs.io/en/latest/machine-learning/metric/accuracy/)
|
* [Accuracy](http://php-ml.readthedocs.io/en/latest/machine-learning/metric/accuracy/)
|
||||||
* Cross Validation
|
* Cross Validation
|
||||||
* [Random Split](http://php-ml.readthedocs.io/en/latest/machine-learning/cross-validation/random-split/)
|
* [Random Split](http://php-ml.readthedocs.io/en/latest/machine-learning/cross-validation/random-split/)
|
||||||
|
* [Stratified Random Split](http://php-ml.readthedocs.io/en/latest/machine-learning/cross-validation/stratified-random-split/)
|
||||||
* Preprocessing
|
* Preprocessing
|
||||||
* [Normalization](http://php-ml.readthedocs.io/en/latest/machine-learning/preprocessing/normalization/)
|
* [Normalization](http://php-ml.readthedocs.io/en/latest/machine-learning/preprocessing/normalization/)
|
||||||
* [Imputation missing values](http://php-ml.readthedocs.io/en/latest/machine-learning/preprocessing/imputation-missing-values/)
|
* [Imputation missing values](http://php-ml.readthedocs.io/en/latest/machine-learning/preprocessing/imputation-missing-values/)
|
||||||
|
|
|
@ -1,4 +1,4 @@
|
||||||
# RandomSplit
|
# Random Split
|
||||||
|
|
||||||
One of the simplest methods from Cross-validation is implemented as `RandomSpilt` class. Samples are split to two groups: train group and test group. You can adjust number of samples in each group.
|
One of the simplest methods from Cross-validation is implemented as `RandomSpilt` class. Samples are split to two groups: train group and test group. You can adjust number of samples in each group.
|
||||||
|
|
||||||
|
@ -6,7 +6,7 @@ One of the simplest methods from Cross-validation is implemented as `RandomSpilt
|
||||||
|
|
||||||
* $dataset - object that implements `Dataset` interface
|
* $dataset - object that implements `Dataset` interface
|
||||||
* $testSize - a fraction of test split (float, from 0 to 1, default: 0.3)
|
* $testSize - a fraction of test split (float, from 0 to 1, default: 0.3)
|
||||||
* $seed - seed for random generator (for tests)
|
* $seed - seed for random generator (e.g. for tests)
|
||||||
|
|
||||||
```
|
```
|
||||||
$randomSplit = new RandomSplit($dataset, 0.2);
|
$randomSplit = new RandomSplit($dataset, 0.2);
|
||||||
|
|
|
@ -0,0 +1,44 @@
|
||||||
|
# Stratified Random Split
|
||||||
|
|
||||||
|
Analogously to `RandomSpilt` class samples are split to two groups: train group and test group.
|
||||||
|
Distribution of samples takes into account their targets and trying to divide them equally.
|
||||||
|
You can adjust number of samples in each group.
|
||||||
|
|
||||||
|
### Constructor Parameters
|
||||||
|
|
||||||
|
* $dataset - object that implements `Dataset` interface
|
||||||
|
* $testSize - a fraction of test split (float, from 0 to 1, default: 0.3)
|
||||||
|
* $seed - seed for random generator (e.g. for tests)
|
||||||
|
|
||||||
|
```
|
||||||
|
$split = new StratifiedRandomSplit($dataset, 0.2);
|
||||||
|
```
|
||||||
|
|
||||||
|
### Samples and labels groups
|
||||||
|
|
||||||
|
To get samples or labels from test and train group you can use getters:
|
||||||
|
|
||||||
|
```
|
||||||
|
$dataset = new StratifiedRandomSplit($dataset, 0.3, 1234);
|
||||||
|
|
||||||
|
// train group
|
||||||
|
$dataset->getTrainSamples();
|
||||||
|
$dataset->getTrainLabels();
|
||||||
|
|
||||||
|
// test group
|
||||||
|
$dataset->getTestSamples();
|
||||||
|
$dataset->getTestLabels();
|
||||||
|
```
|
||||||
|
|
||||||
|
### Example
|
||||||
|
|
||||||
|
```
|
||||||
|
$dataset = new ArrayDataset(
|
||||||
|
$samples = [[1], [2], [3], [4], [5], [6], [7], [8]],
|
||||||
|
$targets = ['a', 'a', 'a', 'a', 'b', 'b', 'b', 'b']
|
||||||
|
);
|
||||||
|
|
||||||
|
$split = new StratifiedRandomSplit($dataset, 0.5);
|
||||||
|
```
|
||||||
|
|
||||||
|
Split will have equals amount of each target. Two of the target `a` and two of `b`.
|
Loading…
Reference in New Issue
Block a user