creat docs files

This commit is contained in:
Arkadiusz Kondas 2016-04-09 00:36:48 +02:00
parent 83b4a9e19c
commit 5be2147784
10 changed files with 232 additions and 3 deletions

37
docs/index.md Normal file
View File

@ -0,0 +1,37 @@
# PHP Machine Learning (PHP-ML)
[![Build Status](https://scrutinizer-ci.com/g/php-ai/php-ml/badges/build.png?b=develop)](https://scrutinizer-ci.com/g/php-ai/php-ml/build-status/develop)
[![Total Downloads](https://poser.pugx.org/php-ai/php-ml/downloads.svg)](https://packagist.org/packages/php-ai/php-ml)
[![License](https://poser.pugx.org/php-ai/php-ml/license.svg)](https://packagist.org/packages/php-ai/php-ml)
[![Scrutinizer Code Quality](https://scrutinizer-ci.com/g/php-ai/php-ml/badges/quality-score.png?b=develop)](https://scrutinizer-ci.com/g/php-ai/php-ml/?branch=develop)
Fresh approach to machine learning in PHP. Note that at the moment PHP is not the best choice for machine learning but maybe this will change ...
## Installation
Currently this library is in the process of developing, but You can install it with Composer:
```
composer require php-ai/php-ml
```
## To-Do
* implements more algorithms
* integration with Lavacharts for data visualization
## Testing
After installation, you can launch the test suite in project root directory (you will need to install dev requirements with composer)
```
bin/phpunit
```
## License
PHP-ML is released under the MIT Licence. See the bundled LICENSE file for details.
## Author
Arkadiusz Kondas (@ArkadiuszKondas)

View File

@ -0,0 +1,35 @@
# KNearestNeighbors Classifier
Classifier implementing the k-nearest neighbors algorithm.
### Constructor Parameters
* $k - number of nearest neighbors to scan (default: 3)
```
$classifier = new KNearestNeighbors($k=4);
```
### Train
To train a classifier simply provide train samples and labels (as `array`):
```
$samples = [[1, 3], [1, 4], [2, 4], [3, 1], [4, 1], [4, 2]];
$labels = ['a', 'a', 'a', 'b', 'b', 'b'];
$classifier = new KNearestNeighbors();
$classifier->train($samples, $labels);
```
### Predict
To predict sample class use `predict` method. You can provide one sample or array of samples:
```
$classifier->predict([3, 2]);
// return 'b'
$classifier->predict([[3, 2], [1, 5]]);
// return ['b', 'a']
```

View File

@ -0,0 +1,29 @@
# RandomSplit
One of the simplest methods from Cross-validation is implemented as `RandomSpilt` class. Samples are split to two groups: train group and test group. You can adjust number of samples in each group.
### Constructor Parameters
* $dataset - object that implements `Dataset` interface
* $testSize - a fraction of test split (float, from 0 to 1, default: 0.3)
* $seed - seed for random generator (for tests)
```
$randomSplit = new RandomSplit($dataset, 0.2);
```
### Samples and labels groups
To get samples or labels from test and train group you can use getters:
```
$dataset = new RandomSplit($dataset, 0.3, 1234);
// train group
$dataset->getTrainSamples();
$dataset->getTrainLabels();
// test group
$dataset->getTestSamples();
$dataset->getTestLabels();
```

View File

@ -0,0 +1,21 @@
# ArrayDataset
Helper class that holds data as PHP `array` type. Implements the `Dataset` interface which is used heavily in other classes.
### Constructors Parameters
* $samples - (array) of samples
* $labels - (array) of labels
```
$dataset = new ArrayDataset([[1, 1], [2, 1], [3, 2], [4, 1]], ['a', 'a', 'b', 'b']);
```
### Samples and labels
To get samples or labels you can use getters:
```
$dataset->getSamples();
$dataset->getLabels();
```

View File

@ -0,0 +1,15 @@
# CsvDataset
Helper class that loads data from CSV file. It extends the `ArrayDataset`.
### Constructors Parameters
* $filepath - (string) path to `.csv` file
* $features - (int) number of columns that are features (starts from first column), last column must be a label
* $headingRow - (bool) define is file have a heading row (if `true` then first row will be ignored)
```
$dataset = new CsvDataset('dataset.csv', 2, true);
```
See Array Dataset for more information.

View File

@ -0,0 +1,34 @@
# Iris Dataset
Most popular and widely available dataset of iris flower measurement and class names.
### Specification
| Classes | 3 |
| Samples per class | 50 |
| Samples total | 150 |
| Features per sample | 4 |
### Load
To load Iris dataset simple use:
```
$dataset = new Iris();
```
### Several samples
```
sepal length,sepal width,petal length,petal width,class
5.1,3.5,1.4,0.2,Iris-setosa
4.9,3.0,1.4,0.2,Iris-setosa
4.7,3.2,1.3,0.2,Iris-setosa
7.0,3.2,4.7,1.4,Iris-versicolor
6.4,3.2,4.5,1.5,Iris-versicolor
6.9,3.1,4.9,1.5,Iris-versicolor
6.3,3.3,6.0,2.5,Iris-virginica
5.8,2.7,5.1,1.9,Iris-virginica
7.1,3.0,5.9,2.1,Iris-virginica
6.3,2.9,5.6,1.8,Iris-virginicacs
```

View File

@ -0,0 +1,24 @@
# Accuracy
Class for calculate classifier accuracy.
### Score
To calculate classifier accuracy score use `score` static method. Parametrs:
* $actualLabels - (array) true sample labels
* $predictedLabels - (array) predicted labels (e.x. from test group)
* $normalize - (bool) normalize or not the result (default: true)
### Example
```
$actualLabels = ['a', 'b', 'a', 'b'];
$predictedLabels = ['a', 'a', 'a', 'b'];
Accuracy::score($actualLabels, $predictedLabels);
// return 0.75
Accuracy::score($actualLabels, $predictedLabels, false);
// return 3
```

View File

@ -0,0 +1,17 @@
# Distance
Special class for calculation of different types of distance.
### Euclidean
![euclidean](https://upload.wikimedia.org/math/8/4/9/849f040fd10bb86f7c85eb0bbe3566a4.png "Euclidean Distance")
To calculate euclidean distance:
```
$a = [4, 6];
$b = [2, 5];
Distance::euclidean($a, $b);
// return 2.2360679774998
```

17
mkdocs.yml Normal file
View File

@ -0,0 +1,17 @@
site_name: PHP Machine Learning (PHP-ML)
pages:
- Home: index.md
- Machine Learning:
- Classification:
- KNearestNeighbors: machine-learning/classification/knearestneighbors.md
- Cross Validation:
- RandomSplit: machine-learning/cross-validation/randomsplit.md
- Datasets:
- Array Dataset: machine-learning/datasets/array-dataset.md
- CSV Dataset: machine-learning/datasets/csv-dataset.md
- Demo:
- Iris: machine-learning/datasets/demo/iris.md
- Metric:
- Accuracy: machine-learning/metric/accuracy.md
- Distance: machine-learning/metric/distance.md
theme: readthedocs

View File

@ -32,10 +32,10 @@ class RandomSplitTest extends \PHPUnit_Framework_TestCase
$labels = ['a', 'a', 'b', 'b']
);
$randomSplit1 = new RandomSplit($dataset, 0.5);
$randomSplit = new RandomSplit($dataset, 0.5);
$this->assertEquals(2, count($randomSplit1->getTestSamples()));
$this->assertEquals(2, count($randomSplit1->getTrainSamples()));
$this->assertEquals(2, count($randomSplit->getTestSamples()));
$this->assertEquals(2, count($randomSplit->getTrainSamples()));
$randomSplit2 = new RandomSplit($dataset, 0.25);