update docs

This commit is contained in:
Arkadiusz Kondas 2016-05-07 23:53:42 +02:00
parent 46197eba7b
commit 365a9baeca
6 changed files with 154 additions and 1 deletions

View File

@ -37,15 +37,19 @@ composer require php-ai/php-ml
## Features
* Classification
* [SVC](http://php-ml.readthedocs.io/en/latest/machine-learning/classification/svc/)
* [k-Nearest Neighbors](http://php-ml.readthedocs.io/en/latest/machine-learning/classification/k-nearest-neighbors/)
* [Naive Bayes](http://php-ml.readthedocs.io/en/latest/machine-learning/classification/naive-bayes/)
* Regression
* [Least Squares](http://php-ml.readthedocs.io/en/latest/machine-learning/regression/least-squares/)
* [SVR](http://php-ml.readthedocs.io/en/latest/machine-learning/regression/svr/)
* Clustering
* [k-Means](http://php-ml.readthedocs.io/en/latest/machine-learning/clustering/k-means)
* [DBSCAN](http://php-ml.readthedocs.io/en/latest/machine-learning/clustering/dbscan)
* Cross Validation
* [Random Split](http://php-ml.readthedocs.io/en/latest/machine-learning/cross-validation/random-split)
* Feature Extraction
* [Token Count Vectorizer](http://php-ml.readthedocs.io/en/latest/machine-learning/feature-extraction/token-count-vectorizer)
* Datasets
* [CSV](http://php-ml.readthedocs.io/en/latest/machine-learning/datasets/csv-dataset)
* Ready to use:

View File

@ -1,4 +1,4 @@
# PHP Machine Learning library
# PHP-ML - Machine Learning library for PHP
[![Build Status](https://scrutinizer-ci.com/g/php-ai/php-ml/badges/build.png?b=develop)](https://scrutinizer-ci.com/g/php-ai/php-ml/build-status/develop)
[![Documentation Status](https://readthedocs.org/projects/php-ml/badge/?version=develop)](http://php-ml.readthedocs.org/en/develop/?badge=develop)
@ -37,15 +37,19 @@ composer require php-ai/php-ml
## Features
* Classification
* [SVC](http://php-ml.readthedocs.io/en/latest/machine-learning/classification/svc/)
* [k-Nearest Neighbors](http://php-ml.readthedocs.io/en/latest/machine-learning/classification/k-nearest-neighbors/)
* [Naive Bayes](http://php-ml.readthedocs.io/en/latest/machine-learning/classification/naive-bayes/)
* Regression
* [Least Squares](http://php-ml.readthedocs.io/en/latest/machine-learning/regression/least-squares/)
* [SVR](http://php-ml.readthedocs.io/en/latest/machine-learning/regression/svr/)
* Clustering
* [k-Means](http://php-ml.readthedocs.io/en/latest/machine-learning/clustering/k-means)
* [DBSCAN](http://php-ml.readthedocs.io/en/latest/machine-learning/clustering/dbscan)
* Cross Validation
* [Random Split](http://php-ml.readthedocs.io/en/latest/machine-learning/cross-validation/random-split)
* Feature Extraction
* [Token Count Vectorizer](http://php-ml.readthedocs.io/en/latest/machine-learning/feature-extraction/token-count-vectorizer)
* Datasets
* [CSV](http://php-ml.readthedocs.io/en/latest/machine-learning/datasets/csv-dataset)
* Ready to use:

View File

@ -0,0 +1,47 @@
# Support Vector Classification
Classifier implementing Support Vector Machine based on libsvm.
### Constructor Parameters
* $kernel (int) - kernel type to be used in the algorithm (default Kernel::LINEAR)
* $cost (float) - parameter C of C-SVC (default 1.0)
* $degree (int) - degree of the Kernel::POLYNOMIAL function (default 3)
* $gamma (float) - kernel coefficient for Kernel::RBF, Kernel::POLYNOMIAL and Kernel::SIGMOID. If gamma is null then 1/features will be used instead.
* $coef0 (float) - independent term in kernel function. It is only significant in Kernel::POLYNOMIAL and Kernel::SIGMOID (default 0.0)
* $tolerance (float) - tolerance of termination criterion (default 0.001)
* $cacheSize (int) - cache memory size in MB (default 100)
* $shrinking (bool) - whether to use the shrinking heuristics (default true)
* $probabilityEstimates (bool) - whether to enable probability estimates (default false)
```
$classifier = new SVC(Kernel::LINEAR, $cost = 1000);
$classifier = new SVC(Kernel::RBF, $cost = 1000, $degree = 3, $gamma = 6);
```
### Train
To train a classifier simply provide train samples and labels (as `array`). Example:
```
use Phpml\Classification\SVC;
use Phpml\SupportVectorMachine\Kernel;
$samples = [[1, 3], [1, 4], [2, 4], [3, 1], [4, 1], [4, 2]];
$labels = ['a', 'a', 'a', 'b', 'b', 'b'];
$classifier = new SVC(Kernel::LINEAR, $cost = 1000);
$classifier->train($samples, $labels);
```
### Predict
To predict sample label use `predict` method. You can provide one sample or array of samples:
```
$classifier->predict([3, 2]);
// return 'b'
$classifier->predict([[3, 2], [1, 5]]);
// return ['b', 'a']
```

View File

@ -0,0 +1,50 @@
# Token Count Vectorizer
Transform a collection of text samples to a vector of token counts.
### Constructor Parameters
* $tokenizer (Tokenizer) - tokenizer object (see below)
* $minDF (float) - ignore tokens that have a samples frequency strictly lower than the given threshold. This value is also called cut-off in the literature. (default 0)
```
use Phpml\FeatureExtraction\TokenCountVectorizer;
use Phpml\Tokenization\WhitespaceTokenizer;
$vectorizer = new TokenCountVectorizer(new WhitespaceTokenizer());
```
### Transformation
To transform a collection of text samples use `transform` method. Example:
```
$samples = [
'Lorem ipsum dolor sit amet dolor',
'Mauris placerat ipsum dolor',
'Mauris diam eros fringilla diam',
];
$vectorizer = new TokenCountVectorizer(new WhitespaceTokenizer());
$vectorizer->transform($samples)
// return $vector = [
// [0 => 1, 1 => 1, 2 => 2, 3 => 1, 4 => 1],
// [5 => 1, 6 => 1, 1 => 1, 2 => 1],
// [5 => 1, 7 => 2, 8 => 1, 9 => 1],
//];
```
### Vocabulary
You can extract vocabulary using `getVocabulary()` method. Example:
```
$vectorizer->getVocabulary();
// return $vocabulary = ['Lorem', 'ipsum', 'dolor', 'sit', 'amet', 'Mauris', 'placerat', 'diam', 'eros', 'fringilla'];
```
### Tokenizers
* WhitespaceTokenizer - select tokens by whitespace.
* WordTokenizer - select tokens of 2 or more alphanumeric characters (punctuation is completely ignored and always treated as a token separator).

View File

@ -0,0 +1,44 @@
# Support Vector Regression
Class implementing Epsilon-Support Vector Regression based on libsvm.
### Constructor Parameters
* $kernel (int) - kernel type to be used in the algorithm (default Kernel::LINEAR)
* $degree (int) - degree of the Kernel::POLYNOMIAL function (default 3)
* $epsilon (float) - epsilon in loss function of epsilon-SVR (default 0.1)
* $cost (float) - parameter C of C-SVC (default 1.0)
* $gamma (float) - kernel coefficient for Kernel::RBF, Kernel::POLYNOMIAL and Kernel::SIGMOID. If gamma is null then 1/features will be used instead.
* $coef0 (float) - independent term in kernel function. It is only significant in Kernel::POLYNOMIAL and Kernel::SIGMOID (default 0.0)
* $tolerance (float) - tolerance of termination criterion (default 0.001)
* $cacheSize (int) - cache memory size in MB (default 100)
* $shrinking (bool) - whether to use the shrinking heuristics (default true)
```
$regression = new SVR(Kernel::LINEAR);
$regression = new SVR(Kernel::LINEAR, $degree = 3, $epsilon=10.0);
```
### Train
To train a model simply provide train samples and targets values (as `array`). Example:
```
use Phpml\Regression\SVR;
use Phpml\SupportVectorMachine\Kernel;
$samples = [[60], [61], [62], [63], [65]];
$targets = [3.1, 3.6, 3.8, 4, 4.1];
$regression = new SVR(Kernel::LINEAR);
$regression->train($samples, $targets);
```
### Predict
To predict sample target value use `predict` method. You can provide one sample or array of samples:
```
$regression->predict([64])
// return 4.03
```

View File

@ -3,15 +3,19 @@ pages:
- Home: index.md
- Machine Learning:
- Classification:
- SVC: machine-learning/classification/svc.md
- KNearestNeighbors: machine-learning/classification/k-nearest-neighbors.md
- NaiveBayes: machine-learning/classification/naive-bayes.md
- Regression:
- LeastSquares: machine-learning/regression/least-squares.md
- SVR: machine-learning/regression/svr.md
- Clustering:
- KMeans: machine-learning/clustering/k-means.md
- DBSCAN: machine-learning/clustering/dbscan.md
- Cross Validation:
- RandomSplit: machine-learning/cross-validation/random-split.md
- Feature Extraction:
- Token Count Vectorizer: machine-learning/feature-extraction/token-count-vectorizer.md
- Datasets:
- Array Dataset: machine-learning/datasets/array-dataset.md
- CSV Dataset: machine-learning/datasets/csv-dataset.md