update docs
This commit is contained in:
parent
46197eba7b
commit
365a9baeca
|
@ -37,15 +37,19 @@ composer require php-ai/php-ml
|
|||
## Features
|
||||
|
||||
* Classification
|
||||
* [SVC](http://php-ml.readthedocs.io/en/latest/machine-learning/classification/svc/)
|
||||
* [k-Nearest Neighbors](http://php-ml.readthedocs.io/en/latest/machine-learning/classification/k-nearest-neighbors/)
|
||||
* [Naive Bayes](http://php-ml.readthedocs.io/en/latest/machine-learning/classification/naive-bayes/)
|
||||
* Regression
|
||||
* [Least Squares](http://php-ml.readthedocs.io/en/latest/machine-learning/regression/least-squares/)
|
||||
* [SVR](http://php-ml.readthedocs.io/en/latest/machine-learning/regression/svr/)
|
||||
* Clustering
|
||||
* [k-Means](http://php-ml.readthedocs.io/en/latest/machine-learning/clustering/k-means)
|
||||
* [DBSCAN](http://php-ml.readthedocs.io/en/latest/machine-learning/clustering/dbscan)
|
||||
* Cross Validation
|
||||
* [Random Split](http://php-ml.readthedocs.io/en/latest/machine-learning/cross-validation/random-split)
|
||||
* Feature Extraction
|
||||
* [Token Count Vectorizer](http://php-ml.readthedocs.io/en/latest/machine-learning/feature-extraction/token-count-vectorizer)
|
||||
* Datasets
|
||||
* [CSV](http://php-ml.readthedocs.io/en/latest/machine-learning/datasets/csv-dataset)
|
||||
* Ready to use:
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
# PHP Machine Learning library
|
||||
# PHP-ML - Machine Learning library for PHP
|
||||
|
||||
[![Build Status](https://scrutinizer-ci.com/g/php-ai/php-ml/badges/build.png?b=develop)](https://scrutinizer-ci.com/g/php-ai/php-ml/build-status/develop)
|
||||
[![Documentation Status](https://readthedocs.org/projects/php-ml/badge/?version=develop)](http://php-ml.readthedocs.org/en/develop/?badge=develop)
|
||||
|
@ -37,15 +37,19 @@ composer require php-ai/php-ml
|
|||
## Features
|
||||
|
||||
* Classification
|
||||
* [SVC](http://php-ml.readthedocs.io/en/latest/machine-learning/classification/svc/)
|
||||
* [k-Nearest Neighbors](http://php-ml.readthedocs.io/en/latest/machine-learning/classification/k-nearest-neighbors/)
|
||||
* [Naive Bayes](http://php-ml.readthedocs.io/en/latest/machine-learning/classification/naive-bayes/)
|
||||
* Regression
|
||||
* [Least Squares](http://php-ml.readthedocs.io/en/latest/machine-learning/regression/least-squares/)
|
||||
* [SVR](http://php-ml.readthedocs.io/en/latest/machine-learning/regression/svr/)
|
||||
* Clustering
|
||||
* [k-Means](http://php-ml.readthedocs.io/en/latest/machine-learning/clustering/k-means)
|
||||
* [DBSCAN](http://php-ml.readthedocs.io/en/latest/machine-learning/clustering/dbscan)
|
||||
* Cross Validation
|
||||
* [Random Split](http://php-ml.readthedocs.io/en/latest/machine-learning/cross-validation/random-split)
|
||||
* Feature Extraction
|
||||
* [Token Count Vectorizer](http://php-ml.readthedocs.io/en/latest/machine-learning/feature-extraction/token-count-vectorizer)
|
||||
* Datasets
|
||||
* [CSV](http://php-ml.readthedocs.io/en/latest/machine-learning/datasets/csv-dataset)
|
||||
* Ready to use:
|
||||
|
|
|
@ -0,0 +1,47 @@
|
|||
# Support Vector Classification
|
||||
|
||||
Classifier implementing Support Vector Machine based on libsvm.
|
||||
|
||||
### Constructor Parameters
|
||||
|
||||
* $kernel (int) - kernel type to be used in the algorithm (default Kernel::LINEAR)
|
||||
* $cost (float) - parameter C of C-SVC (default 1.0)
|
||||
* $degree (int) - degree of the Kernel::POLYNOMIAL function (default 3)
|
||||
* $gamma (float) - kernel coefficient for ‘Kernel::RBF’, ‘Kernel::POLYNOMIAL’ and ‘Kernel::SIGMOID’. If gamma is ‘null’ then 1/features will be used instead.
|
||||
* $coef0 (float) - independent term in kernel function. It is only significant in ‘Kernel::POLYNOMIAL’ and ‘Kernel::SIGMOID’ (default 0.0)
|
||||
* $tolerance (float) - tolerance of termination criterion (default 0.001)
|
||||
* $cacheSize (int) - cache memory size in MB (default 100)
|
||||
* $shrinking (bool) - whether to use the shrinking heuristics (default true)
|
||||
* $probabilityEstimates (bool) - whether to enable probability estimates (default false)
|
||||
|
||||
```
|
||||
$classifier = new SVC(Kernel::LINEAR, $cost = 1000);
|
||||
$classifier = new SVC(Kernel::RBF, $cost = 1000, $degree = 3, $gamma = 6);
|
||||
```
|
||||
|
||||
### Train
|
||||
|
||||
To train a classifier simply provide train samples and labels (as `array`). Example:
|
||||
|
||||
```
|
||||
use Phpml\Classification\SVC;
|
||||
use Phpml\SupportVectorMachine\Kernel;
|
||||
|
||||
$samples = [[1, 3], [1, 4], [2, 4], [3, 1], [4, 1], [4, 2]];
|
||||
$labels = ['a', 'a', 'a', 'b', 'b', 'b'];
|
||||
|
||||
$classifier = new SVC(Kernel::LINEAR, $cost = 1000);
|
||||
$classifier->train($samples, $labels);
|
||||
```
|
||||
|
||||
### Predict
|
||||
|
||||
To predict sample label use `predict` method. You can provide one sample or array of samples:
|
||||
|
||||
```
|
||||
$classifier->predict([3, 2]);
|
||||
// return 'b'
|
||||
|
||||
$classifier->predict([[3, 2], [1, 5]]);
|
||||
// return ['b', 'a']
|
||||
```
|
|
@ -0,0 +1,50 @@
|
|||
# Token Count Vectorizer
|
||||
|
||||
Transform a collection of text samples to a vector of token counts.
|
||||
|
||||
### Constructor Parameters
|
||||
|
||||
* $tokenizer (Tokenizer) - tokenizer object (see below)
|
||||
* $minDF (float) - ignore tokens that have a samples frequency strictly lower than the given threshold. This value is also called cut-off in the literature. (default 0)
|
||||
|
||||
```
|
||||
use Phpml\FeatureExtraction\TokenCountVectorizer;
|
||||
use Phpml\Tokenization\WhitespaceTokenizer;
|
||||
|
||||
$vectorizer = new TokenCountVectorizer(new WhitespaceTokenizer());
|
||||
```
|
||||
|
||||
### Transformation
|
||||
|
||||
To transform a collection of text samples use `transform` method. Example:
|
||||
|
||||
```
|
||||
$samples = [
|
||||
'Lorem ipsum dolor sit amet dolor',
|
||||
'Mauris placerat ipsum dolor',
|
||||
'Mauris diam eros fringilla diam',
|
||||
];
|
||||
|
||||
$vectorizer = new TokenCountVectorizer(new WhitespaceTokenizer());
|
||||
$vectorizer->transform($samples)
|
||||
// return $vector = [
|
||||
// [0 => 1, 1 => 1, 2 => 2, 3 => 1, 4 => 1],
|
||||
// [5 => 1, 6 => 1, 1 => 1, 2 => 1],
|
||||
// [5 => 1, 7 => 2, 8 => 1, 9 => 1],
|
||||
//];
|
||||
|
||||
```
|
||||
|
||||
### Vocabulary
|
||||
|
||||
You can extract vocabulary using `getVocabulary()` method. Example:
|
||||
|
||||
```
|
||||
$vectorizer->getVocabulary();
|
||||
// return $vocabulary = ['Lorem', 'ipsum', 'dolor', 'sit', 'amet', 'Mauris', 'placerat', 'diam', 'eros', 'fringilla'];
|
||||
```
|
||||
|
||||
### Tokenizers
|
||||
|
||||
* WhitespaceTokenizer - select tokens by whitespace.
|
||||
* WordTokenizer - select tokens of 2 or more alphanumeric characters (punctuation is completely ignored and always treated as a token separator).
|
|
@ -0,0 +1,44 @@
|
|||
# Support Vector Regression
|
||||
|
||||
Class implementing Epsilon-Support Vector Regression based on libsvm.
|
||||
|
||||
### Constructor Parameters
|
||||
|
||||
* $kernel (int) - kernel type to be used in the algorithm (default Kernel::LINEAR)
|
||||
* $degree (int) - degree of the Kernel::POLYNOMIAL function (default 3)
|
||||
* $epsilon (float) - epsilon in loss function of epsilon-SVR (default 0.1)
|
||||
* $cost (float) - parameter C of C-SVC (default 1.0)
|
||||
* $gamma (float) - kernel coefficient for ‘Kernel::RBF’, ‘Kernel::POLYNOMIAL’ and ‘Kernel::SIGMOID’. If gamma is ‘null’ then 1/features will be used instead.
|
||||
* $coef0 (float) - independent term in kernel function. It is only significant in ‘Kernel::POLYNOMIAL’ and ‘Kernel::SIGMOID’ (default 0.0)
|
||||
* $tolerance (float) - tolerance of termination criterion (default 0.001)
|
||||
* $cacheSize (int) - cache memory size in MB (default 100)
|
||||
* $shrinking (bool) - whether to use the shrinking heuristics (default true)
|
||||
|
||||
```
|
||||
$regression = new SVR(Kernel::LINEAR);
|
||||
$regression = new SVR(Kernel::LINEAR, $degree = 3, $epsilon=10.0);
|
||||
```
|
||||
|
||||
### Train
|
||||
|
||||
To train a model simply provide train samples and targets values (as `array`). Example:
|
||||
|
||||
```
|
||||
use Phpml\Regression\SVR;
|
||||
use Phpml\SupportVectorMachine\Kernel;
|
||||
|
||||
$samples = [[60], [61], [62], [63], [65]];
|
||||
$targets = [3.1, 3.6, 3.8, 4, 4.1];
|
||||
|
||||
$regression = new SVR(Kernel::LINEAR);
|
||||
$regression->train($samples, $targets);
|
||||
```
|
||||
|
||||
### Predict
|
||||
|
||||
To predict sample target value use `predict` method. You can provide one sample or array of samples:
|
||||
|
||||
```
|
||||
$regression->predict([64])
|
||||
// return 4.03
|
||||
```
|
|
@ -3,15 +3,19 @@ pages:
|
|||
- Home: index.md
|
||||
- Machine Learning:
|
||||
- Classification:
|
||||
- SVC: machine-learning/classification/svc.md
|
||||
- KNearestNeighbors: machine-learning/classification/k-nearest-neighbors.md
|
||||
- NaiveBayes: machine-learning/classification/naive-bayes.md
|
||||
- Regression:
|
||||
- LeastSquares: machine-learning/regression/least-squares.md
|
||||
- SVR: machine-learning/regression/svr.md
|
||||
- Clustering:
|
||||
- KMeans: machine-learning/clustering/k-means.md
|
||||
- DBSCAN: machine-learning/clustering/dbscan.md
|
||||
- Cross Validation:
|
||||
- RandomSplit: machine-learning/cross-validation/random-split.md
|
||||
- Feature Extraction:
|
||||
- Token Count Vectorizer: machine-learning/feature-extraction/token-count-vectorizer.md
|
||||
- Datasets:
|
||||
- Array Dataset: machine-learning/datasets/array-dataset.md
|
||||
- CSV Dataset: machine-learning/datasets/csv-dataset.md
|
||||
|
|
Loading…
Reference in New Issue