update docs

2016-05-07 23:53:42 +02:00 · 2016-05-07 23:53:42 +02:00 · 365a9baeca
parent 46197eba7b
commit 365a9baeca
6 changed files with 154 additions and 1 deletions
--- a/README.md
+++ b/README.md
@ -37,15 +37,19 @@ composer require php-ai/php-ml
 ## Features

 * Classification
+    * [SVC](http://php-ml.readthedocs.io/en/latest/machine-learning/classification/svc/)
    * [k-Nearest Neighbors](http://php-ml.readthedocs.io/en/latest/machine-learning/classification/k-nearest-neighbors/)
    * [Naive Bayes](http://php-ml.readthedocs.io/en/latest/machine-learning/classification/naive-bayes/)
 * Regression
    * [Least Squares](http://php-ml.readthedocs.io/en/latest/machine-learning/regression/least-squares/)
+    * [SVR](http://php-ml.readthedocs.io/en/latest/machine-learning/regression/svr/)
 * Clustering
    * [k-Means](http://php-ml.readthedocs.io/en/latest/machine-learning/clustering/k-means)
    * [DBSCAN](http://php-ml.readthedocs.io/en/latest/machine-learning/clustering/dbscan)
 * Cross Validation
    * [Random Split](http://php-ml.readthedocs.io/en/latest/machine-learning/cross-validation/random-split)
+* Feature Extraction
+    * [Token Count Vectorizer](http://php-ml.readthedocs.io/en/latest/machine-learning/feature-extraction/token-count-vectorizer)
 * Datasets
    * [CSV](http://php-ml.readthedocs.io/en/latest/machine-learning/datasets/csv-dataset)
    * Ready to use:
--- a/docs/index.md
+++ b/docs/index.md
@ -1,4 +1,4 @@
-# PHP Machine Learning library
+# PHP-ML - Machine Learning library for PHP

 [![Build Status](https://scrutinizer-ci.com/g/php-ai/php-ml/badges/build.png?b=develop)](https://scrutinizer-ci.com/g/php-ai/php-ml/build-status/develop)
 [![Documentation Status](https://readthedocs.org/projects/php-ml/badge/?version=develop)](http://php-ml.readthedocs.org/en/develop/?badge=develop)
@ -37,15 +37,19 @@ composer require php-ai/php-ml
 ## Features

 * Classification
+    * [SVC](http://php-ml.readthedocs.io/en/latest/machine-learning/classification/svc/)
    * [k-Nearest Neighbors](http://php-ml.readthedocs.io/en/latest/machine-learning/classification/k-nearest-neighbors/)
    * [Naive Bayes](http://php-ml.readthedocs.io/en/latest/machine-learning/classification/naive-bayes/)
 * Regression
    * [Least Squares](http://php-ml.readthedocs.io/en/latest/machine-learning/regression/least-squares/)
+    * [SVR](http://php-ml.readthedocs.io/en/latest/machine-learning/regression/svr/)
 * Clustering
    * [k-Means](http://php-ml.readthedocs.io/en/latest/machine-learning/clustering/k-means)
    * [DBSCAN](http://php-ml.readthedocs.io/en/latest/machine-learning/clustering/dbscan)
 * Cross Validation
    * [Random Split](http://php-ml.readthedocs.io/en/latest/machine-learning/cross-validation/random-split)
+* Feature Extraction
+    * [Token Count Vectorizer](http://php-ml.readthedocs.io/en/latest/machine-learning/feature-extraction/token-count-vectorizer)
 * Datasets
    * [CSV](http://php-ml.readthedocs.io/en/latest/machine-learning/datasets/csv-dataset)
    * Ready to use:
--- a/docs/machine-learning/classification/svc.md
+++ b/docs/machine-learning/classification/svc.md
@ -0,0 +1,47 @@
+# Support Vector Classification
+
+Classifier implementing Support Vector Machine based on libsvm.
+
+### Constructor Parameters
+
+* $kernel (int) - kernel type to be used in the algorithm (default Kernel::LINEAR)
+* $cost (float) - parameter C of C-SVC (default 1.0)
+* $degree (int) - degree of the Kernel::POLYNOMIAL function (default 3)
+* $gamma (float) - kernel coefficient for ‘Kernel::RBF’, ‘Kernel::POLYNOMIAL’ and ‘Kernel::SIGMOID’. If gamma is ‘null’ then 1/features will be used instead.
+* $coef0 (float) - independent term in kernel function. It is only significant in ‘Kernel::POLYNOMIAL’ and ‘Kernel::SIGMOID’ (default 0.0)
+* $tolerance (float) - tolerance of termination criterion (default 0.001)
+* $cacheSize (int) - cache memory size in MB (default 100)
+* $shrinking (bool) - whether to use the shrinking heuristics (default true)
+* $probabilityEstimates (bool) - whether to enable probability estimates (default false)
+
+```
+$classifier = new SVC(Kernel::LINEAR, $cost = 1000);
+$classifier = new SVC(Kernel::RBF, $cost = 1000, $degree = 3, $gamma = 6);
+```
+
+### Train
+
+To train a classifier simply provide train samples and labels (as `array`). Example:
+
+```
+use Phpml\Classification\SVC;
+use Phpml\SupportVectorMachine\Kernel;
+
+$samples = [[1, 3], [1, 4], [2, 4], [3, 1], [4, 1], [4, 2]];
+$labels = ['a', 'a', 'a', 'b', 'b', 'b'];
+
+$classifier = new SVC(Kernel::LINEAR, $cost = 1000);
+$classifier->train($samples, $labels);
+```
+
+### Predict
+
+To predict sample label use `predict` method. You can provide one sample or array of samples:
+
+```
+$classifier->predict([3, 2]);
+// return 'b'
+
+$classifier->predict([[3, 2], [1, 5]]);
+// return ['b', 'a']
+```
--- a/docs/machine-learning/feature-extraction/token-count-vectorizer.md
+++ b/docs/machine-learning/feature-extraction/token-count-vectorizer.md
@ -0,0 +1,50 @@
+# Token Count Vectorizer
+
+Transform a collection of text samples to a vector of token counts.
+
+### Constructor Parameters
+
+* $tokenizer (Tokenizer) - tokenizer object (see below)
+* $minDF (float) -  ignore tokens that have a samples frequency strictly lower than the given threshold. This value is also called cut-off in the literature. (default 0)
+
+```
+use Phpml\FeatureExtraction\TokenCountVectorizer;
+use Phpml\Tokenization\WhitespaceTokenizer;
+
+$vectorizer = new TokenCountVectorizer(new WhitespaceTokenizer());
+```
+
+### Transformation
+
+To transform a collection of text samples use `transform` method. Example:
+
+```
+$samples = [
+    'Lorem ipsum dolor sit amet dolor',
+    'Mauris placerat ipsum dolor',
+    'Mauris diam eros fringilla diam',
+];
+
+$vectorizer = new TokenCountVectorizer(new WhitespaceTokenizer());
+$vectorizer->transform($samples)
+// return $vector = [
+//    [0 => 1, 1 => 1, 2 => 2, 3 => 1, 4 => 1],
+//    [5 => 1, 6 => 1, 1 => 1, 2 => 1],
+//    [5 => 1, 7 => 2, 8 => 1, 9 => 1],
+//];
+        
+```
+
+### Vocabulary
+
+You can extract vocabulary using `getVocabulary()` method. Example:
+
+```
+$vectorizer->getVocabulary();
+// return $vocabulary = ['Lorem', 'ipsum', 'dolor', 'sit', 'amet', 'Mauris', 'placerat', 'diam', 'eros', 'fringilla'];
+```
+
+### Tokenizers
+
+* WhitespaceTokenizer - select tokens by whitespace.
+* WordTokenizer - select tokens of 2 or more alphanumeric characters (punctuation is completely ignored and always treated as a token separator).
--- a/docs/machine-learning/regression/svr.md
+++ b/docs/machine-learning/regression/svr.md
@ -0,0 +1,44 @@
+# Support Vector Regression
+
+Class implementing Epsilon-Support Vector Regression based on libsvm.
+
+### Constructor Parameters
+
+* $kernel (int) - kernel type to be used in the algorithm (default Kernel::LINEAR)
+* $degree (int) - degree of the Kernel::POLYNOMIAL function (default 3)
+* $epsilon (float) -  epsilon in loss function of epsilon-SVR (default 0.1)
+* $cost (float) - parameter C of C-SVC (default 1.0)
+* $gamma (float) - kernel coefficient for ‘Kernel::RBF’, ‘Kernel::POLYNOMIAL’ and ‘Kernel::SIGMOID’. If gamma is ‘null’ then 1/features will be used instead.
+* $coef0 (float) - independent term in kernel function. It is only significant in ‘Kernel::POLYNOMIAL’ and ‘Kernel::SIGMOID’ (default 0.0)
+* $tolerance (float) - tolerance of termination criterion (default 0.001)
+* $cacheSize (int) - cache memory size in MB (default 100)
+* $shrinking (bool) - whether to use the shrinking heuristics (default true)
+
+```
+$regression = new SVR(Kernel::LINEAR);
+$regression = new SVR(Kernel::LINEAR, $degree = 3, $epsilon=10.0);
+```
+
+### Train
+
+To train a model simply provide train samples and targets values (as `array`). Example:
+
+```
+use Phpml\Regression\SVR;
+use Phpml\SupportVectorMachine\Kernel;
+
+$samples = [[60], [61], [62], [63], [65]];
+$targets = [3.1, 3.6, 3.8, 4, 4.1];
+
+$regression = new SVR(Kernel::LINEAR);
+$regression->train($samples, $targets);
+```
+
+### Predict
+
+To predict sample target value use `predict` method. You can provide one sample or array of samples:
+
+```
+$regression->predict([64])
+// return 4.03
+```
--- a/mkdocs.yml
+++ b/mkdocs.yml
@ -3,15 +3,19 @@ pages:
  - Home: index.md
  - Machine Learning:
    - Classification:
+      - SVC: machine-learning/classification/svc.md
      - KNearestNeighbors: machine-learning/classification/k-nearest-neighbors.md
      - NaiveBayes: machine-learning/classification/naive-bayes.md
    - Regression:
      - LeastSquares: machine-learning/regression/least-squares.md
+      - SVR: machine-learning/regression/svr.md
    - Clustering:
      - KMeans: machine-learning/clustering/k-means.md
      - DBSCAN: machine-learning/clustering/dbscan.md
    - Cross Validation:
      - RandomSplit: machine-learning/cross-validation/random-split.md
+    - Feature Extraction:
+      - Token Count Vectorizer: machine-learning/feature-extraction/token-count-vectorizer.md
    - Datasets:
      - Array Dataset: machine-learning/datasets/array-dataset.md
      - CSV Dataset: machine-learning/datasets/csv-dataset.md