diff --git a/docs/machine-learning/association/apriori.md b/docs/machine-learning/association/apriori.md index bbf829b..779ef28 100644 --- a/docs/machine-learning/association/apriori.md +++ b/docs/machine-learning/association/apriori.md @@ -15,7 +15,7 @@ $associator = new Apriori($support = 0.5, $confidence = 0.5); ### Train -To train a associator simply provide train samples and labels (as `array`). Example: +To train an associator, simply provide train samples and labels (as `array`). Example: ``` $samples = [['alpha', 'beta', 'epsilon'], ['alpha', 'beta', 'theta'], ['alpha', 'beta', 'epsilon'], ['alpha', 'beta', 'theta']]; @@ -31,7 +31,7 @@ You can train the associator using multiple data sets, predictions will be based ### Predict -To predict sample label use `predict` method. You can provide one sample or array of samples: +To predict sample label use the `predict` method. You can provide one sample or array of samples: ``` $associator->predict(['alpha','theta']); @@ -43,7 +43,7 @@ $associator->predict([['alpha','epsilon'],['beta','theta']]); ### Associating -Get generated association rules simply use `rules` method. +To get generated association rules, simply use the `rules` method. ``` $associator->getRules(); @@ -52,7 +52,7 @@ $associator->getRules(); ### Frequent item sets -Generating k-length frequent item sets simply use `apriori` method. +To generate k-length frequent item sets, simply use the `apriori` method. ``` $associator->apriori(); diff --git a/docs/machine-learning/classification/k-nearest-neighbors.md b/docs/machine-learning/classification/k-nearest-neighbors.md index a4eb96c..a4ba53b 100644 --- a/docs/machine-learning/classification/k-nearest-neighbors.md +++ b/docs/machine-learning/classification/k-nearest-neighbors.md @@ -14,7 +14,7 @@ $classifier = new KNearestNeighbors($k=3, new Minkowski($lambda=4)); ## Train -To train a classifier simply provide train samples and labels (as `array`). Example: +To train a classifier, simply provide train samples and labels (as `array`). Example: ``` $samples = [[1, 3], [1, 4], [2, 4], [3, 1], [4, 1], [4, 2]]; @@ -28,7 +28,7 @@ You can train the classifier using multiple data sets, predictions will be based ## Predict -To predict sample label use `predict` method. You can provide one sample or array of samples: +To predict sample label use the `predict` method. You can provide one sample or array of samples: ``` $classifier->predict([3, 2]); diff --git a/docs/machine-learning/classification/naive-bayes.md b/docs/machine-learning/classification/naive-bayes.md index af3b357..57fcdcf 100644 --- a/docs/machine-learning/classification/naive-bayes.md +++ b/docs/machine-learning/classification/naive-bayes.md @@ -4,7 +4,7 @@ Classifier based on applying Bayes' theorem with strong (naive) independence ass ### Train -To train a classifier simply provide train samples and labels (as `array`). Example: +To train a classifier, simply provide train samples and labels (as `array`). Example: ``` $samples = [[5, 1, 1], [1, 5, 1], [1, 1, 5]]; @@ -18,7 +18,7 @@ You can train the classifier using multiple data sets, predictions will be based ### Predict -To predict sample label use `predict` method. You can provide one sample or array of samples: +To predict sample label use the `predict` method. You can provide one sample or array of samples: ``` $classifier->predict([3, 1, 1]); diff --git a/docs/machine-learning/classification/svc.md b/docs/machine-learning/classification/svc.md index 99b4da0..3d87b62 100644 --- a/docs/machine-learning/classification/svc.md +++ b/docs/machine-learning/classification/svc.md @@ -21,7 +21,7 @@ $classifier = new SVC(Kernel::RBF, $cost = 1000, $degree = 3, $gamma = 6); ### Train -To train a classifier simply provide train samples and labels (as `array`). Example: +To train a classifier, simply provide train samples and labels (as `array`). Example: ``` use Phpml\Classification\SVC; @@ -38,7 +38,7 @@ You can train the classifier using multiple data sets, predictions will be based ### Predict -To predict sample label use `predict` method. You can provide one sample or array of samples: +To predict sample label use the `predict` method. You can provide one sample or array of samples: ``` $classifier->predict([3, 2]); @@ -74,7 +74,7 @@ $classifier = new SVC( $classifier->train($samples, $labels); ``` -Then use `predictProbability` method instead of `predict`: +Then use the `predictProbability` method instead of `predict`: ``` $classifier->predictProbability([3, 2]); diff --git a/docs/machine-learning/clustering/dbscan.md b/docs/machine-learning/clustering/dbscan.md index c82a195..ce01198 100644 --- a/docs/machine-learning/clustering/dbscan.md +++ b/docs/machine-learning/clustering/dbscan.md @@ -16,12 +16,12 @@ $dbscan = new DBSCAN($epsilon = 2, $minSamples = 3, new Minkowski($lambda=4)); ### Clustering -To divide the samples into clusters simply use `cluster` method. It's return the `array` of clusters with samples inside. +To divide the samples into clusters, simply use the `cluster` method. It returns the `array` of clusters with samples inside. ``` $samples = [[1, 1], [8, 7], [1, 2], [7, 8], [2, 1], [8, 9]]; $dbscan = new DBSCAN($epsilon = 2, $minSamples = 3); $dbscan->cluster($samples); -// return [0=>[[1, 1], ...], 1=>[[8, 7], ...]] +// return [0=>[[1, 1], ...], 1=>[[8, 7], ...]] ``` diff --git a/docs/machine-learning/clustering/k-means.md b/docs/machine-learning/clustering/k-means.md index 661f717..132c2dc 100644 --- a/docs/machine-learning/clustering/k-means.md +++ b/docs/machine-learning/clustering/k-means.md @@ -1,6 +1,6 @@ # K-means clustering -The K-Means algorithm clusters data by trying to separate samples in n groups of equal variance, minimizing a criterion known as the inertia or within-cluster sum-of-squares. +The K-Means algorithm clusters data by trying to separate samples in n groups of equal variance, minimizing a criterion known as the inertia or within-cluster sum-of-squares. This algorithm requires the number of clusters to be specified. ### Constructor Parameters @@ -15,11 +15,11 @@ $kmeans = new KMeans(4, KMeans::INIT_RANDOM); ### Clustering -To divide the samples into clusters simply use `cluster` method. It's return the `array` of clusters with samples inside. +To divide the samples into clusters, simply use the `cluster` method. It returns the `array` of clusters with samples inside. ``` $samples = [[1, 1], [8, 7], [1, 2], [7, 8], [2, 1], [8, 9]]; -Or if you need to keep your indentifiers along with yours samples you can use array keys as labels. +Or if you need to keep your identifiers along with yours samples you can use array keys as labels. $samples = [ 'Label1' => [1, 1], 'Label2' => [8, 7], 'Label3' => [1, 2]]; $kmeans = new KMeans(2); @@ -32,8 +32,8 @@ $kmeans->cluster($samples); #### kmeans++ (default) K-means++ method selects initial cluster centers for k-mean clustering in a smart way to speed up convergence. -It use the DASV seeding method consists of finding good initial centroids for the clusters. +It uses the DASV seeding method consists of finding good initial centroids for the clusters. #### random -Random initialization method chooses completely random centroid. It get the space boundaries to avoid placing clusters centroid too far from samples data. +Random initialization method chooses completely random centroid. It gets the space boundaries to avoid placing cluster centroids too far from samples data. diff --git a/docs/machine-learning/cross-validation/random-split.md b/docs/machine-learning/cross-validation/random-split.md index edfdded..a5bf402 100644 --- a/docs/machine-learning/cross-validation/random-split.md +++ b/docs/machine-learning/cross-validation/random-split.md @@ -1,20 +1,20 @@ # Random Split -One of the simplest methods from Cross-validation is implemented as `RandomSpilt` class. Samples are split to two groups: train group and test group. You can adjust number of samples in each group. +One of the simplest methods from Cross-validation is implemented as `RandomSpilt` class. Samples are split to two groups: train group and test group. You can adjust the number of samples in each group. ### Constructor Parameters * $dataset - object that implements `Dataset` interface * $testSize - a fraction of test split (float, from 0 to 1, default: 0.3) * $seed - seed for random generator (e.g. for tests) - + ``` $randomSplit = new RandomSplit($dataset, 0.2); ``` ### Samples and labels groups -To get samples or labels from test and train group you can use getters: +To get samples or labels from test and train group, you can use getters: ``` $dataset = new RandomSplit($dataset, 0.3, 1234); diff --git a/docs/machine-learning/cross-validation/stratified-random-split.md b/docs/machine-learning/cross-validation/stratified-random-split.md index d3f53be..1a6caa1 100644 --- a/docs/machine-learning/cross-validation/stratified-random-split.md +++ b/docs/machine-learning/cross-validation/stratified-random-split.md @@ -1,22 +1,22 @@ # Stratified Random Split -Analogously to `RandomSpilt` class samples are split to two groups: train group and test group. +Analogously to `RandomSpilt` class, samples are split to two groups: train group and test group. Distribution of samples takes into account their targets and trying to divide them equally. -You can adjust number of samples in each group. +You can adjust the number of samples in each group. ### Constructor Parameters * $dataset - object that implements `Dataset` interface * $testSize - a fraction of test split (float, from 0 to 1, default: 0.3) * $seed - seed for random generator (e.g. for tests) - + ``` $split = new StratifiedRandomSplit($dataset, 0.2); ``` ### Samples and labels groups -To get samples or labels from test and train group you can use getters: +To get samples or labels from test and train group, you can use getters: ``` $dataset = new StratifiedRandomSplit($dataset, 0.3, 1234); @@ -41,4 +41,4 @@ $dataset = new ArrayDataset( $split = new StratifiedRandomSplit($dataset, 0.5); ``` -Split will have equals amount of each target. Two of the target `a` and two of `b`. +Split will have equal amounts of each target. Two of the target `a` and two of `b`. diff --git a/docs/machine-learning/datasets/array-dataset.md b/docs/machine-learning/datasets/array-dataset.md index 8bbcc37..87bae48 100644 --- a/docs/machine-learning/datasets/array-dataset.md +++ b/docs/machine-learning/datasets/array-dataset.md @@ -2,7 +2,7 @@ Helper class that holds data as PHP `array` type. Implements the `Dataset` interface which is used heavily in other classes. -### Constructors Parameters +### Constructor Parameters * $samples - (array) of samples * $labels - (array) of labels @@ -15,7 +15,7 @@ $dataset = new ArrayDataset([[1, 1], [2, 1], [3, 2], [4, 1]], ['a', 'a', 'b', 'b ### Samples and labels -To get samples or labels you can use getters: +To get samples or labels, you can use getters: ``` $dataset->getSamples(); @@ -24,7 +24,7 @@ $dataset->getTargets(); ### Remove columns -You can remove columns by index numbers, for example: +You can remove columns by their index numbers, for example: ``` use Phpml\Dataset\ArrayDataset; diff --git a/docs/machine-learning/datasets/csv-dataset.md b/docs/machine-learning/datasets/csv-dataset.md index d2efaaa..557b7fc 100644 --- a/docs/machine-learning/datasets/csv-dataset.md +++ b/docs/machine-learning/datasets/csv-dataset.md @@ -2,11 +2,11 @@ Helper class that loads data from CSV file. It extends the `ArrayDataset`. -### Constructors Parameters +### Constructor Parameters * $filepath - (string) path to `.csv` file * $features - (int) number of columns that are features (starts from first column), last column must be a label -* $headingRow - (bool) define is file have a heading row (if `true` then first row will be ignored) +* $headingRow - (bool) define if the file has a heading row (if `true` then first row will be ignored) ``` $dataset = new CsvDataset('dataset.csv', 2, true); diff --git a/docs/machine-learning/datasets/files-dataset.md b/docs/machine-learning/datasets/files-dataset.md index f050cfd..6d55b3f 100644 --- a/docs/machine-learning/datasets/files-dataset.md +++ b/docs/machine-learning/datasets/files-dataset.md @@ -2,7 +2,7 @@ Helper class that loads dataset from files. Use folder names as targets. It extends the `ArrayDataset`. -### Constructors Parameters +### Constructor Parameters * $rootPath - (string) path to root folder that contains files dataset @@ -42,7 +42,7 @@ data ... ``` -Load files data with `FilesDataset`: +Load files data with `FilesDataset`: ``` use Phpml\Dataset\FilesDataset; diff --git a/docs/machine-learning/datasets/mnist-dataset.md b/docs/machine-learning/datasets/mnist-dataset.md index 1ed5081..5c7a76e 100644 --- a/docs/machine-learning/datasets/mnist-dataset.md +++ b/docs/machine-learning/datasets/mnist-dataset.md @@ -1,6 +1,6 @@ # MnistDataset -Helper class that load data from MNIST dataset: [http://yann.lecun.com/exdb/mnist/](http://yann.lecun.com/exdb/mnist/) +Helper class that loads data from MNIST dataset: [http://yann.lecun.com/exdb/mnist/](http://yann.lecun.com/exdb/mnist/) > The MNIST database of handwritten digits, available from this page, has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image. It is a good database for people who want to try learning techniques and pattern recognition methods on real-world data while spending minimal efforts on preprocessing and formatting. @@ -18,7 +18,7 @@ $trainDataset = new MnistDataset('train-images-idx3-ubyte', 'train-labels-idx1-u ### Samples and labels -To get samples or labels you can use getters: +To get samples or labels, you can use getters: ``` $dataset->getSamples(); diff --git a/docs/machine-learning/datasets/svm-dataset.md b/docs/machine-learning/datasets/svm-dataset.md index 8ac1c26..93a8cfb 100644 --- a/docs/machine-learning/datasets/svm-dataset.md +++ b/docs/machine-learning/datasets/svm-dataset.md @@ -2,7 +2,7 @@ Helper class that loads data from SVM-Light format file. It extends the `ArrayDataset`. -### Constructors Parameters +### Constructor Parameters * $filepath - (string) path to the file diff --git a/docs/machine-learning/feature-extraction/tf-idf-transformer.md b/docs/machine-learning/feature-extraction/tf-idf-transformer.md index c592b8d..4ac2e5d 100644 --- a/docs/machine-learning/feature-extraction/tf-idf-transformer.md +++ b/docs/machine-learning/feature-extraction/tf-idf-transformer.md @@ -19,7 +19,7 @@ $transformer = new TfIdfTransformer($samples); ### Transformation -To transform a collection of text samples use `transform` method. Example: +To transform a collection of text samples, use the `transform` method. Example: ``` use Phpml\FeatureExtraction\TfIdfTransformer; @@ -28,7 +28,7 @@ $samples = [ [0 => 1, 1 => 1, 2 => 2, 3 => 1, 4 => 0, 5 => 0], [0 => 1, 1 => 1, 2 => 0, 3 => 0, 4 => 2, 5 => 3], ]; - + $transformer = new TfIdfTransformer($samples); $transformer->transform($samples); @@ -38,5 +38,5 @@ $samples = [ [0 => 0, 1 => 0, 2 => 0, 3 => 0, 4 => 0.602, 5 => 0.903], ]; */ - + ``` diff --git a/docs/machine-learning/feature-extraction/token-count-vectorizer.md b/docs/machine-learning/feature-extraction/token-count-vectorizer.md index 4dc5260..7d9405e 100644 --- a/docs/machine-learning/feature-extraction/token-count-vectorizer.md +++ b/docs/machine-learning/feature-extraction/token-count-vectorizer.md @@ -16,7 +16,7 @@ $vectorizer = new TokenCountVectorizer(new WhitespaceTokenizer()); ### Transformation -To transform a collection of text samples use `transform` method. Example: +To transform a collection of text samples, use the `transform` method. Example: ``` $samples = [ @@ -42,7 +42,7 @@ $vectorizer->transform($samples); ### Vocabulary -You can extract vocabulary using `getVocabulary()` method. Example: +You can extract vocabulary using the `getVocabulary()` method. Example: ``` $vectorizer->getVocabulary(); diff --git a/docs/machine-learning/feature-selection/selectkbest.md b/docs/machine-learning/feature-selection/selectkbest.md index 2d8024c..71d1ff9 100644 --- a/docs/machine-learning/feature-selection/selectkbest.md +++ b/docs/machine-learning/feature-selection/selectkbest.md @@ -5,7 +5,7 @@ ## Constructor Parameters * $k (int) - number of top features to select, rest will be removed (default: 10) -* $scoringFunction (ScoringFunction) - function that take samples and targets and return array with scores (default: ANOVAFValue) +* $scoringFunction (ScoringFunction) - function that takes samples and targets and returns an array with scores (default: ANOVAFValue) ```php use Phpml\FeatureSelection\SelectKBest; @@ -27,13 +27,13 @@ $selector->fit($samples = $dataset->getSamples(), $dataset->getTargets()); $selector->transform($samples); /* -$samples[0] = [1.4, 0.2]; +$samples[0] = [1.4, 0.2]; */ ``` ## Scores -You can get a array with the calculated score for each feature. +You can get an array with the calculated score for each feature. A higher value means that a given feature is better suited for learning. Of course, the rating depends on the scoring function used. @@ -56,7 +56,7 @@ $selector->scores(); float(1179.0343277002) [3]=> float(959.32440572573) -} +} */ ``` @@ -70,11 +70,11 @@ For classification: The test is applied to samples from two or more groups, possibly with differing sizes. For regression: - - **UnivariateLinearRegression** + - **UnivariateLinearRegression** Quick linear model for testing the effect of a single regressor, sequentially for many regressors. This is done in 2 steps: - 1. The cross correlation between each regressor and the target is computed, that is, ((X[:, i] - mean(X[:, i])) * (y - mean_y)) / (std(X[:, i]) *std(y)). - - 2. It is converted to an F score + - 2. It is converted to an F score ## Pipeline diff --git a/docs/machine-learning/feature-selection/variance-threshold.md b/docs/machine-learning/feature-selection/variance-threshold.md index 9c942e7..4021895 100644 --- a/docs/machine-learning/feature-selection/variance-threshold.md +++ b/docs/machine-learning/feature-selection/variance-threshold.md @@ -1,7 +1,7 @@ # Variance Threshold -`VarianceThreshold` is a simple baseline approach to feature selection. -It removes all features whose variance doesn’t meet some threshold. +`VarianceThreshold` is a simple baseline approach to feature selection. +It removes all features whose variance doesn’t meet some threshold. By default, it removes all zero-variance features, i.e. features that have the same value in all samples. ## Constructor Parameters @@ -16,10 +16,10 @@ $transformer = new VarianceThreshold(0.15); ## Example of use -As an example, suppose that we have a dataset with boolean features and +As an example, suppose that we have a dataset with boolean features and we want to remove all features that are either one or zero (on or off) -in more than 80% of the samples. -Boolean features are Bernoulli random variables, and the variance of such +in more than 80% of the samples. +Boolean features are Bernoulli random variables, and the variance of such variables is given by ``` Var[X] = p(1 - p) diff --git a/docs/machine-learning/metric/accuracy.md b/docs/machine-learning/metric/accuracy.md index 5045973..efdab23 100644 --- a/docs/machine-learning/metric/accuracy.md +++ b/docs/machine-learning/metric/accuracy.md @@ -1,10 +1,10 @@ # Accuracy -Class for calculate classifier accuracy. +Class for calculating classifier accuracy. ### Score -To calculate classifier accuracy score use `score` static method. Parameters: +To calculate classifier accuracy score, use the `score` static method. Parameters: * $actualLabels - (array) true sample labels * $predictedLabels - (array) predicted labels (e.x. from test group) diff --git a/docs/machine-learning/metric/classification-report.md b/docs/machine-learning/metric/classification-report.md index 53f125b..f5591a8 100644 --- a/docs/machine-learning/metric/classification-report.md +++ b/docs/machine-learning/metric/classification-report.md @@ -1,6 +1,6 @@ # Classification Report -Class for calculate main classifier metrics: precision, recall, F1 score and support. +Class for calculating main classifier metrics: precision, recall, F1 score and support. ### Report diff --git a/docs/machine-learning/metric/confusion-matrix.md b/docs/machine-learning/metric/confusion-matrix.md index b07443a..4ff08c9 100644 --- a/docs/machine-learning/metric/confusion-matrix.md +++ b/docs/machine-learning/metric/confusion-matrix.md @@ -1,6 +1,6 @@ # Confusion Matrix -Class for compute confusion matrix to evaluate the accuracy of a classification. +Class for computing confusion matrix to evaluate the accuracy of a classification. ### Example (all targets) diff --git a/docs/machine-learning/neural-network/multilayer-perceptron-classifier.md b/docs/machine-learning/neural-network/multilayer-perceptron-classifier.md index 7365a71..976d475 100644 --- a/docs/machine-learning/neural-network/multilayer-perceptron-classifier.md +++ b/docs/machine-learning/neural-network/multilayer-perceptron-classifier.md @@ -39,8 +39,7 @@ $mlp = new MLPClassifier(4, [$layer1, $layer2], ['a', 'b', 'c']); ## Train -To train a MLP simply provide train samples and labels (as array). Example: - +To train a MLP, simply provide train samples and labels (as array). Example: ``` $mlp->train( @@ -71,7 +70,7 @@ $mlp->setLearningRate(0.1); ## Predict -To predict sample label use predict method. You can provide one sample or array of samples: +To predict sample label use the `predict` method. You can provide one sample or array of samples: ``` $mlp->predict([[1, 1, 1, 1], [0, 0, 0, 0]]); diff --git a/docs/machine-learning/preprocessing/imputation-missing-values.md b/docs/machine-learning/preprocessing/imputation-missing-values.md index 219db22..302d89d 100644 --- a/docs/machine-learning/preprocessing/imputation-missing-values.md +++ b/docs/machine-learning/preprocessing/imputation-missing-values.md @@ -49,7 +49,7 @@ $data = [ ``` -You can also use `$samples` constructer parameter instead of `fit` method: +You can also use the `$samples` constructor parameter instead of the `fit` method: ``` use Phpml\Preprocessing\Imputer; diff --git a/docs/machine-learning/regression/least-squares.md b/docs/machine-learning/regression/least-squares.md index 84a3279..5505f13 100644 --- a/docs/machine-learning/regression/least-squares.md +++ b/docs/machine-learning/regression/least-squares.md @@ -1,10 +1,10 @@ # LeastSquares Linear Regression -Linear model that use least squares method to approximate solution. +Linear model that uses least squares method to approximate solution. ### Train -To train a model simply provide train samples and targets values (as `array`). Example: +To train a model, simply provide train samples and targets values (as `array`). Example: ``` $samples = [[60], [61], [62], [63], [65]]; @@ -18,7 +18,7 @@ You can train the model using multiple data sets, predictions will be based on a ### Predict -To predict sample target value use `predict` method with sample to check (as `array`). Example: +To predict sample target value, use the `predict` method with sample to check (as `array`). Example: ``` $regression->predict([64]); @@ -27,8 +27,8 @@ $regression->predict([64]); ### Multiple Linear Regression -The term multiple attached to linear regression means that there are two or more sample parameters used to predict target. -For example you can use: mileage and production year to predict price of a car. +The term multiple attached to linear regression means that there are two or more sample parameters used to predict target. +For example you can use: mileage and production year to predict the price of a car. ``` $samples = [[73676, 1996], [77006, 1998], [10565, 2000], [146088, 1995], [15000, 2001], [65940, 2000], [9300, 2000], [93739, 1996], [153260, 1994], [17764, 2002], [57000, 1998], [15000, 2000]]; @@ -42,7 +42,7 @@ $regression->predict([60000, 1996]) ### Intercept and Coefficients -After you train your model you can get the intercept and coefficients array. +After you train your model, you can get the intercept and coefficients array. ``` $regression->getIntercept(); diff --git a/docs/machine-learning/regression/svr.md b/docs/machine-learning/regression/svr.md index 1678f5f..14f9e6a 100644 --- a/docs/machine-learning/regression/svr.md +++ b/docs/machine-learning/regression/svr.md @@ -21,7 +21,7 @@ $regression = new SVR(Kernel::LINEAR, $degree = 3, $epsilon=10.0); ### Train -To train a model simply provide train samples and targets values (as `array`). Example: +To train a model, simply provide train samples and targets values (as `array`). Example: ``` use Phpml\Regression\SVR; @@ -38,7 +38,7 @@ You can train the model using multiple data sets, predictions will be based on a ### Predict -To predict sample target value use `predict` method. You can provide one sample or array of samples: +To predict sample target value, use the `predict` method. You can provide one sample or array of samples: ``` $regression->predict([64]) diff --git a/docs/machine-learning/workflow/pipeline.md b/docs/machine-learning/workflow/pipeline.md index 34465eb..b89b88e 100644 --- a/docs/machine-learning/workflow/pipeline.md +++ b/docs/machine-learning/workflow/pipeline.md @@ -5,13 +5,12 @@ In machine learning, it is common to run a sequence of algorithms to process and * Split each document’s text into tokens. * Convert each document’s words into a numerical feature vector ([Token Count Vectorizer](machine-learning/feature-extraction/token-count-vectorizer/)). * Learn a prediction model using the feature vectors and labels. - -PHP-ML represents such a workflow as a Pipeline, which consists sequence of transformers and a estimator. +PHP-ML represents such a workflow as a Pipeline, which consists of a sequence of transformers and an estimator. ### Constructor Parameters -* $transformers (array|Transformer[]) - sequence of objects that implements Transformer interface +* $transformers (array|Transformer[]) - sequence of objects that implements the Transformer interface * $estimator (Estimator) - estimator that can train and predict ``` @@ -29,7 +28,8 @@ $pipeline = new Pipeline($transformers, $estimator); ### Example -First our pipeline replace missing value, then normalize samples and finally train SVC estimator. Thus prepared pipeline repeats each transformation step for predicted sample. +First, our pipeline replaces the missing value, then normalizes samples and finally trains the SVC estimator. +Thus prepared pipeline repeats each transformation step for predicted sample. ``` use Phpml\Classification\SVC; diff --git a/docs/math/distance.md b/docs/math/distance.md index 6970742..c7c3a98 100644 --- a/docs/math/distance.md +++ b/docs/math/distance.md @@ -4,7 +4,7 @@ Selected algorithms require the use of a function for calculating the distance. ### Euclidean -Class for calculation Euclidean distance. +Class for calculating Euclidean distance. ![euclidean](https://upload.wikimedia.org/math/8/4/9/849f040fd10bb86f7c85eb0bbe3566a4.png "Euclidean Distance") @@ -13,7 +13,7 @@ To calculate Euclidean distance: ``` $a = [4, 6]; $b = [2, 5]; - + $euclidean = new Euclidean(); $euclidean->distance($a, $b); // return 2.2360679774998 @@ -21,7 +21,7 @@ $euclidean->distance($a, $b); ### Manhattan -Class for calculation Manhattan distance. +Class for calculating Manhattan distance. ![manhattan](https://upload.wikimedia.org/math/4/c/5/4c568bd1d76a6b15e19cb2ac3ad75350.png "Manhattan Distance") @@ -30,7 +30,7 @@ To calculate Manhattan distance: ``` $a = [4, 6]; $b = [2, 5]; - + $manhattan = new Manhattan(); $manhattan->distance($a, $b); // return 3 @@ -38,7 +38,7 @@ $manhattan->distance($a, $b); ### Chebyshev -Class for calculation Chebyshev distance. +Class for calculating Chebyshev distance. ![chebyshev](https://upload.wikimedia.org/math/7/1/2/71200f7dbb43b3bcfbcbdb9e02ab0a0c.png "Chebyshev Distance") @@ -47,7 +47,7 @@ To calculate Chebyshev distance: ``` $a = [4, 6]; $b = [2, 5]; - + $chebyshev = new Chebyshev(); $chebyshev->distance($a, $b); // return 2 @@ -55,7 +55,7 @@ $chebyshev->distance($a, $b); ### Minkowski -Class for calculation Minkowski distance. +Class for calculating Minkowski distance. ![minkowski](https://upload.wikimedia.org/math/a/a/0/aa0c62083c12390cb15ac3217de88e66.png "Minkowski Distance") @@ -64,7 +64,7 @@ To calculate Minkowski distance: ``` $a = [4, 6]; $b = [2, 5]; - + $minkowski = new Minkowski(); $minkowski->distance($a, $b); // return 2.080 @@ -83,7 +83,7 @@ $minkowski->distance($a, $b); ### Custom distance -To apply your own function of distance use `Distance` interface. Example +To apply your own function of distance use the `Distance` interface. Example: ``` class CustomDistance implements Distance @@ -103,7 +103,7 @@ class CustomDistance implements Distance $distance[] = $a[$i] * $b[$i]; } - return min($distance); + return min($distance); } } ``` diff --git a/docs/math/statistic.md b/docs/math/statistic.md index 626828e..a677a58 100644 --- a/docs/math/statistic.md +++ b/docs/math/statistic.md @@ -7,7 +7,7 @@ Selected statistical methods. Correlation coefficients are used in statistics to measure how strong a relationship is between two variables. There are several types of correlation coefficient. ### Pearson correlation - + Pearson’s correlation or Pearson correlation is a correlation coefficient commonly used in linear regression. Example: