Commit Graph

222 Commits

Author SHA1 Message Date
Mustafa Karabulut 1d73503958 Ensemble Classifiers : Bagging and RandomForest (#36)
* Fuzzy C-Means implementation

* Update FuzzyCMeans

* Rename FuzzyCMeans to FuzzyCMeans.php

* Update NaiveBayes.php

* Small fix applied to improve training performance

array_unique is replaced with array_count_values+array_keys which is way
faster

* Revert "Small fix applied to improve training performance"

This reverts commit c20253f16ac3e8c37d33ecaee28a87cc767e3b7f.

* Revert "Revert "Small fix applied to improve training performance""

This reverts commit ea10e136c4c11b71609ccdcaf9999067e4be473e.

* Revert "Small fix applied to improve training performance"

This reverts commit c20253f16ac3e8c37d33ecaee28a87cc767e3b7f.

* First DecisionTree implementation

* Revert "First DecisionTree implementation"

This reverts commit 4057a08679c26010c39040a48a3e6dad994a1a99.

* DecisionTree

* FCM Test

* FCM Test

* DecisionTree Test

* Ensemble classifiers: Bagging and RandomForests

* test

* Fixes for conflicted files

* Bagging and RandomForest ensemble algorithms

* Changed unit test

* Changed unit test

* Changed unit test

* Bagging and RandomForest ensemble algorithms

* Baggging and RandomForest ensemble algorithms

* Bagging and RandomForest ensemble algorithms

RandomForest algorithm is improved with changes to original DecisionTree

* Bagging and RandomForest ensemble algorithms

* Slight fix about use of global Exception class

* Fixed the error about wrong use of global Exception class

* RandomForest code formatting
2017-02-07 12:37:56 +01:00
Arkadiusz Kondas b7c9983524 Do not requre file to exist for model manager 2017-02-03 17:48:15 +01:00
Arkadiusz Kondas 858d13b0fa Update phpunit to 6.0 2017-02-03 12:58:25 +01:00
David Monllaó 8f122fde90 Persistence class to save and restore models (#37)
* Models manager with save/restore capabilities

* Refactoring dataset exceptions

* Persistency layer docs

* New tests for serializable estimators

* ModelManager static methods to instance methods
2017-02-02 09:03:09 +01:00
David Monllaó c1b1a5d6ac Support for multiple training datasets (#38)
* Multiple training data sets allowed

* Tests with multiple training data sets

* Updating docs according to #38

Documenting all models which predictions will be based on all
training data provided.

Some models already supported multiple training data sets.
2017-02-01 19:06:38 +01:00
Arkadiusz Kondas c3686358b3 Add rules for new cs-fixer 2017-01-31 20:33:08 +01:00
Mustafa Karabulut 87396ebe58 DecisionTree and Fuzzy C Means classifiers (#35)
* Fuzzy C-Means implementation

* Update FuzzyCMeans

* Rename FuzzyCMeans to FuzzyCMeans.php

* Update NaiveBayes.php

* Small fix applied to improve training performance

array_unique is replaced with array_count_values+array_keys which is way
faster

* Revert "Small fix applied to improve training performance"

This reverts commit c20253f16ac3e8c37d33ecaee28a87cc767e3b7f.

* Revert "Revert "Small fix applied to improve training performance""

This reverts commit ea10e136c4c11b71609ccdcaf9999067e4be473e.

* Revert "Small fix applied to improve training performance"

This reverts commit c20253f16ac3e8c37d33ecaee28a87cc767e3b7f.

* DecisionTree

* FCM Test

* FCM Test

* DecisionTree Test
2017-01-31 20:27:15 +01:00
Arkadiusz Kondas a78ebc159a Use assertCount in tests 2016-12-12 19:31:30 +01:00
Arkadiusz Kondas b6fe290c65 Fix for php7.1 accuracy test score 2016-12-12 19:28:26 +01:00
Arkadiusz Kondas 12d0adda62 Increase iterations number in Backpropagation test (sometimes it fails) 2016-11-20 22:56:18 +01:00
Arkadiusz Kondas cbdc049526 Update php-cs-fixer 2016-11-20 22:53:17 +01:00
Arkadiusz Kondas bca2196b57 Prevent Division by zero error in classification report 2016-11-20 22:49:26 +01:00
Arkadiusz Kondas 349ea16f01 Rename demo datasets and add Dataset suffix 2016-09-30 14:02:08 +02:00
Arkadiusz Kondas 84af842f04 Fix division by zero in ClassificationReport #21 2016-09-27 20:07:21 +02:00
Arkadiusz Kondas 1ce6bb544b Run php-cs-fixer 2016-09-21 21:51:19 +02:00
Arkadiusz Kondas 8072ddb2bf Update phpunit to 5.5 2016-09-21 21:46:16 +02:00
Patrick Florek fa87eca375 Add new class Set for simple Set-theoretical operations
### Features

* Works only with primitive types int, float, string
* Implements set theortic operations union, intersection, complement
* Modifies set by adding, removing elements
* Implements \IteratorAggregate for use in loops

### Implementation details

Based on array functions:
* array_diff,
* array_merge,
* array_intersection,
* array_unique,
* array_values,
* sort.

### Drawbacks

* **Do not work with objects.**
* Power set and Cartesian product returning array of Set
2016-09-10 13:24:43 +02:00
Patrick Florek 90038befa9 Apply comments / coding styles
* Remove user-specific gitignore
* Add return type hints
* Avoid global namespace in docs
* Rename rules -> getRules
* Split up rule generation

Todo:
* Move set theory out to math
* Extract rule generation
2016-09-02 00:26:01 +02:00
Patrick Florek c8bd8db601 # Association rule learning - Apriori algorithm
* Generating frequent k-length item sets
* Generating rules based on frequent item sets
* Algorithm has exponential complexity, be aware of it
* Apriori algorithm is split into apriori and candidates method
* Second step rule generation is implemented by rules method
* Internal methods are invoked for fine grain unit tests
* Wikipedia's train samples and an alternative are provided for test cases
* Small documentation for public interface is also shipped
2016-08-23 15:44:53 +02:00
Arkadiusz Kondas 6421a2ba41 Develop to master (#18)
* Fix Backpropagation test with explicit random generator seed

* remove custom seed - not working :(

* Updated links in readme
2016-08-21 14:03:20 +02:00
Arkadiusz Kondas c506a84164 refactor Backpropagation methods and simplify things 2016-08-10 23:03:02 +02:00
Arkadiusz Kondas 66d029e94f implement and test Backpropagation training 2016-08-10 22:43:47 +02:00
Arkadiusz Kondas e5d39ee18a implements and test multilayer perceptron methods 2016-08-09 13:27:48 +02:00
Arkadiusz Kondas 64859f263f test abstraction from LayeredNetwork 2016-08-07 23:41:08 +02:00
Arkadiusz Kondas 95b29d40b1 add Layer, Input and Bias for neutal network 2016-08-05 10:20:31 +02:00
Arkadiusz Kondas 7062ee29e1 add Neuron and Synapse classes 2016-08-02 20:30:20 +02:00
Arkadiusz Kondas 637fd613b8 implement activation function for neural network 2016-08-02 13:07:47 +02:00
Pablo Joán Iglesias 38deaaeb2e testScalarProduct check for non numeric values (#13)
* testScalarProduct check for non numeric values

test for non numeric values.

* updating pr #13

using global namespace fro stdClass
2016-07-26 08:13:52 +02:00
Arkadiusz Kondas 403824d23b test exception on kmeans 2016-07-24 14:01:17 +02:00
Arkadiusz Kondas 448eaafd78 remove unused exception 2016-07-24 13:52:52 +02:00
Arkadiusz Kondas 2a76cbb402 add .coverage to git ignore 2016-07-24 13:42:50 +02:00
Arkadiusz Kondas 093e8fc89c add more tests for CReport 2016-07-19 22:01:39 +02:00
Arkadiusz Kondas 074dcf7470 php-cs-fixer 2016-07-19 21:59:23 +02:00
Arkadiusz Kondas 9665457159 implement ClassificationReport class 2016-07-19 21:58:59 +02:00
Arkadiusz Kondas 7abee3061a docs for files dataset and php-cs-fixer 2016-07-16 23:56:52 +02:00
Arkadiusz Kondas e0b560f31d create FilesDataset class 2016-07-16 23:29:40 +02:00
Arkadiusz Kondas 9f140d5b6f fix problem with token count vectorizer array order 2016-07-14 13:25:11 +02:00
Arkadiusz Kondas 7c0767c15a create docs for tf-idf transformer 2016-07-12 00:21:34 +02:00
Arkadiusz Kondas f04cc04da5 create StratifiedRandomSplit for cross validation 2016-07-10 14:13:35 +02:00
Arkadiusz Kondas 6c7416a9c4 implement ConfusionMatrix metric 2016-07-07 00:29:58 +02:00
Arkadiusz Kondas cce68997a1 implement StopWords in TokenCountVectorizer 2016-07-06 23:22:29 +02:00
Arkadiusz Kondas 601ff884e8 php-cs-fixer 2016-06-17 00:34:15 +02:00
Arkadiusz Kondas 424519cd83 implement fit fot TokenCountVectorizer 2016-06-17 00:33:48 +02:00
Arkadiusz Kondas be7423350f add more tests for fit metod in preprocessors 2016-06-17 00:23:27 +02:00
Arkadiusz Kondas 3e9e70810d implement fit on Imputer 2016-06-17 00:16:49 +02:00
Arkadiusz Kondas 557f344018 add fit method for Transformer interface 2016-06-17 00:08:10 +02:00
Arkadiusz Kondas 4554011899 rename labels to targets for Dataset 2016-06-16 23:56:15 +02:00
Arkadiusz Kondas 7f4a0b243f transform samples for prediction in pipeline 2016-06-16 16:10:46 +02:00
Arkadiusz Kondas 26f2cbabc4 fix Pipeline transformation 2016-06-16 10:26:29 +02:00
Arkadiusz Kondas d21a401365 implement Tranformer interface on preprocessing classes 2016-06-16 10:03:57 +02:00
Arkadiusz Kondas 7c5e79d2c6 change transformer behavior to reference 2016-06-16 10:01:40 +02:00
Arkadiusz Kondas 15519ba122 simple pipeline test 2016-06-16 09:58:17 +02:00
Arkadiusz Kondas cc50d2c9b1 implement TfIdf transformation 2016-06-15 16:04:09 +02:00
Arkadiusz Kondas da6d94cc46 create stop words class 2016-06-14 11:54:04 +02:00
Arkadiusz Kondas 2f51716388 change token count vectorizer to return full token counts 2016-06-14 09:58:11 +02:00
Arkadiusz Kondas 23eff0044a add test with dataset example 2016-05-31 20:01:54 +02:00
Arkadiusz Kondas fb04b57853 implement data Normalizer with L1 and L2 norm 2016-05-08 20:35:01 +02:00
Arkadiusz Kondas 65cdfe64b2 implement Median and MostFrequent strategy for imputer 2016-05-08 19:33:39 +02:00
Arkadiusz Kondas a761d0e8f2 mode (dominant) from numbers 2016-05-08 19:23:54 +02:00
Arkadiusz Kondas ed1e07e803 median function in statistic 2016-05-08 19:12:39 +02:00
Arkadiusz Kondas b0ab236ab9 create imputer tool for completing missing values 2016-05-08 14:47:17 +02:00
Arkadiusz Kondas 46197eba7b add word tokenizer 2016-05-07 23:17:52 +02:00
Arkadiusz Kondas 078f543146 add word tokenizer 2016-05-07 23:17:46 +02:00
Arkadiusz Kondas 430c1078cf implement support vector regression 2016-05-07 23:04:58 +02:00
Arkadiusz Kondas c409658483 support vector classifier implementation 2016-05-07 22:17:12 +02:00
Arkadiusz Kondas 6cf6c5e768 add multi class svm test 2016-05-07 14:08:09 +02:00
Arkadiusz Kondas 7b5b6418f4 libsvm predict program implementation 2016-05-06 22:55:41 +02:00
Arkadiusz Kondas dfb7b6b108 datatransformer test set 2016-05-06 22:38:50 +02:00
Arkadiusz Kondas 4ac2ac8a35 fix index for trainging set 2016-05-06 22:33:04 +02:00
Arkadiusz Kondas 95caef8692 start to implement SVM with libsvm 2016-05-05 23:29:11 +02:00
Arkadiusz Kondas c05ce8c542 feature extractions tools - TokenCountVectorizez 2016-05-03 23:28:29 +02:00
Arkadiusz Kondas fadd003169 create whitespace tokenizer 2016-05-03 00:33:18 +02:00
Arkadiusz Kondas bb9e1aa4f0 test kmeans init methods 2016-05-01 23:44:04 +02:00
Arkadiusz Kondas 7572304d50 refactor kmeans subclasses 2016-05-01 23:36:33 +02:00
Arkadiusz Kondas c0513e9b82 kmeans clustering 2016-05-01 23:17:09 +02:00
Arkadiusz Kondas 01a2499754 cs-fixer 2016-05-01 00:56:43 +02:00
Arkadiusz Kondas 22963114c3 dbscan clustering algorithm 2016-05-01 00:47:44 +02:00
Arkadiusz Kondas f7b91bea72 change Classifier namespace to Classification 2016-04-30 23:45:21 +02:00
Arkadiusz Kondas ee9bb7b252 add tests for matrix class 2016-04-30 23:21:32 +02:00
Arkadiusz Kondas ff9adc267c better arguments format for regression 2016-04-30 13:54:58 +02:00
Arkadiusz Kondas ff79de7e14 better arguments format for regression 2016-04-30 13:54:01 +02:00
Arkadiusz Kondas b1c47d5e9d test intercept and coefficients of linear regression 2016-04-30 13:32:40 +02:00
Arkadiusz Kondas 633974fea0 php-cs-fxier 2016-04-30 00:59:10 +02:00
Arkadiusz Kondas 60c796f5d9 create matrix calculation for ls regression for multiple variable 2016-04-30 00:58:54 +02:00
Arkadiusz Kondas 9d74174a68 ls reg with error :( 2016-04-29 23:03:08 +02:00
Arkadiusz Kondas 3e4dc3ddf8 add test for mean with floats 2016-04-28 07:32:48 +02:00
Arkadiusz Kondas b5e4cbe66e add Mean::arithmetic tests 2016-04-27 23:57:23 +02:00
Arkadiusz Kondas 80a712e8a8 implement Least Squares Regression 2016-04-27 23:51:14 +02:00
Arkadiusz Kondas cbec77d247 pearson correlation function 2016-04-27 23:28:01 +02:00
Arkadiusz Kondas 66dcfcf2b7 implement standard deviation of population function 2016-04-27 23:04:59 +02:00
Arkadiusz Kondas af3b57692f linear regression is also hard 2016-04-25 22:55:34 +02:00
Arkadiusz Kondas 46da769ca6 typo in variable name 2016-04-25 20:16:53 +02:00
Arkadiusz Kondas 37782eba98 implement RBF kernel function 2016-04-21 22:54:38 +02:00
Arkadiusz Kondas b30f4cbf11 make scalar function static 2016-04-21 22:12:45 +02:00
Arkadiusz Kondas 34281e40ee add scalar product function 2016-04-21 00:23:03 +02:00
Arkadiusz Kondas 9330785a6f extract Math namespace 2016-04-20 23:56:33 +02:00
Arkadiusz Kondas a4ab370a48 create traits for reduce complexity 2016-04-16 21:24:40 +02:00
Arkadiusz Kondas 100205d767 simple Naive Bayes classifier 2016-04-14 22:56:54 +02:00
Arkadiusz Kondas 85243f2d92 cs-fixer 2016-04-12 23:10:33 +02:00
Arkadiusz Kondas 79b76fb1a4 implement minkowski distance metric function 2016-04-12 22:02:14 +02:00
Arkadiusz Kondas d82a12497a implement manhattan distance metric function 2016-04-12 21:43:25 +02:00
Arkadiusz Kondas aed37e247e knn with chebyshev distance metric test 2016-04-11 21:50:29 +02:00
Arkadiusz Kondas 14bffbe38a :implement Chebyshev distance metric 2016-04-11 21:46:50 +02:00
Arkadiusz Kondas 4d77a16e12 implement Chebyshev distance metric 2016-04-11 21:44:48 +02:00
Arkadiusz Kondas d169ebf730 create Distance metrci interface and refactor classifier 2016-04-11 21:35:17 +02:00
Arkadiusz Kondas 171c6974e7 remove accuracy score tests on datasets 2016-04-09 15:52:22 +02:00
Arkadiusz Kondas a992f65200 remove accuracy score tests on datasets 2016-04-09 15:50:48 +02:00
Arkadiusz Kondas c9c592cb09 add glass identification dataset 2016-04-09 15:46:54 +02:00
Arkadiusz Kondas dd53581309 wine class dataset 2016-04-09 15:33:05 +02:00
Arkadiusz Kondas 5be2147784 creat docs files 2016-04-09 00:36:48 +02:00
Arkadiusz Kondas 62ec4ec2f2 integration tests for knn classifier 2016-04-08 22:49:17 +02:00
Arkadiusz Kondas e7d2780150 classifier predict array of samples or one sample 2016-04-08 22:25:15 +02:00
Arkadiusz Kondas f1c81638d6 accuracy score with test 2016-04-08 22:11:59 +02:00
Arkadiusz Kondas db9d57cd22 add tests for datasets 2016-04-07 22:36:02 +02:00
Arkadiusz Kondas 9c18a5a22d add tests for datasets 2016-04-07 22:35:49 +02:00
Arkadiusz Kondas bbcc8a3e68 random split implementation and tests 2016-04-07 22:12:36 +02:00
Arkadiusz Kondas 649cbdb9a6 prepare cross validation random splitter 2016-04-06 22:38:08 +02:00
Arkadiusz Kondas e521fb8f80 iris dataset loader 2016-04-06 21:46:17 +02:00
Arkadiusz Kondas 7cbeaecffb simple test for knn classifier 2016-04-05 21:35:06 +02:00
Arkadiusz Kondas bd31f9a025 php-cs-fixer 2016-04-04 22:49:54 +02:00
Arkadiusz Kondas 5c348e8452 add more Euclidean distance tests 2016-04-04 22:48:07 +02:00
Arkadiusz Kondas dd927ef981 create phpunit configuration and first tests 2016-04-04 22:38:51 +02:00