Commit Graph

222 Commits

Author SHA1 Message Date
Yuji Uchiyama 53f8a89556 Fix support of a rule in Apriori (#229)
* Clean up test code

* Add test to check support and confidence (failed due to a bug)

* Fix support value of rules
2018-02-11 12:42:46 +01:00
Arkadiusz Kondas 3ba35918a3
Implement VarianceThreshold - simple baseline approach to feature selection. (#228)
* Add sum of squares deviations

* Calculate population variance

* Add VarianceThreshold - feature selection transformer

* Add docs about VarianceThreshold

* Add missing code for pipeline usage
2018-02-10 18:07:09 +01:00
Andreas Möller 4b5d57fd6f Enhancement: Flatten directory structure (#220) 2018-02-10 12:08:58 +01:00
Yuji Uchiyama 71cc633c8e Fix apriori generates an empty array as a part of the frequent item sets (#224) 2018-02-07 10:02:38 +01:00
Yuji Uchiyama ec091b5ea3 Support probability estimation in SVC (#218)
* Add test for svm model with probability estimation

* Extract buildPredictCommand method

* Fix test to use PHP_EOL

* Add predictProbability method (not completed)

* Add test for DataTransformer::predictions

* Fix SVM to use PHP_EOL

* Support probability estimation in SVM

* Add documentation

* Add InvalidOperationException class

* Throw InvalidOperationException before executing libsvm if probability estimation is not supported
2018-02-06 20:39:25 +01:00
Jonathan Baldie c32bf3fe2b Configure an Activation Function per hidden layer (#208)
* ability to specify per-layer activation function

* some tests for new addition to layer

* appease style CI whitespace issue

* more flexible addition of layers, and developer can pass Layer object in manually

* new test for layer object in mlp constructor

* documentation for added MLP functionality
2018-02-01 23:15:36 +01:00
Jonathan Baldie e318921076 Fix string representation of integer labels issue in NaiveBayes (#206)
* Update NaiveBayes.php

This fixes an issue using string labels that are string representations of integers, e.g. "1998" getting cast to (int)1998.

* Update NaiveBayes.php

fixes superfluous whitespace error

* added tests for naive bayes with numeric labels

* added array_unique

* nested array_flips for speed

* nested the array flips inside the array map

* to appear style CI test
2018-01-31 21:44:44 +01:00
Yuji Uchiyama 554c86af68 Choose averaging method in classification report (#205)
* Fix testcases of ClassificationReport

* Fix averaging method in ClassificationReport

* Fix divided by zero if labels are empty

* Fix calculation of f1score

* Add averaging methods (not completed)

* Implement weighted average method

* Extract counts to properties

* Fix default to macro average

* Implement micro average method

* Fix style

* Update docs

* Fix styles
2018-01-29 18:06:21 +01:00
Marcin Michalski ba7114a3f7 Add libsvm exception tests (#202) 2018-01-26 22:07:22 +01:00
Arkadiusz Kondas 7435bece34
Add test for Pipeline save and restore with ModelManager (#191) 2018-01-12 10:54:20 +01:00
Yuji Uchiyama d953ef6bfc Fix the implementation of conjugate gradient method (#184)
* Add unit tests for optimizers

* Fix ConjugateGradient

* Fix coding style

* Fix namespace
2018-01-12 10:53:43 +01:00
David Monllaó e83f7b95d5 Fix activation functions support (#163)
- Backpropagation using the neuron activation functions derivative
- instead of hardcoded sigmoid derivative
- Added missing activation functions derivatives
- Sigmoid forced for the output layer
- Updated ThresholdedReLU default threshold to 0 (acts as a ReLU)
- Unit tests for derivatives
- Unit tests for classifiers using different activation functions
- Added missing docs
2018-01-09 11:09:59 +01:00
Yuji Uchiyama 9938cf2911 Rewrite DBSCAN (#185)
* Add testcases to DBSCAN

* Fix DBSCAN implementation

* Refactoring DBSCAN implementation

* Fix coding style
2018-01-09 10:53:02 +01:00
Tomáš Votruba 6660645ecd Update dev dependencies (#187)
* composer: update dev dependencies

* phpstan fixes

* phpstan fixes

* phpstan fixes

* phpstan fixes

* drop probably forgotten humbug configs

* apply cs

* fix cs bug

* compsoer: add coding standard and phsptan dev friendly scripts

* ecs: add skipped errors

* cs: fix PHP 7.1

* fix cs

* ecs: exclude strict fixer that break code

* ecs: cleanup commented sets

* travis: use composer scripts for testing to prevent duplicated setup
2018-01-06 21:25:47 +01:00
Tomáš Votruba a348111e97 Add PHPStan and level to max (#168)
* tests: update to PHPUnit 6.0 with rector

* fix namespaces on tests

* composer + tests: use standard test namespace naming

* update travis

* resolve conflict

* phpstan lvl 2

* phpstan lvl 3

* phpstan lvl 4

* phpstan lvl 5

* phpstan lvl 6

* phpstan lvl 7

* level max

* resolve conflict

* [cs] clean empty docs

* composer: bump to PHPUnit 6.4

* cleanup

* composer + travis: add phpstan

* phpstan lvl 1

* composer: update dev deps

* phpstan fixes

* update Contributing with new tools

* docs: link fixes, PHP version update

* composer: drop php-cs-fixer, cs already handled by ecs

* ecs: add old set rules

* [cs] apply rest of rules
2018-01-06 13:09:33 +01:00
David Monllaó c4ad117d28 Ability to update learningRate in MLP (#160)
* Allow people to update the learning rate

* Test for learning rate setter
2017-12-05 21:09:06 +01:00
Yuji Uchiyama c4f58f7f6f Fix logistic regression implementation (#169)
* Fix target value of LogisticRegression

* Fix probability calculation in LogisticRegression

* Change the default cost function to log-likelihood

* Remove redundant round function

* Fix for coding standard
2017-12-05 12:03:55 +01:00
Tomáš Votruba 946fbbc521 Tests: use PHPUnit (6.4) exception methods (#165)
* tests: update to PHPUnit 6.0 with rector

* [cs] clean empty docs

* composer: bump to PHPUnit 6.4

* tests: use class references over strings

* cleanup
2017-11-28 08:00:13 +01:00
Tomáš Votruba 726cf4cddf Added EasyCodingStandard + lots of code fixes (#156)
* travis: move coveralls here, decouple from package

* composer: use PSR4

* phpunit: simpler config

* travis: add ecs run

* composer: add ecs dev

* use standard vendor/bin directory for dependency bins, confuses with local bins and require gitignore handling

* ecs: add PSR2

* [cs] PSR2 spacing fixes

* [cs] PSR2 class name fix

* [cs] PHP7 fixes - return semicolon spaces, old rand functions, typehints

* [cs] fix less strict typehints

* fix typehints to make tests pass

* ecs: ignore typehint-less elements

* [cs] standardize arrays

* [cs] standardize docblock, remove unused comments

* [cs] use self where possible

* [cs] sort class elements, from public to private

* [cs] do not use yoda (found less yoda-cases, than non-yoda)

* space

* [cs] do not assign in condition

* [cs] use namespace imports if possible

* [cs] use ::class over strings

* [cs] fix defaults for arrays properties, properties and constants single spacing

* cleanup ecs comments

* [cs] use item per line in multi-items array

* missing line

* misc

* rebase
2017-11-22 22:16:10 +01:00
David Monllaó 333598b472 Fix backpropagation random error (#157) 2017-11-20 23:11:21 +01:00
Tomáš Votruba 653c7c772d Upgrade to PHP 7.1 (#150)
* upgrade to PHP 7.1

* bump travis and composer to PHP 7.1

* fix tests
2017-11-14 21:21:23 +01:00
Tomáš Votruba d85bfed468 [cs] remove more unused comments (#146)
* [cs] remove more unused comments

* [cs] remove unused array phpdocs

* [cs] remove empty lines in docs

* [cs] space-proof useless docs

* [cs] remove empty @param lines

* [cs] remove references arrays
2017-11-13 11:42:40 +01:00
Tomáš Votruba f4650c696c [coding standard] fix imports order and drop unused docs typehints (#145)
* fix imports order

* drop unused docs typehints, make use of return types where possible
2017-11-06 08:56:37 +01:00
Marcin Michalski 11d05ce89d Comparison - replace eval (#130)
* Replace eval with strategy

* Use Factory Pattern, add tests

* Add missing dockblocks

* Replace strategy with simple object
2017-10-24 18:59:12 +02:00
Maxim Kasatkin b48b82bd34 DBSCAN fix for associative keys and array_merge performance optimization (#139) 2017-10-18 10:59:37 +02:00
Marcin Michalski 61d2b7d115 Ensure user-provided SupportVectorMachine paths are valid (#126) 2017-09-02 22:44:19 +02:00
Marcin Michalski ba2b8c8a9c Use C-style casts (#124) 2017-09-02 21:41:06 +02:00
Marcin Michalski 8c06a55a16 Make tests namespace consistent (#125) 2017-09-02 21:39:59 +02:00
Marcin Michalski b1be0574d8 Add PReLU activation function (#128)
* Implement RELU activation functions

* Add PReLUTest
2017-09-02 21:31:14 +02:00
Marcin Michalski 0e59cfb174 Add ThresholdedReLU activation function (#129) 2017-09-02 21:30:35 +02:00
Marcin Michalski 3e2708de17 Fix #120 (#121)
* Fix #120

* Add DecisionTreeLeafTest
2017-08-28 13:00:24 +02:00
Yuji Uchiyama 136a92c82b Support CSV with long lines (#119) 2017-08-21 08:08:54 +02:00
Marcin Michalski 3ac658c397 php-cs-fixer - more rules (#118)
* Add new cs-fixer rules and run them

* Do not align double arrows/equals
2017-08-17 08:50:37 +02:00
Ante Lucic 07041ec608 Run newest php-cs-fixer (#108) 2017-07-26 08:24:47 +02:00
Maxime COLIN 2d3b44f1a0 Fix samples transformation in Pipeline training (#94) 2017-05-24 09:06:54 +02:00
David Monllaó de50490154 Neural networks partial training and persistency (#91)
* Neural networks partial training and persistency

* cs fixes

* Add partialTrain to nn docs

* Test for invalid partial training classes provided
2017-05-23 09:03:05 +02:00
Maxime COLIN 3dff40ea1d Add french stopwords (#92)
* Add french stopwords

* Add french stopwords test
2017-05-22 23:18:27 +02:00
David Monllaó 4af8449b1c Neural networks improvements (#89)
* MultilayerPerceptron interface changes

- Signature closer to other algorithms
- New predict method
- Remove desired error
- Move maxIterations to constructor

* MLP tests for multiple hidden layers and multi-class

* Update all MLP-related tests

* coding style fixes

* Backpropagation included in multilayer-perceptron
2017-05-18 00:07:14 +02:00
Mustafa Karabulut 5b373fa7c2 Linear Discrimant Analysis (LDA) (#82)
* Linear Discrimant Analysis (LDA)

* LDA test file

* Matrix inverse via LUDecomposition

* LUDecomposition inverse() and det() applied

* Readme update for LDA
2017-04-25 08:58:02 +02:00
David Monllaó 12b8b118dd Fix division by 0 error during normalization (#83)
* Fix division by 0 error during normalization

std is 0 when a feature has the same value in samples.

* Expand std normalization test
2017-04-24 11:47:30 +02:00
Mustafa Karabulut a87859dd97 Linear algebra operations, Dimensionality reduction and some other minor changes (#81)
* Lineer Algebra operations

* Covariance

* PCA and KernelPCA

* Tests for PCA, Eigenvalues and Covariance

* KernelPCA update

* KernelPCA and its test

* KernelPCA and its test

* MatrixTest, KernelPCA and PCA tests

* Readme update

* Readme update
2017-04-23 09:03:30 +02:00
David Monllaó e1854d44a2 Partial training base (#78)
* Cost values for multiclass OneVsRest uses

* Partial training interface

* Reduce linear classifiers memory usage

* Testing partial training and isolated training

* Partial trainer naming switched to incremental estimator

Other changes according to review's feedback.

* Clean optimization data once optimize is finished

* Abstract resetBinary
2017-04-19 22:26:31 +02:00
Mustafa Karabulut 49234429f0 LogisticRegression classifier & Optimization methods (#63)
* LogisticRegression classifier & Optimization methods

* Minor fixes to Logistic Regression & Optimizers PR

* Addition for getCostValues() method
2017-03-27 23:46:53 +02:00
Mustafa Karabulut 01bb82a2a7 One-v-Rest Classification technique applied to linear classifiers (#54)
* One-v-Rest Classification technique applied to linear classifiers

* Fix for Apriori

* Fixes for One-v-Rest

* One-v-Rest test cases
2017-03-05 09:43:19 +01:00
Arkadiusz Kondas 63c63dfba2 Add no_unused_imports rule to cs-fixer 2017-03-01 10:16:15 +01:00
Mustafa Karabulut c028a73985 AdaBoost improvements (#53)
* AdaBoost improvements

* AdaBoost improvements & test case resolved

* Some coding style fixes
2017-02-28 21:45:18 +01:00
Arkadiusz Kondas e8c6005aec Update changelog and cs fixes 2017-02-23 20:59:30 +01:00
Mustafa Karabulut 4daa0a222a AdaBoost algorithm along with some improvements (#51) 2017-02-21 10:38:18 +01:00
Mustafa Karabulut cf222bcce4 Linear classifiers: Perceptron, Adaline, DecisionStump (#50)
* Linear classifiers

* Code formatting to PSR-2

* Added basic test cases for linear classifiers
2017-02-16 23:23:55 +01:00
Povilas Susinskas f0a7984f39 Check if matrix is singular doing inverse (#49)
* Check if matrix is singular doing inverse

* add return bool type
2017-02-15 10:09:16 +01:00
Mustafa Karabulut 1d73503958 Ensemble Classifiers : Bagging and RandomForest (#36)
* Fuzzy C-Means implementation

* Update FuzzyCMeans

* Rename FuzzyCMeans to FuzzyCMeans.php

* Update NaiveBayes.php

* Small fix applied to improve training performance

array_unique is replaced with array_count_values+array_keys which is way
faster

* Revert "Small fix applied to improve training performance"

This reverts commit c20253f16ac3e8c37d33ecaee28a87cc767e3b7f.

* Revert "Revert "Small fix applied to improve training performance""

This reverts commit ea10e136c4c11b71609ccdcaf9999067e4be473e.

* Revert "Small fix applied to improve training performance"

This reverts commit c20253f16ac3e8c37d33ecaee28a87cc767e3b7f.

* First DecisionTree implementation

* Revert "First DecisionTree implementation"

This reverts commit 4057a08679c26010c39040a48a3e6dad994a1a99.

* DecisionTree

* FCM Test

* FCM Test

* DecisionTree Test

* Ensemble classifiers: Bagging and RandomForests

* test

* Fixes for conflicted files

* Bagging and RandomForest ensemble algorithms

* Changed unit test

* Changed unit test

* Changed unit test

* Bagging and RandomForest ensemble algorithms

* Baggging and RandomForest ensemble algorithms

* Bagging and RandomForest ensemble algorithms

RandomForest algorithm is improved with changes to original DecisionTree

* Bagging and RandomForest ensemble algorithms

* Slight fix about use of global Exception class

* Fixed the error about wrong use of global Exception class

* RandomForest code formatting
2017-02-07 12:37:56 +01:00
Arkadiusz Kondas b7c9983524 Do not requre file to exist for model manager 2017-02-03 17:48:15 +01:00
Arkadiusz Kondas 858d13b0fa Update phpunit to 6.0 2017-02-03 12:58:25 +01:00
David Monllaó 8f122fde90 Persistence class to save and restore models (#37)
* Models manager with save/restore capabilities

* Refactoring dataset exceptions

* Persistency layer docs

* New tests for serializable estimators

* ModelManager static methods to instance methods
2017-02-02 09:03:09 +01:00
David Monllaó c1b1a5d6ac Support for multiple training datasets (#38)
* Multiple training data sets allowed

* Tests with multiple training data sets

* Updating docs according to #38

Documenting all models which predictions will be based on all
training data provided.

Some models already supported multiple training data sets.
2017-02-01 19:06:38 +01:00
Arkadiusz Kondas c3686358b3 Add rules for new cs-fixer 2017-01-31 20:33:08 +01:00
Mustafa Karabulut 87396ebe58 DecisionTree and Fuzzy C Means classifiers (#35)
* Fuzzy C-Means implementation

* Update FuzzyCMeans

* Rename FuzzyCMeans to FuzzyCMeans.php

* Update NaiveBayes.php

* Small fix applied to improve training performance

array_unique is replaced with array_count_values+array_keys which is way
faster

* Revert "Small fix applied to improve training performance"

This reverts commit c20253f16ac3e8c37d33ecaee28a87cc767e3b7f.

* Revert "Revert "Small fix applied to improve training performance""

This reverts commit ea10e136c4c11b71609ccdcaf9999067e4be473e.

* Revert "Small fix applied to improve training performance"

This reverts commit c20253f16ac3e8c37d33ecaee28a87cc767e3b7f.

* DecisionTree

* FCM Test

* FCM Test

* DecisionTree Test
2017-01-31 20:27:15 +01:00
Arkadiusz Kondas a78ebc159a Use assertCount in tests 2016-12-12 19:31:30 +01:00
Arkadiusz Kondas b6fe290c65 Fix for php7.1 accuracy test score 2016-12-12 19:28:26 +01:00
Arkadiusz Kondas 12d0adda62 Increase iterations number in Backpropagation test (sometimes it fails) 2016-11-20 22:56:18 +01:00
Arkadiusz Kondas cbdc049526 Update php-cs-fixer 2016-11-20 22:53:17 +01:00
Arkadiusz Kondas bca2196b57 Prevent Division by zero error in classification report 2016-11-20 22:49:26 +01:00
Arkadiusz Kondas 349ea16f01 Rename demo datasets and add Dataset suffix 2016-09-30 14:02:08 +02:00
Arkadiusz Kondas 84af842f04 Fix division by zero in ClassificationReport #21 2016-09-27 20:07:21 +02:00
Arkadiusz Kondas 1ce6bb544b Run php-cs-fixer 2016-09-21 21:51:19 +02:00
Arkadiusz Kondas 8072ddb2bf Update phpunit to 5.5 2016-09-21 21:46:16 +02:00
Patrick Florek fa87eca375 Add new class Set for simple Set-theoretical operations
### Features

* Works only with primitive types int, float, string
* Implements set theortic operations union, intersection, complement
* Modifies set by adding, removing elements
* Implements \IteratorAggregate for use in loops

### Implementation details

Based on array functions:
* array_diff,
* array_merge,
* array_intersection,
* array_unique,
* array_values,
* sort.

### Drawbacks

* **Do not work with objects.**
* Power set and Cartesian product returning array of Set
2016-09-10 13:24:43 +02:00
Patrick Florek 90038befa9 Apply comments / coding styles
* Remove user-specific gitignore
* Add return type hints
* Avoid global namespace in docs
* Rename rules -> getRules
* Split up rule generation

Todo:
* Move set theory out to math
* Extract rule generation
2016-09-02 00:26:01 +02:00
Patrick Florek c8bd8db601 # Association rule learning - Apriori algorithm
* Generating frequent k-length item sets
* Generating rules based on frequent item sets
* Algorithm has exponential complexity, be aware of it
* Apriori algorithm is split into apriori and candidates method
* Second step rule generation is implemented by rules method
* Internal methods are invoked for fine grain unit tests
* Wikipedia's train samples and an alternative are provided for test cases
* Small documentation for public interface is also shipped
2016-08-23 15:44:53 +02:00
Arkadiusz Kondas 6421a2ba41 Develop to master (#18)
* Fix Backpropagation test with explicit random generator seed

* remove custom seed - not working :(

* Updated links in readme
2016-08-21 14:03:20 +02:00
Arkadiusz Kondas c506a84164 refactor Backpropagation methods and simplify things 2016-08-10 23:03:02 +02:00
Arkadiusz Kondas 66d029e94f implement and test Backpropagation training 2016-08-10 22:43:47 +02:00
Arkadiusz Kondas e5d39ee18a implements and test multilayer perceptron methods 2016-08-09 13:27:48 +02:00
Arkadiusz Kondas 64859f263f test abstraction from LayeredNetwork 2016-08-07 23:41:08 +02:00
Arkadiusz Kondas 95b29d40b1 add Layer, Input and Bias for neutal network 2016-08-05 10:20:31 +02:00
Arkadiusz Kondas 7062ee29e1 add Neuron and Synapse classes 2016-08-02 20:30:20 +02:00
Arkadiusz Kondas 637fd613b8 implement activation function for neural network 2016-08-02 13:07:47 +02:00
Pablo Joán Iglesias 38deaaeb2e testScalarProduct check for non numeric values (#13)
* testScalarProduct check for non numeric values

test for non numeric values.

* updating pr #13

using global namespace fro stdClass
2016-07-26 08:13:52 +02:00
Arkadiusz Kondas 403824d23b test exception on kmeans 2016-07-24 14:01:17 +02:00
Arkadiusz Kondas 448eaafd78 remove unused exception 2016-07-24 13:52:52 +02:00
Arkadiusz Kondas 2a76cbb402 add .coverage to git ignore 2016-07-24 13:42:50 +02:00
Arkadiusz Kondas 093e8fc89c add more tests for CReport 2016-07-19 22:01:39 +02:00
Arkadiusz Kondas 074dcf7470 php-cs-fixer 2016-07-19 21:59:23 +02:00
Arkadiusz Kondas 9665457159 implement ClassificationReport class 2016-07-19 21:58:59 +02:00
Arkadiusz Kondas 7abee3061a docs for files dataset and php-cs-fixer 2016-07-16 23:56:52 +02:00
Arkadiusz Kondas e0b560f31d create FilesDataset class 2016-07-16 23:29:40 +02:00
Arkadiusz Kondas 9f140d5b6f fix problem with token count vectorizer array order 2016-07-14 13:25:11 +02:00
Arkadiusz Kondas 7c0767c15a create docs for tf-idf transformer 2016-07-12 00:21:34 +02:00
Arkadiusz Kondas f04cc04da5 create StratifiedRandomSplit for cross validation 2016-07-10 14:13:35 +02:00
Arkadiusz Kondas 6c7416a9c4 implement ConfusionMatrix metric 2016-07-07 00:29:58 +02:00
Arkadiusz Kondas cce68997a1 implement StopWords in TokenCountVectorizer 2016-07-06 23:22:29 +02:00
Arkadiusz Kondas 601ff884e8 php-cs-fixer 2016-06-17 00:34:15 +02:00
Arkadiusz Kondas 424519cd83 implement fit fot TokenCountVectorizer 2016-06-17 00:33:48 +02:00
Arkadiusz Kondas be7423350f add more tests for fit metod in preprocessors 2016-06-17 00:23:27 +02:00
Arkadiusz Kondas 3e9e70810d implement fit on Imputer 2016-06-17 00:16:49 +02:00
Arkadiusz Kondas 557f344018 add fit method for Transformer interface 2016-06-17 00:08:10 +02:00
Arkadiusz Kondas 4554011899 rename labels to targets for Dataset 2016-06-16 23:56:15 +02:00
Arkadiusz Kondas 7f4a0b243f transform samples for prediction in pipeline 2016-06-16 16:10:46 +02:00
Arkadiusz Kondas 26f2cbabc4 fix Pipeline transformation 2016-06-16 10:26:29 +02:00
Arkadiusz Kondas d21a401365 implement Tranformer interface on preprocessing classes 2016-06-16 10:03:57 +02:00