Commit Graph

63 Commits

Author SHA1 Message Date
Pol Dellaiera
02dab41830 Provide a new NGramTokenizer with minGram and maxGram support (#350)
* Issue #349: Provide a new NGramTokenizer.

* Issue #349: Add tests.

* Fixes from code review.

* Implement NGramTokenizer with min and max gram support

* Add missing tests for ngram

* Add info about NGramTokenizer to docs and readme

* Add performance test for tokenization
2019-02-15 17:31:10 +01:00
Arkadiusz Kondas
6844cf407a
Fix typo in naive bayes docs 2019-01-23 09:41:44 +01:00
Arkadiusz Kondas
18c36b971f
Mnist Dataset (#326)
* Implement MnistDataset

* Add MNIST dataset documentation
2018-11-07 08:02:56 +01:00
Arkadiusz Kondas
e255369636
Fix Imputer docs and check if train data was set (#314)
* Update docs for Imputer class

* Throw exception when trying to transform imputer without train data

* Update changelog
2018-10-10 21:36:18 +02:00
Arkadiusz Kondas
15adf9e252
Update build status badge from travis-ci 2018-07-31 23:28:29 +02:00
Yuji Uchiyama
ab22cc5b68 Change the default kernel type in SVC to Kernel::RBF (#267)
* Change the default kernel type in SVC to Kernel::RBF

* Update CHANGELOG.md
2018-06-20 23:28:11 +02:00
Yuji Uchiyama
8b0d12c219 Fix SVR documentation (#265) 2018-03-20 17:25:25 +01:00
Arkadiusz Kondas
a36fe086d3
Add performance test for LeastSquares (#263)
* Install phpbench 🚀

* Add first benchmark for LeastSquares

* Update README and CONTRIBUTING guide

* Fix typo
2018-03-10 21:48:16 +01:00
Ivana Momcilovic
af2d732194 KMeans associative clustering (#262)
* KMeans associative clustering added

* fix travis error

* KMeans will return provided keys as point label if they are provided

* fix travis

* fix travis
2018-03-08 22:27:16 +01:00
Arkadiusz Kondas
8976047cbc
Add removeColumns function to ArrayDataset (#249)
* Add removeColumns function to ArrayDataset

* Add removeColumns to docs

* Fix cs
2018-03-03 16:04:21 +01:00
Yuji Uchiyama
9c195559df Update apriori documentation (#245)
* Fix a wrong word

* More precise description about support and confidence
2018-02-27 18:50:07 +01:00
Yuji Uchiyama
4562f1dfc9 Add a SvmDataset class for SVM-Light (or LibSVM) format files (#237)
* Add data loader for svm format

* Add tests for error cases

* Set proper exception messages

* Add documents

* Add error checking code for invalid column format

* Add missing documents
2018-02-24 11:17:35 +01:00
Arkadiusz Kondas
451f84c2e6 Add SelectKBest docs 2018-02-14 20:34:53 +01:00
Arkadiusz Kondas
3ba35918a3
Implement VarianceThreshold - simple baseline approach to feature selection. (#228)
* Add sum of squares deviations

* Calculate population variance

* Add VarianceThreshold - feature selection transformer

* Add docs about VarianceThreshold

* Add missing code for pipeline usage
2018-02-10 18:07:09 +01:00
Yuji Uchiyama
ec091b5ea3 Support probability estimation in SVC (#218)
* Add test for svm model with probability estimation

* Extract buildPredictCommand method

* Fix test to use PHP_EOL

* Add predictProbability method (not completed)

* Add test for DataTransformer::predictions

* Fix SVM to use PHP_EOL

* Support probability estimation in SVM

* Add documentation

* Add InvalidOperationException class

* Throw InvalidOperationException before executing libsvm if probability estimation is not supported
2018-02-06 20:39:25 +01:00
Yuji Uchiyama
ed775fb232 Fix documentation of apriori (#221)
* Fix the return value of the single sample prediction

* Fix typo
2018-02-05 18:50:45 +01:00
Jonathan Baldie
c32bf3fe2b Configure an Activation Function per hidden layer (#208)
* ability to specify per-layer activation function

* some tests for new addition to layer

* appease style CI whitespace issue

* more flexible addition of layers, and developer can pass Layer object in manually

* new test for layer object in mlp constructor

* documentation for added MLP functionality
2018-02-01 23:15:36 +01:00
Yuji Uchiyama
9f0723f7d0 Fix documentation of ClassificationReport (#209)
* Fix values in example code

* Remove inconsistent empty lines
2018-01-31 19:20:50 +01:00
Yuji Uchiyama
554c86af68 Choose averaging method in classification report (#205)
* Fix testcases of ClassificationReport

* Fix averaging method in ClassificationReport

* Fix divided by zero if labels are empty

* Fix calculation of f1score

* Add averaging methods (not completed)

* Implement weighted average method

* Extract counts to properties

* Fix default to macro average

* Implement micro average method

* Fix style

* Update docs

* Fix styles
2018-01-29 18:06:21 +01:00
David Monllaó
e83f7b95d5 Fix activation functions support (#163)
- Backpropagation using the neuron activation functions derivative
- instead of hardcoded sigmoid derivative
- Added missing activation functions derivatives
- Sigmoid forced for the output layer
- Updated ThresholdedReLU default threshold to 0 (acts as a ReLU)
- Unit tests for derivatives
- Unit tests for classifiers using different activation functions
- Added missing docs
2018-01-09 11:09:59 +01:00
Tomáš Votruba
a348111e97 Add PHPStan and level to max (#168)
* tests: update to PHPUnit 6.0 with rector

* fix namespaces on tests

* composer + tests: use standard test namespace naming

* update travis

* resolve conflict

* phpstan lvl 2

* phpstan lvl 3

* phpstan lvl 4

* phpstan lvl 5

* phpstan lvl 6

* phpstan lvl 7

* level max

* resolve conflict

* [cs] clean empty docs

* composer: bump to PHPUnit 6.4

* cleanup

* composer + travis: add phpstan

* phpstan lvl 1

* composer: update dev deps

* phpstan fixes

* update Contributing with new tools

* docs: link fixes, PHP version update

* composer: drop php-cs-fixer, cs already handled by ecs

* ecs: add old set rules

* [cs] apply rest of rules
2018-01-06 13:09:33 +01:00
David Monllaó
c4ad117d28 Ability to update learningRate in MLP (#160)
* Allow people to update the learning rate

* Test for learning rate setter
2017-12-05 21:09:06 +01:00
David Monllaó
b1d40bfa30 Change from theta to learning rate var name in NN (#159) 2017-11-20 23:39:50 +01:00
David Monllaó
f7537c049a documentation add tokenizer->fit required to build the dictionary (#155) 2017-11-16 21:40:11 +01:00
Arkadiusz Kondas
a11e3f69c3
Add support for coveralls.io (#153)
* Add support for coveralls.io

* Generate coverage report only on php 7.2 build

* Fix osx travis build and move tools to bin dir

* Update php version badge

* Fix travis conditional statement

* Fix travis conditional statement

* 🤦 fix bin path
2017-11-15 11:08:51 +01:00
Tomáš Votruba
f4650c696c [coding standard] fix imports order and drop unused docs typehints (#145)
* fix imports order

* drop unused docs typehints, make use of return types where possible
2017-11-06 08:56:37 +01:00
Arkadiusz Kondas
dda9e16b4c Add software quaility awards 2017 badge by @yegor256 2017-10-24 08:31:29 +02:00
David Monllaó
de50490154 Neural networks partial training and persistency (#91)
* Neural networks partial training and persistency

* cs fixes

* Add partialTrain to nn docs

* Test for invalid partial training classes provided
2017-05-23 09:03:05 +02:00
David Monllaó
4af8449b1c Neural networks improvements (#89)
* MultilayerPerceptron interface changes

- Signature closer to other algorithms
- New predict method
- Remove desired error
- Move maxIterations to constructor

* MLP tests for multiple hidden layers and multi-class

* Update all MLP-related tests

* coding style fixes

* Backpropagation included in multilayer-perceptron
2017-05-18 00:07:14 +02:00
David Monllaó
c0463ae087 Fix wrong docs references (#79) 2017-04-13 21:34:55 +02:00
Bill Nunney
8be19567a2 Update imputation example to use transform method (#57) 2017-03-09 20:41:15 +01:00
David Monllaó
8f122fde90 Persistence class to save and restore models (#37)
* Models manager with save/restore capabilities

* Refactoring dataset exceptions

* Persistency layer docs

* New tests for serializable estimators

* ModelManager static methods to instance methods
2017-02-02 09:03:09 +01:00
David Monllaó
c1b1a5d6ac Support for multiple training datasets (#38)
* Multiple training data sets allowed

* Tests with multiple training data sets

* Updating docs according to #38

Documenting all models which predictions will be based on all
training data provided.

Some models already supported multiple training data sets.
2017-02-01 19:06:38 +01:00
Robert Boloc
aace5ff022 Fix documentation links 2017-01-05 20:37:48 +00:00
Ken Seah
8a0a9f09e2 Update array-dataset.md
Method has already changed name to getTargets() instead of getLabels()
2016-11-04 00:03:49 +11:00
Patrick Florek
1ff455ebed Add index entries 2016-09-17 22:06:13 +02:00
Patrick Florek
fa87eca375 Add new class Set for simple Set-theoretical operations
### Features

* Works only with primitive types int, float, string
* Implements set theortic operations union, intersection, complement
* Modifies set by adding, removing elements
* Implements \IteratorAggregate for use in loops

### Implementation details

Based on array functions:
* array_diff,
* array_merge,
* array_intersection,
* array_unique,
* array_values,
* sort.

### Drawbacks

* **Do not work with objects.**
* Power set and Cartesian product returning array of Set
2016-09-10 13:24:43 +02:00
Patrick Florek
90038befa9 Apply comments / coding styles
* Remove user-specific gitignore
* Add return type hints
* Avoid global namespace in docs
* Rename rules -> getRules
* Split up rule generation

Todo:
* Move set theory out to math
* Extract rule generation
2016-09-02 00:26:01 +02:00
Patrick Florek
c8bd8db601 # Association rule learning - Apriori algorithm
* Generating frequent k-length item sets
* Generating rules based on frequent item sets
* Algorithm has exponential complexity, be aware of it
* Apriori algorithm is split into apriori and candidates method
* Second step rule generation is implemented by rules method
* Internal methods are invoked for fine grain unit tests
* Wikipedia's train samples and an alternative are provided for test cases
* Small documentation for public interface is also shipped
2016-08-23 15:44:53 +02:00
Arkadiusz Kondas
3599367ce8 Add docs for neural network 2016-08-14 19:14:56 +02:00
Arkadiusz Kondas
2f5b090188 create contributing guide 2016-07-26 21:57:15 +02:00
Arkadiusz Kondas
6ed4761427 add examples link to readme 2016-07-24 13:35:13 +02:00
Arkadiusz Kondas
52cd58acb0 add info about minimum php version required 2016-07-20 09:15:52 +02:00
Arkadiusz Kondas
963cfea551 add ClassificationReport docs 2016-07-19 22:17:03 +02:00
Arkadiusz Kondas
76d15e9691 add php-ml logo 2016-07-17 00:31:47 +02:00
Arkadiusz Kondas
7abee3061a docs for files dataset and php-cs-fixer 2016-07-16 23:56:52 +02:00
Arkadiusz Kondas
7c0767c15a create docs for tf-idf transformer 2016-07-12 00:21:34 +02:00
Arkadiusz Kondas
ba8927459c add docs for ConfusionMatrix 2016-07-12 00:11:18 +02:00
Arkadiusz Kondas
bb35d045ba add docs for Pipeline 2016-07-12 00:00:17 +02:00
Arkadiusz Kondas
212be20fe7 create changelog 2016-07-11 21:12:49 +02:00