* Issue #349: Provide a new NGramTokenizer.
* Issue #349: Add tests.
* Fixes from code review.
* Implement NGramTokenizer with min and max gram support
* Add missing tests for ngram
* Add info about NGramTokenizer to docs and readme
* Add performance test for tokenization
* KMeans associative clustering added
* fix travis error
* KMeans will return provided keys as point label if they are provided
* fix travis
* fix travis
* Add data loader for svm format
* Add tests for error cases
* Set proper exception messages
* Add documents
* Add error checking code for invalid column format
* Add missing documents
* Add test for svm model with probability estimation
* Extract buildPredictCommand method
* Fix test to use PHP_EOL
* Add predictProbability method (not completed)
* Add test for DataTransformer::predictions
* Fix SVM to use PHP_EOL
* Support probability estimation in SVM
* Add documentation
* Add InvalidOperationException class
* Throw InvalidOperationException before executing libsvm if probability estimation is not supported
* ability to specify per-layer activation function
* some tests for new addition to layer
* appease style CI whitespace issue
* more flexible addition of layers, and developer can pass Layer object in manually
* new test for layer object in mlp constructor
* documentation for added MLP functionality
- Backpropagation using the neuron activation functions derivative
- instead of hardcoded sigmoid derivative
- Added missing activation functions derivatives
- Sigmoid forced for the output layer
- Updated ThresholdedReLU default threshold to 0 (acts as a ReLU)
- Unit tests for derivatives
- Unit tests for classifiers using different activation functions
- Added missing docs
* Add support for coveralls.io
* Generate coverage report only on php 7.2 build
* Fix osx travis build and move tools to bin dir
* Update php version badge
* Fix travis conditional statement
* Fix travis conditional statement
* 🤦 fix bin path
* Multiple training data sets allowed
* Tests with multiple training data sets
* Updating docs according to #38
Documenting all models which predictions will be based on all
training data provided.
Some models already supported multiple training data sets.
### Features
* Works only with primitive types int, float, string
* Implements set theortic operations union, intersection, complement
* Modifies set by adding, removing elements
* Implements \IteratorAggregate for use in loops
### Implementation details
Based on array functions:
* array_diff,
* array_merge,
* array_intersection,
* array_unique,
* array_values,
* sort.
### Drawbacks
* **Do not work with objects.**
* Power set and Cartesian product returning array of Set
* Remove user-specific gitignore
* Add return type hints
* Avoid global namespace in docs
* Rename rules -> getRules
* Split up rule generation
Todo:
* Move set theory out to math
* Extract rule generation
* Generating frequent k-length item sets
* Generating rules based on frequent item sets
* Algorithm has exponential complexity, be aware of it
* Apriori algorithm is split into apriori and candidates method
* Second step rule generation is implemented by rules method
* Internal methods are invoked for fine grain unit tests
* Wikipedia's train samples and an alternative are provided for test cases
* Small documentation for public interface is also shipped