# Imputation missing values For various reasons, many real world datasets contain missing values, often encoded as blanks, NaNs or other placeholders. To solve this problem you can use the `Imputer` class. ## Constructor Parameters * $missingValue (mixed) - this value will be replaced (default null) * $strategy (Strategy) - imputation strategy (read to use: MeanStrategy, MedianStrategy, MostFrequentStrategy) * $axis (int) - axis for strategy, Imputer::AXIS_COLUMN or Imputer::AXIS_ROW * $samples (array) - array of samples to train ``` $imputer = new Imputer(null, new MeanStrategy(), Imputer::AXIS_COLUMN); $imputer = new Imputer(null, new MedianStrategy(), Imputer::AXIS_ROW); ``` ## Strategy * MeanStrategy - replace missing values using the mean along the axis * MedianStrategy - replace missing values using the median along the axis * MostFrequentStrategy - replace missing using the most frequent value along the axis ## Example of use ``` use Phpml\Preprocessing\Imputer; use Phpml\Preprocessing\Imputer\Strategy\MeanStrategy; $data = [ [1, null, 3, 4], [4, 3, 2, 1], [null, 6, 7, 8], [8, 7, null, 5], ]; $imputer = new Imputer(null, new MeanStrategy(), Imputer::AXIS_COLUMN); $imputer->fit($data); $imputer->transform($data); /* $data = [ [1, 5.33, 3, 4], [4, 3, 2, 1], [4.33, 6, 7, 8], [8, 7, 4, 5], ]; */ ``` You can also use `$samples` constructer parameter instead of `fit` method: ``` use Phpml\Preprocessing\Imputer; use Phpml\Preprocessing\Imputer\Strategy\MeanStrategy; $data = [ [1, null, 3, 4], [4, 3, 2, 1], [null, 6, 7, 8], [8, 7, null, 5], ]; $imputer = new Imputer(null, new MeanStrategy(), Imputer::AXIS_COLUMN, $data); $imputer->transform($data); ```