MapReduce
in package
implements
IteratorAggregate
Implements a simplistic version of the popular Map-Reduce algorithm. Acts like an iterator for the original passed data after each result has been processed, thus offering a transparent wrapper for results coming from any source.
Table of Contents
Interfaces
- IteratorAggregate
Properties
- $_counter : int
- Count of elements emitted during the Reduce phase
- $_data : Traversable
- Holds the original data that needs to be processed
- $_executed : bool
- Whether the Map-Reduce routine has been executed already on the data
- $_intermediate : array<string|int, mixed>
- Holds the shuffled results that were emitted from the map phase
- $_mapper : callable
- A callable that will be executed for each record in the original data
- $_reducer : callable|null
- A callable that will be executed for each intermediate record emitted during the Map phase
- $_result : array<string|int, mixed>
- Holds the results as emitted during the reduce phase
Methods
- __construct() : mixed
- Constructor
- emit() : void
- Appends a new record to the final list of results and optionally assign a key for this record.
- emitIntermediate() : void
- Appends a new record to the bucket labelled with $key, usually as a result of mapping a single record from the original data.
- getIterator() : Traversable
- Returns an iterator with the end result of running the Map and Reduce phases on the original data
- _execute() : void
- Runs the actual Map-Reduce algorithm. This is iterate the original data and call the mapper function for each , then for each intermediate bucket created during the Map phase call the reduce function.
Properties
$_counter
Count of elements emitted during the Reduce phase
protected
int
$_counter
= 0
$_data
Holds the original data that needs to be processed
protected
Traversable
$_data
$_executed
Whether the Map-Reduce routine has been executed already on the data
protected
bool
$_executed
= false
$_intermediate
Holds the shuffled results that were emitted from the map phase
protected
array<string|int, mixed>
$_intermediate
= []
$_mapper
A callable that will be executed for each record in the original data
protected
callable
$_mapper
$_reducer
A callable that will be executed for each intermediate record emitted during the Map phase
protected
callable|null
$_reducer
$_result
Holds the results as emitted during the reduce phase
protected
array<string|int, mixed>
$_result
= []
Methods
__construct()
Constructor
public
__construct(Traversable $data, callable $mapper[, callable|null $reducer = null ]) : mixed
Example:
Separate all unique odd and even numbers in an array
$data = new \ArrayObject([1, 2, 3, 4, 5, 3]);
$mapper = function ($value, $key, $mr) {
$type = ($value % 2 === 0) ? 'even' : 'odd';
$mr->emitIntermediate($value, $type);
};
$reducer = function ($numbers, $type, $mr) {
$mr->emit(array_unique($numbers), $type);
};
$results = new MapReduce($data, $mapper, $reducer);
Previous example will generate the following result:
['odd' => [1, 3, 5], 'even' => [2, 4]]
Parameters
- $data : Traversable
-
the original data to be processed
- $mapper : callable
-
the mapper callback. This function will receive 3 arguments. The first one is the current value, second the current results key and third is this class instance so you can call the result emitters.
- $reducer : callable|null = null
-
the reducer callback. This function will receive 3 arguments. The first one is the list of values inside a bucket, second one is the name of the bucket that was created during the mapping phase and third one is an instance of this class.
emit()
Appends a new record to the final list of results and optionally assign a key for this record.
public
emit(mixed $val[, mixed $key = null ]) : void
Parameters
- $val : mixed
-
The value to be appended to the final list of results
- $key : mixed = null
-
and optional key to assign to the value
emitIntermediate()
Appends a new record to the bucket labelled with $key, usually as a result of mapping a single record from the original data.
public
emitIntermediate(mixed $val, mixed $bucket) : void
Parameters
- $val : mixed
-
The record itself to store in the bucket
- $bucket : mixed
-
the name of the bucket where to put the record
getIterator()
Returns an iterator with the end result of running the Map and Reduce phases on the original data
public
getIterator() : Traversable
Return values
Traversable_execute()
Runs the actual Map-Reduce algorithm. This is iterate the original data and call the mapper function for each , then for each intermediate bucket created during the Map phase call the reduce function.
protected
_execute() : void