Fr3nch13/CakePHP Utilities

MapReduce
in package
implements IteratorAggregate

Implements a simplistic version of the popular Map-Reduce algorithm. Acts like an iterator for the original passed data after each result has been processed, thus offering a transparent wrapper for results coming from any source.

Table of Contents

Interfaces

IteratorAggregate

Properties

$_counter  : int
Count of elements emitted during the Reduce phase
$_data  : Traversable
Holds the original data that needs to be processed
$_executed  : bool
Whether the Map-Reduce routine has been executed already on the data
$_intermediate  : array<string|int, mixed>
Holds the shuffled results that were emitted from the map phase
$_mapper  : callable
A callable that will be executed for each record in the original data
$_reducer  : callable|null
A callable that will be executed for each intermediate record emitted during the Map phase
$_result  : array<string|int, mixed>
Holds the results as emitted during the reduce phase

Methods

__construct()  : mixed
Constructor
emit()  : void
Appends a new record to the final list of results and optionally assign a key for this record.
emitIntermediate()  : void
Appends a new record to the bucket labelled with $key, usually as a result of mapping a single record from the original data.
getIterator()  : Traversable
Returns an iterator with the end result of running the Map and Reduce phases on the original data
_execute()  : void
Runs the actual Map-Reduce algorithm. This is iterate the original data and call the mapper function for each , then for each intermediate bucket created during the Map phase call the reduce function.

Properties

$_counter

Count of elements emitted during the Reduce phase

protected int $_counter = 0

$_data

Holds the original data that needs to be processed

protected Traversable $_data

$_executed

Whether the Map-Reduce routine has been executed already on the data

protected bool $_executed = false

$_intermediate

Holds the shuffled results that were emitted from the map phase

protected array<string|int, mixed> $_intermediate = []

$_mapper

A callable that will be executed for each record in the original data

protected callable $_mapper

$_reducer

A callable that will be executed for each intermediate record emitted during the Map phase

protected callable|null $_reducer

$_result

Holds the results as emitted during the reduce phase

protected array<string|int, mixed> $_result = []

Methods

__construct()

Constructor

public __construct(Traversable $data, callable $mapper[, callable|null $reducer = null ]) : mixed

Example:

Separate all unique odd and even numbers in an array

 $data = new \ArrayObject([1, 2, 3, 4, 5, 3]);
 $mapper = function ($value, $key, $mr) {
     $type = ($value % 2 === 0) ? 'even' : 'odd';
     $mr->emitIntermediate($value, $type);
 };

 $reducer = function ($numbers, $type, $mr) {
     $mr->emit(array_unique($numbers), $type);
 };
 $results = new MapReduce($data, $mapper, $reducer);

Previous example will generate the following result:

 ['odd' => [1, 3, 5], 'even' => [2, 4]]
Parameters
$data : Traversable

the original data to be processed

$mapper : callable

the mapper callback. This function will receive 3 arguments. The first one is the current value, second the current results key and third is this class instance so you can call the result emitters.

$reducer : callable|null = null

the reducer callback. This function will receive 3 arguments. The first one is the list of values inside a bucket, second one is the name of the bucket that was created during the mapping phase and third one is an instance of this class.

emit()

Appends a new record to the final list of results and optionally assign a key for this record.

public emit(mixed $val[, mixed $key = null ]) : void
Parameters
$val : mixed

The value to be appended to the final list of results

$key : mixed = null

and optional key to assign to the value

emitIntermediate()

Appends a new record to the bucket labelled with $key, usually as a result of mapping a single record from the original data.

public emitIntermediate(mixed $val, mixed $bucket) : void
Parameters
$val : mixed

The record itself to store in the bucket

$bucket : mixed

the name of the bucket where to put the record

getIterator()

Returns an iterator with the end result of running the Map and Reduce phases on the original data

public getIterator() : Traversable
Return values
Traversable

_execute()

Runs the actual Map-Reduce algorithm. This is iterate the original data and call the mapper function for each , then for each intermediate bucket created during the Map phase call the reduce function.

protected _execute() : void
Tags
throws
LogicException

if emitIntermediate was called but no reducer function was provided


        
On this page

Search results