About the Immutable Collections I added to bingo-functional

Lochemem Bruno Michael
3 min readAug 22, 2018

--

PHP arrays are supercharged. Arrays in PHP are almost a defacto data structure. Useful for a litany of operations — serialization, JSON data manipulation, and even flow-control to mention but a few, PHP arrays are incredibly ubiquitous. There are, however, certain things that are negated by many array users.

Arrays are, in PHP, hash table implementations. Said fact might be tacit — the intricacies of low-level hash table implementations are not. PHP has, through many iterations of its existence, seen multiple changes to its core. Among the myriad of changes made to the language is the modification of the hash table hashing, collision, and index-resolution algorithms. A hash table works in such a way that string index to integer value conversion occurs via some hashing mechanism. Combine said mechanism with bucket value allocation and, the result is a large set. A lot more information about the more arcane workings of the PHP Virtual Machine’s parsing of arrays is available on Nikita Popov’s website.

The utility of arrays, ever-manifested in their malleability, stems from their existence as functors. A functor is a container the contents of which can be mapped over. Functors are — in PHP implemented as type classes that support ad-hoc polymorphism. The problem with PHP arrays (hashtables in disguise) is that they consume large chunks of memory when exceedingly large (upwards of one million entries). SPL data structures are one mitigation of said encumbrance but, despite their potency, they are mostly difficult to implement in a codebase. The other solution, the php-ds extension, is particularly interesting — it has a polyfill for non-extension-present installations of PHP that looks promising.

The solution I implemented in bingo-functional is one that relies on SPL data structures — in particular, the SPL fixed array. I consider said structure to be the most integrable of all SPL utilities hence its usage in an ad-hoc polymorphic data structure aptly named Collection. The inspiration for this package extension — the immutable-php library built by Joshua Koudys is one with a similar offering. SPL fixed arrays significantly outperform their hashtable key-value counterparts and can — in the right conditions — be a potent storage container. Below is a snippet of sample usage:


use Chemem\Bingo\Functional\Immutable\Collection;
$list = Collection::of(range(1, 1000000));$list->map(function ($val) { return $val * 20; });

The code above creates a list containing one million entries and multiplies each value by twenty. The same snippet merely emphasizes the robustness of the Collection component. Also, the code shown above demonstrates usage of the map helper function which — like the map function native to the PHP userland — applies a function (in this case the lambda that multiplies a supplied argument by twenty) to each value in the list. If this makes any sense — the Collection class is more conspicuously functor-esque than a hash table: it has some inbuilt methods that include said map function, flat map, and filter functions to mention but a few. Also, the Collection class is hash table storage-capable; the snippet below demonstrates this.

const GOT_EPISODES = [
[
'no' => 1,
'title' => 'Winter is Coming'
],
[
'no' => 2,
'title' => 'The Kingsroad'
],
[
'no' => 12,
'title' => 'The Night Lands'
]
];
$firstSeason = Collection::from(...GOT_EPISODES)
->filter(function (array $episode) { return $episode['no'] < 11; });

Because I am no stranger to Westeros material, the example above suffices as proof of concept. Ideally, the data in the immutable array presents itself as the result of one of either an API call or database interaction — the filter operation on the underlying nested hash tables occurs within a Collection item and thus — SPL fixed array element.

PHP arrays (hash tables) are useful in a litany of ways but their nascent large size (a product of hash calculation and collision truncation optimizations at the PHP-VM layer) — like that of many constructs in computing — is directly proportional to available memory. The persistent data structure, the Collection — recently added to bingo-functional — relies on the SPL fixed array, a less memory intensive relative to the hash table that is convenient for functional programming.

--

--