More Functions — Functional Programming in PHP (Part 4)
This is the forth part of a series of articles about Functional Programming in PHP. Previous parts:
Introduction
Currying
Function Composition
The first Functional Programming language, Lisp, derives its name from ‘LISt Processing’. Working with lists is a fundamental concept in FP. Imagine your favorite necklace of pearls and the reassuring feeling of moving your fingers through it one by one. This simple activity has such an enormous effect, that the string of beads happens to appear in various spiritual practices of major world religions form Buddhism to Christianity.
We will delay on this concept a bit as a means to slice the elephant and coping whichever amount of data in consumable chunks. To empower our functions with the ability to cope with infinite amount of data, we have to teach them how to handle lists effectively. We will focus on two common types of lists in PHP: strings and arrays as our prime targets.
Chopping heads
There are many things that we can organize into lists, but if we want to take the approach to use list as the major means of data processing, then we will find ourselves distinguishing parts of the list soon. One of the distinguished parts of the list is its head. Speaking in PHP’s terms, that means the first element of an array for example. Assume we want to walk our list, string, stream, or anything similar, and process it as long as it takes. To walk our list in manageable chunks, we need to start somewhere and stop right at the first element, take it, make operations with it, and continue with the next, which in turn can be just the first of the remaining elements of the list.
I intentionally blurred the difference between different types of lists here, because we deal with list as an abstraction that refers to elements linked one after another. In this sense string
is just another list containing elements of type char
— which indeed is how we find it implemented in languages like C and Haskell.
We will call the function that returns the first element of the list head
, and we will use it heavily for various reasons, but most often to walk our list element by element, recursively.
Implementing head
in PHP is easy. We can implement it for multiple data types, even object
:
Our function takes a list and returns the first element of that list, if any. Again, we don’t care what kind of list is that. We could implement it for Iterator
and stream as well. In strictly typed languages this would be implemented by generics, taking List<T>
as an input, and returning simply T
.
The output using this function could be symbolized like this:
head([ a, b, c ]) = a
head({ a, b, c }) = a
head(Iterator([ a, b, c ])) = a
Pull the tail
As stated in the previous section, to walk our list recursively, we need another function which returns every element of our list but the first one. This section of the list is called the tail. We can implement tail for arrays as simply as calling array_slice($xs, 1)
or substr($xs, 1)
for strings. We can use a similar wrapper function to the one above to blur the difference if we wish to do so. Usage example:
tail("!rest") = rest
tail([ a, b, c ]) = array (
0 => 'b',
1 => 'c',
)
Take it or leave it
Dividing list to head and tail has its undeniable benefit walking the list recursively, however we’re not restricted to walk that list element by element. We can also walk it by groups of elements, where the group can consist of arbitrary number of elements. With tools so powerful, a wide range of data structures become representable with mere lists. For example list of pairs (tuples) can be reduced to a simple list, where we take the first two elements from the list, and then we take the first two elements from the rest of the list. We can also walk the list dynamically assuming that some elements define how many elements to take next, or how many elements to skip.
For this we need to implement two functions take
and drop
. Both accept two arguments, an integer number representing how many elements to take (or skip) at once, and a list to walk.
These two functions, while also blurring the difference between strings and arrays, have some peculiarities not seen in case of head
and tail
worth talking about. Both functions get curried allowing to its caller to provide any of their two arguments whenever they become available. Like this, I could create a new function called takePair
by simple feeding take
with only one argument: take(2)
. The result would be a new function which takes only one argument, the list. This technique is called partial application, and provides even more flexibility for creating new functions, and therefore heavily used.
FP languages tend to do currying automatically, but in PHP, we have to rely on doing it manually whenever we think we might need it. This adds some noise in our code and also obscures the signature of our function which normally would be what we actually see in the curried function: fn($n, $xs)
where $n
is the number of elements to take or drop, and $xs
is just the list.
We also use variable-length argument list again to apply whatever is provided right into our curried function, so that if both required arguments were provided, we can simply call the underlying function immediately.
It would make sense to add comments referring to the real expected parameter list of the function.
Last but not least
One may wonder why we start processing list by their first element. For the same reason we start indexing arrays with zero. Traditionally strings, arrays, and lists are implemented as singly-linked lists, which are basically just pointers to the first element of the list in the memory, which itself will have another pointer to the next element to the list. In C we would add size of the element (size of the data type + the pointer) of the list to the first pointer to get the next element, so if we add zero to the first pointer, we move nowhere, but still getting the first element. Hence array access is just syntactic sugar to pointer arithmetic.
This also means, that getting the first element is cheap, for we don’t need to perform any calculations on the address, while the more we’d like to walk our list, the more effort it takes if we need to start over and over again. Under such circumstances, it totally made sense to keep track of the first element and the next one (rest/tail).
Consequently, reaching the last element in a long list can be expensive. With that in mind, we still can create functions for accessing the last element, or to get every element of the list but the last one.
These functions have only one parameter, the list, so they don’t need to be curried, it would add no extra benefit.
I need to mention that although I provided implementation for string
and array
, our list ideally is neither a string nor an array, but an abstraction of any elements linked one after another sequentially. If we had this abstraction, interface, or more accurately container, we wouldn’t need to care what kind of values they hold.
Scrutinizing objects
While lists can hold any kind of values (especially if we use PHP’s arrays as underlying implementation, which also lets us mix types within our list), it still would make sense to use some more complex data structures like associative arrays for encapsulating and managing data that belongs together, and acts as a unit. If we want to use OOP libraries, we will most likely get some results nicely packed into class instances or objects. To access the properties of such objects we don’t need to switch paradigms all of a sudden, but can create a function that takes the name of a property, and returns the value of that property. The functions signature would be something like this:
prop(string $prop, object | array $obj): mixed
What does this signature tell you? Would it make sense to curry this function? Definitely! For example if we have a JSON string returned from an API, and parsed into an object or associative array, chances are we want to access some of its properties sooner than the others. In that case, we could simply feed prop
with the property name and the object, and it would return the value we’re looking for. But if it’s nested deeply in the structure, or we want code reuse while not repeating ourselves, we could also define access functions in advance by feeding only the property to our curried prop
function, which in turn would reply with a new function expecting only the object to work with.
For the sake of simplicity, I only provided an implementation for associative arrays here, but with the powerful Reflection API from PHP, we can easily implement this for any objects, even when their classes don’t want us to access their properties directly:
Maybe we’re going into too much assumptions here, but we also applied some safety checks, and eventually, we can use this helper function to make our original prop
function capable of dealing with class instances and standard objects as well. This little magic can provide huge benefits to stay in the FP domain while dealing with output from OOP libraries.
Summary
As we can see, most of these functions are simple, and if we’re not considering the multiple implementations for slightly different data types, they only have one responsibility too. Frankly, I’d prefer them to work with abstractions instead, but implementing these concepts for the well known data types may help understanding them easier.
The point here is that our functions are meant to be simple. They have no state to maintain, and they are straight mappers of the input values to the output. They can be tested easily, and could run in parallel without a problem.
No wonder that I could put any of these functions in an AWS lambda instance, and they could live cloud native like Lucy with the diamonds.
As far as FP is concerned, everything should be a function just like these, and with function composition, functors, applicatives, higher-kinded types, and monads the side effects are guaranteed to be kept under control, meaning that we can draw a clear line between what our code is meant to do and when exactly is allowed to launch operations that cause side effects or are the result of side effects, while leaving the how to the language.
The good news is that one does not need to know the theory behind these concepts in order to use them. The even better news is that they can rule complexity and control side effects so that we can keep on writing simple and testable functions while developing real-world applications.
While I tend to emphasize how easy and testable these functions are, it’s not too difficult to point out some of their shortcomings. For example, I don’t check boundaries, and I also return null
values in case of lacking data. The problem with null
values is that we need to check them all the time we want to use them. FP offers better solutions for that than our regular if
-else
.
In the upcoming parts, we will examine those, and how to contain async in a standardized way, while eventually trying to build a Sinatra style web app in PHP using ReactPHP and what we’ve gathered of FP that far. Looking forward to see you there!