Function Composition — Functional Programming in PHP (Part 3)
This is the third part of a series of articles about Functional Programming in PHP. Previous parts:
The Zen of modularity
If we ever want to reuse our code, we need to divide it into smaller chunks, that have well defined responsibility, and keep those parts modular enough to be considered building blocks for a bigger structure. We may have the first hand experience that balancing pebbles on top of each other at the river bank is considerably more of a challenge than building a wall from angular bricks.
It’s not enough to have building blocks that have uniform shape, because some shapes like spheres don’t really play nice on their own to build a structure, unless they have a constraint that forces them to do so. Much better, if we have building blocks like the aforementioned bricks that were designed to fit each other, and inherently form a structure.
Fortunately, functions are exactly that kind of building blocks. Not only the output of one function can be used as input for another, and therefore they can form a chain, but a higher-order function can also accept other functions as input, or return a new function as output, and therefore capable to form arbitrary ‘shapes’ that can fit each other flexibly. If we pass functions as inputs instead of values returned by other functions, we can postpone execution to the time when the result is actually needed.
Consider the following scenario:
We have an array of bytes. We want to convert them into characters. We want to order them in reverse. Then join them into a string. And finally capitalize that string.
We can write or use existing functions for each of these steps, and then chain them one after the other:
$bytes = [ 111, 108, 108, 101, 104 ];
$chars = bytes_to_chars($bytes);
$rev_chars = array_reverse($chars);
$my_string = join($rev_chars);
$my_word = ucfirst($my_string);
This is the common way of doing consecutive steps of computation, and it’s quite readable, but it’s not very efficient, because each step is calculated once reached, and temporary variables are created for them. We could put this into a function, and call that when needed, but the individual steps would still behave the same.
We could also have written all in one call, sparing the extra function and the variables:
$my_word = ucfirst(
join(
array_reverse(
bytes_to_chars([ 111, 108, 108, 101, 104 ])
)
)
);
This is function composition. The benefit of this is that now the chain of operations acts as one unit. We execute this chain only when we need it. If it depends on something else, we can easily add a condition, as if we had put them into a separate method.
Unfortunately, the syntax is somewhat lame, because each step is indented more than the previous one. It shows dependency, but also decreases readability, because the flow is not entirely clear, and the closing parentheses are mere noise.
Since this is a repeating pattern, we could simply create a higher-order function that takes an arbitrary number of arguments of callable type, then returns a function that takes optional initial values to feed the first function, and executing the rest with the output of the previous function.
We could apply it like this:
compose(
ucfirst,
join,
array_reverse,
bytes_to_chars
)([ 111, 108, 108, 101, 104 ]);
If only PHP supported passing functions on their own right, that is. This is not the case however, and an argument list like this, would be interpreted as list of constants, and we’d be rewarded with a bunch of error messages:
PHP Fatal error: Uncaught Error: Undefined constant...
This is not what we mean. Of course we could do something like this:
compose(
'ucfirst',
'join',
'array_reverse',
'bytes_to_chars'
)([ 111, 108, 108, 101, 104 ]);
which would work right out of the box, but the usage is somewhat misleading, for we are forced to use strings instead of callables.
Another option would be to use functions or arrow functions, but that also increases the noise considerably:
compose(
fn($x) => ucfirst($x),
fn($x) => join($x),
fn($x) => array_reverse($x),
fn($x) => bytes_to_chars($x)
)([ 111, 108, 108, 101, 104 ]);
And finally, we could also use our immutable values, if we defined them earlier like this:
function capitalize(): callable { return fn($x) => ucfirst($x); }
Then we could write the same composition like this:
compose(
capitalize(),
str_join(),
arr_rev(),
bytes_to_chars()
)([ 111, 108, 108, 101, 104 ]);
In my opinion, this is much more easier to read, while it also hints more appropriately that we’re dealing with callables here. It has a serious overhead though, that we need a wrapper function for already existing functions too, because they don’t support passing them around on their own right, or calling them without the required number of arguments. (If they were curried by default, then ucfirst()
would simply return a curried function which compose
could call later, when the input value for it becomes available.)
We can also combine these approaches and wrap built-in functions in an arrow function, and design our own ones so that they return a function.
Go with the flow
The compose
function presented above puts some mental load on us, because it presents the order of execution reversed. To read it in the correct order, we need to start from the bottom, and proceed towards the top.
If that bothers us, we can use the reversed version of compose instead, which simply use function composition the other way round. The function is called pipe
and it reflects the natural flow of function calls more accurately:
pipe(
'bytes_to_chars',
'array_reverse',
'join',
'ucfirst'
)([ 111, 108, 108, 101, 104 ])
Pulling the trigger
In the previous example, the initial argument still remained at the end, but we can consider that as a trigger that initiates the whole chain of operations.
In fact, we’re not forced to pull that trigger immediately. We’re free to use our composition, pass it, name it in whichever way we like.
For example, creating a new function:
function bytes_to_word(array $bytes): string {
return pipe(
fn($x) => bytes_to_chars($x),
fn($x) => array_reverse($x),
fn($x) => join($x),
fn($x) => ucfirst($x)
)($bytes);
}
Or assigning it to a variable:
$bytes_to_word = pipe(
fn($x) => bytes_to_chars($x),
fn($x) => array_reverse($x),
fn($x) => join($x),
fn($x) => ucfirst($x)
);
Then calling it:
$bytes_to_word([ 111, 108, 108, 101, 104 ]);
Also works.
We’re not forced to take this approach though. With functional thinking, it’s easy to restructure the pipe function to take the initial value first, and let the returned function take the list of callables. In that case it the usage would be like this:
pipe($initial_val)(
op1(),
op2(),
opn()
);
The order gets natural, but we lose the opportunity to rebrand our function composition with another name, because this already contains all the required arguments, it gets executed immediately, and only the result is returned.
Under the hood
The core of these patterns are so common, that PHP already implemented it. We have nothing else to do, just add the triggering mechanism, and reversing the arguments as necessary. PHP’s own version actually prefers pipe
, so we will use reverse for compose
.
array_reduce
is itself very useful, and we can use it whenever we want to gain one single result out of an array of values using a function which is called for each items. The initial value is ‘accumulated’ and passed to the next iteration which can then update it. Check out the docs if you’d like to know more about that.
Functional languages usually have several versions of this function, that are capable to walk the the list from both directions. They are mostly called foldLeft
and foldRight
. They slightly differ in effect though.
Keeping your secrets
Being modular is great. Organizing our code is also great. Leaking and polluting is not great. If you have secrets, you better keep them to yourself, and by no means bring them anywhere near Twitter. It can happen, that others are not that eager to get them known.
PHP assists you to organize your code into files, and include the files you need into other files. It also helps you to distinguish areas of your code base by putting them into directory structures and namespaces to avoid conflicts with other libraries.
What it does not though, is to assist you with elementary information hiding. No matter how deep in a filesystem hierarchy you hid your files, and how long chain of sub-namespaces you have, they can be included easily, and then all of your code becomes available unless they are put into classes.
require_once 'Stuff\Things.php';use Stuff\Things as T;echo T\super_secret(); // Bamm!
The concept of modules targets this issue, and would let you hide parts of your code in your file, that are not explicitly offered to the public. This can be familiar from the JavaScript world, where parts of the code that are offered for public use, need to be exported, and the rest remains unavailable, even when it’s not surrounded with classes. It’s also the preferred encapsulation technique among functional languages like Elixir or Haskell.
Unfortunately, PHP does not support modules just yet, but that cannot prevent us from using the concept of modules and information hiding to our own benefit. As mentioned above, classes are equipped with this feature in the name of encapsulation. We need nothing more than that, so we can use classes to serve our own purposes as modules. We need to take some extra safety measures though to prevent others from using our module-classes as OOP classes, because we don’t need inheritance and instantiation.
That’s as easy as defining a constructor private, and then we’re good to go. Of course, we don’t want to have ‘methods’ in the original OOP sense either. We just collect our pure functions here. To access them, we need to declare them as static.
The good thing is that like this, we can freely declare visibility to our functions, which otherwise wouldn’t be possible. From there, only private and public makes sense to us.
Consider this self-explaining syntax from Elixir:
Our corresponding PHP fake module would look like this:
It’s not too bad after all. The external world has no access to our private helper function dressed as a method.
The major drawback of this approach is that it mixes concepts once again. It’s not a good practice to use something for a purpose it was not designed for. Although classes can easily be used as a module, they inherit a bunch of other features which we don’t need, and also prevent using. This can lead to confusion when someone with an OOP background bumps into our module.
Of course we can add some extra comments, but along with the safety measures we take to turn off OOP features (line 11), and the burden we carry to fit in (the use of self::
on line 3), it gets quite a hustle.
After all, while preventing the use of non-public functions is a good practice, and we can use this technique as a workaround to assure safer code usage, we shouldn’t be too comfortable with it, and forget that we’ve just introduced a mixed concept here, instead of having a clear distinction.
Hopefully, PHP will also have its own modules one day, and then our functions can take another leap forward towards becoming first class citizens of the language what they deserve.
In the next part, we’ll add more essential functions to our tool set, that we tend to use most often. I will guide you through how to use and implement them in a similar fashion that you might got used to already. Then we will sail into the territory of some big guys, and examine monads in use. Eventually we will go split brain with async, and write complete (although small) applications using the tools we gathered, so stay tuned!