EDIT1 (2019-07-30): I have since found a way to implement this without using unsafe
at all. I have added the updated code at the end of this post.
Aside: I did mention a couple blog posts ago that I would write a bit about my latest game, Stay Sharp!, but I want to postpone that until I have something a bit more substantial to say about the project.
About Env-IO
Env-IO is a fun little side project that I have been working on from time to time when I get stuck or bored from working on games. It is a library for purely functional programming in Rust through trifunctor IO. As the title of this post suggests, this is simply a port of the Scala library, ZIO. Env-IO can be found here, but it is currently unfinished. As a result of working on this project, I have learned a lot about traits, some limitations of Rust, how free monad interpreters work, and how to use unsafe
. Since I feel as though I am close to having a working port that supports synchronous effects, I felt like this was a good time to talk a bit about the process of porting this library.
Why Env-IO?
Admittedly, Env-IO may be a bit out of place in the Rust ecosystem. Rust already makes mutations quite safe with its compiler enforced rules around mutability of variables and mutable references. However, I figured that some people might want to try pure FP in Rust, and there weren’t really any other good libraries for doing so.
One of the main motivations for using ZIO in Scala is faster, purely functional programming. This is achieved by using a technique known as effect rotation. For context, let’s say I have some type that needs to be able to encode nullability, the ability to perform some kind of side-effect, and fallibility. Assuming Rust has some basic kind of IO monad, maybe from some generic IO monad crate, we might represent this type like so:
Result<Option<IO<A>>, E>
If we want to treat this type as a monad, we will have to nest our calls to and_then
because these effects are stacked vertically. In Scala, we would have to write monad tranformers and there would be extra allocations, which would be quite expensive. This would probably be a bit cheaper in Rust due to fewer allocations, but the monad transformers are still a bit of extra boilerplate. We can avoid using monad transformers and flatten our types by using effect rotation to compose effects horizontally. For example, we would change the represention of the above type to the following:
<A, E> EnvIO
So this brings us to bifunctor IO, which is how ZIO used to be implemented. ZIO ended up switching to trifunctor IO, as explained in this talk, which is why EnvIO
is implemented as a trifunctor. Specifically, the addition of the reader effect to ZIO makes it more testable. Typically, the final tagless approach might be used instead of adding a reader effect like this, but, as mentioned in the talk, it is very difficult to use with a bifunctor. It also comes with a plethora of other drawbacks, such as requiring users to understand advanced FP material, such as higher-kinded types. Since Rust does not support HKTs and ZIO specifically avoids them, this actually works out quite nicely when porting ZIO to Rust. Thus, we arrive at the EnvIO<R, A, E>
trifunctor (analogous to ZIO[R, E, A]
). For the curious, I swapped the type arguments A
and E
simply because Rust’s Result
type constructor is left-biased.
How Env-IO was Built
As we all know by now, Rust is a very restrictive language. It lacks subtyping (outside of lifetimes) and enforces an affine type system at compile time. It also forces the programmer to be very explicit about heap usage. These factors alone are enough to make porting just about any library from another language non-trivial. This includes zero-dependency libraries, such as ZIO. In order to demonstrate just how difficult this was to port, I will be walking you through the process of constructing Env-IO.
For background, the process of purely functional programming in Env-IO (and also free monads) centers around defining a value level programming language. To achieve this, we define a set of instructions that the library user can build a tree out of, which will later be interpreted by the EnvIO
interpreter. In this case, when dealing with side-effects, such as a println!
, we lift them into an Effect
instruction, like so:
// Our set of instructions
enum EnvIO<R, A, E> {
Box<dyn Fn() -> Box<A>>),
Effect(...
}
fn effect<R, A, E, F: 'static>(eff: F) -> EnvIO<R, A, E> where F: Fn() -> Box<A> {
EnvIO::Effect(Box::new(eff))
}
// Note that Rust does not have by-name parameters, unlike Scala.
// We can add a macro to make lifting statements into values a bit more ergonomic.
macro_rules! effect {
$e:expr) => {
({
$crate::effect(move || Box::new($e))
}
}
}
fn main() {
let effect_instr = effect!(println!("hello world"));
...
}
In this case, println!(”hello world”)
is not going to be executed at all, since the macro will rewrite this code as an anonymous function. Thus, the actual printing does not happen until the instruction is processed by the EnvIO
interpreter later on. We’ve defined our first instruction! Wahoo! Unfortunately, it can’t really help us much on its own. We have no way of chaining together statements. For example, we can’t really represent a program like this one:
println!("hello world");
println!("meow");
We need some kind of binding instruction that allows us to chain two effects together. Something like a monadic bind
function. Let’s add a FlatMap instruction:
// A newtype wrapper for Kleisli arrows.
// A Kleisli arrow is basically just a function A => F[B]. The term comes from category theory.
struct Kleisli<R, A, E> {
: Box<dyn Fn(???) -> EnvIO<R, A, E>>
k}
// Our set of instructions
enum EnvIO<R, A, E> {
Box<dyn Fn() -> Box<A>>),
Effect(Box<EnvIO<R, ???, E>>, Kleisli<R, A, E>)
FlatMap(...
}
This is where the Rust limitations start to really kick in. You’ll notice that we have a bit of a problem here. EnvIO only has 3 type parameters! It’s a trifunctor, remember? How do we represent a FlatMap enum variant? Well, in ZIO, this is done by representing each instruction as a separate type. For instance, ZIO defines a FlatMap class which extends the ZIO trait. But we can’t really do that in Rust! We would run into issues of object safety if we tried defining a generic flat_map
method on an EnvIO
trait object. Believe me, I’ve tried. Okay, fine, the Future
trait has its own and_then
method, but Future
does not use trait objects and does not require effect rotation or an interpreter of any kind. It also isn’t as concerned about stack safety as EnvIO is
. Future
is much simpler than EnvIO
. In our example, we need something more powerful. Something more… dangerous.
struct Kleisli<R, A, E> {
: Box<dyn Fn(Box<Opaque>) -> EnvIO<Opaque, Opaque, Opaque>>,
k: PhantomData<(R, A, E)>
_pd}
enum Opaque {}
// Our set of instructions
enum EnvIO<R, A, E> {
Box<dyn Fn() -> Box<A>>),
Effect(Box<EnvIO<R, Opaque, E>>, Kleisli<R, A, E>)
FlatMap(...
}
impl<R, A, E> EnvIO<R, A, E> {
unsafe fn flat_map_unsafe<B>(self, k: Box<dyn Fn(Box<A>) -> EnvIO<R, B, E>>) -> EnvIO<R, B, E> {
EnvIO::FlatMap(Box::new(mem::transmute(self)), Kleisli { k: mem::transmute(k), _pd: PhantomData })
}
fn flat_map<B, F: 'static>(self, k: F) -> EnvIO<R, B, E> where F: Fn(A) -> EnvIO<R, B, E> {
unsafe { self.flat_map_unsafe(Box::new(move |a: Box<A>| k(*a))) }
}
}
In this case, we now have a new Opaque
data type, which is just an empty enum. It’s an empty type, like !
(or Scala’s Nothing
), only it’s actually usable on stable. The reality is, this opaque type could have literally been any type. Here, we use the dreaded unsafe
to turn Kleisli arrows into functions from an opaque type to an EnvIO
containing opaque types. This is important because our interpreter is going to need to use some variable to store the current instruction to process, and that variable needs to have some kind of type that holds arbitrary instructions. We could have tried Box<dyn Any>
, but we really need to be able to downcast to an EnvIO<Box<dyn Any>, Box<dyn Any>, Box<dyn Any>>
so that we can pattern match on our instructions. A Box<dyn Any>
only allows you to downcast to the original type you had upcasted from, so this won’t work and the interpreter would just panic. Since unsafe
is being used, we need to make sure that the user-facing API is safe and that no undefined behaviour leaks out of the library, so PhantomData is used here to ensure that our Kleisli arrows have the correct type. PhantomData is zero-sized, so this is just a nice way of enforcing type safety when type parameters are unused in the actual data contained by a struct. When we call flat_map_unsafe
, we also have to do something a bit clever to ensure that the function that the caller supplies us with actually has the correct type. We want the input argument to be boxed for sizedness reasons. The astute reader will note that functions are also profunctors. Thus, we can contramap over the input argument in order to get the desired type. Here, I don’t explicitly define contramap anywhere because I am inlining the implementation directly.
As an aside, here is how contramap works. Take a function like this one: A => B
. Consider the typical signature of the map
function, (F[B], B => C) => F[C]
. We can treat our function A => B
as a functor by fixing one of the two types. First, I will fix A
and allow B
to vary. I will also replace F[_]
with A => _
to give map
the following signature: ((A => B), B => C) => (A => C)
. By playing a bit of type tetris, we can implement it like so (in pseudo-code for brevity):
fn map<A, B, C>(fb: Fn(A) -> B, f: Fn(B) -> C) -> Fn(A) -> C {
|a: A| f(fb(a))
}
That’s cool. We can take a function A => B
and turn it into a function A => C
where C is any arbitrary type. But what if we want to vary the input argument instead? Things become more difficult. Here’s an attempt at what that might look like:
fn map<A, B, C>(fb: Fn(A) -> B, f: Fn(A) -> C) -> Fn(C) -> B {
??? // I can't implement this!
}
We immediately run into problems. No amount of type tetris will permit this function to produce a function of the correct type. This shows us that functions are asymmetric. We will have to be a bit more clever than this. In category theoretic terms, functions are contrafunctorial in their input argument. This means that we need to reverse the direction of our arrows. In code, reversing the arrows looks something like this:
fn contramap<A, B, C>(fb: Fn(A) -> B, f: Fn(C) -> A) -> Fn(C) -> B {
|c: C| fb(f(c))
}
Notice how f
went from A => C
to C => A
. That’s it! That’s all we need to do. Now we can change our function type to accept a Box<A>
instead of A
.
With all that out of the way, let’s write a cute little program that prints a few lines of text!
fn main() {
let io = effect!(println!("hello world"));
let io2 = io.flat_map(|_| effect!(println!("meow")));
}
Alright, we did it! Err, wait, this actually doesn’t do anything still. Remember, this is all just a program description. We haven’t actually interpreted these instructions anywhere yet. So now, we’re going to take off our functional programmer hats and go back to the imperative Rust we all know and love. Ready?
fn run<R, A, E>(envio: EnvIO<R, A, E>) -> A {
unsafe {
let envio_opaque: EnvIO<Opaque, Opaque, Opaque> = mem::transmute(envio);
interpret(envio_opaque)}
}
unsafe fn interpret<A>(mut envio: EnvIO<Opaque, Opaque, Opaque>) -> A {
let mut stack: Vec<Kleisli<Opaque, Opaque, Opaque>> = vec![];
loop {
match envio {
EnvIO::FlatMap(inner, kleisli) => {
match *inner {
EnvIO::Effect(eff) => {
= (kleisli.k)(eff())
envio }
=> {
_ .push(kleisli);
stack= *inner;
envio }
}
},
EnvIO::Effect(eff) => {
if let Some(kleisli) = stack.pop() {
= (kleisli.k)(eff());
envio } else {
return *mem::transmute::<Box<Opaque>, Box<A>>(eff());
}
}
=> panic!("unknown instruction")
_ }
}
}
Let’s break this down, starting with run
. As you may have guessed, run
is our “safe” API that the programmer will be calling into. It will be marked pub
in a future update, since it really should be. All that run
does is convert the program we pass it into an opaque version of the same program. That opaque program will be passed into interpret
, which will do all of the heavy lifting. Note that because we are working with our Opaque
type again, we have to use mem::transmute
some more. This actually makes interpret
incredibly unsafe. Note the signature:
unsafe fn interpret<A>(mut envio: EnvIO<Opaque, Opaque, Opaque>) -> A
We can literally pass it any kind of EnvIO
we want and pull a result of whatever type we want. An easy way to exhibit UB in this case is to transmute an EnvIO<u32, u32, u32>
into EnvIO<Opaque, Opaque, Opaque>
and call interpret with type argument EnvIO<u32, bool, u32>>
. This will give you garbage as an output. To enforce proper usage, run
takes 3 type arguments; the three type arguments of the EnvIO
program we are passing to the interpreter. run
then passes the type arguments as needed to interpret. Right now, since the only instructions we have are FlatMap
and Effect
, we won’t be using R
or E
.
Now let’s dive into the core logic of the interpreter itself. It is important to note that the correctness of this interpreter is critical. If it is not implemented correctly, we can very easily allow UB to leak into the outer world of safe Rust. First, we start of by defining an empty stack of Kleisli
s. These will be very important later. Next, we start up an infinite loop. No, it’s not actually infinite, we can break out when we’re done processing instructions. First we take our current EnvIO
instruction and pattern match to see what kind of instruction it really is. If it’s an EnvIO::Effect
then we can simply run the effect, which will produce a Box<Opaque>
. From here, we can transmute to a Box<A>
and then dereference it to get the final output value. Note that the transmute only works while the values are boxed. While transmute
is very unsafe, it still has a safety check that ensures that the types you are converting between have the same size. Thus, since the two boxed types have the same size, we are allowed to transmute in this case. Things get a little more complicated if our first instruction is a EnvIO::FlatMap
, however. First, we extract the inner EnvIO
and run a mini-interpreter here to determine what the next instruction is. This is what ZIO does under the hood as well, for performance reasons. If we see an EnvIO::Effect
then we know that we can evaluate it and feed the result it to our Kleisli
that we got from the outer EnvIO::FlatMap
instruction to get back the next instruction. However, if we see another EnvIO::FlatMap
, we want to push the current instruction’s Kleisli
on to the stack and set our current instruction to the next one. Note that in the outer pattern match, if we get an EnvIO::Effect
, we check the stack to see if it’s empty or not. If not, we just repeatedly apply the Kleisli
s in the stack until the we run into a different instruction variant or we empty out the stack. To make this a bit easier to understand, here is an example:
fn main() {
let io: EnvIO<(), (), ()> = effect!(println!("hello world"));
let io2 = io.flat_map(|_| effect!(println!("meow")));
let io3 = io2.flat_map(|_| effect!(println!("woof")));
run::<(), (), ()>(io3);
}
This produces the following tree of instructions:
FlatMap(io2, |_| { Effect(|| println!("woof")) })
which expands to:
FlatMap(FlatMap(io, |_| { Effect(|| println!("meow")) }), |_| { Effect(|| println!("woof")) })
which expands to:
FlatMap(FlatMap(Effect(|| println!("hello world")), |_| { Effect(|| println!("meow")) }), |_| { Effect(|| println!("woof")) })

Let’s interpret this program. First, we see a FlatMap
and then we see that the next instruction is also a FlatMap
, so we push the current instruction’s Kleisli
on to the stack. That is to say, we would push |_: ()| EnvIO::Effect(|| println!(”woof”))
on to our stack of Kleisli
s. We also set the current instruction to the inner instruction, which, as we saw, was a FlatMap
. On the next loop iteration, we will see a FlatMap
instruction and the next instruction is an Effect
. Thus, we will run the effect and apply the current instruction’s Kleisli
to it. The effect we run will be println!(”hello world”)
and the Kleisli
we apply is going to take the result of printing “hello world”, which is just ()
, and return Effect(println!(”meow”))
. We then set the current instruction to this effect (Effect(println!(”meow”))
). On our next iteration of the loop, we will see that the current instruction is an Effect
and check the stack of Kleisli
s. We will see that there is something on the stack and pop it off. Now, we will run our effect || println!(”meow”)
which prints “meow” and returns ()
. This value is then passed to our Kleisli
, |_: ()| EnvIO::Effect(|| println!(”woof”))
, which returns Effect(println!(”woof”))
. We then set the current instruction to this effect (Effect(println!(”woof”))
). One more iteration to go! On this iteration, we will see that the current instruction is Effect
, so we check the stack again, but this time, there are no more Kleisli
s. At this point, we run our last effect, || println!(”woof”)
, which prints “woof” and returns ()
, breaking out of the loop.
So what’s next?
At the current time of this writing, I have yet to add the rest of the effects. For the next release, I intend on getting all of the major synchronous effects working. I would also like to try and make sure that there are no soundness issues with this crate. Aside from that, I will be adding documentation and tests. I am still on the fence about adding a monad comprehension macro to go along with this. I know mdo
exists and it’s fairly usable as is, but with so many effects rolled into one monad, I think Env-IO could use its own monad comprehension macro that does not use duck-typing, if possible. I’ll definitely need to investigate that a bit more though and it isn’t a high priority at the moment.
Final thoughts
The usage of unsafe
in this crate is definitely something I wish I could cut down on or remove completely. I’ve tried countless times over the last several months to come up with a clean solution that did not use unsafe
whatsoever, but they all had critical flaws that made them non-viable. Scala gets away with this thanks to subtyping and variance. It can just mark ZIO’s type parameters as -R, +E, +A
, which allows a ZIO[Int, Int, Int]
to be a subtype of ZIO[Nothing, Any, Any]
, which allows for easy downcasting. I wonder if there is a way we could emulate this in Rust once !
is stabilized. It would be a simple 3 tiered hierarchy of types with Box<dyn Any>
on top, !
at the bottom, and all other types in between. A safe API around this could be pretty useful for applications like this one, where I had to resort to mem::transmute
to achieve the same sort of thing. I also ended up looking at the zerocopy
crate used in Google Fuchsia. Unfortunately, that crate does not work with types containing type parameters. Maybe some day there will be a way to safely express these kinds of programs in Rust with the right crates, but for now, it seems as though unsafe
is here to stay.
EDIT1: Here is the code that I came up with that allows me to avoid using the unsafe
keyword.
use std::any::Any;
use std::marker::PhantomData;
type BAny = Box<dyn Any>;
type Kleisli = Box<Fn(BAny) -> Box<Instr>>;
enum Instr {
,
Succeed(BAny)Box<Fn() -> BAny>),
Effect(Box<Instr>, Kleisli)
FlatMap(}
struct EnvIO<R, A, E> {
: Instr,
instr: PhantomData<(R, A, E)>
_pd}
impl<R: 'static, A: 'static, E: 'static> EnvIO<R, A, E> {
fn flat_map<B, K: Fn(A) -> EnvIO<R, B, E> + 'static>(self, k: K) -> EnvIO<R, B, E> {
let boxed_input_k = move |a: Box<A>| {
*a)
k(};
let any_input_k = move |bany: BAny| {
let a: Box<A> = bany
.downcast::<A>()
.unwrap_or_else(|_| panic!("flat_map: Could not downcast Any to A"));
boxed_input_k(a)};
let instr_output_k = move |bany: BAny| {
Box::new(any_input_k(bany).instr)
};
{
EnvIO : Instr::FlatMap(Box::new(self.instr), Box::new(instr_output_k)),
instr: PhantomData
_pd}
}
}
macro_rules! effect {
$e:expr) => {
({
$crate::effect(move || Box::new($e))
}
}
}
fn effect<R, A: 'static, E, F: 'static>(eff: F) -> EnvIO<R, A, E> where F: Fn() -> Box<A> {
let effect_any = move || {
let bany: BAny = eff();
bany};
{
EnvIO : Instr::Effect(Box::new(effect_any)),
instr: PhantomData
_pd}
}
fn success<R, A: 'static, E>(a: A) -> EnvIO<R, A, E> {
{
EnvIO : Instr::Succeed(Box::new(a)),
instr: PhantomData
_pd}
}
fn run<R, A: 'static, E>(envio: EnvIO<R, A, E>) -> A {
interpret::<A>(envio.instr)
}
fn interpret<A: 'static>(mut instr: Instr) -> A {
let mut stack: Vec<Kleisli> = vec![];
loop {
match instr {
Instr::FlatMap(inner, kleisli) => {
match *inner {
Instr::Effect(eff) => {
= *kleisli(eff())
instr }
Instr::Succeed(a) => {
= *kleisli(a);
instr }
=> {
_ .push(kleisli);
stack= *inner;
instr }
}
}
Instr::Effect(eff) => {
if let Some(kleisli) = stack.pop() {
= *kleisli(eff());
instr } else {
return *eff()
.downcast::<A>()
.unwrap_or_else(|_| panic!("interpret (effect): Could not downcast Any to A"));
}
}
Instr::Succeed(a) => {
if let Some(kleisli) = stack.pop() {
= *kleisli(a);
instr } else {
return *a.downcast::<A>()
.unwrap_or_else(|_| panic!("interpret (succeed): Could not downcast Any to A"));
}
}
}
}
}
This isn’t the final API (mostly due to issues related to variance and subtyping; perhaps I will blog about my approach to that problem too), however it does accomplish everything that the previous implementation in this post was able to. For clarity, the EnvIO
enum from the earlier version of this post was changed to the Instr
enum here. The major change that allowed me to remove unsafe
was removing all type parameters from the instruction enum. Instead, the type parameters are tied to a newtype wrapper, which I have named EnvIO
for now. The type parameters are only used in PhantomData
, which serves the same purpose as before: keeping things typesafe. As you can see from the code, I ended up replacing most type parameter usages in the instruction enum with BAny
, which is just a shorter alias for Box<dyn Any>
. This doesn’t change much of how the interpreter works. We were already using an opaque type before, so switching to BAny
just means we have to downcast instead of using mem::transmute
. This does complicate all of the functions that take closures as arguments, since they now have to perform more map/contramaps to massage the closure type into something that uses BAny
. However, I think that this trade-off is worthwhile, since I would much rather avoid the extra responsibility of having to verify that my code does not exhibit UB. One note about the downcasts: I use unwrap_or_else
because unwrap_or
is eagerly evaluated and will trigger panic!
too early. There’s probably a cleaner way to panic!
than this (probably could use expect
?), but I’ll tidy that up later when I come up with better error messages for easier debugging.