Automatic parallelization of for-comprehensions in Scala 3

With the ever-growing popularity of functional programming, effect systems are getting more and more traction. Unfortunately, they are still a long way from being beginner-friendly since reading functional pipelines requires some getting used to. It is specifically the case when we are dealing with parallel programs. Writing them requires remembering a multitude of names of functional combinators. Additionally, the resulting code looks very different from the mainstream imperative code. In this post, we will try to aid this problem by introducing a way to automatically parallelize effectful code written with for-comprehensions.

Effect systems

Let’s start with a quick summary of effect systems. An effect system is a library that provides a way to track effects in the types. Some examples of such libraries would be Zio or cats-effect. They do that by having a type of effectful computations, ZIO and IO, respectively.

So, for example, a function that simply gives us an integer value will have a type Int, but a function that has to make an HTTP request to get the number can have a type IO[Int]. That’s because making an HTTP request is a side-effect.

Effect types API

As with all APIs, having the types/data structures is not enough – we should also have a set of operations to perform on them. Each of those libraries provides tons of helper functions, from handling console interaction to handling files and streaming data.

In our case, we are interested in the most generic ones, i.e. the ones used most often. That would be map and flatMap.

Now, I know what you’re thinking. And no, this is not going to be another Monad tutorial, so I’ll summarise those functions in two sentences:

map is usually provided by a type class/interface Functor and performs a pure operation on the result of the effect, and its type is F[A] => (A => B) => F[B].
flatMap is usually provided by a type class/interface Monad and performs an effectful operation sequentially after getting the result of the first computation, and its type is F[A] => (A => F[B]) => F[B].

The running example

Let’s introduce an example of a function we might want to write in a real-world program using effect systems. Our function will be responsible for loading the data for a product on a marketplace website. To get specific parts of data, we will use methods on other service classes. Those will be:

java

productService.getInfo: ProductID => IO[ProductInfo]
userService.getInfo: UserID => IO[UserInfo]
recommendationService.getSimilarProducts: ProductID => IO[List[ProductID]]
ratingService.getUserRating: UserID => IO[Int]
renderingService.renderProductSite: ProductInfo => UserInfo => List[ProductID] => Int => HTML

Now that we have decided on the example, there are two styles of writing this type of code. We will label them with greek letters to be able to distinguish them easily.

Sigma

The Sigma approach is straightforward. We use calls to map and flatMap to glue together the calls to the required methods. A possible solution will look like this:

java

def renderProductSiteSigma(productID: ProductID): IO[HTML] =
  productService.getInfo(productID).flatMap { product =>
    recommendationService.getSimilarProducts(productID).flatMap { recommended =>
      userService.getInfo(product.ownerID).flatMap { owner =>
        ratingService.getUserRating(product.ownerID).map { rating =>
          renderingService.renderProductSite(product, owner, recommended, rating)
        }
      }
    }
  }

That’s it. We used all of the required functions from the API and spliced them together using flatMap‘s and map‘s.

Some people don’t like this approach because it lacks clarity and is too explicit. That is where the second approach comes in.

Beta

The beta approach uses for-comprehensions. This fit quite nicely here since in Scala, for‘s are just syntactic sugar for chains of map, flatMap, and withFilter. So a solution that uses for-comprehensions will look like this:

def renderProductSiteBeta(productID: ProductID): IO[HTML] = for { product <- productService.getInfo(productID) recommended <- recommendationService.getSimilarProducts(productID) owner <- userService.getInfo(product.ownerID) rating <- ratingService.getUserRating(product.ownerID) } yield renderingService.renderProductSite(product, owner, recommended, rating)

You should decide which approach you prefer since it’s just a matter of style. I’d argue that the main advantage of the Beta approach is being more beginner-friendly since it mimics the imperative style. On the other hand, the Sigma approach is more explicit, which makes it easier

Parallelism

There is an old Polish saying: “If you can get the same result faster and slower, then it’s better to get it faster”. I’d say that I agree, and probably so would most software clients. That’s why we would like to make our example application faster. But how can we make it faster, without knowing the implementation details of our methods? The answer is that we are going to introduce parallelism i.e. whenever two method calls can be done at the same time, we will allow them to be run in parallel.

You might have noticed that calls to productService.getInfo(productID) and recommendationService.getSimilarProducts(productID) can be performed in parallel. That’s because, for both of them, we just need the productID, which is an argument to the function. The same goes for getting owner and rating, though both of those depend on product.

The main questions we have to answer now are:

What other operations on effect types do we need?
How hard will it be to introduce parallelism to our solutions, Sigma and Beta?

The middle child of functional programming

What other operations on effect types do we need? There is a type in functional programming that allows for parallel executions. Unfortunately, it is often overlooked when learning functional programming. I am, of course, talking about Applicative.

Applicative usually provides two functions, pure and ap, with their respective signatures (for IO) pure: A => IO[A] and ap: `IO[A] => IO[A => B] => IO[B]. For our use case, we only care about ap. But because its signature is quite hard to understand, we’ll use an alternative function that is as expressive as ap – let’s call it zipPar: IO[A] => IO[B] => IO[(A, B)].

Now Applicative should be way easier to understand, with zipPar that takes two effectful computations and combines them into an effectful computation that returns the results of both. Now that is exactly what we need to express parallelism.

Parallel Sigma

How can we change the Sigma approach to utilize parallelism, then? The solution will not change that much:

java

def renderProductSiteSigmaPar(productID: ProductID): IO[HTML] =
  productService.getInfo(productID)
    .zipPar(recommendationService.getSimilarProducts(productID))
    .flatMap { case (product, recommended) =>
      userService.getInfo(product.ownerID)
        .zipPar(ratingService.getUserRating(product.ownerID))
        .map { case (owner, rating) =>
          renderingService.renderProductSite(product, owner, recommended, rating)
        }
    }

We changed two calls from flatMap to zipPar, but other than that, the code looks roughly similar.

Parallel Beta

Now, what about Beta? We might be tempted to write the following code:

java

def renderProductSiteBetaPar(productID: ProductID): IO[HTML] =
  for {
    (product, recommended) <- productService.getInfo(productID)
    	.zipPar(recommendationService.getSimilarProducts(productID))
    (owner, rating) <- userService.getInfo(product.ownerID)
    	.zipPar(ratingService.getUserRating(product.ownerID))
  } yield renderingService.renderProductSite(product, owner, recommended, rating)

Which again doesn’t look that dissimilar to the previous approach. The problems here are that:

It is now a hybrid approach between Sigma and Beta.
Current Scala versions will desugar this for to use withFilter, which is not ideal since most common effect types do not implement withFilter.

Oh no! Does this mean that we cannot use for-comprehensions for parallel code? Is there a solution?

Of course, there is! Let’s take a look at it below.

The Alpha solution – AvocADO

Our solution is to copy what Haskell does with the ApplicativeDo language extension, but instead of a language extension, we will use a macro that changes the for-comprehension into its parallel version.

I’ll summarise what the macro does in three steps:

Go through every binding
Change it to zipPar if it doesn’t depend on any of the currency zipped values
Leave it as a flatMap if it does, and go to the next one

How to use AvocADO

Provided you added a dependency for avocADO in your build tool, the code for our example will look like this:

java

import avocado.*
import avocado.instances.your_effect_system.given

def renderProductSiteBetaParAcocADO(productID: ProductID): IO[HTML] =
  ado {
    for {
      product <- productService.getInfo(productID)
      recommended <- recommendationService.getSimilarProducts(productID)
      owner <- userService.getInfo(product.ownerID)
      rating <- ratingService.getUserRating(product.ownerID)
    } yield renderingService.renderProductSite(product, owner, recommended, rating)
  }

That’s it. You just import the macro function and instance, pass the for-comprehension as an argument to ado and let the library do the hard work.

Conclusion

I hope this blog post gave you a good explanation as to why we need Applicative in functional programming and what it can offer us.

AvocADO library can be found here: https://github.com/VirtusLab/avocADO

If you think that the library sounds interesting, then consider trying it out. It is still very much in the experimental stage, so bug reports, feature requests, and contributions are very much welcome. Also, consider giving it a star.

If your team is working with Scala 3 or if you’re planning on migrating, we can help. We offer engineering support for Scala 3 projects as well as free support for Scala 2 to 3 migration. Learn more here:

Scala 3 support subscription

Automatic parallelization of for-comprehensions in Scala 3

Effect systems

Effect types API

The running example

java

Sigma

java

Beta

Parallelism

The middle child of functional programming

Parallel Sigma

java

Parallel Beta

java

The Alpha solution – AvocADO

How to use AvocADO

java

Conclusion

Liked the article?

explore more on

Partner flexibly with VirtusLab

Consulting

Team augmentation

End to end delivery

Bussiness sectors

Capabilities

Take the first step to a sustained competitive edge for your business