Published: May 16, 2023|12 min read12 minutes read
Since Akka is now a paid tool we also decided to move towards Pekko, so you can still use an open-source serialization helper.
Every message that leaves a JVM boundary in Pekko needs to be serialized first. However, the existing solutions for serialization in Scala leave a lot of room for runtime failures. These failures are the result of programmer's oversights. Such oversights aren’t reported as possible runtime errors in compile time. This is why VirtusLab is glad to introduce the Pekko Serialization Helper. It’s a toolkit with a Circe-based, runtime-safe serializer and a set of Scala compiler plugins to counteract the common caveats in Pekko serialization.
Although Pekko is a great tool to work with, it also has some downsides when it comes to serialization. Several situations might cause unexpected errors when working with standard Pekko serialization, such as:
Missing serialization binding
Incompatibility of persistent data
Jackson Pekko serializer drawbacks
Missing codec registration
The similarity between these situations are bugs in the application code caused by programmer’s mistakes. The Scala compiler, on its own, passes over these bugs, which can easily break your app in runtime. Fortunately, there is a way to catch them during compilation with Pekko Serialization Helper, or PSH.
Once this is done, let’s enable the sbt plugin in the target project:
1lazy val app = (project in file("app"))
2 .enablePlugins(PekkoSerializationHelperPlugin)
Pekko-specific objects that get serialized include: Messages, Events and States. We might encounter runtime errors during serialization. Pekko Serialization Helper assists with spotting these errors and avoiding them in runtime. Let’s see common runtime errors related to Pekko serialization.
If you want to read more about events and persistent states, follow this link.
1. Missing serialization binding
A proper serialization in Pekko follows a certain concept:
First, you need to define a Scala trait, to serialize a message, persistent state or event:
1package org
2trait MySer
Second, bind a serializer to this trait in a configuration file:
Now let’s wire up Pekko Serialization Helper. The serializability-checker-plugin, part of PSH, detects messages, events and persistent states. It checks whether they extend the given base trait and reports an error when they don't.
This ensures that Pekko uses the specified serializer. The serializer protects a running application against an unintended fallback to Java serialization or outright serialization failure.
This plugin requires you to add an @SerializabilityTrait annotation to the base trait:
If we enable serializability-checker-plugin and add an @SerializabilityTrait annotation to the base trait, the compiler will be able to catch errors like this during compilation:
1test0.scala:7: error: org.random.project.BehaviorTest.Command is used as Pekko message
2but does not extend a trait annotated with org.virtuslab.psh.annotation.SerializabilityTrait.
3Passing an object of a class that does NOT extend a trait annotated with SerializabilityTrait as a message may cause Pekko to
A common problem with persistence in Pekko is the incompatibility of already persisted data with schemas defined in a new version of an application.
The solution for this incompatibility is the dump-persistence-schema-plugin – another part of Pekko Serializer Helper toolbox. It is a mix of a compiler plugin and a sbt task. The plugin can be used for dumping schema of pekko-persistence to a file and detecting accidental changes of events (journal) and states (snapshots) with a simple diff.
If you want to dump a persistence schema for each sbt module where Pekko Serialization Helper Plugin is enabled, run:
1sbt ashDumpPersistenceSchema
It saves the created dump into a yaml file. The default is target/<sbt-module-name>-dump-persistence-schema-<version>.yaml
Then, a simple diff command can be used to check the difference between the version of a schema from develop/main branch and the version from the current commit. Such comparison lets us catch possible incompatibilities of persisted data.
3. Jackson Pekko Serializer drawbacks
One more pitfall is to use the Jackson Serializer for Pekko. Let’s dive into some examples that might occur when combining Jackson with Scala code:
Jackson Serializer – Scala example 1
Let’s take a look at a dangerous code for Jackson:
1case class Message(animal: Animal) extends MySer
2
3sealed trait Animal
4
5final case class Lion(name: String) extends Animal
6final case class Tiger(name: String) extends Animal
This code seems to be alright, but unfortunately it will not work with the Jackson serialization. At runtime, there will be an exception with a message such as: “Cannot construct instance of Animal(...)”. The reason behind it is that abstract types need to be mapped to concrete types explicitly in code. If you want to make this code work, you need to add a lot of Jackson annotations:
1case class Message(animal: Animal) extends MultiDocPrintService
6 new JsonSubTypes.Type(value = classOf[Lion], name = "lion"),
7 new JsonSubTypes.Type(value = classOf[Tiger], name = "tiger")))
8sealed trait Animal
9
10final case class Lion(name: String) extends Animal
11final case class Tiger(name: String) extends Animal
Jackson Serializer – Scala example 2
If a Scala object is defined:
1case object Tick
Then there willbe no exceptions during serialization. During deserialization though, Jackson will create another instance of object Tick’s underlying class instead of restoring the object Tick’s underlying singleton. This means, the deserialization will end up in an unexpected but unreported behavior…
1actorRef ! Tick
2
3// Inside the actor:
4def receive = {
5 case Tick => // this won't get matched !!
6} // message will be unhandled !!
Pekko Serialization Helper as alternative
Pekko Serialization Helper provides a more Scala-friendly serializer that uses Circe.
Use our Circe-based Pekko serializer, to get rid of problems as shown in the examples above. Circe Pekko Serializer comes with Pekko Serialization Helper toolbox. It uses Circe codecs that are derived using Shapeless and are generated during compilation. This ensures that the serializer doesn’t crash at runtime, as reflection-based serializers might do.
The Circe Pekko Serializer is easy to use, just add the following lines to project dependencies:
8 override lazy val codecs = Seq(Register[CommandOne], Register[CommandTwo])
9
10 override lazy val manifestMigrations = Nil
11
12 override lazy val packagePrefix = "org.project"
13}
MySerializable in the example above is the name of the base trait
Last but not least, remember to add your custom serializer to the Pekko configuration. Shortly, add two following configurations to the .conf file - pekko.actor.serializers and pekko.actor.serialization-bindings:
1pekko {
2 actor {
3 serializers {
4 circe-json = "org.example.ExampleSerializer"
5 }
6 serialization-bindings {
7 "org.example.MySerializable" = circe-json
8 }
9 }
10}
From now on you’ll have a safe Circe-based serializer to cope with the serialization of your objects.
4. Missing Codec registration
Last situation in which unexpected runtime exceptions might occur during serialization is the missing registration of a codec.
7 override lazy val codecs = Seq(Register[CommandOne]) // WHOOPS someone forgot to register CommandTwo...
8}
1java.lang.RuntimeException: Serialization of [CommandTwo] failed. Call Register[A]
2for this class or its supertype and append the result to `def codecs`.
3
Pekko Serialization Helper can help by using the @Serializer annotation.
Pekko Serialization Helper toolbox includes the codec-registration-checker-plugin. It gathers all direct descendants of the class marked with @SerializabilityTrait during compilation and checks the body of classes annotated with @Serializer for any reference of these direct descendants.
Let’s take a look at how we can apply the plugin to check a class extending CircePekkoSerializer:
Pekko Serialization Helper is the right tool to make Pekko serialization bulletproof by catching possible runtime exceptions during compilation. It is free to use and easy to configure. Moreover, it is already used in commercial projects, although it has not reached full maturity yet. If you want to know more, check out PSH readme on GitHub.