Handling errors in direct-style Scala

12 Mar 2024.9 minutes read

Handling errors in direct-style Scala webp image

Error handling is the cornerstone of any library that orchestrates the execution of user-provided code. That's also the case in ox, a library for safe direct-style concurrency and resiliency for Scala on the JVM.

oxen3

Many of ox's features are centered around error handling: structured concurrency, retries, error propagation in channels. However, so far the primary error signaling mechanism in ox were exceptions. In Scala, all exceptions are unchecked. That's due to the many reasons why Java's checked exceptions are considered a failed feature of the language.

However, while having only unchecked exceptions alleviates some of the problems of their checked flavor, it also makes a dent in the otherwise type-safe nature of Scala codebases.

Let's first explore different ways in which we might represent errors and then examine, how such representations are supported in ox.

Errors as values

To rectify the dent in type-safety that we mentioned, it's often recommended to represent errors differently: as values. When using such a representation, the type of the error is part of each method's signature, allowing for more precision, compile-time safety, and better documentation. A primary example of such an approach is ZIO, where each computation description is parameterized by the type of error it might encounter. However, ZIO uses monadic composition to define programs and requires an entirely different programming style—something the direct style approach tries to avoid.

Still, we can represent errors as values when using direct style. It's worth noting that exceptions are still possible when an error is represented as a value. Hence, we need to make a distinction as to which errors should use which mechanism. I think the approach ZIO took: exceptions are used to signal "defects" in code (bugs), is quite sensible. On the other hand, any of the "expected" errors should be represented by values.

Of course, this is not a universal recipe, and it's quite possible that you'll need to adjust these definitions for your use case. However, it's worth specifying upfront what kind of error-handling mechanisms should be used and when. To distinguish bugs from the "expected" errors, represented as values, we'll use the term application errors going forward.

Most I/O libraries, especially Java ones, use exceptions to signal all kinds of errors—both the "expected" ones and bugs. Thus, it might be necessary to create utilities or use library wrappers that convert such exceptions to an as-a-value representation. Moreover, an unexpected error at one level might be an expected error at another, so we'll also need utilities to convert between the two representations.

Picking a representation

Now that we have the basic terminology set up, we need some specific mechanisms to represent errors as values. One approach is to make the error states members of the data types returned by your "business" functions. After all, since we are dealing with "expected errors", the business logic should handle the error states and the successful states. This might become hard when cross-cutting concerns are at play since we want to avoid repeating an error state in each ADT returned by a business logic method, and we might still want to have some generic error representation.

That's why I think a very reasonable representation of errors-as-values is using Scala's Eithers, where by convention, the left side represents an error, and the right side—success. For example, a method such as:

def lookupUser(id: Int): Either[NotFound, User] = ???

clearly states what kind of error might be encountered. What about composition, though? We wanted to do direct-style, and we're back to for comprehensions. For example:

def lookupUser(id: Int): Either[NotFound, User] = ???
def updateEmail(user: User, email: String): Either[Conflict, User] = ???

val result = for {
  u <- lookupUser(1120)
  _ <- updateEmail(u, "test@example.com")
} yield ()

Yes, that's true—composing two computations that might report errors requires a flatMap or a for-comprehension. However, as one participant of Scala.IO pointed out during a conversation recently, the situation is different than with asynchronous monads—we can deconstruct the value "directly" to obtain a non-Either result. That's not possible with various flavors of IOs, as they should only be run "at the end of the world".

Hence, Eithers used to represent errors are much less viral. We can use pattern matching, or fold to collapse the potentially erroneous result:

val user = lookupUser(10) match {
  case Left(NotFound()) => registerUser(email)
  case Right(user) => user
}

As a side note, some consider Java's checked exceptions as mere syntax sugar for Eithers, arguing that "real" exceptions should not influence the caller methods and their code/signatures.

Moreover, we can use the freshly introduced boundary-break integrated with Either to get scoped direct syntax (the getEither comes from a gist, it's not part of Scala's stdlib):

import getEither.?
val result: Either[AppError, User] = getEither {
  val user = lookupUser(1120).?
  val _ = updateEmail(user, "test@example.com").?
  user
}

Note the ? postfix operators, which "unwrap" the Right value and short-circuit the computation to the getEither boundary in case of a Left. Boundary-break uses exceptions underneath; however, in some cases, the compiler optimizes this.

Summing up: Eithers are definitely a decent error representation mechanism, not always allowing us to strictly keep writing our code in direct style, but probably as close as we can get. After all, if we want to represent errors as values, we have to somehow account for that fact in our code.

Alternative representation

As an alternative to Eithers, we can also consider using union types. They are untagged—only checked at compile time—hence, they bear no run-time overhead. However, you might get unchecked warnings when discriminating the union type's members if at least one of them is generic. Also, you cannot represent nested errors using union types since they are flattened.

Still, the alternative exists and might work quite well in some cases. As an upside, we don't need to use for-comprehensions. However, we still need to inspect every result to determine whether it returns an error or not:

def lookupUser(id: Int): NotFound | User = ???
def updateEmail(user: User, email: String): Conflict | User = ???

val result: NotFound | Conflict | User = lookupUser(1120) match {
  case u: User => updateEmail(u, "test@example.org")
  case e       => e
}

We can also combine the either-approach with union types, using them to represent possible errors. This might give us the best of both worlds, as we don't need common ancestors for otherwise disparate error types (note the type ascription for result):

def lookupUser(id: Int): Either[NotFound, User] = ???
def updateEmail(user: User, email: String): Either[Conflict, User] = ???

val result: Either[NotFound | Conflict, Unit] = for {
  u <- lookupUser(10)
  _ <- updateEmail(u, "x")
} yield ()

In ox

As mentioned in the beginning, so far ox only supported exceptions as an error signaling mechanism. Recently, we have also added support for application errors. We don't mandate a fixed representation of such errors, to accommodate to the solutions mentioned above, but also to any project-specific error representations.

The central construct is a trait ErrorMode[E, F[_]], which describes the representation of application errors. Such errors have a fixed error type E, which is reported in context F. Support for Eithers is built-in and implements the ErrorMode with the following type parameters:

class EitherMode[E] extends ErrorMode[E, [T] =>> Either[E, T]]

Error modes are currently supported in supervised blocks, par, race, and retry methods. Let's examine a couple of examples.

Parallel computations with different error modes

First, we'll examine par and its variants. The variant that uses exceptions to short-circuit computation remains unchanged:

def par[T1, T2](t1: => T1, t2: => T2): (T1, T2) = ???

def computation1: Int = ???
def computation2: String = ???

val result: (Int, String) = par(computation1, computation2)

If any of the computations throw an exception, the other is interrupted, and the exception is propagated to the caller. However, in the latest release of ox, we also have the parEither method, which short-circuits the computation if any returns a Left:

def parEither[E, T1, T2](
    t1: => Either[E, T1], 
    t2: => Either[E, T2]): Either[E, (T1, T2)] = ???

def computation1: Either[Int, Result1] = ???
def computation2: Either[Int, Result2] = ???

val result: Either[Int, (Result1, Result2)] = 
  parEither(computation1, computation2)

Hence, if all computations succeed, we get a tuple of the results in the Either context (a Right). If any fails, we get the first application error. Of course, parEither still satisfies the structured concurrency property, and only returns after all created forks have been completed, either successfully, with an exception, or due to an interrupt.

Finally, we've got a variant which accepts an arbitrary error mode:

def par[E, F[_], T1, T2](em: ErrorMode[E, F])(
    t1: => F[T1], t2: => F[T2]): F[(T1, T2)]

Application-error-aware supervised scopes

Supervised scopes are the basic building block for structured concurrency in ox. They provide a syntactic boundary, within which asynchronous computations might be forked (backed by virtual threads), and which guarantees that all forks will be completed before the block completes.

Previously, they were aware only of exceptions: that is, when any supervised fork threw an exception, the scope ended, interrupting any forks that were still in progress. You might recognize this pattern from Akka's "let it crash": a single unhandled error winds down the whole hierarchy (in Akka it's actors, in ox it's forks).

Currently, we can create supervised scopes by providing an arbitrary error mode. Such a scope can start forks using forkError. An application-error-aware fork will end the scope whenever an application error is encountered, propagating the error as a result of the whole scope:

supervisedError(EitherMode[Int]) {
  // the fork below completes with an application error
  forkUserError { Left(10) }
  // the body of the scope completes successfully
  Right(())
}
// returns Left(10)

supervisedError scopes are used behind the scenes to implement the variants of par and race mentioned above.

Using such supervised scopes is safe: it's statically checked at compile time, what kind of forks might be started by what kind of scopes, using context functions.

To dive a bit more into the technical details, there are two capabilities that a method may require. First there's Ox, which allows starting "regular" (exception-aware) forks. Second, there's OxError[E, F[_]], which allows starting application-error-aware forks (that's an addition to the default error handling: these forks are still exception-aware, with the same behavior if an exception is thrown).

The type parameters to OxError are usually inferred, and they only need to be provided by hand if there's a custom function that needs to fork asynchronous computations as part of its logic. Finally, OxError is a subtype of Ox, so we can use any method that requires the capability of "regular" forking in a supervisedError scope.

There's more to add

Going forward, we'll be adding more utilities around error handling, converting between error representations, as well as syntactical improvements. The goal of these changes will be to streamline working with errors when using direct-style Scala, in accordance with ox's motto: "safe direct-style concurrency and resiliency for Scala on the JVM".

If you have any suggestions about what areas related to error handling should be covered, please head over to our community forum. It will be great to get your feedback! Ox is an open-source project available under the Apache 2 license and waiting for your testing in Maven Central.

Last but not least, if you find Ox interesting, please star the repository!

Contents