Monads: Your App as a Function, Part 2

This is part 2 of Monads: Your App As A Function.

In the previous post we started looking at monads, and summarized that their core purpose is to transport values or computations along a call chain. (Functional programming is all about inputs and outputs.) To enable this behavior, we said monads have the following properties:

  • Monads are types with two functions unit and flatMap
  • Monads are constructed via unit to contain arbitrary data
  • Monads are chained together (potentially transforming the contained data) via flatMap

We concluded the article by saying that this is not the entire truth. What we have neglected so far are the laws that govern unit and flatMap, and that have to hold true for a type that looks like a monad, to actually be a monad.

Let me begin by saying that there is no deeper meaning in these laws beyond supporting common sense. I still think it’s worth looking at them because they close the loop to imperative programming, which you might be more familiar with, since especially the second and third laws enable what’s been attributed to monads as being “programmable semicolons” in case you’ve stumbled upon that phrase before.

The three monad laws

As I mentioned, it’s not actually sufficient to merely provide the unit and flatMap functions in order to write a monad type. These two functions must also behave in a certain way. This behavior is summarized in three short algebraic laws, and they are as follows.

Associativity

The law of associativity says exactly what you’d expect it to. It demands that for any monad m and any two functions f and g, it must not make a difference whether you apply f to m first and then apply g to the result, or if you first apply g to the result of f (recall that functions passed to flatMap return monads) and then in turn flatMap this over m. Or in short:

1
(m flatMap f) flatMap g == m flatMap ((x) -> f(x) flatMap g)

What does that mean or why is it important? It simply means that the order in which you compose individual steps must not affect the overall outcome. This doesn’t have any deeper meaning beyond not violating common sense: if you pour yourself a coffee, it shouldn’t make any difference whether you first pour the coffee, then add sugar and cream, or first mix sugar and cream and add coffee to it. (I deliberately chose coffee, not tea, since any true Englishmen would disagree with this statement!) To clear up with a common misunderstanding: the law of associativity says nothing about the order of execution; only about the order of composition. Since the computation defined by f often involves side effects, the order of execution obviously does matter (writing a file to disk and then formatting the hard drive has a different outcome than formatting the hard drive and then writing files to it!)

Left and right identity

Now these two are more interesting. The left unit law (“left identity”) states that:

1
unit(x) flatMap f == f(x)

Or in prose: flatMap must behave in such a way that for any function f passed to it, the result is the same as calling f in isolation. This might be a bit more difficult to grasp at first, but beyond the aspect of chainability that we’ve already discovered, this law enables the “3rd C”: confinement.

Essentially it says: flatMap allows you to peek at the value contained in the monad and apply a transformation to it, all without leaving the monad. This is also why monads are called “programmable semicolons”: they allow you to “lift” an imperative statement into the confinement of the monad using a higher order function (flatMap) and chain it to the next computation, rather than separating the two computations by calling them explicitly and placing a semicolon between them (The “semicolon” here is not meant to be understood literally, but as a metaphor for demarcating expressions in an imperative call style. Even in languages that do not use semicolons, this applies.) That is, given two monad types M1 and M2, instead of saying:

1
2
3
4
m1 = M1.unit(x);
m2 = f(m1.get()); // f returns an instance of M2
result = m2.get();
...

You can say:

1
result = M1.unit(x).flatMap(f).get();

That is, we have obtained the result without explicitly peeking at x simply by replacing imperative calls with a more fluent functional call style. An important take away here is: this leaves no room for further side effects in flatMap. Since the outcome must be equivalent as per this law, flatMap must not “squeeze” any extra side effects between the “semicolons”.

That leaves only the right unit law (“right identity”). Let’s have a look at its formal definition:

1
m flatMap unit == m

Or in plain English, applying the unit constructor to a monad has the exact same outcome as not calling it at all. Again this makes perfect sense if you translate it to an imperative programming style. Instead of saying:

1
result = m.get();

You can say:

1
result = m.flatMap(Monad.unit).get();

How does that make sense? It turns out that this law is important to support a fluent call style, since it allows us to build up long monad call chains without worrying that a function passed to flatMap might represent the value we’ve already obtained. This becomes more obvious if in the code snippet above you replace the call to Monad.unit with some arbitrary f, which might return the unit constructor. Again, this law states that there is no room for side effects here. For a nice example of why this law is important, I suggest having a look at Scala’s Option[T] monad, or Guava’s Optional<T> if you prefer Java.

Now that you know the laws, forget about them

You saw this coming, right? I honestly think that the monad laws mostly exist to prevent behavior that would be counter-intuitive. In fact, there is at least one very good case where the left unit law should be violated: exception handling. To recall, the law states that:

1
unit(x) flatMap f == f(x)

But what if f throws an exception? If flatMap is supposed to behave the same way as the right hand side, then it will, too, throw an exception and terminate the call chain. That’s bad, since it destroys one of the most valuable aspects of monads: chaining computations together. In defense of the purists, in mathematics, there is no such thing as exceptions. Signals don’t magically stop functions and make them return nothing. Unfortunately, in the real world we’re faced with functions that are not pure in a mathematical sense, so instead we take the “third C”, confinement, a little further and transform the exception to a value and trap it in the monad. This is exactly what RxJava does when trapping an exception and re-routing it to Observer#onError, or the Try type in Scala. So while their type structure is monadic, they actually violate the left identity law and hence are not true monads. But who cares as long as they get the job done!

Speaking about getting the job done: we’re now able to rewrite that piece of code we kicked things off with in the first article, and transform it to a fluent code style using RxJava’s monad(-ish) Observable type.

Tying up the ends

Now that you understand what monads are and why they’re useful, let’s put the knowledge to practice and revisit our initial code snippet. Here it is again for your convenience:

1
2
3
4
5
6
7
8
9
10
class FetchJsonObject extends AsyncTask<String, Void, JsonObject> {

  protected JsonObject doInBackground(String... args) {
    final String url = args[0];
    String json = serviceApi.get(url).readString();
    cache.persist(url, json);
    return JsonObject.parse(json);
  }

}

Remember how I said that we can think of each line of code here as a step in a series of transformations, and that the monad helps us transport the results from one step to the next. We also said that each step may terminate the entire task by throwing an exception. Just to emphasize: this is bad. It means the entire task is not really deterministic, and what we’re missing are well-defined “exit points”, that allow us to terminate the sequence gracefully. It all reminds us of a really fragile soap bubble, where on each line we risk the bubble to burst, without having a good exit strategy.

We can rewrite this using a monadic type now, in this case an RxJava Observable, which lets us turn the soap bubble into something less fragile:

1
2
3
4
5
final String url = "...";
serviceApi.get(url)
    .doOnNext((json) -> { cache.persist(url, json) })
    .flatMap((json) -> { Observable.from(JsonObject.parse(json)) })
    .subscribe(new JsonObjectObserver());

I’ve used Java 8 closure syntax here to keep the example concise and clear. Note how we transformed our semicolons into a monadic call chain. serviceApi.get does not immediately return a value anymore; instead, it returns an Observable<String> holding the API call result. We then want to perform a side effect by caching this result, which we do using another monad transformation called doOnNext, an action that’s performed for every value emitted. We then transform the fetch result into another monad, namely from Observable<String> to Observable<JsonObject> by passing a function to flatMap that parses the JSON String and sticks it in a new monad/observable. We finally subscribe a listener that receives the final result.

There’s a few key things here that make this implementation superior to what we started out with, and I suggest to compare these against our initial Q&A we went through in the previous article:

  1. RxJava ensures that if in any of the above steps an exception is thrown, it will be propagated to the observer and subsequent steps will be skipped. In other words, we don’t have to worry about errors until we actually need to deal with them.

  2. You can gracefully terminate the call chain yourself by calling onCompleted on the given observer. This allows you to skip any subsequent steps in case there is nothing meaningful to return, i.e. there’s no need to return null anywhere. The observer will simply receive nothing through onNext and complete straight away.

  3. In RxJava, scheduling an individual step to run on another thread than the one you started on, is treated as just another transformation of the call chain. This means you can use a monad transformation to specify concurrency, making it a simple and natural aspect to deal with.

  4. Most importantly, all of the above applies to all possible steps in the call chain, freeing you from the burden of making decisions for every single step in your sequence.

Instead of a soap bubble, we have an assembly line now, where results of individual steps are transported in a resilient and well defined way. Exit points are clear and dealt with uniformly, regardless of where we leave the monad–even in the case of error.

Monads: Your App as a Function, Part 1

A paper by Twitter’s Marius Eriksen (“Your server as a function”), which introduces the key concepts behind Finagle is what made me choose the title for this post. I believe functional programming (FP) will be just as important to mobile application development in the future as it is for web development today. Since I first jumped the “reactive bandwagon” about a year ago, other companies like Parse/Facebook and Spotify have started to move to functional programming on mobile to simplify concurrent programming (via the BoltsFramework and trickle library respectively.)

The reason is quite simple: it’s easier to write resilient code in functional languages, and resilience is key. Performance might be a feature, but resilience is a must. If the critical paths through your business logic are brittle, then your app can be as fast as light, but your users will still scoff at it and look elsewhere for value or entertainment.

I write Android applications, and Java is not a functional language. It’s not even an object-oriented language, at least not in a puristic sense. However, that doesn’t stop us from adopting some of the good practices found in FP to improve on existing Java code. In this post I’ll try to explain how adopting one of the most fundamental type patterns in FP, monadic types, can dramatically simplify and improve the robustness of your core application logic.

I have already written about RxJava and functional reactive programming and how we make use of it in our mobile applications at SoundCloud.
I hope it served as a good introduction into using that library specifically, and how expressing expensive operations through Observables makes your code more resilient to failure and easier to compose.

However, there’s a reason why Observables are so universally useful: they’re monads. This post is my own attempt at explaining monads, why they’re so valuable, and why you should consider using them.

Before you keep reading–or, heavens forbid, consider dropping out here!–let me say that none of the following pragraphs will assume you have experience with FP or any functional language for that matter. I will use Java for all examples, so that you have something familiar to work with if you’re a Java (or Android) developer already.

Say Monad one more time…

It’s almost a joke these days. People hate it when FP folks start talking about monads. People hate it, because they have a vague idea at best of what a monad is, and it makes you feel like an idiot. No one wants to be an idiot! Let me tell you: you’re not an idiot, and monads are not difficult to understand. It’s just surprisingly difficult to explain them.

I’m not a mathematician. I don’t know category theory. But I believe I have understood monads to a degree that I can make effective use of them in the code I write day in and day out, and that I can even write my own monadic types. Here’s another piece of good news: if you understand monads, you understand most of the underpinnings of functional languages. You’ll quickly find when jumping from one FPL to the next, monads will follow you around. It’s a bit like understanding classes in object-oriented languages. If all you’ve ever known is procedures and value types, then classes may seem odd at first. But once you understand classes, it doesn’t matter which OOPL you use, the concepts remain the same.

So what’s a monad? I think monads are best explained (and appreciated!) by realizing in what poor situation you as an imperative programmer actually are. So I’ll start by showing you a piece of code that I bet you’ve written yourself in some way shape or form at some point in time, and then making you reflect on why your code sucks. No offense by the way, my code sucks too. But that’s the great thing about being a developer, right? We strive to make code suck a little less every single day.

Let’s look at the example.

A piece of code you’ve written before

If you’re an app developer, chances are you connect to some service API to download JSON or XML that describes your business objects. At SoundCloud, everything evolves around tracks, so we download track metadata a lot. If you’re doing it right, then you’re also caching this data somewhere. It doesn’t really matter where or how, it could be in a database or just using flat files. Here’s a very typical of way of doing this in Android using the AsyncTask class:

1
2
3
4
5
6
7
8
9
10
class FetchJsonObject extends AsyncTask<String, Void, JsonObject> {

  protected JsonObject doInBackground(String... args) {
    final String url = args[0];
    String json = serviceApi.get(url).readString();
    cache.persist(url, json);
    return JsonObject.parse(json);
  }

}

Don’t worry too much about what types JsonObject or serviceApi are here. This is just pseudo code serving to get my point across. It should still be easy to see that we’re trying to achieve the following:

  1. Download a JSON document from a given URL
  2. Cache it to disk using the URL as the cache key
  3. Parse it into an in-memory representation

Instead of actually pointing out what the problem with all this is, let’s turn this into a short Q&A. Have a look at the following questions. What are your answers to each of them?

Q: Every single line here can throw an exception. Where is it handled?

A: Simple, you wrap everything in a try/catch block. Fair enough. Then what? How do you propagate the exception to the caller? Recall that this job is running on a background thread, so there might be visibility issues. Moreover, how do you signal the error? You have to return something from doInBackground. Thinking about returning null? You might want to listen to what Tony Hoare has to say about null references (he invented them by the way.)

Q: If the API request fails, what do we return?

A: Easy, we return null! Sorry, but Tony Hoare says NO, so let’s put our foot down on that one okay?

Q: If the API request succeeds but caching fails, do we throw out the result?

A: Uhm, maybe? We haven’t really thought about how to propagate the data in this series of steps. Aha! There’s our first clue about what a monad is: transporting data as a series of steps. I’ll pick this up again later.

Q: Should caching to local storage happen on the same thread as sending API requests?

A: Probably not. Because this could mean that API requests block local storage I/O from happening in case they take longer than expected, right? It almost looks as if caching to local storage should happen in its own task. Or maybe it has something to do with propagating data as a series of steps… (This is where you should picture me waving a flag in your face that says monad on it.)

Q: If I want to just make an API request/just cache to local storage, do I write a new AsyncTask for each?

A: I suppose so. If we did that, however, how would we combine them to arrive at the definition above, which performs both steps in succession and pipes data from one task to the next? I smell sulfur, we might be well on our way to callback hell.

I hope the picture begins taking shape. It appears there are a number of related problems here, most of them having to do with processing data as a series of potentially asynchronous steps where failure in each step is anticipated.

Let’s finally look at what monads are and how they solve this for us.

Monads explained

We’ll get a little more concrete now and jump straight into the definition of what monads are and what has to hold true for a monad to actually be a monad. I then show how a simple monadic type could look like in Java.

Let’s first look at some of the existing definitions that attempt to put monads in a single sentence. Erik Meijer, the man behind the Reactive Extensions and my personal hero for wearing a SoundCloud t-shirt on stage at GOTO Berlin, has this to say about monads:

Monads are return types that guide you through the happy path.

This might be my favorite definition, because it catches the gist of what monads are all about. However, it’s still a bit vague and doesn’t really help in understanding what the structure of a monad is. Martin Odersky, EPFL fame and inventor of the Scala programming language looks at it this way:

Monads are parametric types with two operations flatMap and unit that obey some algebraic laws.

So this is rather the opposite: this definition doesn’t really tell us what monads are good for, but it contains some important clues about their structure, i.e. they are types, parameterized over another type, and they consist of just two operations. I told you monads were simple!

Both these definitions I took almost verbatim from the reactive programming course on Coursera, which I highly recommend. Let’s have a look at what Wikipedia has to say:

Monads are structures that represent computations defined as sequences of steps.

Sounds familiar? I told you I’d come back to the whole sequence of steps thing. Finally, here’s my own attempt at putting monads in a sentence, and it’s the definition I will use throughout the rest of this article:

Monads are chainable container types that trap values or computations and allow them to be transformed in confinement.

The key take away from this definition are the “three Cs”: containment, chainability, and confinement.

I will now explain how monads enable these properties for arbitrary data or computations and why that’s super awesome.

Monads are types

We just learned that monads are parametric types that define just two operations, unit (also called return) and flatMap (also called bind or mapMany). I will show you shortly what these methods do and what a full definition of a monad looks like in Java, but just to put your worries to rest a little: if you’ve used Scala or RxJava before, then you’ve already seen and used monads. All lists in Scala are monads with the List constructor method as unit and a flatMap method to transform them. Observables in RxJava are monads with Observable.from as unit and mapMany to transform them (mapMany in RxJava is actually aliased to flatMap, so you can use either one.)

That said, let’s have a look at the structure of a monadic type in Java:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
public class Monad<T> {
  private T value;

  private Monad(T value) {
    this.value = value;
  }

  public static <T> Monad<T> unit(T value) {
    return new Monad<T>(value);
  }

  public <R> Monad<R> flatMap(Func1<T, Monad<R>> func) {
    return func.call(this.value);
  }

  public T get() {
    return value;
  }
}

I’ve added a third method here, get, but it’s hardly worth talking about a getter function, so let’s skip it as part of the discussion and turn straight to unit and flatMap.

The 1st C: unit enables containment

There’s an obvious take away from the snippet above: a monad is defined over a type T which it contains values of. Containment here is enabled by the unit function: it takes a T and traps it in the monad by creating a new instance of the monad with that value passed into it. At this point I should mention that T can be anything, including collection types like lists. Remember RxJava and observables? An Observable is a monad defined over collections of values. Let’s keep things simple though and let’s create a monad of integers:

1
Monad<Integer> intMonad = Monad.unit(2);

So we stuff the number 2 in our monad. Cool, but not so useful thus far. What makes it useful is the flatMap method, so let’s turn to flatMap.

The 2nd C: flatMap enables chainability

I admit this one might look a bit more puzzling, but it’s actually pretty straight forward once you look past the awkward syntax. The flatMap function itself is defined over a new type variable, R, and it returns a new monad of that type Monad<R>. It does so, however, not just by trapping the value in it like unit does, but by applying a function to the current value, a function which knows how to turn Ts into monads of R. (I borrowed the Func1 type from RxJava here: it means it’s a function object that takes 1 argument of type T and returns something of type Monad<R>.)

Let this sink in for a second, since this is perhaps the most important aspect of monads. It’s important because it allows us to chain monads together using transformations of the values they contain. I can take a monad containing, say, an integer, and flatMap it using a function which takes this integer, transforms it (say, by taking the square root of that number) and sticking it in a new monad. The last part is critical, since it means we can do this forever and ever, because the return value will be a monad again with a flatMap function which can again take a function which returns another monad which has… you get the idea.

Let’s take the square root example using the monad we just created:

1
2
3
4
5
6
double result = intMonad.flatMap(new Func1<Integer, Monad<Double>>() {
    public Monad<Double> call(Integer input) {
      return Monad.unit(Math.sqrt(input));
    }
  }
}).get();

Now that looks more useful! What we’ve done here is take our initial integer monad and transformed it using flatMap to obtain a new monad of type Monad<Double> that contains the square root of the initial value. This is fundamentally different from applying the sqrt function directly to some input, since there’s no flatMap method defined on double that you could use to apply further transformations, so you’d effectivly lose the property of chainability.

This also enables entirely new perspectives in terms of code structure and reuse: since the monad structure never changes, your business logic is entirely expressed in terms of functions, which transform values step wise, are defined and tested in isolation, and composed together to form new pieces of functionality. It’s like Legos, but using functions.

The 3rd C: To be continued…

If you paid attention at the beginning, you might have noticed that the “3rd C”, namely confinement is still missing. This is because is has to do with the algebraic laws a monad has to obey. Frankly, I haven’t been completely honest with you. Types with unit and flatMap follow a monadic structure, but purists will say they’re not actually monads unless they obey some algebraic laws.

What the three monad laws are, what implications they have, and how we can piece everything together to turn our initial AsyncTask example into something fundamentally better using monads is what I will cover in part 2 of this article.

Functional Reactive Programming on Android With RxJava

Shameless plug: if after reading this article, you want to know more, come hear me talk at DroidCon UK 2013!

If you are an application developer, there are two inconvenient truths:

  1. Modern applications are inherently concurrent.
  2. Writing concurrent programs that are correct is difficult.

In the domain of mobile or desktop applications, parallel execution allows for responsive user interfaces because we can move computations into the background while the UI responds to ongoing user interactions. Code must execute concurrently to not stray from this fundamental requirement. Writing such programs is diffcult because on mobile they are typically written in imperative languages like C or Java. Writing concurrent code in imperative languages is difficult because code is written in terms of interweaved, temporal instructions that move objects or data structures from one state to another. This imperative style of programming inherently produces side effects. It presents several problems when running instructions in parallel, such as race conditions when writing to a shared resource.

Resistance is futile–or is it?

Developers have grown accustomed to the drawbacks of expressing concurrency in imperative languages. On platforms like Android where Java is (still) the dominant language, concurrency simply sucks, and we should just give in and deal with it. I personally keep a close eye on the server side end of the spectrum. Over the past few years, functional programming has made an astounding comeback in terms of rate of adoption and innovation, the details of which I will not get into here. In the case of concurrency, functional programming has a very simple answer to dealing with shared state: don’t have it.

Problems of concurrent programming with AsyncTask

Being based on Java, Android comes with a standard number of Java concurrency primitives such as Threads and Futures. While these tools make it easy to perform simple asynchronous tasks, they are fairly low level and require a substantial amount of diligence when you use them to model complex interactions between concurrent objects. A frequent use case on Android or any UI-driven application is to perform a background job and then update the UI with the result of the operation. Android provides AsyncTask for exactly that:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
class DownloadTask extends AsyncTask<String, Void, File> {

  protected File doInBackground(String... args) {
    final String url = args[0];
    try {
      byte[] fileContent = downloadFile(url);
      File file = writeToFile(fileContent);
      return file;
    } catch (Exception e) {
      // ???
    }
  }

  protected void onPostExecute(File file) {
    Context context = getContext(); // ???
    Toast.makeText(context,
        "Downloaded: " + file.getAbsolutePath(),
        Toast.LENGTH_SHORT)
        .show();
  }
}

This looks straightforward. Define a method doInBackground that accepts something through its formal parameters, and returns something as the result of the operation. Android guarantees that this code will execute in a thread that is not the main user-interface thread. We also define a UI callback function onPostExecute that receives the result of the computation and can consume it on the main UI thread, since Android guarantees that this method will always be invoked on the main thread.

In search for the jigsaw-puzzle pieces

So far so good. What’s wrong with this picture? Let’s start with doInBackground, which downloads a file–a costly operation because it involves network and disk I/O. There are many things that can go wrong, so we want to recover from errors, and add a try-catch block. What do we do in the catch block? Log the error? Perhaps we want to inform the user about this error too, which likely involves interacting with the UI. Wait, we cannot do that because we are not allowed to update any user-interface elements from a background thread. Bummer.

It should be easy to handle that error in onPostExecute. We might reason that it is as simple as holding on to the exception in a private field (i.e. we write it on the background thread), and check in onPostExecute (i.e. read it on the UI thread) if that field is set to something other than null (did I mention we love null checks) and display it to the user in some way shape or form. But wait, how do we obtain a reference to a Context, without which we cannot do anything meaningful with the UI? Apparently, we have to bind it to the task instance up front, at the point of instantiation, and keep a reference to it throughout a task’s execution. But what if the download takes a minute to run? Do we want to hold on to an Activity instance for an entire minute? What if the user decides to back out of the Activity that triggered the task, and we are holding on to a stale reference. This not only creates a substantial memory leak, but is also worthless because meanwhile it has been detached from the application window. A problem that everyone is well aware of.

Beyond the basics

There are other problems with all this. The preceding task is incredibly simple. Picture a more complicated scenario where we need to orchestrate a number of such operations. For example, we might want to fetch some JSON from a service API, parse it, map it, filter it, cache it to disk, and only then feed the result to the UI. All the aforementioned operations should–as per the single responsibility principle–exist as separate objects, perhaps exposed through different services. It is difficult and non-intuitive to use AsyncTask because it requires grouping any number of combinations of service interactions into separate task classes. This results in a proliferation of meaningless task classes, from the perspective of your business logic.

Another option is to have one task class per service-object invocation, or wrap the service objects themselves in AsyncTasks. Composing service objects means nesting AsyncTask, which leads to what is commonly referred to as “callback hell” because you start tasks from a task callback from a task callback from a … you get the idea.

Last but not least, AsyncTasks scheduling behavior varies significantly across different versions of Android. It’s changed from a capped thread pool in the 1.x days (with varying bounds depending on the API level) to a single thread executor model in 4.x. Read that again. Your tasks (plural) do not run concurrently to each other on ICS devices and beyond (although they do run concurrently to the main UI thread). Why did Google decide to serialize task execution? Developers could not get it right, applications suffered from nasty problems due to race conditions and incorrectly synchronized code.

The inconvenient truth

Should we still use Thread and AsyncTask?

The answer is “probably”. For simple, one-shot jobs that do not require much orchestration, AsyncTask is fine. For anything more complex it is doable, but requires juggling with volatiles, WeakReferences, null checks, and other defensive, unconfident mechanisms. Perhaps worst of all, it requires you to think about things that have nothing to do with the problem that you set out to solve, which is
to download a file.

Enter RxJava–now with more Android

To come back to the initial problem statement, do we have to give in to the lack of high-level abstractions and deal with it, or do better solutions exist? Turns out that functional programming might have an answer to this. “But wait” you might say, “I still wanna use Java?”. Turns out, yes, you can. It is not super pretty (at least not unless Google whips out its magic wand and gives us Java 8 and closures on Dalvik, or unless you feel attracted to anonymous classes and six levels of identation). However, it solves all of the problems in one fell swoop:

  • No standard mechanism to recover from errors
  • Lack of control over thread scheduling (unless you like to dig deep)
  • No obvious way to compose asynchronous operations
  • No obvious and hassle-free way of attaching to Context

RxJava is an implementation of the Reactive Extensions (Rx) on the JVM, courtesy of Netflix. Rx was first conceived by Erik Meijer on the Microsoft .NET platform, as a way of combining data or event streams with reactive objects and functional composition. In Rx, events are modeled as observable streams to which observers are subscribed. These streams, or observables for short, can be filtered, transformed, and composed in various ways before their results are emitted to an observer. Every observer is defined within three messages: onNext, onCompleted, and onError. Concurrency is a variable in this equation, and abstracted away in the form of schedulers. Generally, every observable stream exposes an interface that is modeled after concurrent execution flows (i.e. you don’t call it, you subscribe to it), but by default is executed synchronously. Introducing schedulers can make an observable execute using various concurrency primitives such as threads, thread pools, or even Scala actors. Here is an example:

1
2
3
4
5
6
7
8
Subscription sub = Observable.from(1, 2, 3, 4, 5)
    .subscribeOn(Schedulers.newThread())
    .observeOn(AndroidSchedulers.mainThread())
    .subscribe(observer);

// ...

sub.unsubscribe();

This creates a new, observable stream from the given list of integers, and emits them one after another on the given observer. The use of subscribeOn and observeOn configures the stream to emit the numbers on a new Thread, and to receive them on the Android main UI thread. For example, the observer’s onNext method is called on the main thread. Eventually, you unsubscribe from the observable. Here is an example Observer implementation:

1
2
3
4
5
6
7
8
9
10
public class IntObserver implements Observer<Integer> {

  @Override
  public void onNext(Integer value) {
     System.out.println("received: " + value);
  }

  // onCompleted and onError omitted
  ...
}

For something more interesting, you can implement the download task as an Rx Observable:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
private Observable<File> downloadFileObservable() {
    return Observable.create(new OnSubscribeFunc<File>() {
        @Override
        public Subscription onSubscribe(Observer<? super File> fileObserver) {
            try {
                byte[] fileContent = downloadFile();
                File file = writeToFile(fileContent);
                fileObserver.onNext(file);
                fileObserver.onCompleted();
            } catch (Exception e) {
                fileObserver.onError(e);
            }
            return Subscriptions.empty();
        }
    });
}

The preceding example creates a method that builds an Observable stream, which in this case only ever emits a single item (the file) to which a File observer can connect. Whenever this observable is subscribed to, its onSubscribe function triggers and executes the task at hand. If the task can be carried out successfully, deliver the result to the observer through onNext so onNext can properly react to it. Then signal completion by using onCompleted. If an exception is raised, deliver it to the observer through onError. As an example, you can use this from a Fragment:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
class MyFragment extends Fragment implements Observer<File> {
  private Subscription subscription;

  @Override
  protected void onCreate(Bundle savedInstanceState) {
    subscription = AndroidObservables.fromFragment(this, downloadFileObservable())
                          .subscribeOn(Schedulers.newThread())
                          .subscribe(this);
  }

  private Observable<File> downloadFileObservable() { /* as above */ }

  @Override
  protected void onDestroy() {
    subscription.unsubscribe();
  }

  public void onNext(File file) {
    Toast.makeText(getActivity(),
        "Downloaded: " + file.getAbsolutePath(),
        Toast.LENGTH_SHORT)
        .show();
  }

  public void onCompleted() {}

  public void onError(Throwable error) {
    Toast.makeText(getActivity(),
        "Download failed: " + error.getMessage(),
        Toast.LENGTH_SHORT)
        .show();
  }
}

By using RxJava, the aforementioned issues are solved all at the same time. The fromFragment call transforms the given source observable in such a way that events will only be emitted to the fragment if it’s still alive and attached to its host activity. Call unsubscribe in onDestroy to ensure that all references to the fragment, which is also the observer, are released.

You can have proper error handling through an observer’s onError callback. Also, you can execute the task on any given scheduler with a simple method call. Doing so gives you fine-grained control over where the expensive code is run and where the callbacks will run, all without you having to write a single line of synchronization logic. Futhermore, RxJava allows you to compose and transform observables to obtain new ones, which enables you to reuse code easily. For example, to not emit the File itself, but merely its path, transform the existing observable:

1
2
3
4
5
6
7
8
9
Observable<String> filePathObservable = downloadFileObservable().map(new Func1<File, String>() {
    @Override
    public String call(File file) {
        return file.getAbsolutePath();
    }
});

// now emits file paths, not `File`s
subscription = filePathObservable.subscribe(/* Observer<String> */);

You can see how powerful this way of expressing asynchronous computations is. At SoundCloud, we are moving most of our code that relies heavily on event-based and asynchronous operations to Rx observables. For convenience, we contributed AndroidSchedulers that schedule an observer to receive callbacks on a Handler thread. See rxjava-android. We are also in the process of contributing those operators back that allow observing observables from Fragments and Activities in an easy and safe way, as seen in the previous example.

In a nutshell, RxJava finally makes concurrency and event-based programming on Android hassle free. Note that we follow the same strategy on iOS using GitHub’s Reactive Cocoa library because we have committed ourselves to the functional-reactive paradigm. We think that it is an exciting development that leads to code that is more stable, easier to unit test, and free of low-level state or concurrency concerns that would otherwise take over your service objects.

To hear more about this topic, watch this interview with our Director of Mobile Engineering on Root Access Berlin and come see me at DroidCon UK 2013 where I will be speaking about RxJava and its use in the SoundCloud application on the developer track.

Herding HTTP Requests, or Why Your Keep-Alive Connection May Already Be Dead

One of the more expensive things you can do on Android, or mobile in general, is sending and receiving data over a mobile network connection. Holding a dedicated channel in particular causes high battery consumption, as much as a hundred times more compared to being idle. Moreover, there is overhead on various levels when establishing a dedicated connection: First, it has has been shown that poor timing w.r.t. sending network requests can leave the connection in states with high power consumption before going back to idle, even when it’s not actively used. Second, establishing an HTTP connection, especially when using secure sockets (HTTPS), will result in a handshake being performed between the mobile client and the server, even when multiple requests are sent in quick succession. If no proper precaution is taken, these handshakes will occur N times when N requests are being sent, and your app will incur the associated extra cost of network traffic and battery consumption.

To mitigate these issues, it is generally advisable to batch HTTP requests whenever possible and send them over a persistent HTTP connection. This can be implemented using request queues (persisted or in memory) which are emptied at a given time using HTTP Keep-Alive. At SoundCloud we have recently implemented a tracking component which batches up a number of requests and flushes it in certain intervals to minimize the aforementioned overhead. However, we stumbled a few times when trying to make persistent connections work with HttpURLConnection, so make sure to double check your own implementation against the next few paragraphs or you may unwittingly send a large number of requests over separate connections, even when Keep-Alive is enabled. So let me break it to you:

HTTP Keep-Alive on Android does NOT just work

Fortunately for us, Android sets the Keep-Alive header by default, which a quick glance at the header fields of a newly opened HttpURLConnection shows. However, we were surprised to see that Android would still happily open new connections when using code like the snippet below:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
Collection<String> requestUrls = ...;
HttpURLConnection connection = null;

for (String url : requestUrls) {
    try {
        connection = (HttpURLConnection) new URL(url).openConnection();

        connection.setConnectTimeout(CONNECT_TIMEOUT);
        connection.setReadTimeout(READ_TIMEOUT);
        connection.connect();

        final int response = connection.getResponseCode();

        if (response == 200) {
          // handle success
        } else {
          // handle error code
        }
    } catch (IOException e) {
        // ...
    }
}

if (connection != null) {
    connection.disconnect();
}

Sniffing these requests with Wireshark showed that they were sent using separate TCP streams coming from different client ports (indicating separate connections.) Looking at the code, we’re not closing the connection until we’ve sent the last request, and we made sure Keep-Alive is actually set for every request, so what’s the problem here?

Finding the culprit

Revisiting Persistent Connections we found this curious paragraph:

When the application finishes reading the response body or when the application calls close() on the InputStream returned by URLConnection.getInputStream(), the JDK’s HTTP protocol handler will try to clean up the connection and if successful, put the connection into a connection cache for reuse by future HTTP requests.

Apparently, in order to actually being able to reuse a connection, the implementation must know where one HTTP request on the same TCP connection ends and the next one starts. In our case, we weren’t interested in the response payload since these were fire-and-forget style requests, so we simply added a

connection.getInputStream().close();

as the documentation suggests. This, still, did not work for us. Again Android would not reuse TCP sockets but open a new one for every request; we haven’t found out why this is happening so please leave me a comment if you do. From here on you’re left with two options: You either fully consume the response payload, which you may want to do anyway depending on your mileage. This would mean reading bytes from the input stream until hitting the EOS byte. Alternatively, if you’re not interested in the server’s response, you should do as the document above suggests and send an HTTP HEAD instead:

1
2
3
4
5
6
7
8
9
10
11
12
13
...
for (String url : requestUrls) {
    try {
        connection = (HttpURLConnection) new URL(url).openConnection();

        connection.setRequestMethod("HEAD");

        // as above
    } catch (IOException e) {
        // ...
    }
}
...

Note that in this case, neither closing the input stream nor consuming it is required since there will be no response payload to read from. Returning to Wireshark, here’s what the communication with the server looks like now:

This is what we wanted to begin with: Client and server establish a TCP stream, then sending all six GET requests over the same connection. After doing our work, either the client or the server will terminate the connection by starting a FIN exchange. Here it is important to understand that calling connection.disconnent() does not guarantee triggering FIN. From the documentation of HttpURLConnection:

Once the response body has been read, the HttpURLConnection should be closed by calling disconnect().
Disconnecting releases the resources held by a connection so they may be closed or reused.

The emphasis here is on may: as you can see from the previous screenshot, it was the server that decided to terminate the connection at some point, and not our call to disconnect, since the first FIN is sent by the destination not the source.

Use Android’s @+id Notation With Care

Being sloppy when it comes to managing your application’s resource IDs can lead to subtle bugs that are difficult to find and debug. Imagine a scenario where foo_activity.xml holds the layout definition for FooActivity, and you define three different variants for different screen configurations:

res/layout/foo_activity.xml
res/layout-land/foo_activity.xml
res/layout-sw600dp/foo_activity.xml

Now, with some certainty these three layouts will share the same views, with the same IDs, just slightly differently styled or arranged. Let’s furthermore assume in all three layouts, we have a TextView:

FooActivity will of course retrieve a reference to this TextView via findViewById:

Now what happens if in the main layout file (layout/foo_activity) you change the view’s ID? You may be surprised to hear that your application will still compile. That’s because the old view ID, my_text, still exists in R.java, since while now gone from layout/foo_activity.xml, it’s still (re)created using the @+id notation in the other two layouts, thus continuing to exist in the ID pool. Whenever these layouts are loaded and you reference the new ID from FooActivity, then of course your application will crash.

The problem here stems from violating the DRY principle: we’re carelessly repeating the code which creates a resource ID, when ideally, it should only ever be found in one, and only one part of the application. To recall what @+id does, it’s an idempotent “create this ID” action. In other words, if that ID has not been defined yet, it will get defined, otherwise it will be used. So it’s safe to use this notation multiple times with the same ID, which may be the reason why people overuse it: it looks like a safe bet, when it’s actually not.

There are three approaches I have tried to deal with this issue:

1 - Pulling view IDs into styles

When redefining views multiple times in different layouts, one approach could be to extract the respective view IDs into a style, then apply the single shared style to all three variants of the view:

While I first favored this, there are several problems with this approach: first, it reduces the visibility of IDs, which can be confusing when dealing with views in RelativeLayout, where you reference views using IDs. Moreover, IntelliJ IDEA at least will get terribly confused and issue an error, since it doesn’t resolve styles to inspect the correctness of a layout file (it’ll assume the view is missing the ID attribute.) Lastly, and this is purely a style question, one could argue that styles should be concerned with only visual appearance, not structural attributes like IDs.

2 - Pulling view IDs into ids.xml

Another option is to pull the shared view IDs into a global resource file, e.g. in res/ids.xml:

This will turn the respective view IDs into first class resources themselves, and by extension make them reachable via R.id and in all layout files. Here, too, no @+-notation is required in any layout files anymore. You would define shared IDs in one, and only one location.

The problems with this approach are similar to 1, but it clears up with the stylistic problem of defining IDs in a style sheet.

3 - How We Do It (TM)

We ended up taking a third route, which is one of convention. We’ve agreed on establishing a rule which says that it’s fine to use @+id in layouts files, but only use it in the default layout file, i.e. the one located in res/layout. Whenever a layout is overloaded using different configuations, then even for the same views, @id should be used. That way we ensure that there is only a single location where an ID actually gets defined, without taking it completely out of context when working with layout XML. Moreover, changing the view’s ID will lead to compilation errors, since all overloaded variants now reference a non-existing ID.

I’d be interested to hear how everyone else deals with this. 

 

Update Them Remotes!

In an act of malevolence, I decided to update my Twitter and GitHub to use the same username. This means that repository URLs have changed and everything can now be found under:

https://github.com/mttkay

So update those remotes! Here’s how:

$ git remote set-url <remote> <new_url>

Sorry for the confusion.

Unblock Us Manager for Android Updated With Region Switcher

I have pushed out an update to Unblock Us Manager on the Google Play store which adds the content region switcher a few users, including myself, were asking for. I’ve added a few small UI improvements along the way.

Drop me a line if you find the app is not working well for you.

 

Unblock Us website

Unblock Us on Twitter

Unblock-Us Manager Now Available on Google Play

I’m a big proponent of Unblock Us, a VPN-less solution for using online service that are not available in your country, such as Netflix, Hulu, Pandora etc., in your country. It’s purely based on DNS, which means it works on all network connected devices and has practically zero performance impacts.

One thing that always bugged was that one has to reactivate their IP address with Unblock Us whenever it changes, e.g. after rebooting your WiFi router. Previously one had to go to unblock-us.com, wait for the service checker widget to detect that you had a new IP address, and manually reactivate it by clicking a pop-up link. On Google TV this was particularly annoying, since the browsing experience isn’t exactly great and it was just uncomfortable, since you didn’t even realize until e.g. Netflix would stop working.

Long story short, I’ve written an Android application which does all that for you, automatically. It sits idle on your Android device and will be woken up by the system whenever your network connectivity changes. It will then automatically register with unblock-us.com with the current IP address. This should ensure a seamless viewing / listening experience for Unblock Us supported services.

The app is tiny, requires merely internet access permissions, and doesn’t even require a password–just your email address you used to sign up with Unblock Us. Your email address will be used to send an activation request to unblock-us.com.

Ideas for new features:

  • Provide a region chooser similar to the one on the website
  • Provide a service monitor that will notify you about outages of Netflix and other services

Download for free on Google Play

Why the Google TV SDK Lacks an In-app Live TV Widget

I have been following Google TV for a while now, and right after its announcement I was pretty excited, pondering the vast amount of opportunities for creating Android apps that would enrich your TV viewing experience.

That said I was pretty dumbstruck to learn that a PIP (picture-in-picture) component, i.e. an Android view displaying the live TV stream to be embedded in your applications, does not exist (or more precisely: is not available for you to use.) In other words, the most obvious and powerful use case for an Android based TV platform, namely building Android applications around a live TV signal, is not possible to realize. Quoting from the documentation [1]:

Note: The picture-in-picture (PIP) feature of the Live TV app is not available to other Android applications. Also, you can’t run an Android application in the small PIP window.

 

As puzzling as Google’s decision to not provide such a component may sound at first, there is actually a very simple explanation for it. After last week’s DevFest event in Berlin, I had a chance to speak to Google’s Matt Gaunt, developer advocate for Google TV in London, about this very issue. Apparently, the lack of a PIP component has purely legal reasons. Think about it: if any Android application would be able to overlay its content over a TV channel, that channel’s content could easily be compromised and the viewer tricked into believing that the overlaid content is actually part of the show. While this would not be an issue with “trusted” services such as, say, a content overlay containing background information for a news show, malicious apps could easily exploit this to misrepresent or otherwise distort the content aired by the TV channel.

Hence, to make this happen, Google would have to ask every single TV channel in the world for permission to allow content being overlaid by third party apps. You can easily see how unlikely this will ever be to happen, since Fox News is probably not interested in apps waving an Obama flag in front of their channel logo, to name one example.

For now, unfortunately, Google TV remains a mere TV-or-apps experience, not both.

[1] https://developers.google.com/tv/android/docs/gtv_android?hl=en#HardwareFeatures

Android’s New Build System: Good Times Guaranteed.

As of now, most of you probably heard that Google is abandoning Ant and is moving to a new, modern build system based on the excellent Gradle build system. I have been a big fan of Gradle for some time now, and tried promoting and working on its Android features in my spare time (as part of gradle-android-plugin), so this comes as fantastic news to me.

One thing I was worried about is that with a big change like this, the de facto standard for building, packaging and distributing Android library projects as created and promoted by the great guys behind android-maven-plugin, i.e. the apklib artifact kind, would now become obsolete shortly after it had turned out so successful. Keep in mind that all work on Maven related features was and is entirely community driven, and not directly supported by Google.

However, android-maven-plugin’s Manfred Moser just clarified on the developer mailing list that in fact they had been working together with both Xavier Ducrohet (the Google SDK tools lead) and Hans Dockter (founder of Gradle and Gradleware’s CEO) to streamline their efforts. Here’s what will happen:

  • Programmatic access to SDK tools functionality will be provided by an open-source library built and maintained by Google, and will be available on Maven Central
  • The former apklib artifact type will be improved upon and renamed to .aar (Android Archive), in the spirit of Java’s JAR and WAR types

The Maven Android plugin will start supporting this new format in the forthcoming releases.

Now what does all this mean for developers? It means mostly two things: First, with Google’s move to Gradle, there will be an excellent build system available out of the box with all the neat features Gradle provides, such as exposing a nice DSL using the Groovy programming language, dependency management via Ivy, plugin support, etc. pp. Second, no one using Maven now to build their Android apps will have to make the switch: android-maven-plugin will continue to work and co-exist, and even better, it will be a symbiotic co-existence between the two systems.

If you want to get a 5 minute head start into Gradle for Android, have a look at my slide deck “Hands on the Gradle”.

Good times guaranteed!