Code

Motivation

Work in progress

What is a "specification language"?

In a lecture, back in 2010, Joe Armstrong said:

"We need languages to describe encodings and protocols not machine instructions"

He then went on to elaborate saying that we have plenty (too many) programming languages (e.g. C, Java, Go, Python, Erlang, etc) which at a high-level allow us to manipulate machines via machine instructions (and syscalls). However we only have a few specification languages which describe encodings and protocols, i.e. describe what's going on between two, or more, machines!

In a later talk (2013) Joe has the following slide:

Joe Armstrong's talk "The How and Why of Fitting Things Together"

Where the black boxes are written using programming languages while the big red arrow between is what Joe means with encodings and protocols. It's this big red arrow which is what we need different languages for, and to distinguish them from "normal" programming languages, we shall call them specification languages.

Spex tries to be such a specification language.

The difference between APIs, encoding and protocols

Before we can start talking about the shortcomings of current specification languages, it's helpful to first break down Joe's red arrow. It actually has three components:

  1. The interface of the blackbox (the API), i.e. what messages the component accepts;
  2. The encoding of the messages, i.e. what bytes (or bits) are sent over the communication channel;
  3. The protocol, i.e. what sequences of messages are allowed.

Hopefully it's clear what encodings are (e.g. UTF-8 encoded JSON or binary Protobuf messages), but it might be worth pondering the difference between APIs and protocols.

In the latter of the above mentioned talks, Joe gives an example where merely knowing the API isn't good enough. The example he gives is the POSIX filesystem API:

Nothing in the API says, for example, that we can't continue writing to a file descriptor that has been closed! This is where protocols come in, saying for example that we can only write to a file descritor that is open, etc.

One last point to note is that APIs can be synchronous or asynchronous, as in the client of the API gets a response back immediately upon making the call or it (potentially) gets the response sometime in the future.

What are the problems with past approaches?

Now that we've established the differences between APIs, encodings and protocols, we can start talking about where the current specification languages fall short.

Choose two out of three

The first thing to note is that very few specification languages support specifying all three aspects. For example:

On the other side of the fence, we have things like session types which support specifications of APIs and protocols, but don't talk about encodings.

There's one exception which supports specifying all three aspects and that's Joe's Universal Binary Format (2002), it seems to have been largely forgotten about though.

Syntax and tooling

In addition to the aspets of what you can specify, it's also important to consider how pleasant it's to read and write such specifications and what you can do with them -- i.e. what tooling is available once we have a specification.

Ultimately the low use of specification languages must at least somewhat be a reflection of the fact that specifications are not worth the effort?

OpenAPI specifications are written in JSON or YAML, which verbose and hard to read. Microsoft's TypeSpec (2024) introduces a sane syntax inspired by TypeScript.

What the Spex specification language tries to do different

Spex tries to address this shortcoming of specifications by in addition to being a specification language, it's also a verifier that checks if some system respects the specification -- thereby always ensuring that the two are in sync.

In the process of testing the real system against the specification is will also produce minimal test cases for potential problems it notices along the way:

(We'll look at examples of how exactly this gets reported in the next section on features.)

In the future we'd like to derive more useful functionality from specifications, including:

With this future functionality we hope to get to the point where there's a clear benefit to writing specifications!

Conclusion