Seamless computing

In an earlier post (“Remember machine independence?“) I pointed out my dissatisfaction with the so-called domain specific languages (DSL) which are popping up everywhere nowadays, in support of new and esoteric programming platforms. The name DSL bugs me because I find it misleading. The right name would be ASL: architecture-specific language. They are not meant to solve problems in a particular domain, they are meant to program a particular architecture. Many such languages have some of the trappings of higher-level programming languages but also enjoy unconventional syntax and semantics, designed to match the platform.

I am not buying it. I think it is a mistake. I think the first and foremost principle of a higher-level programming language is that of machine independence. A higher-level programming language should allow the programmer to solve a problem rather than drive a physical system. This was a crazy idea in the 1950s, which went through all the stages that a crazy idea goes through:

First they ignore you. Then they ridicule you. And then they attack you and want to burn you. And then they build monuments to you. (Nicholas Klein)

The you in this quote was John Backus, who I am not sure whether he got a monument out of it (or whether anyone really wanted to burn him), but he did get a Turing Award, which is not bad. Not accidentally, the title of his lecture on the occasion of the award was Can Programming Be Liberated From the von Neumann Style? [PDF]. And machine independence figured high on the agenda.

Now the idea of machine independence is getting ridiculed again. As it was back in the 1950s the reason is exactly the same: computer programming experts are comfortable with machine-dependence in a programming language, and are worried that machine-independence will cause a substantial loss of efficiency. Which is both true and misguided at the same time, unless you are working on system-level programming. For application-level programming (scientific, corporate, etc) it rarely helps that your program is twice as fast if it takes twice as long to write.

In a 1958 paper called Languages, Logic, Learning and Computers, (J. of Computers and Automation) J.W. Carr III writes this:

One would like to be able to use these computers up to the day of the next one’s arrival, with interchange of problems and yet efficiency. Can this be done?

Carr is referring to the upgrade from IBM-650 to the “totally different” IBM-701 and IBM-704. This is before FORTRAN had been invented so the answer was not obvious at the time. But can we extend this point to our situation and replace ‘computers’ with ‘architectures’. Should our programs, written for the CPU, not work for GPU, FPGA, cloud, etc? The skeptics say ‘no’ and often argue that a mismatch between the programming language model and the system model leads to inefficiency. Perhaps, but in many circumstances loss of performance is a price worth paying. If we look at the languages of today, especially functional languages, they do not match the CPU-based architecture very well at all. They are declarative where computers are imperative. They use functions where computers use jumps. To compile (efficiently) one for the other a lot of optimization work is put in and the native systems are augmented by runtime systems to provide essential services such as garbage collection. But this does not invalidate the principle of having good programming languages even if they don’t match the architecture on which they run. Designing machines especially to run functional languages is actually a cool and interesting idea (see the REDUCERON).

Here is the same objection, taken from Carr’s paper:

The problems of efficiency have been often emphasized by programmers who believe strongly in the continuance of hand-programming over translational techniques.

“Hand-programming” means programming in assembly or even binary. “Translational” means using what we now call compilers. It’s safe to say history vindicates Carr and not those overly concerned with efficiency.

Bob Harper’s blog has a number of excellent posts on the duality between programming in the language and programming for the machine. They are all quite relevant to the point I am making, so have a look.

Machine-independence is clearly the necessary ingredient to achieve the kind of portability considered desirable by Carr. So we are doing something about it. We are developing a compiler for a conventional imperative and functional programming language (we call it Verity) which would allow effective and efficient compilation to any architecture. We already have a back end for compiling to FPGAs, which we call the Geometry of Synthesis project. It supports higher-order functions, recursion, separate compilation and foreign function interfaces to native HDL, all in an architecture-agnostic way. I wrote about this project on this blog before.

Distributed seamless computing

Today I want to talk distributed computing. From the point of view of supporting architecture-independent programming, distributed computing is a mess. Lets start with a simplistic program, just to illustrate a point, a program that in a functional programming language I might write as:

let f x = x * x in f 3 + f 4

And suppose that I want to distribute this code on two nodes, one that would execute the function f, one that would execute the main program. I can choose a high-level distributed programming language such as ERLANG. Here is how the code looks:

c(A_pid) -> receive X -> A_pid ! X * X end, c(A_pid).
main() ->
 C_pid = spawn(f, c, [self()]), C_pid ! 3,
 receive X -> C_pid ! 4, receive Y -> X + Y end
 end.

In the first program I describe logical operations such as multiplication and addition, and a logical structure to the program. In the second program I describe a system: I spawn processes and send messages and handle messages. A program which should be trivial is no longer trivial.

Things get even worse if I choose to distribute my program on three nodes: one for f, one for the function applications, one for whatever is left of the main program. This is the ERLANG code for this:

c() -> receive {Pid, X} -> Pid ! X * X end, c().
b(A_pid, C_pid) ->
 receive
 request0 -> C_pid ! {self(), 3}, receive X -> A_pid ! X end;
 request1 -> C_pid ! {self(), 4}, receive X -> A_pid ! X end
 end,
 b(A_pid, C_pid).
main() ->
 C_pid = spawn(f2, c, []),
 B_pid = spawn(f2, b, [self(), C_pid]),
 B_pid ! request0,
 receive X -> B_pid ! request1, receive Y -> X + Y end
 end.

Can you tell that these two ERLANG programs compute the same result? If you initially wrote the first two-node version and decided to re-factor to three nodes how hard would the transition be? You need to rewrite most of the code.

There is a better way. The way you might want to indicate nodes to the compiler should not be more than that:

let f x = (x * x)@A in f 3 + f 4

or

let f x = (x * x)@A in (f 3)@B + (f 4)@B

Is this possible? To the programmer who needs to struggle with the intricacies of writing distributed code this may seem like a pipe dream. But it is possible. Theorists of programming languages have known since the late 80s and early 90s that computation and communication are closely related. Whether it was Milner’s Functions as Processes [PDF], Girard’s Geometry of Interaction [DOI] or the early work on Game Semantics [see my survey] the message is the same: function calls are nothing but structured communication protocols. And this is not just pure esoteric theory, we really knew how to write compilers and interpreters based on it. Nick Benton wrote a very cool tutorial which also applies functions-as-processes to distribution. Why didn’t we write this compiler yet? Because we were afraid it might be too inefficient.

I think we should be braver and more sanguine and have a go at it. This worry could be misplaced or exaggerated. With my student Olle Fredriksson we made the first step, which is writing a compiler for a core functional programming language which can be distributed using light-weight compiler annotations just as described above. We call it seamless computing because it is distributed computing where all the communication between components is via function calls rather than via explicit sending and handling of messages. Explicit communication is the seam in distributed computing. This distribution is communication-free.

A paper describing the compiler was presented at the International Symposium on Trustworthy Global Computing earlier this year [PDF] and there is a video based on it which you can watch online. The compiler is available for download from veritygos.org. It is based on Girard’s Geometry of Interaction and it compiles down to C and MPI. There are some good things and some bad things regarding its efficiency. On the one hand the compiler does not require a garbage collector, which is very desirable in the distributed tokens. On the other, the tokens being exchanged during the computation have a size proportional to the depth of the call stack, which can be large especially for recursive calls. But this can be avoided by using a better implementation, still without requiring a distributed garbage collector. This is ongoing work. We are also currently expanding the compiler to handle the whole of Verity, so that it has concurrency and shared state. We are making great progress and it’s a lot of fun both as a theoretical challenge and as an implementation and engineering one.

We will keep you posted as things develop!

About Dan Ghica

Reader in Semantics of Programming Languages // University of Birmingham
This entry was posted in seamless computing. Bookmark the permalink.

2 Responses to Seamless computing

  1. Adrien says:

    Hello Dan,

    I’m curious: what do you make of the fallacies of distributed computing?
    http://nighthacks.com/roller/jag/resource/Fallacies.html

    Recently we began building upon your GoS work, while revisiting old areas such as finite mealy automata. Your work was very inspiring, but the practical impact in terms of programming languages may take a while to percolate (at least in our case!).

    • Dan Ghica says:

      It’s a bit like saying “Top fallacy of Turing machines: the memory is infinite”. Of course all those caveats are true, but you want your programming language and the runtime systems to shield you of all that complexity, as a programmer. So these issues must not be ignored, but they should be kept under the hood.