Moving sites

From now on new posts will appear on my Blogger site: This is for both technical (the comment system on this site, the occasional crashes) and personal (I may want to post personal or political opinions which need not be related to my employer) reasons.

Please update your subscriptions!

Posted in General | Leave a comment

Categorical semantics of digital circuits

Isn’t it funny that the world of programming languages is defined by syntax while the world of circuit design is defined by semantics? One can never be sure, but it seems to me that when people design software they imagine the code whereas when people design circuits they imagine the behaviour. This seems also to apply to the design of programming languages, where an endless variety of syntaxes exist, all essentially doing the same thing (crudely speaking) but in weirdly different ways. In hardware design there are only two syntaxes (languages) worth considering, VHDL and Verilog. Unlike programming languages, hardware language don’t even trap you into a particular ecosystem and most tools allow you to happily interoperate the two. These hardware languages may be, from the point of view of a programmer, quite horrible but, since I am typing this post on a working computer and sending it over to a global network of working computers, these horrible hardware languages seem to get the job done.

The hardware-software contrast carries over into the semantics of the two classes of languages. The semantics of programming languages is most commonly specified as syntactic transformation rules called operational semantics. For example in any programming language there is a rule that says that the phrase “1+1″ can be safely replaced with “2″. Of course, these rules can be very complicated, describing how functions are called, exceptions are managed, input-output interactions handled and so on. But the idea is that a program and some additional configuration information can be reduced to a “simpler” program with some (potentially different) configuration information.

On the other hand, the semantics of a circuit is most commonly given as sets of waveforms which describe directly the evolution in time of a measurable parameter at any chosen point in the circuit — a simulation, informally speaking. There are similar behavioural approaches to giving semantics to programming languages (denotational semantics) but they are not as widely used in applications (although they are very interesting things to study!).

What is quite surprising is that there is no equivalent of operational semantics for digital circuits — at least nothing I could find. The closest I can mention is the ground-breaking work by Mary Sheeran in the 1980s on using functional languages to specify circuit designs. As a part of this approach she specifies some equations for simplifying circuits, which can serve as part of an operational semantics — but more axioms are needed for that.

One thing that Sheeran’s work makes implicitly clear is that the conventional syntax of hardware languages is too unstructured to be useful for the formulation of axioms, and consequently for operational syntax-level modelling. Her Haskell-based syntax has the requisite mathematical structure so that axioms can be formulated. But how do we know what axioms are needed?

Circuits in general are diagrams of interconnected components (e.g. logical gates) and the way we like to think of them is up to topological isomorphism. This means that when we design a circuit we prefer not to think about the geometry of the interconnect (the wires lengths and shapes) but only the topology (what connects to what). The geometric details (in particular the presence of very long wires or especially very long parallel wires) is of course important for the physical realisation of the circuit, but at the logical design level we want to abstract away from geometric detail. Specialised place-and-route tools make sure that physical realisation of logical designs does not violate the abstractions we use.

The wonderful thing about diagrams-up-to-topological-isomorphism is that they can be usually described very precisely in the mathematical language of monoidal categories. This gives us a good language, and a good set of equations to work with. Unsurprisingly, the categorical specification language for circuit diagrams looks a lot like Sheeran’s Haskell-based specifications — this similarity has a name, the Curry-Howard-Lambek correspondence. The extra thing that we get though is the equations — all the equations that we need in order to start reasoning syntactically about circuits!

In a paper, jointly authored with my colleague Achim Jung, about t0 be published in the conference Formal Methods in Computer-Aided Design we pursue this observation and produce a categorical semantics of digital circuits. The main idea of the paper is that digital circuits can be shown to belong to a class of monoidal categories called control categories, which are (informally speaking) diagrams equipped with an unfoldable iterator. The rule of the iterator unfolding can be expressed diagrammatically as:

So whenever we have a circuit with feedback we can replicate the circuit, connect the two in a certain way, and get an equivalent circuit. This unfolding of the iterated circuit is the diagrammatic equivalent of the unfolding of a fixpoint combinator: Yf = f(Yf). This equation provides the essential means of syntactic reasoning about feedback.

Note that we prove that circuits satisfy this equation, we don’t simply require it axiomatically. The proof relies on smaller, more obvious, axioms which circuits must satisfy, such as the interaction between constant circuits denoting fixed logical levels and structural circuit components such as splitting and joining wires.

How is this syntactic approach helpful? We were mainly motivated by the filling this gap in syntactic reasoning about circuits, but it turns out interesting applications do exist. One such application are the so-called combinational circuits with feedback, where feedback loops are in some sense fake, i.e. they don’t create conventional feedback effects (latching, oscillation, etc.). Conventionally, digital design tools ban such circuits even though they can be very useful. This is an example of a circuit implementing, informally speaking, a circuit which behaves like if c then F(G(x)) else G(F(x)) efficiently, while using only one instance each of circuits and G:

The main example in our paper is to show how we can easily reason about such circuits without descending into complicated simulations.

The full paper is available from this link [PDF].

Posted in Geometry of Synthesis | 1 Comment

Why is lambda calculus named after that specific Greek letter? Do not believe the rumours!

A common myth in theoretical computer science is that the ‘λ’ in λ-calculus comes from some kind of typographical error, where the intended notation was a circumflex accent ŷ.M, which became a caret ^y.M then finally a lambda λy.M. At least on the math stack-exchange this is the most popular explanation [link].

We recently had λ-calculus pioneer Prof. Dana Scott give a series of lectures in our School. He was a PhD student of Alonzo Church. In the introduction of one of his lectures he told us the story of the Greek letters used in the λ-calculus: the α of α-conversion, the β of β-expansion, the η of η-expansion and of course λ itself.

He says that when Church was asked what the meaning of the λ was, he just replied “Eeny, meeny, miny, moe.“, which can only mean one thing. It was a random, meaningless choice. Prof. Scott claimed that the typographical origin myth was mainly propagated by Henk Barendregt and is just pure whimsy. He asked us to stop perpetrating this silly story.

I am attaching the link to the YouTube video with apologies for the low quality.

Posted in General, programming languages, teaching | Leave a comment

So what is a “pure programming language” anyway?

This is one of those questions that is inflammatory enough to start a flame war, so it should make a good title for a blog post. But instead of feeding the flames I want to pour some water or whatever appropriate flame-retardant material is appropriate. This post is largely related to a recent post of my friend Neel Krishnaswami who outrageously called Agda an impure functional programming language — to add to the confusion “pure” and “functional” are often unhelpfully conflated.

So what is “pure”? Languages which are not pure are commonly said to have “side-effects”. So conversely we could ask, what are “side-effects”? Changing the values of memory locations is perhaps universally considered a side-effect, and, more generally, interacting with the “outside” world is pretty much considered a side-effect (see Haskell’s IO Monad). But there are other kinds of programming language features which do not involve, strictly speaking, an interaction with the outside world but are largely seen as effects, perhaps most importantly continuations. Finally, there are yet another kind of programming language features which researchers disagree on whether they are effects or not, for example non-determinism but especially non-termination. In his post on Agda Neel has pointed out yet another feature which can be considered an effect, the nominal generativity of type declarations in languages such as Agda, also encountered in other languages.

This debate on what is and what is not an effect, and implicitly what is and what is not a pure language, is not entirely silly because the badge of purity is wore with pride by languages such as Haskell and Agda, so attributing effects to them is an insult, really. As an insult to Haskell, someone once said that it is not a pure language but it operates in the “blessed monad” of non-termination and uncaught pattern-matching failures — which is actually 100% true.

The problem is that this endless is-it-an-effect-or-not controversy doesn’t bring us closer to an understanding of what is pure and what is effectful, so I will propose a definition of pure vs. impure which is very general, in fact it transcends lambda calculus or even programming languages in general. It is a systemic proposal of purity vs. impurity — in the process I also propose to discard the term “effects” which is silly, really, because any computation has some effect — as someone pointed out, any computation will get your computer slightly warmer. And if you think this is silly, let me tell you that there are side-channel attacks based on detecting temperature changes in processors. My proposal is also in line with my greater project to give semantics to systems rather than just syntax.

So here it goes: suppose you design a system, which as all well-designed systems, has a certain equational theory. If your system is “the lambda calculus” then the key equations are alpha, beta and eta. These equations are of course essential in reasoning about your system. Now suppose that you realise that your system must be connected to other systems which do not observe the same equational theory — you have no choice. As I have argued elsewhere, any programming language to be useful requires a foreign function interface which allows it to interface with code written in a different, unknown, language. Your original system is the “pure” system and the external systems are the “impure” parts.

Programming languages (the civilised ones) are such extensions of the simply-typed lambda calculus. Add a fix-point operator, arithmetic and an if statement and you get PCF, but you also get non-termination and a previously-absent distinction between calling conventions (call-by-name and call-by-value have different equational theories). Is PCF pure? Relative to the simply-typed lambda calculus it is not, since it breaks its equational theory, but if you give it its own equational theory then it is pure. To PCF add local variables and you get Algol which, again, breaks PCF’s equational theory. Yet Algol is a very civilised language, with a very good equational theory.

So “pure” is an entirely relative term which we can meaningfully use to describe a system as equationally understood, and “impure” to denote an extension which doesn’t obey the same equations, making reasoning difficult or impossible. So I would say no language is “pure” but we have pure and impure extensions to any given reference language, which is always pure. Another way to think about impure extensions is simply as extrinsics or foreign functions.

An important role that type systems could (and should) serve is to delimit “pure” and “impure” code in the same program. By “impure” code I mean code that is observably influenced by impure features. As far as I know the only comprehensive attempt at this is John Reynolds’s Syntactic Control of Interference type system, which works on top of Algol syntax, to delimit “pure” code (i.e. without assignments) which he calls “passive” from “impure” code (i.e. which relies on memory locations) which he calls “active”. I find this a much more reasonable approach at dealing with “effects” by simply isolating them, rather than trying to give types to these effects as is the standard “monadic” approach. As I argued elsewhere this is simply too ambitious and it is unlikely to work when we interface with arbitrary systems, just because arbitrary systems are too messy and we have no right to expect them to be uniform enough to be typeable.

Posted in programming languages, system level semantics | Leave a comment

Learn OCaml!

I have been teaching OCaml to first year undergraduate students for a few years now. Why not share it with the world? To access the course follow the link at the bottom of the page.

Here is what you will learn:

  1. The pure fragment of OCaml: functions, polymorphism, patterns, variants, lists, (tail) recursion, map and fold
  2. Basic algorithms and data types: list algorithms, searching, sorting, queues
  3. Basic proofs of program correctness using structural induction, applied to Peano numbers and lists and supported by our home-brewed proof assistant Yup
  4. Some general theoretical concepts: isomorphisms, the Curry-Howard correspondence

On the web site you will find: lecture notes, lecture slides, lecture videos and over 30 programming assignments. The assignments come with automatic test harnesses so you can test your answers for correctness.

The course web site is accessible from

Posted in General | Leave a comment

Compilers as lazy denotational interpreters

This year I had the pleasure to teach, for the first time, a course on compilers. I have been doing research on compilers for a while now but I suppose I never really asked myself seriously “what is a compiler?”. I could recognize a compiler if I saw one, but this is not good enough for teaching. One paper that gives an excellent conceptual explanation is Ager et. al.’s [PDF] but it might be a little hard to read if you are not familiar with abstract machines (Krivine, SECD) and with Standard ML syntax. In this post I will follow the same methodology, but as it applies to a very simple expression language which has only arithmetic, input, output and variable binding and a very simple abstract machine. This will make it, I hope, clearer what are an evaluator, interpreter and compiler, and how they are connected to the key notions of abstract and virtual machine.

I should specify that this is not a recipe for writing efficient or realistic compilers. This is an exercise in understanding what a compiler is. The second caveat is that the perspective presented here is quite common for functional or other higher-level programming languages but not quite as relevant for C, which is close enough to the concrete machine that an abstract machine is not really necessary. (Unless we consider LLVM-based compilation as using virtual machine.)

The code used in this post is OCaml. Why? Because OCaml is the best programming language bar none for implementing compilers. If you don’t believe me feel free to translate or reimplement my code in your favourite language and see how it goes.

The object language

Let us consider this simple expression language:

e ::= n | x | e ⊙ e | let x = e in e | print e | read

where n stands for numeric constants, x for variable identifiers, and ⊙ for arbitrary arithmetic operators. We shall not deal with parsing at all as it is a totally separate topic, but we start from the abstract syntax tree, specified by this OCaml type:

type expression =
 | Const of int
 | Var of string
 | Binary of string * expression * expression
 | Let of string * expression * expression
 | Output of expression
 | Input

For example, the following expression

let z = (let x = read in let y = read in x + y) in print z

has the syntax tree:

Let ("z", 
    Let ("x", Input, (Let ("y", Input, 
        (Binary ("+", Var "x", Var "y"))))), 
Output (Var "z"))

The evaluator

The first step in the formal definition of a programming language is giving its evaluation rules in an effective way. There are many ways to do this and they fall under the general topic of operational semantics, which can have several flavours (e.g. big stepsmall step, etc.) of which you can learn more in Andy Pitts’s excellent tutorial [link]. Without too much further theorising here is an obvious evaluator for our expression language:

let rec evaluate = function
 | Const n -> n
 | Binary (op, e, e') ->
   let v = evaluate e in
   let v' = evaluate e' in
   op_of_opcode op v v'
 | Let (x, e, e') ->
   let v = evaluate e in 
   let e' = substitute x v e' in
   evaluate e'
 | Var x -> failwith ("undefined: " ^ x)
 | Output e ->
   let v = evaluate e in
   print_int v; print_newline ();
 | Input -> read_int ()

Let’s discuss the code:

  • A constant expression Const n is evaluated to just n.
  • An operator Binary evaluates the two arguments then applies the operator. The helper function op_of_opcode decodes the operator symbol into an actual (OCaml) operation:
    let op_of_opcode = function
     | "+" -> ( + ) 
     | "*" -> ( * )
     | ";" -> (fun x y -> ignore x; y)
     | (* and so on *)
  • The Let binder evaluates the definition of x then replaces x with it in the body expression e’ using a helper function to do the substitution:
    let rec substitute (x : string) (v : int) = function
     | Const n -> Const n
     | Binary (op, e1, e2) -> Binary (op, substitute x v e1, substitute x v e2)
     | Let (y, e1, e2) when x = y -> Let (y, substitute x v e1, e2)
     | Let (y, e1, e2) -> Let (y, substitute x v e1, substitute x v e2)
     | Var y when x = y -> Const v
     | Var y -> Var y
     | Output e -> Output (substitute x v e)
     | Input -> Input

    Note that substituting in a Let expression must ensure that the defined variable is never substituted as it would violate the obvious scoping rules of a local definition, hence the special case.

  • As a consequence, all Variables in a program are substituted at the point of definition se we shouldn’t encounter them. If we do, they must have been undefined so we raise a failure.
  • Output evaluates its argument, prints it and returns it as a result. Since we are in an expression language we always need to produce some kind of integer result and this seems sensible.
  • Input reads an integer and returns it.

This is a reasonable definition of our language’s semantics but it is rather impractical for two reasons:

  1. In order for this evaluator to execute the program it needs to modify the abstract syntax tree during the evaluation. This is clearly a bad idea for several reasons:
    1. Substitution or any other kind of AST manipulation will be a complicated operation which is unlikely to be easily mapped into smaller, machine-friendly operations.
    2. Substitution or any other kind of AST manipulation will always be inefficient in the extreme.
    3. Substitution is incompatible with using libraries, foreign-function interfaces, or runtime services.
  2. In order for the evaluator to execute a program it relies on the full power of the meta-language, in our case OCaml. Why is this a problem? We know that eventually we want our program to be executed by some machine, and it seems like a bad idea to assume that machine will be able to run OCaml, which is a big and complicated language requiring a sophisticated run-time environment. We need something simpler.

A compositional evaluator

We first remedy the problem of modifying the AST by giving a compositional evaluator in which the meaning of a term is defined from the meanings of its subterms. This is the key principle of denotational semantics which goes back to Scott and Strachey for programming languages and to Tarski for logics. To implement it we will add an environment to our evaluator, which is a dictionary mapping variable identifiers to values:

let rec cevaluate env = function
 | Const n -> n
 | Binary (op, e, e') ->
   let n = cevaluate env e in
   let n' = cevaluate env e' in
   op_of_opcode op n n'
 | Let (x, e, e') ->
   let env = (x, cevaluate env e) :: env in 
   cevaluate env e'
 | Var x -> lookup x env
 | Output e ->
   let v = cevaluate env e in
   print_int v; print_newline ();
 | Input -> read_int ()

The helper lookup function is:

let rec lookup x = function
 | [] -> failwith ("undefined: " ^ x)
 | (x', n) :: env when x = x' -> n
 | _ :: env -> lookup x env

Except for the treatment of variables, which are looked up rather than substituted, the two evaluators are identical. A mathematically-minded reader could prove (e.g. in Agda or Coq) some soundness results, e.g.

∀M.∀n. evaluate M = n ⇒ cevaluate [] M = n

Note that this is not entirely a trivial exercise because of the variable scoping rules! Also, keeping track of effects (input and output) presents some further formal challenges.

Abstract machine and interpretation

The second step in our process is to try to write an evaluator restrained to particular operations. What operations? That depends on what our ultimate target architecture might be. We don’t necessarily want to use at this stage a genuine concrete machine specification (e.g. x86) because it would be extremely complicated, albeit doable — barely. What we want is an intermediate layer of abstraction. What we want is a set of operations that we believe can be refined efficiently into concrete machine operations. What we want is an abstract machine.

For functional languages the standard abstract machines are Krivine for call-by-name evaluation and SECD for call-by-value, but other abstract machines exist, for example the Token abstract machine used to interpret linear logic [link]. Designing abstract machines is a fascinating and fun exercise, a beautiful mix of engineering and mathematics. They truly are the nexus between language and machine which make compilation of higher-level languages possible. 

For purpose of illustration I will use a totally made-up and very simple abstract machine, the PSM (pure stack machine). It is a stack equipped with some operations:

  • push a  new element onto the stack
  • pop the top of the stack
  • peek at the n-th element of the stack
  • apply an operation to the top two elements of the stack and leave the result instead
  • swap the top two elements of the stack
  • ouput the top of the stack
  • input at the top of the stack.
The OCaml implementation is this:
let stack = ref []
let push x = stack := (x :: !stack)
let pop () = stack := !stack
let peek n = stack := (List.nth !stack n :: !stack)
let apply op = stack := (function x :: x' :: xs -> (op x x') :: xs) !stack
let swap () = stack := (function x :: x' :: xs -> x' :: x :: xs) !stack
let output () = print_int (List.hd !stack); print_newline ()
let input () = stack := (read_int () :: !stack)

When we implement the abstract machine we purposefully move into the imperative world because machines are naturally imperative. Machines have a state which is changed by its operations. In our case this state is simply a global stack. This presentation can be made even nicer, at the cost of some sophistication, by packaging the stack as a functor which would hide all the implementation details from the user. I leave this as an exercise for the sake of a more elementary description.

We are now in a position to write an interpreter for this abstract machine, i.e. a compositional evaluator which will only use the abstract machine operations:

let rec interpret symt = function 
 | Const n -> push n
 | Binary (op, e, e') -> 
   interpret symt e;
   interpret (inc symt) e';
   apply (op_of_opcode op)
 | Let (x, e, e') -> 
   interpret symt e;
   interpret ((x, 0) :: (inc symt)) e';
   swap ();
   pop ()
 | Var x -> 
   let n = lookup x symt in
   peek n
 | Output e ->
   interpret symt e;
   output ()
 | Input ->
   input ()

Lets discuss the code:

  • A constant expression is interpreted by pushing the constant onto the stack.
  • A binary expression will interpret the operands and, assuming that the two values are left onto the top of the stack, will apply the operation.
  • A let expression will interpret the definition e, which will be left at the top of the stack then it will interpret the body e’ remembering the offset of the value for x. We want the value of the body only to remain onto the stack se we swap the value of x, which is on the second position at the top, then we pop it out.
  • Variables are looked up at their known stack offset.
  • Input and output are obvious.

The interpreter uses an extra argument, a symbol table mapping variable identifiers into stack offset, with helper function

let inc = (fun (x, n) -> (x, n+1))

All the “execution” is now done in terms of abstract machine operations, even though we are using a few other operations for managing the symbol table. That is not a serious problem. In fact this mapping between variable identifiers and stack indices could be considered part of the parsing process.

At this point we have our interpreter!

The mathematically minded reader could now think about proving some soundness results about this interpreter, which even for a small language such as this one is quite a (formal, if not mathematical) effort:

∀M.∀n. evaluate M = n ⇒ run M = [n]


let run m = stack := []; interpret [] m; !stack

which means running the interpreter leaves a singleton stack containing the evaluated value.

Compilation, virtual machines and native code generation

Once we have the interpreter, the compiler is surprisingly easy to implement. The interpreter runs the abstract machine instructions as it process the program. A compiler will simply collect, in a lazy way, the instructions (and save them to a file)!

let rec compile symt = function 
 | Const n -> "push " ^ (string_of_int n) ^ "\n"
 | Binary (op, e, e') -> 
   (compile symt e) ^
   (compile (inc symt) e') ^
   "apply " ^ op ^ "\n"
 | Let (x, e, e') -> 
   (compile symt e) ^ 
   (compile ((x, 0) :: (inc symt)) e') ^
   "swap\n" ^
 | Var x -> 
   let n = lookup x symt in
   "peek " ^ (string_of_int n) ^ "\n"
 | Output e ->
   (compile symt e) ^
 | Input ->

The code is strikingly similar. We could (almost) textually replace sequential composition with concatenation and the call to the function implementing the abstract machine instructions to the name of the instructions themselves. I have renamed both functions to f and placed the interpreter and the compiler side-by-side:

For illustration let’s look at one test expression, m = let x = 6 in let y = 3 in x * y:

# m;;
- : expression =
Let ("x", Const 6, 
Let ("y", Const 3, 
Binary ("*", Var "y", Var "x")))
# evaluate m;;
- : int = 18
# cevaluate [] m;;
- : int = 18
# run m;;
- : int list = [18]
# compile [] m |> print_string;;
push 6
push 3
peek 0
peek 2
apply *
- : unit = ()

The very last step is processing the output of the compiler, the bytecode. Here we have two choices:

  1. The bytecode can be interpreted by what is usually called a virtual machine, which runs the bytecode on some concrete machine in the same way that our interpreter ran the original program on the abstract machine. Note that the bytecode is just a simpler programming language, but the principle remains the same.
  2. The bytecode can be itself compiled just by tracing the execution of the virtual machine, in the same way in which we traced the execution of the abstract machine to derive the object language compiler. Again, the principle is the same.


Compiler development is still, unfortunately, somewhat of a black art. I hope this post was informative, and especially I hope that the methodological importance of the abstract/virtual machine came through. This is indeed the key concept that makes compilers work. A good abstract/virtual machine must be such that the object language can be interpreted/compiled by/to it efficiently and its own instructions, the bytecode, must be interpreted/compiled efficiently by/to the concrete hardware platform targeted.

I also wanted to illustrate the role that operational semantics (the non-compositional evaluator) and denotational semantics (the compositional evaluator) can play in rational compiler development, and how they can be gradually refined into interpreters and compilers. I hope that made sense!

Posted in teaching | Leave a comment

A short fable of software engineering vs. regular engineering

An acquaintance recently took up an assembly line job with McLaren automotive — the people who make F1 and luxury sports cars. Before this gig, he used to work from his garage doing custom body and paint work, and developed quite a reputation for skill and quality. Working on the assembly line pays better but he is not very pleased with the monotony of it: all he does all day long is fit one small body panel on the chassis. The panel is part of the front wheel arch and it costs about £7,000.

One day a panel didn’t fit properly. It was off by a few millimeters, nothing that my guy could not fix in a few minutes of careful sawing and sanding. To his surprise what happened was very different. That chassis was taken to one side and work on it was suspended for days while a whole bunch of engineers tried to figure out not how to fix the problem but why the problem happened in the first place. My friend, the assembly line worker, put it down to the stupidity of engineers and the bureaucracy of management. In the end he was not informed how the problem was solved and what the investigation uncovered.

My friend was quite obviously an accomplished hacker. His hacker mentality would have fared quite well in software production, where debugging is part of the normal workflow, but was out of place in a conventional engineering setting. His story made me reflect on the huge gap between car and software engineering.

I wonder how many years until software production will reach the level of maturity where we don’t just fix bugs but, when bugs happen, we stop in our tracks, shocked that bugs happened at all in the first place, and only proceed after we had a serious look at our processes. I wonder how many years until our software will look like this

rather than like this

Posted in General | 2 Comments

Computability: The Greatest Law of Physics

Wigner famously declared mathematics to be “unreasonable effective” in the natural sciences. It is indeed something magical about the fact that if we have a mathematical theory and some mathematically representable aspect of reality then we can carry out abstract, conceptual, mathematical reasoning and formulate some new facts about reality which will turn out to be true. Even more amazing, such theoretical reasoning can lead to genuine discoveries about reality — new observations. According to Lakatos this is the most important thing about a scientific theory. Indeed Eddington would have never ventured to the island of Principe to take measurements of the Hyades during a solar eclipse unless Relativity predicted that he would see something interesting. Haley would have never predicted the return of the comet that now bears his name.

The validity of a scientific theory rests on the correspondence between a mathematical model and measurements made of the physical world. The same relationship between an abstract, essentially mathematical model, and the physical world, usually a special-purpose device, lies also at the heart of computation. This relation, called the implementation relation by Chalmers, was first introduced in the context of trying to figure out what things can be said to compute. For most Computer Scientists this is not particularly exciting or interesting because of our Engineering-minded approach. We build computers so that they satisfy an implementation relation. In fact we rarely distinguish between “abstract” mathematical computation and “concrete” physical computation. We casually deal in “abstract machines”, such as the Turing Machine, without worrying too much whether we are talking about its mathematical definition or an actual physical instance. We just know that the latter can be easily built so that it implements the former.

However for those interested in Natural Computing, the implementation relation is far more interesting because an implementation relation is discovered rather than constructed. Some natural occurring phenomenon (the behaviour of slime mold, for example) can be used to calculate with because it implements a complex computation. Natural Computing works because such phenomena, implementing complex computations, exist. In some trivial sense any physical system can be used to compute facts about its own mathematical model, although that is hardly interesting. Natural Computing is interesting because particular physical systems can compute fast solutions to problems of general interest, which can actually encode the evolutions of other physical systems that evolve much slower or that are complicated or expensive to build.

What is truly fascinating is that no physical system can implement an abstract model which is mathematically uncomputable. For me this is indeed the greatest law of physics in terms of its generality, as it relates any abstract mathematical model to any physical system. The non-implementability of uncomputable models is an impossibility result of a staggering magnitude. And it is truly a falsifiable (in the logical, Popperian sense) scientific law. Scientists, most notably Penrose, suggest that some neuronal processes in the brain may be physical yet uncomputable, but the evidence for this is very sketchy. More generally, philosophers propose that the brain is an instance of a physical yet non-computable process, mainly based on interpretations of Goedel’s incompleteness theorem which I find rather naive. But at least this shows that we can logically conceive of such systems.

In fact I would say that not just computability in the sense of Church-Turing, but the whole of computability theory, including complexity analysis, is essentially formulating extremely general laws of physics between whole classes of formal models and any physical system which might implement them. Computer Science is truly deserving of its name!

Posted in armchair philosophy | 1 Comment

A simple proof checker for teaching

For the last few years I have been teaching functional programming to first year students. One of the main benefits of using functional programming is that you can prove simple, but non-trivial, properties of simple programs, for example, that the empty list is right-neutral for append:

append xs [] = xs

The practical problem with teaching a large group of students is that you end up with a large pile of assignments to check, which means a large number of teaching assistants of various degrees of competence in reading and understanding proofs. Note that for programming assignments we didn’t have this problem because we could use automated random unit testing. Pedagogically, having inconsistent marking standards is not only irritating to students but potentially devastating in the final exam if TAs habitually accept wrong proofs as correct in the assignments. The final exam is marked by me, and I may not be as blind to mistakes.

One possibility is to use proof assistants such as Coq, Agda or Isabelle, but they are a bit too intricate for first year students who are just learning the basics of proofs, logic, recursion and induction. What we want is a proof checker tailored to the needs of the beginning student. These were our design principles:

  1. The syntax of production-level proof assistants aims for concision, elegance and expressiveness. Our aim was a syntax that matched as much as possible the informal mathematical vernacular that students are used to from pen-and-paper proofs.
  2. Production-level proof assistants have various levels of interaction and automation which are not only useless to the beginning student but actively get in the way of learning proof techniques because they may encourage a trial-and-error approach to proofs. Also, very simple fragments such as pure propositional may be entirely automated in such proof assistants.
  3. Even production-level proof assistants can be annoyingly clumsy when it comes to writing equational proofs. However, most of the proofs that we ask our students are equational. Doing congruence reasoning comes very easy to students, but doing it in proof assistants formally is a pain. This aspect is something we wanted automated.
  4. We also need a generally high level of user-friendliness.

The source code, documentation and a range of examples from the simplest to moderately complicated are available on Github. Here I am just going to give the proof for the very simple example above. If you find it appealing I recommend you read the full documentation on-line.

append : 'a list -> 'a list -> 'a list ;
[append nil]: forall xs:'a list.
              append [] xs = xs : 'a list ;
[append ys]:  forall xs:'a list. forall ys:'a list. forall y:'a.
              append (y::ys) xs = y :: append ys xs : 'a list ;
Theorem [append neutral]:
Statement: forall xs:'a list.
           append xs [] = xs :'a list 
Proof: by induction on list :
case []:
 we know [0]: append [] [] = [] :'a list
              because [append nil] with ([]) .
 by [0]
case (hd :: tl):
 [IH]: append tl [] = tl : 'a list .
 we know [1]: append (hd :: tl) [] = hd :: (append tl []) :'a list
              because [append ys] with ([], tl, hd) .
 we know [2]: append (hd :: tl) [] = hd :: tl : 'a list
              because equality on ([1], [IH]) .
 by [2]

A comparison with the Agda equivalent is perhaps a good illustration of our trading off succinctness for readability. Note especially the need for an explicit congruence principle in the Agda proof. The list data type and append (++) are pre-defined. (Coq is not really comparable because it is tactics-based.)

cong-∷ : {A : Set} → (x : A) → ∀ xs ys → xs ≡ ys → x ∷ xs ≡ x ∷ ys
cong-∷ x xs .xs refl = refl
unitr[] : {A : Set} → (xs : List A) → xs ++ [] ≡ xs
unitr[] [] = refl
unitr[] (x ∷ xs) = Goal where
 p : xs ++ [] ≡ xs
 p = unitr[] xs 
 Goal : x ∷ (xs ++ []) ≡ x ∷ xs
 Goal = cong-∷ x (xs ++ []) xs p

The author of the tool is my intern student Yu-Yang Lin, with essential supervision and help from Neel Krishnaswami. Funding for the project was provided by the College of Engineering and Physical Science, University of Birmingham.

Your feedback is most welcome! Even more welcome are more examples, bug reports and bug fixes. If you are looking to make a more substantial contribution, parsing errors could be vastly improved.

Posted in General | Leave a comment

What things compute?

I suppose we can agree that computers compute. We know that they do that because we make them do that. We have a program, on paper or in our head, and we have certain facilities to load that program into the computer memory. This is called implementing the program. Then, when the program runs we can compare the actual behaviour of the physical computer with the expected behaviour defined by the abstract program. Note that this expected behaviour we can determine “by hand” via manual calculations, at least in principle. Programs in the abstract are mathematical objects, programs in the concrete are behaviours of physical devices we call computers. I suspect trained computer scientists will find very little controversial (or exciting) in the clarification above. This is the so-called “implementation theory” of computing, advocated among others by the influential cog-sci philosopher David Chalmers.

This definition is, I would say, sound in the sense that things and behaviours that are described by it are always computation. But is it complete? Are there notions of computation that fall outside of it? If we say “no” then we should expand the definition somehow — lets get back to that. But if we say “yes”, then what are the consequences? Implementation theory assumes that the abstract program defines the concrete program, which is an instance of it. A computer is then a device that accelerates computation by some physical means. The crucial assumption is that the abstract program, as a mathematical object, requires design and therefore intent. Accordingly, only artificial things can compute. We cannot say, for example, that a brain computes. There seems to be something that resembles “information processing” or “computation” in the way a brain behaves, but we don’t actually have the anchoring reference that is the abstract program to compare it with and say that it is an implementation of it. This may seem like nit-picking but it’s a genuine problem.

Outside artificial things designed to compute, i.e. computers, what else computes? What is computation, if anything, as a natural phenomenon? I haven’t seen a good answer so far. I have seen however two classes of bad answers.

The first bad answer is something that has to do with a “system” having “inputs” and “outputs”. This is bad on two accounts — without even going into the deeper philosophical problems of causation as a scientific concept. The first problem is that any natural system has many things that can be considered as “inputs” and “output”. For example if we throw a rock, and we take the initial velocity vector as an input and its space-time coordinates as an output, we can say that it is a system that computes a parabola tangent to that vector. But there is necessary intent in the choice of what measurements are considered as inputs and output. Any object has a trajectory in a highly-dimensional space of properties, and any projection of that trajectory on a lower-dimensional space of measurements defines a different computation. There is no one special computation the system performs. If we take intent out of the equation we have an incomprehensibly large state space and an incomprehensibly complicated trajectory through it, tracking everything down to atomic-level configurations. The second problem is that such broad definitions can quickly degenerate into pancomputationalism: everything computes. And, if everything computes, computation is an uninteresting concept. The brain computes, but so does a piece of wood. Computation is simply another word for behaviour subject to (natural) laws. The Universe is a big computer, computing its own future — grand but somewhat vacuous.

The second bad answer is something that has to do with “processing symbolic information”. As in the case of selecting particular measurements of a system as inputs and outputs, the notion of “symbol” requires not only intent but also, outside of a formal setting, very special powers of of abstraction — think about how many concrete shapes a symbol such as the letter ‘a’ can take. We know the human brain can process symbolic information, but we are far from certain any other kinds of brain can do that, so this definition rules them out as computing things. Also, the process of defining and recognising symbols itself is a prerequisite of computation, but not part of computation itself — which strikes me as odd. Whereas the input-output attempted definition is too broad, this is both too precise and too reliant on concepts that require further clarification. It is tantamount to saying “computation is what the human brain can do”. It does not explain the brain using computation, it uses the brain as an explanation for computation.

In conclusion I want to shatter our presumed consensus: computers do not compute. Not even computers. We compute. Computers just help us along. Implementation theory gives a lucid definition of computation: a relation between a pre-existing mathematical concept and the behaviour of a physical thing. The pre-existence of the mathematical concept is essential. Trying to fit a mathematical model on a pre-existing physical behaviour has a different name: scientific discovery. Where does this live computationalismthe theory that, broadly speaking, the brain is a computer and the mind a computation? I think it exposes its vacuity. A “computational theory of the mind” is along the same lines as having “a computational theory of the weather”, in which the earth is seen as a computer and the weather what it computes. It’s a metaphor that gives the illusion of understanding, without telling us whether tomorrow will rain or not — akin to using élan vital to “explain” life or scansoriality to “explain” propensity to climbing.

But perhaps I am being harsh. “Computation” is vacuous only when used as an explanation, as a first-class scientific concept. But it is not vacuous as an abstract concept or as a characterisation of a style of mathematical model or as a paradigm. The parallel between energy/work as used in physics and information/computation as used in cognitive science is quite striking. In his Lectures on Physics, Feynman has this to say about energy and energy conservation principles [PDF]. I think similar considerations apply to information and computation:

[Conservation of Energy] is a most abstract idea, because it is a mathematical principle; it says that there is a numerical quantity which does not change when something happens. It is not a description of a mechanism, or anything concrete; it is just a strange fact that we can calculate some number and when we finish watching nature go through her tricks and calculate the number again, it is the same. [...]

It is important to realize that in physics today, we have no knowledge of what energy is. We do not have a picture that energy comes in little blobs of a definite amount. It is not that way. However, there are formulas for calculating some numerical quantity, and when we add it all together it gives “28″—always the same number. It is an abstract thing in that it does not tell us the mechanism or the reasons for the various formulas.

For a good survey of the problem of defining computation I recommend Piccinini’s chapter on Computationalism in the Oxford Handbook of Philosophy and Cognitive Science.

Posted in anticomputationalism | 2 Comments