Please update your subscriptions!

]]>The hardware-software contrast carries over into the *semantics* of the two classes of languages. The semantics of programming languages is most commonly specified as syntactic transformation rules called *operational semantics. *For example in any programming language there is a rule that says that the phrase “1+1″ can be safely replaced with “2″. Of course, these rules can be very complicated, describing how functions are called, exceptions are managed, input-output interactions handled and so on. But the idea is that a *program* and some additional configuration information can be *reduced* to a “simpler” program with some (potentially different) configuration information.

On the other hand, the semantics of a circuit is most commonly given as sets of waveforms which describe directly the evolution in time of a measurable parameter at any chosen point in the circuit — a *simulation, *informally speaking. There are similar *behavioural* approaches to giving semantics to programming languages (*denotational semantics*) but they are not as widely used in applications (although they are very interesting things to study!).

What is quite surprising is that there is no equivalent of *operational semantics* for digital circuits — at least nothing I could find. The closest I can mention is the ground-breaking work by Mary Sheeran in the 1980s on using functional languages to specify circuit designs. As a part of this approach she specifies some *equations* for simplifying circuits, which can serve as part of an operational semantics — but more axioms are needed for that.

One thing that Sheeran’s work makes implicitly clear is that the conventional syntax of hardware languages is too unstructured to be useful for the formulation of axioms, and consequently for operational syntax-level modelling. Her Haskell-based syntax has the requisite mathematical structure so that axioms can be formulated. But how do we know what axioms are needed?

Circuits in general are diagrams of interconnected components (e.g. logical gates) and the way we like to think of them is *up to topological isomorphism. *This means that when we design a circuit we prefer not to think about the *geometry *of the interconnect (the wires lengths and shapes) but only the topology (what connects to what). The geometric details (in particular the presence of very long wires or especially very long parallel wires) is of course important for the physical realisation of the circuit, but at the *logical design* level we want to *abstract away* from geometric detail. Specialised *place-and-route* tools make sure that physical realisation of logical designs does not violate the abstractions we use.

The wonderful thing about diagrams-up-to-topological-isomorphism is that they can be usually described very precisely in the mathematical language of *monoidal categories. *This gives us a good language, and a good set of equations to work with. Unsurprisingly, the categorical specification language for circuit diagrams looks a lot like Sheeran’s Haskell-based specifications — this similarity has a name, the *Curry-Howard-Lambek *correspondence. The extra thing that we get though is the equations — all the equations that we need in order to start reasoning syntactically about circuits!

In a paper, jointly authored with my colleague Achim Jung, about t0 be published in the conference Formal Methods in Computer-Aided Design we pursue this observation and produce a categorical semantics of digital circuits. The main idea of the paper is that digital circuits can be shown to belong to a class of monoidal categories called *control categories*, which are (informally speaking) diagrams equipped with an unfoldable iterator. The rule of the iterator unfolding can be expressed diagrammatically as:

So whenever we have a circuit *f *with feedback we can replicate the circuit, connect the two in a certain way, and get an *equivalent* circuit. This unfolding of the iterated circuit is the diagrammatic equivalent of the unfolding of a fixpoint combinator: *Yf = f(Yf). *This equation provides the essential means of syntactic reasoning about feedback.

Note that we *prove* that circuits satisfy this equation, we don’t simply require it axiomatically. The proof relies on smaller, more obvious, axioms which circuits must satisfy, such as the interaction between constant circuits denoting fixed logical levels and *structural* circuit components such as splitting and joining wires.

How is this syntactic approach helpful? We were mainly motivated by the filling this gap in syntactic reasoning about circuits, but it turns out interesting applications do exist. One such application are the so-called *combinational circuits with feedback, *where feedback loops are in some sense *fake*, i.e. they don’t create conventional feedback effects (latching, oscillation, etc.). Conventionally, digital design tools ban such circuits even though they can be very useful. This is an example of a circuit implementing, informally speaking, a circuit which behaves like *if c then F(G(x)) else G(F(x)) *efficiently, while using only one instance each of circuits *F *and *G*:

The main example in our paper is to show how we can easily reason about such circuits without descending into complicated simulations.

The full paper is available from this link [PDF].

]]>We recently had λ-calculus pioneer Prof. Dana Scott give a series of lectures in our School. He was a PhD student of Alonzo Church. In the introduction of one of his lectures he told us the story of the Greek letters used in the λ-calculus: the α of α-conversion, the β of β-expansion, the η of η-expansion and of course λ itself.

He says that when Church was asked what the meaning of the λ was, he just replied “*Eeny, meeny, miny, moe.*“, which can only mean one thing. It was a random, meaningless choice. Prof. Scott claimed that the typographical origin myth was mainly propagated by Henk Barendregt and is just pure whimsy. He asked us to stop perpetrating this silly story.

I am attaching the link to the YouTube video with apologies for the low quality.

]]>So what is “pure”? Languages which are not pure are commonly said to have “side-effects”. So conversely we could ask, what are “side-effects”? Changing the values of memory locations is perhaps universally considered a side-effect, and, more generally, interacting with the “outside” world is pretty much considered a side-effect (see Haskell’s IO Monad). But there are other kinds of programming language features which do not involve, strictly speaking, an interaction with the outside world but are largely seen as effects, perhaps most importantly continuations. Finally, there are yet another kind of programming language features which researchers disagree on whether they are effects or not, for example non-determinism but especially non-termination. In his post on Agda Neel has pointed out yet another feature which can be considered an effect, the nominal generativity of type declarations in languages such as Agda, also encountered in other languages.

This debate on what is and what is not an effect, and implicitly what is and what is not a pure language, is not entirely silly because the badge of purity is wore with pride by languages such as Haskell and Agda, so attributing effects to them is an insult, really. As an insult to Haskell, someone once said that it is not a pure language but it operates in the “blessed monad” of non-termination and uncaught pattern-matching failures — which is actually 100% true.

The problem is that this endless is-it-an-effect-or-not controversy doesn’t bring us closer to an understanding of what is pure and what is effectful, so I will propose a definition of pure vs. impure which is very general, in fact it transcends lambda calculus or even programming languages in general. It is a systemic proposal of purity vs. impurity — in the process I also propose to discard the term “effects” which is silly, really, because any computation has some effect — as someone pointed out, any computation will get your computer slightly warmer. And if you think this is silly, let me tell you that there are side-channel attacks based on detecting temperature changes in processors. My proposal is also in line with my greater project to give semantics to systems rather than just syntax.

So here it goes: suppose you design a system, which as all well-designed systems, has a certain equational theory. If your system is “the lambda calculus” then the key equations are alpha, beta and eta. These equations are of course essential in reasoning about your system. Now suppose that you realise that your system must be connected to other systems which do not observe the same equational theory — you have no choice. As I have argued elsewhere, any programming language to be useful requires a foreign function interface which allows it to interface with code written in a different, unknown, language. Your original system is the “pure” system and the external systems are the “impure” parts.

Programming languages (the civilised ones) are such extensions of the simply-typed lambda calculus. Add a fix-point operator, arithmetic and an if statement and you get PCF, but you also get non-termination and a previously-absent distinction between calling conventions (call-by-name and call-by-value have different equational theories). Is PCF pure? Relative to the simply-typed lambda calculus it is not, since it breaks its equational theory, but if you give it its own equational theory then it is pure. To PCF add local variables and you get Algol which, again, breaks PCF’s equational theory. Yet Algol is a very civilised language, with a very good equational theory.

So “pure” is an entirely relative term which we can meaningfully use to describe a system as equationally understood, and “impure” to denote an extension which doesn’t obey the same equations, making reasoning difficult or impossible. So I would say no language is “pure” but we have pure and impure extensions to any given reference language, which is always pure. Another way to think about impure extensions is simply as extrinsics or foreign functions.

An important role that type systems could (and should) serve is to delimit “pure” and “impure” code in the same program. By “impure” code I mean code that is observably influenced by impure features. As far as I know the only comprehensive attempt at this is John Reynolds’s Syntactic Control of Interference type system, which works on top of Algol syntax, to delimit “pure” code (i.e. without assignments) which he calls “passive” from “impure” code (i.e. which relies on memory locations) which he calls “active”. I find this a much more reasonable approach at dealing with “effects” by simply isolating them, rather than trying to give types to these effects as is the standard “monadic” approach. As I argued elsewhere this is simply too ambitious and it is unlikely to work when we interface with arbitrary systems, just because arbitrary systems are too messy and we have no right to expect them to be uniform enough to be typeable.

]]>Here is what you will learn:

- The pure fragment of OCaml: functions, polymorphism, patterns, variants, lists, (tail) recursion, map and fold
- Basic algorithms and data types: list algorithms, searching, sorting, queues
- Basic proofs of program correctness using structural induction, applied to Peano numbers and lists and supported by our home-brewed proof assistant
*Yup* - Some general theoretical concepts: isomorphisms, the Curry-Howard correspondence

On the web site you will find: lecture notes, lecture slides, lecture videos and over 30 programming assignments. The assignments come with automatic test harnesses so you can test your answers for correctness.

The course web site is accessible from https://birmingham.instructure.com/courses/15627.

]]>I should specify that this is not a recipe for writing efficient or realistic compilers. This is an exercise in **understanding** what a compiler is. The second caveat is that the perspective presented here is quite common for functional or other higher-level programming languages but not quite as relevant for C, which is close enough to the concrete machine that an abstract machine is not really necessary. (Unless we consider LLVM-based compilation as using virtual machine.)

The code used in this post is OCaml. Why? Because OCaml is the best programming language bar none for implementing compilers. If you don’t believe me feel free to translate or reimplement my code in your favourite language and see how it goes.

Let us consider this simple expression language:

e ::= n | x | e ⊙ e | let x = e in e | print e | read

where *n* stands for numeric constants, *x* for variable identifiers, and ⊙ for arbitrary arithmetic operators. We shall not deal with parsing at all as it is a totally separate topic, but we start from the abstract syntax tree, specified by this OCaml type:

type expression = | Const of int | Var of string | Binary of string * expression * expression | Let of string * expression * expression | Output of expression | Input

For example, the following expression

let z = (let x = read in let y = read in x + y) in print z

has the syntax tree:

Let ("z", Let ("x", Input, (Let ("y", Input, (Binary ("+", Var "x", Var "y"))))), Output (Var "z"))

The first step in the formal definition of a programming language is giving its evaluation rules in an effective way. There are many ways to do this and they fall under the general topic of **operational semantics**, which can have several flavours (e.g. *big step*, *small step*, etc.) of which you can learn more in Andy Pitts’s excellent tutorial [link]. Without too much further theorising here is an obvious evaluator for our expression language:

let rec evaluate = function | Const n -> n | Binary (op, e, e') -> let v = evaluate e in let v' = evaluate e' in op_of_opcode op v v' | Let (x, e, e') -> let v = evaluate e in let e' = substitute x v e' in evaluate e' | Var x -> failwith ("undefined: " ^ x) | Output e -> let v = evaluate e in print_int v; print_newline (); v | Input -> read_int ()

Let’s discuss the code:

- A constant expression
*Const n*is evaluated to just*n*. - An operator
*Binary*evaluates the two arguments then applies the operator. The helper function*op_of_opcode*decodes the operator symbol into an actual (OCaml) operation:let op_of_opcode = function | "+" -> ( + ) | "*" -> ( * ) | ";" -> (fun x y -> ignore x; y) | (* and so on *)

- The
*Let*binder evaluates the definition*e*of*x*then replaces*x*with it in the body expression*e’*using a helper function to do the substitution:let rec substitute (x : string) (v : int) = function | Const n -> Const n | Binary (op, e1, e2) -> Binary (op, substitute x v e1, substitute x v e2) | Let (y, e1, e2) when x = y -> Let (y, substitute x v e1, e2) | Let (y, e1, e2) -> Let (y, substitute x v e1, substitute x v e2) | Var y when x = y -> Const v | Var y -> Var y | Output e -> Output (substitute x v e) | Input -> Input

Note that substituting in a

*Let*expression must ensure that the defined variable is*never*substituted as it would violate the obvious scoping rules of a local definition, hence the special case. - As a consequence, all
*Var*iables in a program are substituted at the point of definition se we shouldn’t encounter them. If we do, they must have been undefined so we raise a failure. *Output*evaluates its argument, prints it and returns it as a result. Since we are in an expression language we always need to produce some kind of integer result and this seems sensible.*Input*reads an integer and returns it.

This is a reasonable *definition* of our language’s semantics but it is rather impractical for two reasons:

- In order for this evaluator to execute the program it needs to modify the abstract syntax tree during the evaluation. This is clearly a bad idea for several reasons:
- Substitution or any other kind of AST manipulation will be a complicated operation which is unlikely to be easily mapped into smaller, machine-friendly operations.
- Substitution or any other kind of AST manipulation will always be inefficient in the extreme.
- Substitution is incompatible with using libraries, foreign-function interfaces, or runtime services.

- In order for the evaluator to execute a program it relies on the full power of the meta-language, in our case OCaml. Why is this a problem? We know that eventually we want our program to be executed by some machine, and it seems like a bad idea to assume that machine will be able to run OCaml, which is a big and complicated language requiring a sophisticated run-time environment. We need something simpler.

A compositional evaluator

We first remedy the problem of modifying the AST by giving a *compositional* evaluator in which the meaning of a term is defined from the meanings of its subterms. This is the key principle of *denotational semantics* which goes back to Scott and Strachey for programming languages and to Tarski for logics. To implement it we will add an *environment* to our evaluator, which is a dictionary mapping variable identifiers to values:

let rec cevaluate env = function | Const n -> n | Binary (op, e, e') -> let n = cevaluate env e in let n' = cevaluate env e' in op_of_opcode op n n' | Let (x, e, e') -> let env = (x, cevaluate env e) :: env in cevaluate env e' | Var x -> lookup x env | Output e -> let v = cevaluate env e in print_int v; print_newline (); v | Input -> read_int ()

The helper *lookup* function is:

let rec lookup x = function | [] -> failwith ("undefined: " ^ x) | (x', n) :: env when x = x' -> n | _ :: env -> lookup x env

Except for the treatment of variables, which are looked up rather than substituted, the two evaluators are identical. A mathematically-minded reader could *prove* (e.g. in Agda or Coq) some soundness results, e.g.

∀M.∀n. evaluate M = n ⇒ cevaluate [] M = n

Note that this is not entirely a trivial exercise because of the variable scoping rules! Also, keeping track of effects (input and output) presents some further formal challenges.

The second step in our process is to try to write an evaluator restrained to particular operations. What operations? That depends on what our ultimate target architecture might be. We don’t necessarily want to use at this stage a genuine concrete machine specification (e.g. x86) because it would be extremely complicated, albeit doable — barely. What we want is an intermediate layer of abstraction. What we want is a set of operations that we believe can be refined efficiently into concrete machine operations. What we want is an *abstract machine.*

For functional languages the standard abstract machines are Krivine for call-by-name evaluation and SECD for call-by-value, but other abstract machines exist, for example the *Token abstract machine *used to interpret linear logic [link]. Designing abstract machines is a fascinating and fun exercise, a beautiful mix of engineering and mathematics. They truly are the nexus between *language* and *machine* which make compilation of higher-level languages possible.* *

For purpose of illustration I will use a totally made-up and very simple abstract machine, the PSM (*pure stack machine*). It is a stack equipped with some operations:

*push*a new element onto the stack*pop*the top of the stack*peek*at the n-th element of the stack*apply*an operation to the top two elements of the stack and leave the result instead*swap*the top two elements of the stack*ouput*the top of the stack*input*at the top of the stack.

The OCaml implementation is this:

let stack = ref [] let push x = stack := (x :: !stack) let pop () = stack := List.tl !stack let peek n = stack := (List.nth !stack n :: !stack) let apply op = stack := (function x :: x' :: xs -> (op x x') :: xs) !stack let swap () = stack := (function x :: x' :: xs -> x' :: x :: xs) !stack let output () = print_int (List.hd !stack); print_newline () let input () = stack := (read_int () :: !stack)

When we implement the abstract machine we purposefully move into the *imperative* world because machines are naturally imperative. Machines have a *state* which is changed by its operations. In our case this state is simply a global stack. This presentation can be made even nicer, at the cost of some sophistication, by packaging the stack as a functor which would hide all the implementation details from the user. I leave this as an exercise for the sake of a more elementary description.

We are now in a position to write an *interpreter* for this abstract machine, i.e. a *compositional evaluator* which will only use the abstract machine operations:

let rec interpret symt = function | Const n -> push n | Binary (op, e, e') -> interpret symt e; interpret (inc symt) e'; apply (op_of_opcode op) | Let (x, e, e') -> interpret symt e; interpret ((x, 0) :: (inc symt)) e'; swap (); pop () | Var x -> let n = lookup x symt in peek n | Output e -> interpret symt e; output () | Input -> input ()

Lets discuss the code:

- A constant expression is interpreted by pushing the constant onto the stack.
- A binary expression will interpret the operands and, assuming that the two values are left onto the top of the stack, will apply the operation.
- A let expression will interpret the definition
*e*, which will be left at the top of the stack then it will interpret the body*e’*remembering the offset of the value for*x*. We want the value of the body only to remain onto the stack se we swap the value of*x,*which is on the second position at the top, then we pop it out. - Variables are looked up at their known stack offset.
- Input and output are obvious.

The interpreter uses an extra argument, a symbol table mapping variable identifiers into stack offset, with helper function

let inc = List.map (fun (x, n) -> (x, n+1))

All the “execution” is now done in terms of abstract machine operations, even though we are using a few other operations for managing the symbol table. That is not a serious problem. In fact this mapping between variable identifiers and stack indices could be considered part of the parsing process.

At this point we have our *interpreter!*

The mathematically minded reader could now think about proving some *soundness* results about this interpreter, which even for a small language such as this one is quite a (formal, if not mathematical) effort:

∀M.∀n. evaluate M = n ⇒ run M = [n]

where

let run m = stack := []; interpret [] m; !stack

which means running the interpreter leaves a singleton stack containing the evaluated value.

Once we have the interpreter, the compiler is surprisingly easy to implement. The interpreter *runs* the abstract machine instructions as it process the program. A compiler will simply *collect*, in a lazy way, the instructions (and save them to a file)!

let rec compile symt = function | Const n -> "push " ^ (string_of_int n) ^ "\n" | Binary (op, e, e') -> (compile symt e) ^ (compile (inc symt) e') ^ "apply " ^ op ^ "\n" | Let (x, e, e') -> (compile symt e) ^ (compile ((x, 0) :: (inc symt)) e') ^ "swap\n" ^ "pop\n" | Var x -> let n = lookup x symt in "peek " ^ (string_of_int n) ^ "\n" | Output e -> (compile symt e) ^ "output\n" | Input -> "input\n"

The code is strikingly similar. We could (almost) textually replace sequential composition with concatenation and the call to the function implementing the abstract machine instructions to the name of the instructions themselves. I have renamed both functions to *f* and placed the interpreter and the compiler side-by-side:

For illustration let’s look at one test expression, *m = let x = 6 in let y = 3 in x * y*:

# m;; - : expression = Let ("x", Const 6, Let ("y", Const 3, Binary ("*", Var "y", Var "x"))) # evaluate m;; - : int = 18 # cevaluate [] m;; - : int = 18 # run m;; - : int list = [18] # compile [] m |> print_string;; push 6 push 3 peek 0 peek 2 apply * swap pop swap pop - : unit = ()

The very last step is processing the output of the compiler, the *bytecode. *Here we have two choices:

- The bytecode can be interpreted by what is usually called a
*virtual machine*, which runs the bytecode on some concrete machine in the same way that our interpreter ran the original program on the abstract machine. Note that the bytecode is just a simpler programming language, but the principle remains the same. - The bytecode can be itself compiled just by tracing the execution of the
*virtual machine,*in the same way in which we traced the execution of the abstract machine to derive the object language compiler. Again, the principle is the same.

Compiler development is still, unfortunately, somewhat of a black art. I hope this post was informative, and especially I hope that the methodological importance of the abstract/virtual machine came through. This is indeed the key concept that makes compilers work. A good abstract/virtual machine must be such that the object language can be interpreted/compiled by/to it efficiently and its own instructions, the bytecode, must be interpreted/compiled efficiently by/to the concrete hardware platform targeted.

I also wanted to illustrate the role that operational semantics (the non-compositional evaluator) and denotational semantics (the compositional evaluator) can play in rational compiler development, and how they can be gradually refined into interpreters and compilers. I hope that made sense!

]]>One day a panel didn’t fit properly. It was off by a few millimeters, nothing that my guy could not fix in a few minutes of careful sawing and sanding. To his surprise what happened was very different. That chassis was taken to one side and work on it was suspended for days while a whole bunch of engineers tried to figure out not **how** to fix the problem but **why** the problem happened in the first place. My friend, the assembly line worker, put it down to the stupidity of engineers and the bureaucracy of management. In the end he was not informed how the problem was solved and what the investigation uncovered.

My friend was quite obviously an accomplished **hacker. **His hacker mentality would have fared quite well in software production, where **debugging** is part of the normal workflow, but was out of place in a conventional engineering setting. His story made me reflect on the huge gap between car and software engineering.

I wonder how many years until software production will reach the level of maturity where we don’t just fix bugs but, when bugs happen, we stop in our tracks, shocked that bugs happened at all in the first place, and only proceed after we had a serious look at our processes. I wonder how many years until our software will look like this

rather than like this

]]>The validity of a scientific theory rests on the correspondence between a mathematical model and measurements made of the physical world. The same relationship between an abstract, essentially mathematical model, and the physical world, usually a special-purpose device, lies also at the heart of computation. This relation, called **the implementation relation** by Chalmers, was first introduced in the context of trying to figure out what things can be said to compute. For most Computer Scientists this is not particularly exciting or interesting because of our Engineering-minded approach. We build computers so that they satisfy an implementation relation. In fact we rarely distinguish between “abstract” mathematical computation and “concrete” physical computation. We casually deal in “abstract machines”, such as the Turing Machine, without worrying too much whether we are talking about its mathematical definition or an actual physical instance. We just know that the latter can be easily built so that it implements the former.

However for those interested in *Natural Computing*, the implementation relation is far more interesting because an implementation relation is **discovered** rather than constructed. Some natural occurring phenomenon (the behaviour of slime mold, for example) can be used to **calculate** with because it **implements** a complex computation. Natural Computing works because such phenomena, implementing complex computations, exist. In some trivial sense any physical system can be used to compute facts about its own mathematical model, although that is hardly interesting. Natural Computing is interesting because particular physical systems can compute **fast** solutions to problems of general interest, which can actually encode the evolutions of other physical systems that evolve much slower or that are complicated or expensive to build.

What is truly fascinating is that no physical system can implement an abstract model which is mathematically uncomputable. For me this is indeed the greatest law of physics in terms of its generality, as it relates **any** abstract mathematical model to **any** physical system. The non-implementability of uncomputable models is an impossibility result of a staggering magnitude. And it is truly a falsifiable (in the logical, Popperian sense) scientific law. Scientists, most notably Penrose, suggest that some neuronal processes in the brain may be physical yet uncomputable, but the evidence for this is very sketchy. More generally, philosophers propose that the brain is an instance of a physical yet non-computable process, mainly based on interpretations of Goedel’s incompleteness theorem which I find rather naive. But at least this shows that we can logically conceive of such systems.

In fact I would say that not just computability in the sense of Church-Turing, but the whole of computability theory, including complexity analysis, is essentially formulating extremely general laws of physics between whole classes of formal models and any physical system which might implement them. Computer **Science** is truly deserving of its name!

append xs [] = xs

The practical problem with teaching a large group of students is that you end up with a large pile of assignments to check, which means a large number of teaching assistants of various degrees of competence in reading and understanding proofs. Note that for programming assignments we didn’t have this problem because we could use automated random unit testing. Pedagogically, having inconsistent marking standards is not only irritating to students but potentially devastating in the final exam if TAs habitually accept wrong proofs as correct in the assignments. The final exam is marked by me, and I may not be as blind to mistakes.

One possibility is to use proof assistants such as Coq, Agda or Isabelle, but they are a bit too intricate for first year students who are just learning the basics of proofs, logic, recursion and induction. What we want is a proof checker tailored to the needs of the beginning student. These were our design principles:

- The syntax of production-level proof assistants aims for concision, elegance and expressiveness. Our aim was a syntax that matched as much as possible the informal mathematical vernacular that students are used to from pen-and-paper proofs.
- Production-level proof assistants have various levels of interaction and automation which are not only useless to the beginning student but actively get in the way of learning proof techniques because they may encourage a trial-and-error approach to proofs. Also, very simple fragments such as pure propositional may be entirely automated in such proof assistants.
- Even production-level proof assistants can be annoyingly clumsy when it comes to writing equational proofs. However, most of the proofs that we ask our students are equational. Doing congruence reasoning comes very easy to students, but doing it in proof assistants formally is a pain. This aspect is something we wanted automated.
- We also need a generally high level of user-friendliness.

The source code, documentation and a range of examples from the simplest to moderately complicated are available on Github. Here I am just going to give the proof for the very simple example above. If you find it appealing I recommend you read the full documentation on-line.

Signatures: append : 'a list -> 'a list -> 'a list ;

Definitions: [append nil]: forall xs:'a list. append [] xs = xs : 'a list ; [append ys]: forall xs:'a list. forall ys:'a list. forall y:'a. append (y::ys) xs = y :: append ys xs : 'a list ;

Theorem [append neutral]: Statement: forall xs:'a list. append xs [] = xs :'a list Proof: by induction on list : case []: we know [0]: append [] [] = [] :'a list because [append nil] with ([]) . by [0] case (hd :: tl): [IH]: append tl [] = tl : 'a list . we know [1]: append (hd :: tl) [] = hd :: (append tl []) :'a list because [append ys] with ([], tl, hd) . we know [2]: append (hd :: tl) [] = hd :: tl : 'a list because equality on ([1], [IH]) . by [2] QED.

A comparison with the Agda equivalent is perhaps a good illustration of our trading off succinctness for readability. Note especially the need for an explicit congruence principle in the Agda proof. The list data type and append (++) are pre-defined. (Coq is not really comparable because it is tactics-based.)

cong-∷ : {A : Set} → (x : A) → ∀ xs ys → xs ≡ ys → x ∷ xs ≡ x ∷ ys cong-∷ x xs .xs refl = refl

unitr[] : {A : Set} → (xs : List A) → xs ++ [] ≡ xs unitr[] [] = refl unitr[] (x ∷ xs) = Goal where p : xs ++ [] ≡ xs p = unitr[] xs Goal : x ∷ (xs ++ []) ≡ x ∷ xs Goal = cong-∷ x (xs ++ []) xs p

The author of the tool is my intern student Yu-Yang Lin, with essential supervision and help from Neel Krishnaswami. Funding for the project was provided by the College of Engineering and Physical Science, University of Birmingham.

Your feedback is most welcome! Even more welcome are more examples, bug reports and bug fixes. If you are looking to make a more substantial contribution, parsing errors could be vastly improved.

]]>This definition is, I would say, *sound* in the sense that things and behaviours that are described by it are always computation. But is it *complete*? Are there notions of computation that fall outside of it? If we say “no” then we should expand the definition somehow — lets get back to that. But if we say “yes”, then what are the consequences? Implementation theory assumes that the abstract program defines the concrete program, which is an instance of it. A computer is then a device that accelerates computation by some physical means. The crucial assumption is that the abstract program, as a mathematical object, requires design and therefore intent. Accordingly, only artificial things can compute. We cannot say, for example, that* a brain computes*. There seems to be something that resembles “information processing” or “computation” in the way a brain behaves, but we don’t actually have the anchoring reference that is the abstract program to compare it with and say that it is an implementation of it. This may seem like nit-picking but it’s a genuine problem.

Outside artificial things designed to compute, i.e. computers, what else *computes*? What is computation, if anything, as a *natural phenomenon*? I haven’t seen a good answer so far. I have seen however two classes of bad answers.

The first bad answer is something that has to do with a “system” having “inputs” and “outputs”. This is bad on two accounts — without even going into the deeper philosophical problems of causation as a scientific concept. The first problem is that any natural system has many things that can be considered as “inputs” and “output”. For example if we throw a rock, and we take the initial velocity vector as an input and its space-time coordinates as an output, we can say that it is a system that computes a parabola tangent to that vector. But there is necessary intent in the choice of what measurements are considered as inputs and output. Any object has a trajectory in a highly-dimensional space of properties, and any projection of that trajectory on a lower-dimensional space of measurements defines a different computation. There is no one special computation the system performs. If we take intent out of the equation we have an incomprehensibly large state space and an incomprehensibly complicated trajectory through it, tracking everything down to atomic-level configurations. The second problem is that such broad definitions can quickly degenerate into *pancomputationalism*: everything computes. And, if everything computes, computation is an uninteresting concept. The brain computes, but so does a piece of wood. Computation is simply another word for behaviour subject to (natural) laws. The Universe is a big computer, computing its own future — grand but somewhat vacuous.

The second bad answer is something that has to do with “processing symbolic information”. As in the case of selecting particular measurements of a system as inputs and outputs, the notion of “symbol” requires not only intent but also, outside of a formal setting, very special powers of of abstraction — think about how many concrete shapes a symbol such as the letter ‘a’ can take. We know the human brain can process symbolic information, but we are far from certain any other kinds of brain can do that, so this definition rules them out as computing things. Also, the process of defining and recognising symbols itself is a prerequisite of computation, but not part of computation itself — which strikes me as odd. Whereas the input-output attempted definition is too broad, this is both too precise and too reliant on concepts that require further clarification. It is tantamount to saying “computation is what the human brain can do”. It does not explain the brain using computation, it uses the brain as an explanation for computation.

In conclusion I want to shatter our presumed consensus: computers do not compute. Not even computers. *We compute. *Computers just help us along. Implementation theory gives a lucid definition of *computation*: a relation between a pre-existing mathematical concept and the behaviour of a physical thing. The pre-existence of the mathematical concept is essential. Trying to fit a mathematical model on a pre-existing physical behaviour has a different name: scientific discovery. Where does this live *computationalism, *the theory that, broadly speaking, the brain is a computer and the mind a computation? I think it exposes its vacuity. A “computational theory of the mind” is along the same lines as having “a computational theory of the weather”, in which the earth is seen as a computer and the weather what it computes. It’s a metaphor that gives the illusion of understanding, without telling us whether tomorrow will rain or not — akin to using élan vital to “explain” life or scansoriality to “explain” propensity to climbing.

But perhaps I am being harsh. “Computation” is vacuous only when used as an explanation, as a first-class scientific concept. But it is not vacuous as an abstract concept or as a characterisation of a style of mathematical model or as a paradigm. The parallel between *energy/work* as used in physics and *information/computation* as used in cognitive science is quite striking. In his *Lectures on Physics, *Feynman has this to say about energy and energy conservation principles [PDF]. I think similar considerations apply to information and computation:

[Conservation of Energy]is a most abstract idea, because it is a mathematical principle; it says that there is a numerical quantity which does not change when something happens. It is not a description of a mechanism, or anything concrete; it is just a strange fact that we can calculate some number and when we finish watching nature go through her tricks and calculate the number again, it is the same.[...]It is important to realize that in physics today, we have no knowledge of what energy is. We do not have a picture that energy comes in little blobs of a definite amount. It is not that way. However, there are formulas for calculating some numerical quantity, and when we add it all together it gives “28″—always the same number. It is an abstract thing in that it does not tell us the mechanism or the reasons for the various formulas.

For a good survey of the problem of defining *computation *I recommend Piccinini’s chapter on *Computationalism* in the *Oxford Handbook of Philosophy and Cognitive Science.*