Learn OCaml!

I have been teaching OCaml to first year undergraduate students for a few years now. Why not share it with the world? To access the course follow the link at the bottom of the page.

Here is what you will learn:

  1. The pure fragment of OCaml: functions, polymorphism, patterns, variants, lists, (tail) recursion, map and fold
  2. Basic algorithms and data types: list algorithms, searching, sorting, queues
  3. Basic proofs of program correctness using structural induction, applied to Peano numbers and lists and supported by our home-brewed proof assistant Yup
  4. Some general theoretical concepts: isomorphisms, the Curry-Howard correspondence

On the web site you will find: lecture notes, lecture slides, lecture videos and over 30 programming assignments. The assignments come with automatic test harnesses so you can test your answers for correctness.

The course web site is accessible from https://birmingham.instructure.com/courses/15627.

Posted in General | Leave a comment

Compilers as lazy denotational interpreters

This year I had the pleasure to teach, for the first time, a course on compilers. I have been doing research on compilers for a while now but I suppose I never really asked myself seriously “what is a compiler?”. I could recognize a compiler if I saw one, but this is not good enough for teaching. One paper that gives an excellent conceptual explanation is Ager et. al.’s [PDF] but it might be a little hard to read if you are not familiar with abstract machines (Krivine, SECD) and with Standard ML syntax. In this post I will follow the same methodology, but as it applies to a very simple expression language which has only arithmetic, input, output and variable binding and a very simple abstract machine. This will make it, I hope, clearer what are an evaluator, interpreter and compiler, and how they are connected to the key notions of abstract and virtual machine.

I should specify that this is not a recipe for writing efficient or realistic compilers. This is an exercise in understanding what a compiler is. The second caveat is that the perspective presented here is quite common for functional or other higher-level programming languages but not quite as relevant for C, which is close enough to the concrete machine that an abstract machine is not really necessary. (Unless we consider LLVM-based compilation as using virtual machine.)

The code used in this post is OCaml. Why? Because OCaml is the best programming language bar none for implementing compilers. If you don’t believe me feel free to translate or reimplement my code in your favourite language and see how it goes.

The object language

Let us consider this simple expression language:

e ::= n | x | e ⊙ e | let x = e in e | print e | read

where n stands for numeric constants, x for variable identifiers, and ⊙ for arbitrary arithmetic operators. We shall not deal with parsing at all as it is a totally separate topic, but we start from the abstract syntax tree, specified by this OCaml type:

type expression =
 | Const of int
 | Var of string
 | Binary of string * expression * expression
 | Let of string * expression * expression
 | Output of expression
 | Input

For example, the following expression

let z = (let x = read in let y = read in x + y) in print z

has the syntax tree:

Let ("z", 
    Let ("x", Input, (Let ("y", Input, 
        (Binary ("+", Var "x", Var "y"))))), 
Output (Var "z"))

The evaluator

The first step in the formal definition of a programming language is giving its evaluation rules in an effective way. There are many ways to do this and they fall under the general topic of operational semantics, which can have several flavours (e.g. big stepsmall step, etc.) of which you can learn more in Andy Pitts’s excellent tutorial [link]. Without too much further theorising here is an obvious evaluator for our expression language:

let rec evaluate = function
 | Const n -> n
 | Binary (op, e, e') ->
   let v = evaluate e in
   let v' = evaluate e' in
   op_of_opcode op v v'
 | Let (x, e, e') ->
   let v = evaluate e in 
   let e' = substitute x v e' in
   evaluate e'
 | Var x -> failwith ("undefined: " ^ x)
 | Output e ->
   let v = evaluate e in
   print_int v; print_newline ();
   v
 | Input -> read_int ()

Let’s discuss the code:

  • A constant expression Const n is evaluated to just n.
  • An operator Binary evaluates the two arguments then applies the operator. The helper function op_of_opcode decodes the operator symbol into an actual (OCaml) operation:
    let op_of_opcode = function
     | "+" -> ( + ) 
     | "*" -> ( * )
     | ";" -> (fun x y -> ignore x; y)
     | (* and so on *)
  • The Let binder evaluates the definition of x then replaces x with it in the body expression e’ using a helper function to do the substitution:
    let rec substitute (x : string) (v : int) = function
     | Const n -> Const n
     | Binary (op, e1, e2) -> Binary (op, substitute x v e1, substitute x v e2)
     | Let (y, e1, e2) when x = y -> Let (y, substitute x v e1, e2)
     | Let (y, e1, e2) -> Let (y, substitute x v e1, substitute x v e2)
     | Var y when x = y -> Const v
     | Var y -> Var y
     | Output e -> Output (substitute x v e)
     | Input -> Input

    Note that substituting in a Let expression must ensure that the defined variable is never substituted as it would violate the obvious scoping rules of a local definition, hence the special case.

  • As a consequence, all Variables in a program are substituted at the point of definition se we shouldn’t encounter them. If we do, they must have been undefined so we raise a failure.
  • Output evaluates its argument, prints it and returns it as a result. Since we are in an expression language we always need to produce some kind of integer result and this seems sensible.
  • Input reads an integer and returns it.

This is a reasonable definition of our language’s semantics but it is rather impractical for two reasons:

  1. In order for this evaluator to execute the program it needs to modify the abstract syntax tree during the evaluation. This is clearly a bad idea for several reasons:
    1. Substitution or any other kind of AST manipulation will be a complicated operation which is unlikely to be easily mapped into smaller, machine-friendly operations.
    2. Substitution or any other kind of AST manipulation will always be inefficient in the extreme.
    3. Substitution is incompatible with using libraries, foreign-function interfaces, or runtime services.
  2. In order for the evaluator to execute a program it relies on the full power of the meta-language, in our case OCaml. Why is this a problem? We know that eventually we want our program to be executed by some machine, and it seems like a bad idea to assume that machine will be able to run OCaml, which is a big and complicated language requiring a sophisticated run-time environment. We need something simpler.

A compositional evaluator

We first remedy the problem of modifying the AST by giving a compositional evaluator in which the meaning of a term is defined from the meanings of its subterms. This is the key principle of denotational semantics which goes back to Scott and Strachey for programming languages and to Tarski for logics. To implement it we will add an environment to our evaluator, which is a dictionary mapping variable identifiers to values:

let rec cevaluate env = function
 | Const n -> n
 | Binary (op, e, e') ->
   let n = cevaluate env e in
   let n' = cevaluate env e' in
   op_of_opcode op n n'
 | Let (x, e, e') ->
   let env = (x, cevaluate env e) :: env in 
   cevaluate env e'
 | Var x -> lookup x env
 | Output e ->
   let v = cevaluate env e in
   print_int v; print_newline ();
   v
 | Input -> read_int ()

The helper lookup function is:

let rec lookup x = function
 | [] -> failwith ("undefined: " ^ x)
 | (x', n) :: env when x = x' -> n
 | _ :: env -> lookup x env

Except for the treatment of variables, which are looked up rather than substituted, the two evaluators are identical. A mathematically-minded reader could prove (e.g. in Agda or Coq) some soundness results, e.g.

∀M.∀n. evaluate M = n ⇒ cevaluate [] M = n

Note that this is not entirely a trivial exercise because of the variable scoping rules! Also, keeping track of effects (input and output) presents some further formal challenges.

Abstract machine and interpretation

The second step in our process is to try to write an evaluator restrained to particular operations. What operations? That depends on what our ultimate target architecture might be. We don’t necessarily want to use at this stage a genuine concrete machine specification (e.g. x86) because it would be extremely complicated, albeit doable — barely. What we want is an intermediate layer of abstraction. What we want is a set of operations that we believe can be refined efficiently into concrete machine operations. What we want is an abstract machine.

For functional languages the standard abstract machines are Krivine for call-by-name evaluation and SECD for call-by-value, but other abstract machines exist, for example the Token abstract machine used to interpret linear logic [link]. Designing abstract machines is a fascinating and fun exercise, a beautiful mix of engineering and mathematics. They truly are the nexus between language and machine which make compilation of higher-level languages possible. 

For purpose of illustration I will use a totally made-up and very simple abstract machine, the PSM (pure stack machine). It is a stack equipped with some operations:

  • push a  new element onto the stack
  • pop the top of the stack
  • peek at the n-th element of the stack
  • apply an operation to the top two elements of the stack and leave the result instead
  • swap the top two elements of the stack
  • ouput the top of the stack
  • input at the top of the stack.
The OCaml implementation is this:
let stack = ref []
let push x = stack := (x :: !stack)
let pop () = stack := List.tl !stack
let peek n = stack := (List.nth !stack n :: !stack)
let apply op = stack := (function x :: x' :: xs -> (op x x') :: xs) !stack
let swap () = stack := (function x :: x' :: xs -> x' :: x :: xs) !stack
let output () = print_int (List.hd !stack); print_newline ()
let input () = stack := (read_int () :: !stack)

When we implement the abstract machine we purposefully move into the imperative world because machines are naturally imperative. Machines have a state which is changed by its operations. In our case this state is simply a global stack. This presentation can be made even nicer, at the cost of some sophistication, by packaging the stack as a functor which would hide all the implementation details from the user. I leave this as an exercise for the sake of a more elementary description.

We are now in a position to write an interpreter for this abstract machine, i.e. a compositional evaluator which will only use the abstract machine operations:

let rec interpret symt = function 
 | Const n -> push n
 | Binary (op, e, e') -> 
   interpret symt e;
   interpret (inc symt) e';
   apply (op_of_opcode op)
 | Let (x, e, e') -> 
   interpret symt e;
   interpret ((x, 0) :: (inc symt)) e';
   swap ();
   pop ()
 | Var x -> 
   let n = lookup x symt in
   peek n
 | Output e ->
   interpret symt e;
   output ()
 | Input ->
   input ()

Lets discuss the code:

  • A constant expression is interpreted by pushing the constant onto the stack.
  • A binary expression will interpret the operands and, assuming that the two values are left onto the top of the stack, will apply the operation.
  • A let expression will interpret the definition e, which will be left at the top of the stack then it will interpret the body e’ remembering the offset of the value for x. We want the value of the body only to remain onto the stack se we swap the value of x, which is on the second position at the top, then we pop it out.
  • Variables are looked up at their known stack offset.
  • Input and output are obvious.

The interpreter uses an extra argument, a symbol table mapping variable identifiers into stack offset, with helper function

let inc = List.map (fun (x, n) -> (x, n+1))

All the “execution” is now done in terms of abstract machine operations, even though we are using a few other operations for managing the symbol table. That is not a serious problem. In fact this mapping between variable identifiers and stack indices could be considered part of the parsing process.

At this point we have our interpreter!

The mathematically minded reader could now think about proving some soundness results about this interpreter, which even for a small language such as this one is quite a (formal, if not mathematical) effort:

∀M.∀n. evaluate M = n ⇒ run M = [n]

where

let run m = stack := []; interpret [] m; !stack

which means running the interpreter leaves a singleton stack containing the evaluated value.

Compilation, virtual machines and native code generation

Once we have the interpreter, the compiler is surprisingly easy to implement. The interpreter runs the abstract machine instructions as it process the program. A compiler will simply collect, in a lazy way, the instructions (and save them to a file)!

let rec compile symt = function 
 | Const n -> "push " ^ (string_of_int n) ^ "\n"
 | Binary (op, e, e') -> 
   (compile symt e) ^
   (compile (inc symt) e') ^
   "apply " ^ op ^ "\n"
 | Let (x, e, e') -> 
   (compile symt e) ^ 
   (compile ((x, 0) :: (inc symt)) e') ^
   "swap\n" ^
   "pop\n" 
 | Var x -> 
   let n = lookup x symt in
   "peek " ^ (string_of_int n) ^ "\n"
 | Output e ->
   (compile symt e) ^
   "output\n"
 | Input ->
   "input\n"

The code is strikingly similar. We could (almost) textually replace sequential composition with concatenation and the call to the function implementing the abstract machine instructions to the name of the instructions themselves. I have renamed both functions to f and placed the interpreter and the compiler side-by-side:

For illustration let’s look at one test expression, m = let x = 6 in let y = 3 in x * y:

# m;;
- : expression =
Let ("x", Const 6, 
Let ("y", Const 3, 
Binary ("*", Var "y", Var "x")))
# evaluate m;;
- : int = 18
# cevaluate [] m;;
- : int = 18
# run m;;
- : int list = [18]
# compile [] m |> print_string;;
push 6
push 3
peek 0
peek 2
apply *
swap
pop
swap
pop
- : unit = ()

The very last step is processing the output of the compiler, the bytecode. Here we have two choices:

  1. The bytecode can be interpreted by what is usually called a virtual machine, which runs the bytecode on some concrete machine in the same way that our interpreter ran the original program on the abstract machine. Note that the bytecode is just a simpler programming language, but the principle remains the same.
  2. The bytecode can be itself compiled just by tracing the execution of the virtual machine, in the same way in which we traced the execution of the abstract machine to derive the object language compiler. Again, the principle is the same.

Conclusion

Compiler development is still, unfortunately, somewhat of a black art. I hope this post was informative, and especially I hope that the methodological importance of the abstract/virtual machine came through. This is indeed the key concept that makes compilers work. A good abstract/virtual machine must be such that the object language can be interpreted/compiled by/to it efficiently and its own instructions, the bytecode, must be interpreted/compiled efficiently by/to the concrete hardware platform targeted.

I also wanted to illustrate the role that operational semantics (the non-compositional evaluator) and denotational semantics (the compositional evaluator) can play in rational compiler development, and how they can be gradually refined into interpreters and compilers. I hope that made sense!

Posted in teaching | Leave a comment

A short fable of software engineering vs. regular engineering

An acquaintance recently took up an assembly line job with McLaren automotive — the people who make F1 and luxury sports cars. Before this gig, he used to work from his garage doing custom body and paint work, and developed quite a reputation for skill and quality. Working on the assembly line pays better but he is not very pleased with the monotony of it: all he does all day long is fit one small body panel on the chassis. The panel is part of the front wheel arch and it costs about £7,000.

One day a panel didn’t fit properly. It was off by a few millimeters, nothing that my guy could not fix in a few minutes of careful sawing and sanding. To his surprise what happened was very different. That chassis was taken to one side and work on it was suspended for days while a whole bunch of engineers tried to figure out not how to fix the problem but why the problem happened in the first place. My friend, the assembly line worker, put it down to the stupidity of engineers and the bureaucracy of management. In the end he was not informed how the problem was solved and what the investigation uncovered.

My friend was quite obviously an accomplished hacker. His hacker mentality would have fared quite well in software production, where debugging is part of the normal workflow, but was out of place in a conventional engineering setting. His story made me reflect on the huge gap between car and software engineering.

I wonder how many years until software production will reach the level of maturity where we don’t just fix bugs but, when bugs happen, we stop in our tracks, shocked that bugs happened at all in the first place, and only proceed after we had a serious look at our processes. I wonder how many years until our software will look like this

rather than like this

Posted in General | 2 Comments

Computability: The Greatest Law of Physics

Wigner famously declared mathematics to be “unreasonable effective” in the natural sciences. It is indeed something magical about the fact that if we have a mathematical theory and some mathematically representable aspect of reality then we can carry out abstract, conceptual, mathematical reasoning and formulate some new facts about reality which will turn out to be true. Even more amazing, such theoretical reasoning can lead to genuine discoveries about reality — new observations. According to Lakatos this is the most important thing about a scientific theory. Indeed Eddington would have never ventured to the island of Principe to take measurements of the Hyades during a solar eclipse unless Relativity predicted that he would see something interesting. Haley would have never predicted the return of the comet that now bears his name.

The validity of a scientific theory rests on the correspondence between a mathematical model and measurements made of the physical world. The same relationship between an abstract, essentially mathematical model, and the physical world, usually a special-purpose device, lies also at the heart of computation. This relation, called the implementation relation by Chalmers, was first introduced in the context of trying to figure out what things can be said to compute. For most Computer Scientists this is not particularly exciting or interesting because of our Engineering-minded approach. We build computers so that they satisfy an implementation relation. In fact we rarely distinguish between “abstract” mathematical computation and “concrete” physical computation. We casually deal in “abstract machines”, such as the Turing Machine, without worrying too much whether we are talking about its mathematical definition or an actual physical instance. We just know that the latter can be easily built so that it implements the former.

However for those interested in Natural Computing, the implementation relation is far more interesting because an implementation relation is discovered rather than constructed. Some natural occurring phenomenon (the behaviour of slime mold, for example) can be used to calculate with because it implements a complex computation. Natural Computing works because such phenomena, implementing complex computations, exist. In some trivial sense any physical system can be used to compute facts about its own mathematical model, although that is hardly interesting. Natural Computing is interesting because particular physical systems can compute fast solutions to problems of general interest, which can actually encode the evolutions of other physical systems that evolve much slower or that are complicated or expensive to build.

What is truly fascinating is that no physical system can implement an abstract model which is mathematically uncomputable. For me this is indeed the greatest law of physics in terms of its generality, as it relates any abstract mathematical model to any physical system. The non-implementability of uncomputable models is an impossibility result of a staggering magnitude. And it is truly a falsifiable (in the logical, Popperian sense) scientific law. Scientists, most notably Penrose, suggest that some neuronal processes in the brain may be physical yet uncomputable, but the evidence for this is very sketchy. More generally, philosophers propose that the brain is an instance of a physical yet non-computable process, mainly based on interpretations of Goedel’s incompleteness theorem which I find rather naive. But at least this shows that we can logically conceive of such systems.

In fact I would say that not just computability in the sense of Church-Turing, but the whole of computability theory, including complexity analysis, is essentially formulating extremely general laws of physics between whole classes of formal models and any physical system which might implement them. Computer Science is truly deserving of its name!

Posted in armchair philosophy | 1 Comment

A simple proof checker for teaching

For the last few years I have been teaching functional programming to first year students. One of the main benefits of using functional programming is that you can prove simple, but non-trivial, properties of simple programs, for example, that the empty list is right-neutral for append:

append xs [] = xs

The practical problem with teaching a large group of students is that you end up with a large pile of assignments to check, which means a large number of teaching assistants of various degrees of competence in reading and understanding proofs. Note that for programming assignments we didn’t have this problem because we could use automated random unit testing. Pedagogically, having inconsistent marking standards is not only irritating to students but potentially devastating in the final exam if TAs habitually accept wrong proofs as correct in the assignments. The final exam is marked by me, and I may not be as blind to mistakes.

One possibility is to use proof assistants such as Coq, Agda or Isabelle, but they are a bit too intricate for first year students who are just learning the basics of proofs, logic, recursion and induction. What we want is a proof checker tailored to the needs of the beginning student. These were our design principles:

  1. The syntax of production-level proof assistants aims for concision, elegance and expressiveness. Our aim was a syntax that matched as much as possible the informal mathematical vernacular that students are used to from pen-and-paper proofs.
  2. Production-level proof assistants have various levels of interaction and automation which are not only useless to the beginning student but actively get in the way of learning proof techniques because they may encourage a trial-and-error approach to proofs. Also, very simple fragments such as pure propositional may be entirely automated in such proof assistants.
  3. Even production-level proof assistants can be annoyingly clumsy when it comes to writing equational proofs. However, most of the proofs that we ask our students are equational. Doing congruence reasoning comes very easy to students, but doing it in proof assistants formally is a pain. This aspect is something we wanted automated.
  4. We also need a generally high level of user-friendliness.

The source code, documentation and a range of examples from the simplest to moderately complicated are available on Github. Here I am just going to give the proof for the very simple example above. If you find it appealing I recommend you read the full documentation on-line.

Signatures:
append : 'a list -> 'a list -> 'a list ;
Definitions:
[append nil]: forall xs:'a list.
              append [] xs = xs : 'a list ;
[append ys]:  forall xs:'a list. forall ys:'a list. forall y:'a.
              append (y::ys) xs = y :: append ys xs : 'a list ;
Theorem [append neutral]:
Statement: forall xs:'a list.
           append xs [] = xs :'a list 
Proof: by induction on list :
case []:
 we know [0]: append [] [] = [] :'a list
              because [append nil] with ([]) .
 by [0]
case (hd :: tl):
 [IH]: append tl [] = tl : 'a list .
 we know [1]: append (hd :: tl) [] = hd :: (append tl []) :'a list
              because [append ys] with ([], tl, hd) .
 we know [2]: append (hd :: tl) [] = hd :: tl : 'a list
              because equality on ([1], [IH]) .
 by [2]
QED.

A comparison with the Agda equivalent is perhaps a good illustration of our trading off succinctness for readability. Note especially the need for an explicit congruence principle in the Agda proof. The list data type and append (++) are pre-defined. (Coq is not really comparable because it is tactics-based.)

cong-∷ : {A : Set} → (x : A) → ∀ xs ys → xs ≡ ys → x ∷ xs ≡ x ∷ ys
cong-∷ x xs .xs refl = refl
unitr[] : {A : Set} → (xs : List A) → xs ++ [] ≡ xs
unitr[] [] = refl
unitr[] (x ∷ xs) = Goal where
 p : xs ++ [] ≡ xs
 p = unitr[] xs 
 Goal : x ∷ (xs ++ []) ≡ x ∷ xs
 Goal = cong-∷ x (xs ++ []) xs p

The author of the tool is my intern student Yu-Yang Lin, with essential supervision and help from Neel Krishnaswami. Funding for the project was provided by the College of Engineering and Physical Science, University of Birmingham.

Your feedback is most welcome! Even more welcome are more examples, bug reports and bug fixes. If you are looking to make a more substantial contribution, parsing errors could be vastly improved.

Posted in General | Leave a comment

What things compute?

I suppose we can agree that computers compute. We know that they do that because we make them do that. We have a program, on paper or in our head, and we have certain facilities to load that program into the computer memory. This is called implementing the program. Then, when the program runs we can compare the actual behaviour of the physical computer with the expected behaviour defined by the abstract program. Note that this expected behaviour we can determine “by hand” via manual calculations, at least in principle. Programs in the abstract are mathematical objects, programs in the concrete are behaviours of physical devices we call computers. I suspect trained computer scientists will find very little controversial (or exciting) in the clarification above. This is the so-called “implementation theory” of computing, advocated among others by the influential cog-sci philosopher David Chalmers.

This definition is, I would say, sound in the sense that things and behaviours that are described by it are always computation. But is it complete? Are there notions of computation that fall outside of it? If we say “no” then we should expand the definition somehow — lets get back to that. But if we say “yes”, then what are the consequences? Implementation theory assumes that the abstract program defines the concrete program, which is an instance of it. A computer is then a device that accelerates computation by some physical means. The crucial assumption is that the abstract program, as a mathematical object, requires design and therefore intent. Accordingly, only artificial things can compute. We cannot say, for example, that a brain computes. There seems to be something that resembles “information processing” or “computation” in the way a brain behaves, but we don’t actually have the anchoring reference that is the abstract program to compare it with and say that it is an implementation of it. This may seem like nit-picking but it’s a genuine problem.

Outside artificial things designed to compute, i.e. computers, what else computes? What is computation, if anything, as a natural phenomenon? I haven’t seen a good answer so far. I have seen however two classes of bad answers.

The first bad answer is something that has to do with a “system” having “inputs” and “outputs”. This is bad on two accounts — without even going into the deeper philosophical problems of causation as a scientific concept. The first problem is that any natural system has many things that can be considered as “inputs” and “output”. For example if we throw a rock, and we take the initial velocity vector as an input and its space-time coordinates as an output, we can say that it is a system that computes a parabola tangent to that vector. But there is necessary intent in the choice of what measurements are considered as inputs and output. Any object has a trajectory in a highly-dimensional space of properties, and any projection of that trajectory on a lower-dimensional space of measurements defines a different computation. There is no one special computation the system performs. If we take intent out of the equation we have an incomprehensibly large state space and an incomprehensibly complicated trajectory through it, tracking everything down to atomic-level configurations. The second problem is that such broad definitions can quickly degenerate into pancomputationalism: everything computes. And, if everything computes, computation is an uninteresting concept. The brain computes, but so does a piece of wood. Computation is simply another word for behaviour subject to (natural) laws. The Universe is a big computer, computing its own future — grand but somewhat vacuous.

The second bad answer is something that has to do with “processing symbolic information”. As in the case of selecting particular measurements of a system as inputs and outputs, the notion of “symbol” requires not only intent but also, outside of a formal setting, very special powers of of abstraction — think about how many concrete shapes a symbol such as the letter ‘a’ can take. We know the human brain can process symbolic information, but we are far from certain any other kinds of brain can do that, so this definition rules them out as computing things. Also, the process of defining and recognising symbols itself is a prerequisite of computation, but not part of computation itself — which strikes me as odd. Whereas the input-output attempted definition is too broad, this is both too precise and too reliant on concepts that require further clarification. It is tantamount to saying “computation is what the human brain can do”. It does not explain the brain using computation, it uses the brain as an explanation for computation.

In conclusion I want to shatter our presumed consensus: computers do not compute. Not even computers. We compute. Computers just help us along. Implementation theory gives a lucid definition of computation: a relation between a pre-existing mathematical concept and the behaviour of a physical thing. The pre-existence of the mathematical concept is essential. Trying to fit a mathematical model on a pre-existing physical behaviour has a different name: scientific discovery. Where does this live computationalismthe theory that, broadly speaking, the brain is a computer and the mind a computation? I think it exposes its vacuity. A “computational theory of the mind” is along the same lines as having “a computational theory of the weather”, in which the earth is seen as a computer and the weather what it computes. It’s a metaphor that gives the illusion of understanding, without telling us whether tomorrow will rain or not — akin to using élan vital to “explain” life or scansoriality to “explain” propensity to climbing.

But perhaps I am being harsh. “Computation” is vacuous only when used as an explanation, as a first-class scientific concept. But it is not vacuous as an abstract concept or as a characterisation of a style of mathematical model or as a paradigm. The parallel between energy/work as used in physics and information/computation as used in cognitive science is quite striking. In his Lectures on Physics, Feynman has this to say about energy and energy conservation principles [PDF]. I think similar considerations apply to information and computation:

[Conservation of Energy] is a most abstract idea, because it is a mathematical principle; it says that there is a numerical quantity which does not change when something happens. It is not a description of a mechanism, or anything concrete; it is just a strange fact that we can calculate some number and when we finish watching nature go through her tricks and calculate the number again, it is the same. [...]

It is important to realize that in physics today, we have no knowledge of what energy is. We do not have a picture that energy comes in little blobs of a definite amount. It is not that way. However, there are formulas for calculating some numerical quantity, and when we add it all together it gives “28″—always the same number. It is an abstract thing in that it does not tell us the mechanism or the reasons for the various formulas.

For a good survey of the problem of defining computation I recommend Piccinini’s chapter on Computationalism in the Oxford Handbook of Philosophy and Cognitive Science.

Posted in anticomputationalism | 2 Comments

Leaving the Nest: variables with interleaving scopes

A paper [PDF] with this title, co-written with Jamie Gabbay and Daniela Petrişan, will appear in Computer Science Logic (CSL 2015). Actually the full title is Leaving the Nest: Nominal techniques for variables with interleaving scopes. The “nominal techniques” part is in reference to a new mathematics of names pioneered by Gabbay and Pitts in the late 1990s. In this post I will give a very brief and informal taster/introduction to nominal techniques then tell you why our paper is actually quite interesting.

Names are a peculiar data structure. There are only two things you can do with them. Given two names you can compare them, and given a finite set of names you can find a fresh name relative to them, i.e. a name that is not in that set. There is a perhaps subtle distinction to be made between names and concrete representation of names, which can be strings or integers. Concrete representation might allow other operations as well, such as concatenation or addition, but qua names they only have equality testing and fresh generation.

Because there are only very few things we can do with names and to names, nominal structures, i.e. data structures incorporating names have an interesting property called equivariance, which means informally that all meaningful properties of such structures are preserved by name permutations. What does this really mean? Consider the following OCaml program:

let x = ref () in let y = ref () in x == y;;

A new unit reference, a pointer is created and assigned to x, then another one to y, then they are compared for equality. This little program is always equivalent to false. No matter what pointer-values are returned by the memory management runtime of OCaml, they will always be different. The actual value of the pointers qua names must be irrelevant! This is why invariance is under permutation: they can change names but they will not map different names to equal names.  This simple and elegant principle leads to a powerful and elegant theory of names about which you can read in Andy Pitt’s excellent book.

Nominal techniques really come into their own when dealing with syntax which uses locally-scoped names, such that of the lambda calculus (e.g. λx.x) or predicate logic (e.g. ∀x.x=x). In a syntactic context, the principle of equivariance leads to an equivalence of terms up-to-renaming, commonly known as alpha-equivalence. So terms such as λx.x and λy.y are equivalent. Moreover, this equivalence is also a congruence (is preserved by syntax) so that we can work with equivalence classes under alpha-equivalence instead. This is done often informally in the study of syntax, but a solid mathematical understanding of names is actually required to ensure that recursive definitions over nominal structures — i.e. something as banal as substitution — are actually well defined.

A common theme in the literature, motivated by syntax, has been the use of names in inductively-defined structures, where name abstraction is a term-forming operation. By definition, these names are always well scoped, which means that their scopes are well nested. However, names can be used in a different way, not in the language itself but in the semantics, at the meta level, to mediate access to resources. Memory locations, file handlers, network sockets, etc.  are all examples of such resources which, much like names, can only be tested for equality and created: they are names. However, unlike syntactic names they can be not just created but also destroyed. And the pattern of creation and destruction doesn’t need to be well nested anymore.

For example, this is a sequence where the scopes of a, b, c are interleaved (we mark name-creation with ⟨ and name-destruction with ⟩): ⟨a ⟨b a⟩ ⟨c b⟩ c⟩. The name b is introduced after a is created and when is destroyed it is still “active”. With the (obvious) intuition of scope and alpha-equivalence, we should expect the sequence above to be equivalent to ⟨a ⟨c a⟩ ⟨b c⟩ b⟩.

In our paper we give a mathematical treatment of this phenomenon, focussing on what we believe to be the most interesting aspects: What does it mean for a variable to be free or bound in this setting? What is the right notion of alpha-equivalence when scopes are not well nested? How can we define proper functions on sequences that use names with interleaved scopes? We give syntactic, relational and algebraic models for alpha equivalence which coincide — this is certainly reassuring that we gave the right definitions.

An example which illustrates alpha equivalence and name capture quite well is interleaving of sequences with names. For example, what happens if we want to interleave the sequence ⟨a a⟩ with itself? The standard definition of sequence interleaving gives us two sequences: ⟨a a⟩ ⟨a a⟩ and ⟨a ⟨a a⟩ a⟩. However, taking into account the fact that ⟨a a⟩ = ⟨b b⟩ this should be the same as interleaving those two, which seems to give us six sequences: ⟨a a⟩ ⟨b b⟩, ⟨a ⟨b b⟩ a⟩, ⟨a ⟨b a⟩ b⟩, ⟨b b⟩ ⟨a a⟩, ⟨b ⟨a a⟩ b⟩ and ⟨b ⟨a b⟩ a⟩, but quotienting by alpha-equivalence (in the obvious way) we are left with three: ⟨a a⟩ ⟨b b⟩, ⟨a ⟨b b⟩ a⟩, ⟨a ⟨b a⟩ b⟩, which is still not the same as two. What is going on here?

⟨a a⟩ ⟨b b⟩ = ⟨a a⟩ ⟨a a⟩  – we can re-use bound variable ‘a’ instead of ‘b’
⟨a ⟨b b⟩ a⟩ = ⟨a ⟨a a⟩ a⟩  – we can reuse bound variable ‘a’ instead of ‘b’, noting that there is some name-shadowing going on

So we are left with ⟨a ⟨b a⟩ b⟩, which is the only one which genuinely requires two names in a situation akin to variable capture in the lambda calculus. With a proper notion of alpha-equivalence our definition of interleaving just works and we can skip such tedious analyses.

This has a neat diagrammatic representation — if you have read this blog you may have guessed that I like diagrams! The sequence ⟨a a⟩ represents the diagram , with the two endpoints delimiting the scope. This is nice because the name ‘a’ no longer appears, so it is “obvious” that

  • ⟨a a⟩ ⟨b b⟩ = ⟨a a⟩ ⟨a a⟩ = 
  • ⟨a ⟨b b⟩ a⟩ = ⟨a ⟨a a⟩ a⟩ = 
  • ⟨a ⟨b a⟩ b⟩ = 

In terms of more serious applications, formalisations of game semantics and other trace semantics should now further improve.

Before I sign off, a shout out to my student Bertie Wheen for proving some of the syntactic results in Agda.

Posted in programming languages | Leave a comment

Inventing an algebraic knot theory for eight year olds (V)

Towards Equations

We finally reconvened our maths club after a mid-term break (children not available) combined with the university exams period (me not available). I correctly assumed nobody remembered anything we talked about before. At least not in detail. Also, we had a significant number of new children joining our group, who had no idea about what we have been up to.

In the previous sessions we discovered and invented our notation, which consisted of three combinators (X, C, I) and three operations: composition (“gluing”), tensoring (“stacking”) and dual (“flipping”).

 

 

This time we just took this as a given, the starting point of our development. If the previous sessions were analytic, trying to decompose knots into basic constants and operations, this session was synthetic, starting with formulas and drawing them up and seeing what we get.

The first formula we contemplated was C;C*, which everyone immediately identified as a circle. The knot-theoretic name for this is the unknot, word which caused a certain amount of amusement. One student cleverly observed, unprompted, that (C;C*)* would have been the same thing, also an unknot, and we had a short discussion about how symmetry of a knot K boils down to K=K*. That was quite rewarding and fun and prepared the ground, somewhat serendipitously, for the next formula C;X;C*:

The same student rashly pointed out that this shape is also symmetric, i.e. C;X;C*=(C;X;C*)*. Is this true or false, qua knots?

The ensuing discussion was the most interesting and important part of the session. Initially the students were in majority supportive of the validity of the equation, then someone pointed out that X≠X* so, they reckoned, it must be that

Indeed, the shapes are not identical. The opinion of the group swung so they all agreed that the equation is not valid and the two shapes (original and flipped version) are not equal.

But aren’t they? What do you think? We are talking knots here, and two knots should be equal when they are topologically equal, i.e. they can be deformed into each other without any tearing and gluing. So in fact they are equal because:

So here we have a genuinely interesting equation, where two formulae are equal not because the shapes they correspond to are geometrically equal (i.e. “identical”) but because they are topologically equal, i.e. there is a smooth deformation between them. Also note that the deformation must be in 3D by “twisting” the left (or right) side of the unknot — a 2D transformation would make the wire “pinch” which may or may not be allowable, depending on how we precisely set things up (if we were very serious and careful about what we are doing).

The point was quickly grasped and we moved on. It is a subtle point which I think can be damaged by over-explaining if the initial intuitions are in the right place, which they seemed to be.

Next we looked at a much more complicated formula that took a long time to draw correctly: C;(I⨂C⨂I);(X⨂X*);(I⨂C*⨂I);C*.

As you may see, if we are drawing this as a knot and tidy things up a bit we actually have this:

Which is really just two unknots — they don’t interlock so they can be pulled apart with no tearing and gluing, i.e. topologically. The formula for this is the much simpler C;C*;C;C*!

This point was not easy to make, and it seemed difficult for the students to appreciate it. By this time they were quite tired and the drawing of the more complex formulation of the two unknots diagram drained them of energy and patience — they made quite a few errors and had to restart several times. Showing them a much easier way to achieve the same knot diagram almost made them angry!

I hope to use this as a motivation for the future: how to simplify a formula before we draw it. Until now we have looked at equality semantically, using the underlying model. In due course, as students become more used with manipulating knot diagrams using formulas, I will start introducing some general equational properties.

Posted in teaching | Leave a comment

Types: computation vs. interaction

Type Theory is at the moment the workhorse of programming language theory. It is an elegant, clever and incredibly useful framework for thinking about programming (for full disclosure, I have written several papers with the word “type” in the title). In this post I will not elaborate on the positives of Types but I will try to find at what point they become less useful. The argument below should be accessible to any programmer. To the theorist it will say nothing new, but it will say things that are not said often enough or clearly enough — I think. I hope it will not be too impolite. 

For me, this is an important exercise, because Type Theory is a collection of beautiful and seductive ideas, and as such it has the potential to get one carried away to extremes — like Marxism or Libertarianism. The beauty of Type Theory is not a pragmatic beauty. One is not (yet) in awe at what has been achieved with Type Theory. C and Javascript have this kind of pragmatic beauty, akin to Democratic Capitalism or Social Democracy. They succeed despite being messy and riddled with internal inconsistencies. They emerged, were not designed — or at least not in a principled way. Type Theory on the other hand has a beauty arising out of internal coherence, elegance and simplicity. The messy, illogical stuff is frowned upon. But can it be avoided? 

What does a computer do when it does not compute?

Lets ask a silly question first: What does a computer do? Like all simple questions, this one is steeped in ambiguity, so a simple answer cannot follow absent clarifications. What is a computer? Here there are two answers, depending whether we mean a conceptual computer, so the answer could be A Turing machine! or A von Neuman architecture! These are fine answers. But if you mean a physical computer you could say A MacBook! or An Android phone! These are also fine answers. But the language is already playing tricks on us. A conceptual computer computes, but a physical computer computes and interacts with its physical environment. 

A physical computer that merely computes will still interact with the environment, but in an unsatisfactory way: by consuming energy and warming up. For us, as users of physical computers, the computation is the boring part. Computation is what happens when a spinning wheel appears on the screen and the device becomes unresponsive. We don’t like computers when they compute, we like them when they interact. We consider them to do useful stuff when they drive physical devices. When they draw. When they sing and dance. When they forward that packet to the right node on the network. When they retrieve that interesting nugget of information from the TB hard-drive. 

Computation v. interaction provides some creative tension in the world of programming languages. Programming languages come from two directions. Some of them were birthed by logicians who wanted to get a grip on the foundations of mathematics, and who understood that computation played an important role there. This was the genesis of the lambda calculus. Modern programming languages that hail from this tradition include Haskell and the ML family (OCaml, SML, F#). Other programming languages were birthed by engineers who made physical computers and they needed to write programs for them. Fortran and C came about this way. And, of course, nowadays the two families seem to converge. Computational languages (aka “functional”) reluctantly allow interaction-oriented features (“imperative”) and all languages seem to boast of “lambdas” and “closures”, hallmarks of computational/functional languages.

Much energy has been spent proposing and defending precise definition of “functional” (or “declarative”) versus “imperative” languages, so lets not go there. In a futile attempt to avoid this controversy I will say “computational” languages for those languages and language features that focus on computation, and “interaction” languages for those languages and features that focus on interacting with the environment.

What are Types for?

The concept of types, broadly speaking, is essentially computational. It was introduced by Bertrand Russell to avoid certain paradoxes in the foundations of mathematics. In programming languages, types are syntactic guarantees that certain errors will not happen. For example

  • F is code defining a function that takes an integer as an argument and produces an integer as a result.
  • M is code computing an integer.
  • Therefore F(M) is code in which the function F is guaranteed to run correctly, as it receives an integer according to expectations, and which produces an integer as a result.

So just like in Principia Mathematica, types can be used to prevent certain kinds of errors. This is very reasonable, although there are some strings attached. Because type systems are defined compositionally on the syntax they will rule out some programs that would not cause run-time problems. They are sound (no faulty programs accepted) but not complete (some non-faulty programs are rejected). For example, here is a failing OCaml program:

# if true then 0 else 'a';;
Error: This expression has type char but an expression was expected of type int

The first error is caused because in the process of calculating the type of the program the two branches of the if statement must have compatible types, even though the ‘else’ branch is irrelevant dead code. Here is how Python handles this problem:

>>> 0 if 1 else 'a'
0

The type is not checked and the program runs alright. Everyone happy?

People who think they understand something about the theory of programming languages, including me, tend to agree that what Python does is wrong. Of course, millions of happy Python users disagree and get things done with it on a daily basis. Such is life — if only they listened to the experts.

Is Curry-Howard an isomorphism?

But is this all that there is to be said about types, that they inject a bit of sanity and discipline in programming languages? No, there is more to the story. Enter Curry-Howard. These two logicians noticed a quite fascinating analogy between type systems of programming languages and logics. You can formulate the types of a programming language using logical connectives: A→B can be read either as “A implies B” or “function of argument A and result B”. And then, the calculations required to determine the type of a program, where the type is written as some formula P, are closely related to the calculations required to prove that the logical formula P is true. This is truly an amazing, fascinating analogy and it holds remarkably well for a broad range of logics. Via the correspondence we can derive some truly elegant programming languages, for example Agda, truly a Shangri-La — or maybe a land of the Lotophagi, depending on one’s taste — of types.

But is it really all that surprising that a rich formal system, such as a logic’s proof system, can compute? Any complex enough formal system can compute, see e.g. rewrite systems. Without taking anything away from the elegance of the approach, the fact that it works should not be all that shocking. But the Curry-Howard method teaches us how to awaken the computation that is lurking within any logical system. Only a fool wouldn’t see the beauty and the usefulness of this.

But what about the opposite direction, can we always associate a type system with a “computer program”? That is a harder question because all sorts of things can be used to compute, some of them quite extravagant. Neural networks are an old example, as old as conventional computers. If neural networks compute, what is a “program”? Can we associate a type system with it? Not that I know of. The very question seems strange. How about chemical or DNA-based computing? Types are intrinsically associated with composition, and if a computing device is not made out of components, hoping for a meaningful type system to emerge seems to me unjustified.

To many programmers the notion of program seems to be more basic than that of type, and the two are not intrinsically connected. If a program is a proof, then what does type inference do? We already have the proof, but we don’t know if it is a proof of something? This is a curious thing, isn’t it?

Effects

To start questioning the utility, or the possibility, of a type system we don’t need to go as far as computers made of slime. Whenever we need to employ programming languages which are computation-oriented (“functional”) to interact with the real world there is a general feeling of unease. What is the right way to do it? At one end of the spectrum we have ML-languages which simply ignore, at the level of the type system, the interaction aspects. A program of type int->int will take an int, produce an int, and may or may not interact with the world. This is reasonable but to many it seems like an opportunity lost. Why not use the type system to say something about interaction as well?

Haskell does that, by using a type-theoretic device called a monad. Each monad indicates one particular kind of interaction, such as using memory or perform input-output. I am not a fan of monads, I confess. Monads, introduced by Eugenio Moggi, can be used as a language mechanism, the way Haskell uses them, or as a meta-language construct to formulate semantic models of languages that combine computation and interaction, a la ML. The latter is unquestionably useful, a matter of mathematical necessity. The former, I am not sure. Some monads (I/O) package real interaction, some have nothing to do with interaction (the “maybe” monad) and others are mock interactions (the “state” monad). The first one is reasonable, the second one is also reasonable but for completely different reasons (which I suppose is already becoming reasonable), and the third one is unreasonable on the face of it. It is not the same thing to program with state or as if there was state available, for the same reasons that driving a car simulator and a real car is not quite the same thing.

In the research lit there are many other effect systems. They are clever pieces of mathematical thinking, and very elegant within the fairly restricted bounds of what they can do. But to me, at the heart of it, they seem like languages of mock interaction rather than real interaction. They are reasonable to use as meta-languages, to describe certain interactions, but not as programming languages where the interaction should be real.

Compositionality, but at a cost

Compositionality, another word for plug-and-play, is a very pleasant property of systems. It guarantees that if the parts have some good property, the whole will have a good property. It is nice, but it comes at a cost. For example, it is a fact that global optimisations are always more efficient than composing local optimisations and global analyses are more precise than composing local analyses.

If we want to compose, we need to push all the properties at the interface and ignore the innards. This requires abstraction. This is a cost. For example, a language like OCaml won’t let you write a heterogeneously branched if statement because a type cannot be assigned to it.

if x then 0 else 'a'

For the same reason it won’t let you create a heterogeneous list

# [0; 'a'; fun x -> x + 1];;
Error: This expression has type char but an expression was expected of type int

However, unlike heterogeneous if statements, which are silly, heterogeneous lists could be useful. Here is a fairly exhaustive list of how various languages handle heterogeneous lists. In fact you can program heterogeneous lists in dependently typed languages, but it’s unreasonably complicated. Python makes no complaints:

>>> g = lambda x: x**2
>>> [1, 'a', "hello", g] [1, 'a', 'hello', <function <lambda> at 0x103e4aed8>]

To me this is one methodological weakness of type theory, the commitment of having types for all terms in the language. Why is that? Types are designed to facilitate composition, but the natural unit for program composition is not the term but the function, or even the module.  It makes sense to assign types to modules — it is methodologically consistent. But how we calculate these types could conceivably be more flexible, allowing a global perspective within the unit of composition. Types, as properties at the interface could be calculated using the entire arsenal of (decidable!) logics and static analyses. I understand it would make things like language spec and error reporting more difficult, but it should make the programming easier, if the success of languages such as Python is teaching us anything.

Compositionality v. interaction

To me it seems that the abstraction penalty induced by compositionality plays an important role in the questionable suitability of type theories for capturing interaction. Interaction with real-life systems means interacting with messy systems. In order to interact with them we need an interface language that is suitable both for the programming language and for the real-life physical system. This is not impossible, but only if we don’t require the interaction language to model the physical system itself.

Hardware design is an interesting example, as it works on two levels of abstraction. At the high level we have hardware description languages (HDL) which deal with the logical description of the circuit — the language of the Booleans, sometimes enhanced with extra values such as “undefined”. In this language the order of events can be described, but not precise timings. In this language combinatorial feedback loops, are almost always banned because the precise behaviour depends on extra-logical factors, such as precise timings. Yet without combinatorial feedback loops flip-flops and other memory elements cannot be defined.

SR flip-flop

SR flip-flop

The result is to provide memory elements as black box library components, defined and tested in a different, lower-level, language. But at the interface the two levels of abstraction are reconciled and composition can occur.

In a follow-on post I will explain how a similar approach to types can be taken in the case of programming languages.

Posted in programming languages | Leave a comment

Inventing an algebraic knot theory for eight year olds (IV)

Equations

Here is a quote that captures the ethos against which our maths club militates:

We take other men’s knowledge and opinions upon trust; which is an idle and superficial learning. We must make them our own. We are just like a man who, needing fire, went to a neighbor’s house to fetch it, and finding a very good one there, sat down to warm himself without remembering to carry any back home. What good does it do us to have our belly full of meat if it is not digested, if it is not transformed into us, if it does not nourish and support us?          – Montagne, “Of Pedantry”.

This is how mathematics is usually delivered to students: knowledge upon trust. And it is knowledge of a particularly mystical variety, where magical symbols come together in incomprehensible ways. This is the myth we need to dispel, this is the knowledge we need our kids to digest. Sometimes Algebra can be nothing more than a way to write down knots and their properties.

Last session, using the notation we developed together, went well. One unexpected hick-up was the fact that my students had not seen parentheses before, but we took that in our stride. They also hadn’t use variables before, so most of the examples were concrete. But the little use of variables we made was not confusing.

I even introduced some proper maths terminology, the Big Maths Words. First I asked them to give names to the composition operation, and they came up with some reasonable suggestions such as gluing, linking or joining. When I told them the Big Math Word was composition they liked it — it was “like music”, where you “compose” a song from notes. Indeed! They also had reasonable suggestions for the tensor, things such as stacking or building, but they didn’t much like the Big Math Word tensor. I don’t like it very much myself, I confess.

For the first 20 minutes we went over the new and tidied up notation. The rest of  the time, 30 minutes or so, we worked out examples of going from a drawn knot to the formula. It is not easy, in fact it is quite a challenge (try the exercises) so they often got things wrong. One pervasive error was getting confused by repeated tensoring (“stacking”) of tangles, K⨂K’⨂K” which often was written as K⨂K’K” — an interesting error.

Tomorrow’s session we will finally reach equations. We have already seen the equation I*=I, which was noticed while developing the notation. First we will start with similar coherence equations, which basically mean that two notations denote the same knot. Things like associativity of composition or tensor are typical coherence equations. But because children have no appreciation of parantheses and operator precedence associativity is perhaps not the best example. Functoriality of the tensor is much more interesting.

  1. Draw the tangle for (C◦C*)⨂(C◦C*).
  2. Draw the tangle for (C⨂C)◦(C*⨂C*).
In both cases the result is a couple of loops. The difference is in how we put the loops together and not in what the result is:
Functoriality means that no matter what tangles H1, H2, H3, H4 (for some reason the kids chose H as the standard variable for knots) the following two tangles, if they can be constructed, are always equal:
 
(H1◦H2)⨂(H3◦H4)=(H1⨂H3)◦(H2⨂H4).

The unit and the co-unit also suggest compact-closed-like coherence axioms, which have a more topological flavour. Try this knot: (I⨂C)◦(C*⨂I). It looks like this:

But this is only a wire with some irrelevant bends. We have the equation

(I⨂C)◦(C*⨂I) = I

There is another similar equation, where the shape is like an ‘S’ rather than like a ‘Z’. Can you guess it?

The trove of equations is deep for this algebraic theory of knots and tangles and there is plenty of scope for children to discover equations once they understand what they represent. In addition to compact-closed we can also find equations from traced, braided and closed-monoidal categories.

But most importantly, I think the point is not to just give such equations to the students in order for them to develop skill in calculating knot equivalences using equational reasoning, so I won’t do that. That’s the kind of mystical maths I want to avoid. What is important are these two points:

  1. Understanding the idea of equational reasoning, that the same structure can be expressed in several equivalent ways, and that equations are the mathematical way to show this.
  2. Exploring the space of equations. Ideally, after a while the issues of soundness and completeness should emerge naturally.

Now, a brief comment to the cognoscenti. The categorical structure of this algebra of tangles is not an obvious one. We are using both the unit and the co-unit of compact closed categories, but our category is braided rather than symmetric. However, compact closed categories are symmetric so we are not in an obvious setting. The interaction of braiding and “compact closure” (informally speaking) gives equations like this:

C◦X◦C* = C◦C*

which correspond to this topological isomorphism of tangles:

These are two ways to describing the unknot. The category is clearly non-degenerate because we seem to be able to describe any knot. So combinatorial completeness seems an achievable result. Equational completeness (i.e. any equivalent knots can be proved to be so)  however seems like a challenging and interesting question!

In terms of related work, this seems similar enough: Peter Freyd and David Yetter, Braided compact monoidal categories with applications to low dimensional topology, Adv. Math. 77, 156-182, 1989.

If you are interested in another diagrammatic explanation of algebra also have a look at Pawel Sobocinksi’s entertaining and informative blog.

Posted in teaching | 1 Comment