Searching/fumbling for a proof assistant

I was always very impressed with people who use proof assistants to do their maths. It seems like something from the future, mechanised, unerring, complicated, slightly incomprehensible, and it makes traditional pen-and-paper (well, LaTeX) maths seem quaint, artisanal, even obsolete. The contrast cannot be more striking when it comes to proving some complicated system (e.g. an operational semantics) satisfies some boring sanity property (e.g. reduction preserves typing). For such work lying at the border between mathematics and bureaucracy the use of proof assistants makes such a big difference that it is quite likely that in the not-too-distant future their use will be mandated. Mathematically, such proofs are rarely interesting (and the interesting bits can always be factored out into nice-to-present lemmas) but quite important. It’s a lot more like making sure a piece of code is free of bugs than showing that a piece of maths is interesting.

So I am sold on the importance of proof assistants, but which one to use? The first one I’ve tried was Coq. It is used very broadly, it has an established community of developers, and has been the engine behind some heroic achievements such as finally conclusively proving the Four Colours Theorem.

After a couple of weeks of reading about it, watching some tutorials, playing with it, I produced a tutorial/demo/taster where I prove some simple things about lists in Coq. There was enough information in this tutorial to allow some (smarter than average) first year students to work out some of their assignments using Coq, which was very impressive to me.

More recently I became exposed to Agda and, for the sake of comparison, I reworked the same tutorial/demo/taster in Agda this time. Agda is newer and smaller than Coq, and not quite as well known.

It is something a bit strange about doing a tutorial when yourself are a beginner. From a certain point of view it is dangerous because you may say or do things that are, for the expert, unusual or wrong. On the other hand, there is something positive to say about the innocent approach of the non-expert. Where I fumble you are likely to fumble, where I get stuck, you are likely to get stuck. These are the kind of tutorials where you don’t actually learn Coq/Agda, but you learn whether you want to learn Coq/Agda.

You can obviously draw your own conclusions from what you see in the videos, but I will also share my impressions. Both systems are wonderful and there is something about them that tells me that this is what the future should be like.

Working with Coq felt like driving a tank. Working with Agda felt like driving a Formula 1 car. Coq feels big and powerful, but not very elegant. Agda feels slick and futuristic but you can only wonder how the lack of a tactics language will work out in the trenches of a big boring project. Some of the more wonderful features of Agda (mixfix operators, seamless Unicode support, interactive development) I can see trickling into mainstream programming languages. The elegant proof-term syntax of Agda I can see more appealing to mathematically inclined users. Whereas the nuclear-powered tactic language of Coq is bound to appeal to hackers who just want to get (impressive) things done. But that’s just my opinion, and I don’t think it’s a very original or unique one.

Perhaps the main difference, for my money, between the two is the way I get stuck trying to get things done. In Coq it is quite often the case that I know how to achieve a proof goal but I cannot figure out what is the tactic I should use to achieve it or how to find it — so I blame Coq. Whereas in Agda, when I get stuck it’s because I made a mistake and I get an incomprehensible error message — but then I blame myself. Because of that, I end up being quite frustrated working with Coq whereas I derive a lot of satisfaction working with Agda.

 

Posted in proof assistants | 4 Comments

Seamless computing

In an earlier post (“Remember machine independence?“) I pointed out my dissatisfaction with the so-called domain specific languages (DSL) which are popping up everywhere nowadays, in support of new and esoteric programming platforms. The name DSL bugs me because I find it misleading. The right name would be ASL: architecture-specific language. They are not meant to solve problems in a particular domain, they are meant to program a particular architecture. Many such languages have some of the trappings of higher-level programming languages but also enjoy unconventional syntax and semantics, designed to match the platform.

I am not buying it. I think it is a mistake. I think the first and foremost principle of a higher-level programming language is that of machine independence. A higher-level programming language should allow the programmer to solve a problem rather than drive a physical system. This was a crazy idea in the 1950s, which went through all the stages that a crazy idea goes through:

First they ignore you. Then they ridicule you. And then they attack you and want to burn you. And then they build monuments to you. (Nicholas Klein)

The you in this quote was John Backus, who I am not sure whether he got a monument out of it (or whether anyone really wanted to burn him), but he did get a Turing Award, which is not bad. Not accidentally, the title of his lecture on the occasion of the award was Can Programming Be Liberated From the von Neumann Style? [PDF]. And machine independence figured high on the agenda.

Now the idea of machine independence is getting ridiculed again. As it was back in the 1950s the reason is exactly the same: computer programming experts are comfortable with machine-dependence in a programming language, and are worried that machine-independence will cause a substantial loss of efficiency. Which is both true and misguided at the same time, unless you are working on system-level programming. For application-level programming (scientific, corporate, etc) it rarely helps that your program is twice as fast if it takes twice as long to write.

In a 1958 paper called Languages, Logic, Learning and Computers, (J. of Computers and Automation) J.W. Carr III writes this:

One would like to be able to use these computers up to the day of the next one’s arrival, with interchange of problems and yet efficiency. Can this be done?

Carr is referring to the upgrade from IBM-650 to the “totally different” IBM-701 and IBM-704. This is before FORTRAN had been invented so the answer was not obvious at the time. But can we extend this point to our situation and replace ‘computers’ with ‘architectures’. Should our programs, written for the CPU, not work for GPU, FPGA, cloud, etc? The skeptics say ‘no’ and often argue that a mismatch between the programming language model and the system model leads to inefficiency. Perhaps, but in many circumstances loss of performance is a price worth paying. If we look at the languages of today, especially functional languages, they do not match the CPU-based architecture very well at all. They are declarative where computers are imperative. They use functions where computers use jumps. To compile (efficiently) one for the other a lot of optimization work is put in and the native systems are augmented by runtime systems to provide essential services such as garbage collection. But this does not invalidate the principle of having good programming languages even if they don’t match the architecture on which they run. Designing machines especially to run functional languages is actually a cool and interesting idea (see the REDUCERON).

Here is the same objection, taken from Carr’s paper:

The problems of efficiency have been often emphasized by programmers who believe strongly in the continuance of hand-programming over translational techniques.

“Hand-programming” means programming in assembly or even binary. “Translational” means using what we now call compilers. It’s safe to say history vindicates Carr and not those overly concerned with efficiency.

Bob Harper’s blog has a number of excellent posts on the duality between programming in the language and programming for the machine. They are all quite relevant to the point I am making, so have a look.

Machine-independence is clearly the necessary ingredient to achieve the kind of portability considered desirable by Carr. So we are doing something about it. We are developing a compiler for a conventional imperative and functional programming language (we call it Verity) which would allow effective and efficient compilation to any architecture. We already have a back end for compiling to FPGAs, which we call the Geometry of Synthesis project. It supports higher-order functions, recursion, separate compilation and foreign function interfaces to native HDL, all in an architecture-agnostic way. I wrote about this project on this blog before.

Distributed seamless computing

Today I want to talk distributed computing. From the point of view of supporting architecture-independent programming, distributed computing is a mess. Lets start with a simplistic program, just to illustrate a point, a program that in a functional programming language I might write as:

let f x = x * x in f 3 + f 4

And suppose that I want to distribute this code on two nodes, one that would execute the function f, one that would execute the main program. I can choose a high-level distributed programming language such as ERLANG. Here is how the code looks:

c(A_pid) -> receive X -> A_pid ! X * X end, c(A_pid).
main() ->
 C_pid = spawn(f, c, [self()]), C_pid ! 3,
 receive X -> C_pid ! 4, receive Y -> X + Y end
 end.

In the first program I describe logical operations such as multiplication and addition, and a logical structure to the program. In the second program I describe a system: I spawn processes and send messages and handle messages. A program which should be trivial is no longer trivial.

Things get even worse if I choose to distribute my program on three nodes: one for f, one for the function applications, one for whatever is left of the main program. This is the ERLANG code for this:

c() -> receive {Pid, X} -> Pid ! X * X end, c().
b(A_pid, C_pid) ->
 receive
 request0 -> C_pid ! {self(), 3}, receive X -> A_pid ! X end;
 request1 -> C_pid ! {self(), 4}, receive X -> A_pid ! X end
 end,
 b(A_pid, C_pid).
main() ->
 C_pid = spawn(f2, c, []),
 B_pid = spawn(f2, b, [self(), C_pid]),
 B_pid ! request0,
 receive X -> B_pid ! request1, receive Y -> X + Y end
 end.

Can you tell that these two ERLANG programs compute the same result? If you initially wrote the first two-node version and decided to re-factor to three nodes how hard would the transition be? You need to rewrite most of the code.

There is a better way. The way you might want to indicate nodes to the compiler should not be more than that:

let f x = (x * x)@A in f 3 + f 4

or

let f x = (x * x)@A in (f 3)@B + (f 4)@B

Is this possible? To the programmer who needs to struggle with the intricacies of writing distributed code this may seem like a pipe dream. But it is possible. Theorists of programming languages have known since the late 80s and early 90s that computation and communication are closely related. Whether it was Milner’s Functions as Processes [PDF], Girard’s Geometry of Interaction [DOI] or the early work on Game Semantics [see my survey] the message is the same: function calls are nothing but structured communication protocols. And this is not just pure esoteric theory, we really knew how to write compilers and interpreters based on it. Nick Benton wrote a very cool tutorial which also applies functions-as-processes to distribution. Why didn’t we write this compiler yet? Because we were afraid it might be too inefficient.

I think we should be braver and more sanguine and have a go at it. This worry could be misplaced or exaggerated. With my student Olle Fredriksson we made the first step, which is writing a compiler for a core functional programming language which can be distributed using light-weight compiler annotations just as described above. We call it seamless computing because it is distributed computing where all the communication between components is via function calls rather than via explicit sending and handling of messages. Explicit communication is the seam in distributed computing. This distribution is communication-free.

A paper describing the compiler was presented at the International Symposium on Trustworthy Global Computing earlier this year [PDF] and there is a video based on it which you can watch online. The compiler is available for download from veritygos.org. It is based on Girard’s Geometry of Interaction and it compiles down to C and MPI. There are some good things and some bad things regarding its efficiency. On the one hand the compiler does not require a garbage collector, which is very desirable in the distributed tokens. On the other, the tokens being exchanged during the computation have a size proportional to the depth of the call stack, which can be large especially for recursive calls. But this can be avoided by using a better implementation, still without requiring a distributed garbage collector. This is ongoing work. We are also currently expanding the compiler to handle the whole of Verity, so that it has concurrency and shared state. We are making great progress and it’s a lot of fun both as a theoretical challenge and as an implementation and engineering one.

We will keep you posted as things develop!

Posted in seamless computing | 2 Comments

The ultimate programming language?

In several earlier posts I was considering how the conventional notion of contextual equivalence, which relies relies on quantification over syntactic contexts, is sometimes not suitable and a more direct notion of equivalence should be used. This notion of equivalence, which I like to call system-level semantics, should be induced directly by what is observable by the execution environment as defined not only by language but also by abstract models of the compiler and the operating system.

More concretely, what I proposed (in a paper [PDF] with Nikos Tzevelekos) is that the system context should be defined along the lines of a Dolev-Yao attacker. This context is omnipotent but not omniscient: the program can hide information (data, code) from the context, but once the context gets this information it can use it in an unrestricted way. What we show then in our paper is how this attacker can produce “low-level” attacks against code, i.e. attacks which cannot be carried out in the language itself.

But what if the system-level context and the syntactic context are just as expressive? Then you have a fully abstract system-level semantics. This means that the attacker/context cannot produce any behaviours that cannot be produced in the language itself. Why is this cool? Because it means that all your correctness arguments about source level can be applied to the compiled code as well. You can think of a fully abstract system-level semantics as an abstract specification for a tamper-proof compilation scheme.

In a previous post I describe such a fully abstract SLS. Full abstraction is achieved by using monitors that serve as tamper detectors: if the environment behaves in ways which are illegal then the program takes defensive action (stops executing). So the full abstraction is achieved by introducing runtime checks for detecting bad behaviour, as in control-flow integrity, for example (see Abadi &al [PDF]).

But can you achieve a fully abstract SLS the other way around? By having a programming language that is so expressive that it can implement any attacks performed by a DY-style environment? I think the answer is ‘yes’. And such a language is surprisingly civilised: lambda calculus with state and control introduced in a uniform way. This language, called lambda-mu-hashref, was introduced especially to study security properties in a language capable of producing very powerful contexts. A fully abstract SLS-type semantics is given in the paper.

I find this quite remarkable, the fact that a civilised programming language which is not that much different from ML can induce contexts as powerful as a DY attacker. Any (reasonable) behaviour should be possible to implement using it. There is nothing else you can add to it to make it more expressive. It is, from this point of view, the ultimate programming language. 

Posted in system level semantics | Comments Off

Two consequences of full abstraction

In a recent paper [pdf] with my student Zaid Al-Zobaidi we describe two enhancements to the GoI compiler.

The first one is a tamper-proof mechanism which guarantees that no low-level attacks against the compiled code are possible. This is very nice to have because it means that all properties that you can prove about a program at code level are preserved by the compilation process. Of course, since we are generating hardware (rather than machine code) the ways in which a malicious environment can mount an attack are very constrained. Most importantly, the code cannot be changed, as there is no code just a fixed circuit. However, the compiler is compositional and can interface with libraries written in other languages, and such libraries can violate language invariants in unexpected ways. For example, in the hardware model calls/returns to/from functions are associated with fixed ports in the interface, so a malicious environment can return from a function that wasn’t even called!

The way our synthesised hardware can be made tamper-proof is by providing interaction monitors which take defensive action (e.g. halting the circuit) if the environment behaves in ways which are inconsistent with the behaviour of circuits obtained from compiling legal programs — e.g. circuits which act like functions that return without being called. This is possible because we have a fully abstract model of our programming language, i.e. a model which correctly and completely characterises programs written in the language. With the addition of tamper-detecting monitors, the game semantics and the system-level semantics of the language become the same: the system must play by the rules we impose!

We are not the first to note the importance of fully abstract compilation in the prevention of low-level attacks [Abadi 98] but as far as we can tell our technique is the first compiler to do that for a substantial programming language. Of course, if compiling to machine code this task becomes hard-to-impossible.

The second improvement is due to the fact that the environment can observe the code only in a limited way, so more states of the system seem equivalent from the outside. This is quite similar in principle to bisimulation in a typed context, although the technical details differ. Our laxer notion of ‘equivalence’ lacks transitivity so it’s not really. However, from the point of view of reducing the state space of the automata, hiding behind a monitor turns out to be very effective.

The paper was presented by Zaid at ICE 2012.

Posted in game semantics, Geometry of Synthesis, system level semantics | 1 Comment

Remember machine-independence?

An interesting quotation from Dijkstra, written in 1961:

I would require of a programming language that it should facilitate the work of the programmer as much as possible, especially in the most difficult aspects of his task, such as creating confidence in the correctness of his program. This is already difficult in the case of a specific program that must produce a finite set of results. But then the programmer only has the duty to show (afterwards) that if there were any flaws in his program they apparently did not matter (e.g. when the converging of his process is not guaranteed beforehand). The duty of verification becomes much more difficult once the programmer sets himself the task of constructing algorithms with the pretence [sic] of general applicability. But the publication of such algorithms is precisely one of the important fields of application for a generally accepted machine independent programming language. In this connection, the dubious quality of many ALGOL 60 programs published so far is a warning not to be ignored. [On the Design of Machine Independent Programming Languages]

I am bringing this up because to me it seems a lesson forgotten. There is a tidal wave of new architectures (manycore, GPU, FPGA, hybrid, cloud, distributed, etc) coming and people are happily producing special-purpose languages for each of them. There is a fashion now to call such languages Domain Specific Languages (DSL), but that is to me a misnomer. DSLs should be problem specific, whereas these are architecture-specific. They are ASLs.

At the low-level we certainly need (C-like) languages to program these systems. But what worries me is the embedding of “DSLs” into languages such as Haskell, resulting in strange mongrels, architecture-specific high-level languages. They offer you the poor performance of a high-level language in exchange for the poor abstraction of a low-level language. In fairness, they do offer you a much better syntax than the native languages, but is this enough? Aren’t we settling for too little?

I don’t think we should give up on machine independence just yet. I have recently given two talks on compiling conventional higher level languages into unconventional architectures. The first one is on the Geometry of Synthesis (compiling into circuits), invited talk at Mathematics of Program Construction 2012. The second is on compiling to distributed architectures, a talk at the Games for Logic and Languages 2012.

Posted in Geometry of Synthesis | Comments Off

Game semantics, nominally

I have an upcoming paper in MFPS28, about using nominal techniques to represent “justification pointers” in game semantics (jointly with Murdoch Gabbay).

In game semantics we represent “plays” as sequences with “pointers” from each element to some earlier elements. This is an awkward structure  to formalise, and many papers (including my own) tend to gloss over the annoying bureaucratic details involved in formalising them (e.g. by using integer indices into the sequence).

I don’t mind very much in general this aspect of game semantics, but one situation when it comes into play quite painfully is in teaching it. It’s hard not to feel a bit embarrassed when explaining them chalk-in-hand at the blackboard. It was in fact during such a lecture when Andy Pitts jokingly (I hope) booed me from the audience when I explained pointers and plays.

Ever since, I wanted to fix this annoying little aspect of game semantics and now I think we did, by using nominal techniques: every move in a sequence introduces a fresh name (the tip of the pointer) and uses an existing name (the tail of the arrow). It is as simple as that, really, and everything seems to just work out nicely.

Mathematically, this is perhaps trivial, but this is pretty much the point of it. The contribution of the paper is “philological”, it shows that the nominal setting gives the right language for pointer sequences.

Link to the PDF.

Posted in game semantics | Comments Off

Armchair philosophy

I was reading recently a fun paper by Wesley Phoa, Should computer scientists read Derrida? [pdf]. I was attracted by what seemed to me a preposterous title, being quite sure the paper is a parody. Instead, I found myself confronted with a serious, beautifully written and thoughtful paper.

I will not comment on the paper itself because it deserves to be commented on seriously. I’ll just say that although I liked it quite a lot and although I recommend it as an excellent and thought-provoking read, I didn’t quite agree with its conclusions. Yet I was intrigued by the question. So I’ll have a go at giving a different answer.

First, we need to frame the entire history of semantics of programming languages in a blog post and then lets see how we can fit Derrida in there. We need to start with logic, of course.

There are two kinds of semantics (to use JY Girard’s terminology): essentialist and existentialist. I disclose that I am 10% essentialist and 90% existentialist, so that the reader can calibrate my account taking personal bias into consideration.

Essentialists take after Tarski and aim to understand language by expressing meaning into a meta-language. So that the proposition the grass is green is true if the grass is indeed green. The meaning of grass is ‘grass’, of is is ‘is’, etc. To the uninitiated this may seem funny, but it is serious business. In formal logic symbols get involved, so it seems less silly. The proposition A∧B is true if A is true and B is true, so the meaning of “” is ‘and’.

In programming languages this approach is the oldest, and is called “Scott-Strachey” semantics according to the two influential researchers who promoted it. It is also called ‘denotational’ (because symbols ‘denote’ their meaning) or ‘mathematical’ (because the meaning is expressed mathematically). To map symbols into meaning it is conventional to use fat square brackets 〚  and 〛.

So what is the meaning of the ‘2+2‘? It is 〚2+2〛 = 〚2〛 + 〚2= 2 + 2 = 4. Students are sometimes baffled by this interpretation even though we do our best to try to make things clearer by using different fonts to separate object and meta-language. (Because this blog is written in a meta-meta-language I need to use even more fonts.) It is easy to laugh at this and some do by calling it, maliciously, ‘renotational pedantics’. Well, much of it perhaps is, but not all of it. It is really interesting and subtle when ascribing meaning to types, i.e. things like 〚intint〛 which can be read as ‘what are the functions from ints to ints describable by some programming language’.

In programming language semantics, and in semantics in general, the paramount question is that of equivalence: two programs are equivalent if they can be interchanged to no observable effect. In an essentialist semantics A is equivalent to B if 〚A〛=〚B〛. This justifies the name quite well, since A and B are ‘essentially’ the same because their meanings are equal. Above we have a proof that 2+2 is equivalent to 4, because 〚2+2〛 =〚4〛= 4.

Essentialism has come under some fierce criticism. Wittgenstein, a promoter of it, famously reneged it. Famous contemporary logicians JY Girard (The Blind Spot) and J Hintikka (Principles of Mathematics Revisited) both attack it viciously (and amusingly) because of its need for a meta-language: what is the meaning of the meta-language? “Matrioshka-turtles” is what Girard calls the hierarchy of language, meta-language, meta-meta-language, etc. — it goes all the way down to the foundations of logic, language and mathematics.

Existentialism takes a different approach: language is self-contained and we should study its dynamics in its own right. Wittgenstein uses the concept of ‘language games‘ both to understand the properties of language (which are still called ‘meaning’ although the non-essentialist reading of ‘meaning’ is a bit confusing) and to explain how language arises. For some reason existentialist approaches to meaning always end up in games. Lorenzen used games to give a model theory of logic and Gentzen used games to explain proof theory. Proving something to be true means winning a debate. When something is true a winning strategy for the debate-game exists.

In programming languages the first existentialist semantics was operational semantics, which describes meaning of language in terms of itself, by means of syntactic transformations. It is intuitive and incredibly useful, but it is quite simplistic. Most importantly, it does not give a good treatment of equivalence of programs. It can only formalise the definition of equivalence: A and B are equivalent if for all language-contexts C[-] such that C[A] and C[B] are executable programs, C[A] and C[B] produce the same result. It is not a very useful definition, as induction over contexts is so complicated as to be useless.

Something very nice about programming language semanticists is that we’re not dogmatic. The battles over ‘which one is better’, denotational (essentialist) or operational (existentialist), were non-existent, largely restrained to friendly pub chats. In fact one of the first great open problems was how to show that the two are technically equivalent (the technical term is ‘fully abstract’ for a denotational model which captures operational equivalence perfectly). Milner, who coined the term, also showed how an essentialist (denotational) semantics can be constructed (where the word ‘constructed’ is not necessarily meant constructively) out of an existentialist (operational) semantics. I am not sure how aware philosophers of language are of this construction, because its implications are momentous. It establishes the primacy of existentialist semantics and reduces essentialist semantics to a mathematical convenience (publishing a research paper on programming language semantics without a proof of ‘adequacy’ is a serious challenge).

Milner’s construction is not very useful (which takes nothing away from its importance) so finding more palatable mathematical models is still important, and proving ‘full abstraction’ is the ultimate technical challenge. This turned out to be very hard, but it was eventually done (for many languages), also using the ‘game semantics‘ I mentioned above.

Curiously, in programming language semantics games were introduced as a denotational (essentialist!) semantics: a program denotes a strategy rather than an operational (existentialist) semantics. In fact, some technical aspects of the first game models [see Inf. Comp. 163(2)] made PL Curien (and others) complain that game models were too much like Milner’s construction of a denotational semantics out of an operational semantics [link].

Again, this full abstraction result was not the intended goal: we wanted a fully abstract model of PCF for short. But Loader’s later result settled the question negatively: he showed that the observational equivalence for PCF is not effective. As a matter of fact, the game-theoretic models of PCF given in 1994 by Abramsky, Jagadeesan, and Malacaria (AJM), and by Hyland and Ong (H0) offer syntax-free presentations of term-models, and the fully abstract model of PCF is obtained from them by a rather brutal quotient, called “extensional collapse”, which gives little more information than Milner’s original term model construction of the fully abstract model.

However, these technical issues were only manifest in the PCF model and most subsequent models, starting with Abramsky and McCusker model of Algol were syntax-free, complaint-free, fully abstract models.

Still, this is somewhat of an anomaly, at least historically and I think methodologically. Things were, however, set right by J Rathke and A Jeffrey who came up with a neat combination of operational and game semantics which they called ‘trace semantics‘. P Levy renamed this to ‘operational game semantics’, although I think for historical reasons it would be nice if it was all called just ‘game semantics’. Unlike vanilla operational semantics, operational game semantics offers a neat treatment of equivalence. Two programs A and B are equivalent if whatever ‘move’ one can make in its interaction with the environment the other can match it (the technical term is ‘bisimilar’). Also, unlike vanilla game semantics operational game semantics doesn’t really need the operational semantics.

What about Derrida then?

Broadly speaking, Derrida makes a critique (called deconstruction) of existing approaches to understanding language and logic on the basis that they ‘ignore the context’:

… [deconstruction is] the effort to take this limitless context into account, to pay the sharpest and broadest attention possible to context, and thus to an incessant movement of recontextualisation. The phrase … ‘there is nothing outside the text’, means nothing else: there is nothing outside the context.

This is obviously a statement exaggerated for rhetoric effect, but it’s not silly. In understanding language, context is indeed essential. In understanding formal logic, however, I don’t see how context plays a role, because the context is known. It is sort-of the point of formalisation to do away with the difficult questions that context raises. How about programming then? Is context important? Yes, it is. I’ve been saying this for a while, including on this blog. Realistic programming languages exist in a complex and uncontrollable context. Take these two little programs in simplified C-like languages.

int f() { int x = 0; g(); return x; } 
versus
int f() { g(); return 0; }

Are they equivalent? We don’t know what g() is. Importantly, we don’t know if it’s even a C function (C has a foreign function interface). If g() was “civilised”, then the two are equivalent because a civilised function g() has no access to local variable x. However, g() may be uncivilised and poke through the stack and change x. It’s not hard to write such a g() and hackers write uncivilised functions all the time.

This may seem like a hopeless problem, but it’s not. In fact, operational game semantics can model this situation very well. I call it system-level semantics, and it’s game semantics with a twist. In game semantics we model program+environment as a game with rules, which embody certain assumptions on the behaviour of the environment. In a system-level semantics the game has no rules, but it has knowledge. The environment is an omnipotent but not omniscient god, sort-of like Zeus. We can hide things from it. This is the same principle with which the Dolev-Yao attacker model in protocol security operates.

This does not necessarily lead to chaos, just like Derrida’s deconstruction need not degenerate in relativism:

… to the extent to which it … is itself rooted in a given context, … [deconstruction] does not renounce … the “values” that are dominant in this context (for example, that of truth, etc).

The point is that any enforcement of properties are up to the program and cannot rely on the context. In a game, the program should be able to beat any opponent-context, not just those behaving in a certain way (just like a security protocol).

A system-level semantics of C can be given where local variables are not hidden (a la gcc), and in such a semantics the two programs above are not equivalent. But a system-level semantics of C can also be given where local variables are hidden (e.g. via memory randomisation), which makes the two programs equivalent in any context. In a civilised context both semantics would be equivalent, but in an unrestricted context they are different.

So should computer scientists read Derrida? No. However, like Derrida urges, they should worry more about context.

Posted in game semantics, system level semantics | 11 Comments

Verity now with call-by-value

Call-by-name is generally seen as an annoyance by programmers. Fortunately, if your programming language has mutable state you can force evaluation at ground type by using assignment, so by writing

new v in v := x; f(!v)

you achieve the same effect as calling f(x) by-value. This is tedious, so ALGOL use

f(val x)

to achieve the same effect by using simpler syntax. We have added this feature to Verity now as well.

This was also an opportunity to me to get used to the compiler, which was written by my student Alex Smith. So I sat down with him and had him guide me in implementing this syntactic transformation in the GOS compiler. So that I don’t forget all this stuff I screencasted our session, which you can watch here.

Warning: it is not a very exciting video, other than the mild amusement generated by my constant fumbling.

On the plus side, the whole compiler change took as long as the video, i.e. about one hour including some simple testing. And it worked, no bugs, first time, which is yet another testament to what a nice programming OCaml is.

Posted in Geometry of Synthesis | Comments Off

Back to the future

The programming language ALGOL was “a language so far ahead of its time, that it was not only an improvement on its predecessors, but also on nearly all its successors,” at least according to C.A.R. Hoare. Few programming languages have been more studied [this] or better understood semantically; perhaps a dozen variations on this language obtained by adding various features have known fully abstract semantic models, mostly using game semantics. Few programming languages have a better programming logic than Reynolds’s awesome ALGOL specification logic [this]. And Reynolds’s Syntactic Control of Interference type system for ALGOL is a very clever way to handle effects in functional programming languages, avoiding some of the complications of monadic type systems.

Yet nobody is actually programming in this amazing language. What a pity. Call-by-name is certainly a problem, but it is mainly a problem for compiler writers rather than programmers. If you stick with first order functions, which is most of the code one normally writes, it is straightforward to force evaluation. And who is smart enough to use higher-order functions should be smart enough to use them efficiently.

Our old-new programming language Verity is a pretty faithful version of ALGOL, with the syntax mildly modernized (type inference, etc). Right now we only use this language to do higher-level synthesis, but I think it deserves to grow into a more general purpose language.

Posted in Geometry of Synthesis | 2 Comments

The case for a system-level semantics

In an earlier post I was deploring the way research in programming language semantics fetishises syntax. Because adding to a language anything other than syntactic sugar changes contextual equivalence in the language, the semantic model induced by syntactic context is (sometimes fundamentally) different. Defining a language becomes a neat game of matching a sensible syntax with a sensible semantics; this meet-in-the-middle approach is the most productive in leading to full abstraction results. I cannot think of a better example than the many versions of Idealized ALGOL: with or without “snap-back state”, with or without side effects in expressions, with or without “bad variables”. The differences are tiny yet the models are incredibly different. This is not great.

But what is the alternative? Can we have a meaningful notion of context which does away with syntax?

Lets go back to the conventional definition of equality for terms:

M\equiv N iff \forall C[-].C[M]\equiv C[N].

This definition works because C[M], C[N] are programs, i.e. closed terms, which can be executed in the operational semantics. They will either reduce to some ground-type value which can be compared directly, or will diverge, a non-terminating computation. The quantification over all context is just a way to get around the fact that we cannot execute open terms M, N.

My claim is that we can execute open terms. Real-life software is rarely self-contained. We can certainly compile open terms, this is how we get libraries. Then we link open object code against other open object code to get executables–but this linking can be actually done at run time if we use dynamic libraries. So the system (compiler+linker+RTS+operating system+computer) has a way to run open terms. Why doesn’t the semantics match the behaviour of the system?

Curiously enough, the technical apparatus for “executing open terms” has been developed. Not one, but several such techniques exist: trace semantics, operational game semantics, environmental bisimulation and of course game semantics. What they all have in common is that they model the interaction between an open term and the environment, usually in an effective way — you can build an interpreter out of the description. These are all very powerful techniques and they can describe very powerful, expressive environments.

What usually happens in such papers (and it’s not meant to be a very harsh criticism because I’ve written a couple of papers just like that) is that a lot of technical effort is dedicated to restricting the power of the environment, by adding various rules about what it can and what it cannot do, so that its expressivity matches perfectly the discriminating power of the syntactic context: full abstraction!

As a technical tour de force, full abstraction is usually a great accomplishment. In terms of usefulness, not so much, for reasons already discussed.

An alternative approach is to take the semantic power of the environment at face value and consider that it is a model of the computational environment. This is what the Dolev-Yao model of security is: a purely semantic description of the environment, i.e. the attacker. Precisely the same idea is eminently suitable for a programming language system-level context. The model is unrestricted computationally, only informationally (or epistemically); it can do whatever it wants with the information at hand but you can still hide things from it.

So when you specify a programming language you must explain not only what computations occur, as is normally the case in operational semantic specifications, but also what information is leaked to the environment and what sanity checks are built in the language when handling input from the environment. A specification for a language must be tied in with a high-level specification for a compiler and run-time system! This may sound like a lot, and it does involve more work than the usual operational semantics, but the result is a realistic basis for verification of language properties, especially against possibly antagonistic environments. More details in our paper and in this lecture.

One final comment: is a syntactic context ever appropriate? Yes! If we are talking about a calculus, which is not meant to be compiled and executed, but just studied in the abstract, a syntactic context is a perfect way to study its equational theory. In fact I would go as far as to say that the key difference between a calculus and a programming language is that the right notion of context for the former is syntactic and for the latter is systemic.

Posted in system level semantics | 4 Comments