# A case against syntactic context

A program is what a program does, so if two programs do the same thing then they are the same thing. Semantics starts when you try to pin down what “doing the same thing” means. What is observable? Input and output are observable, but information such as duration of execution or memory usage tends to be ignored. If the program is non-deterministic then things complicate, because non-terminating behaviour is not observable (you cannot run a program until it doesn’t terminate). Depending on how you want to account for such behaviour, different notions of observation arise.

But these choices are deliberate, and the consequences of committing to one choice or another are well understood. At least the fact that there will be consequences is well understood. Here I want to talk about a different choice, a choice made so universally, a choice so ingrained in the way we do semantics of programming languages that it doesn’t even feel like a choice. It feels like “the right thing to do”. I want to explain why it is rarely actually so, and why it is mostly the wrong thing to do.

I am talking about terms and contexts. Programs that include free identifiers are not considered to be executable in the same sense in which closed programs are, i.e. programs in which all identifiers are locally defined. In fact we call the former terms and we only call the latter (proper) programs. So $f(x)$ is a term and cannot be executed, whereas $(\lambda f\lambda x.f(x))(\lambda x.x+x)1$ is a program which we can run.

Semantics becomes really interesting when we talk about terms. And the main goal of semantics is establishing that two terms are the same, so that any use of one is equivalent with any use of the other. They can be substituted for each other with no observable effect (the notion of observation is assumed fixed). But if we cannot rely on running a term, what can we do?

The universal approach is this: we take two terms to be equivalent if for all contexts which together with the term form a program, these programs that we get are equivalent.

$M\equiv N$ iff $\forall C[-].C[M]\equiv C[N]$

For a term such as $f(x)$ above, a context could be $(-)(\lambda x.x)1$. The dash indicates where the hole in the context is, where the term is plugged in to make a program.

I am not sure where this definition of equivalence of terms was introduced first, but I have seen it in the oldest papers of Scott, Milner and Plotkin. It is a definition which seems to have been proposed and accepted by default from the beginning of semantics of programming languages.

And it is wrong.

At least, most of the time.

The reason why it is wrong is that the context in which a term is executed is rarely constrained by the syntax of a particular programming language. The undefined identifiers represent function calls, which can be defined in other parts of the program. But they can also be library calls, or calls to the runtime services of the operating system. Any reasonably mature compiler supports separate compilation and foreign function interface, and once the conventions of the application binary interface are obeyed the linker can connect any compiled term with any other piece of object code, written in any language. For runtime services it is even more difficult to conceive of a syntactic notion of context, since the function can interact directly with a hardware device which has a physical behaviour unspecified syntactically.

Thinking of a context as a syntactic context is a dangerous over-simplification. Any property of a program proved relative to a syntactic context will be valid only in a syntactically valid context. For example if you prove the correctness of a library up to syntactic context, it will not be correct when used from a program which does not constitute a proper syntactic context, either because it is written in a different programming language or because it is malformed yet it passed by the compiler checks. This can actually happen in the case of C. Even worse, if you prove security properties of a piece of code up to syntactic context you only prove that an attacker which only uses that programming language will be unable to violate the property. But attackers rarely do that.

The use of syntactic contexts has led to a lot of intrinsically interesting work in programming languages, culminating in various full abstraction results, i.e. denotational models that perfectly capture the discriminating power of the syntactic context. But unless the properties of the syntactic context can be enforced in the object code and runtime system (and there is work on that) properties based on syntactic context will only be valid in a fictitious, highly idealised world.

What is a more realistic way? The context should be seen as the computational context, the combined capabilities of the language, the compiler, the operating system and the user. This may sound complicated but it is actually simpler than capturing the vagaries of syntax. And it is a scientific exercise in the sense that it should capture actual system behaviour. I like to call it system-level semantics.

More about it in an upcoming post.

Note: If you are impatient you can read the paper or watch the video, but they are on the technical side.