Alcides Fonseca

40.197958, -8.408312

Posts tagged as Programming Languages and Compilers

Z3 Python in the Browser in 10 minutes

Last night, while I was catching up on email, I wanted to make use of my time and our Claude subscription. I decided to scratch an old itch.

Our aeon programming language has Liquid Types (e.g., {x:int | x > 0}) and we rely on an SMT solver to type check the implications of subtyping (e.g., when passing something of type {x:int | x > 3} to a function that accepts {x:int | x > 0}, we need to verify whether x > 3 -> x > 0, for all x).

But there was an issue: aeon is written in Python and relies on the z3 bindings that contain C++ code. We can run Python code in the browser with Pyodide, but the native libraries are not directly supported (at least this one, that relies on multi-threading).

On the other hand, there is a z3 port to web assembly (by alectryon’s Clément Pit-Claudel, no less) but it follows the C API, and has no browser-Python bindings.

So while I went through the ivory tower tall pile of emails, Claude reimplemented the z3 bindings in a different package, used the export to SMT-lib format feature, already present in z3, and passed that to the z3-wasm package.

I asked for examples, and piping the errors I found in the browser back to Claude, I gave up on being a middle man, and instructed Claude to use Rodney to interact with the browser directly (I was running this on a linux server, not my local machine). It then went ahead and made the examples work on Chrome, in a nice demo page. Unfortunately, it did not work on Safari due to the lack of stack switching support in web assembly, so I needed to make another prompt to fix that issue. It deployed to GitHub and Pipit automatically, with little effort.

Of course you get what you paid for: I provide no assurance that there are no bugs. But it’s useful enough for me to prepare some demos and materials for students that do not require any installation or compilation in their machines. That’s a win in my book.

And now, you have support for z3 in Python within the browser for your existing z3 Python projects, or just to play with z3 because the Python bindings are by far the most easy to just play with.

Foundations for hacking on OCaml

How do you acquire the fundamental computer skills to hack on a complex systems project like OCaml? What’s missing and how do you go about bridging the gap?

KC Sivaramakrishnan

KC gives several resources for students to get up to speed with contributing to OCaml.

One of the interesting resources is MIT’s the Missing Semester. This semester I created our own version of this course, covering git, docker, VMs, terminal/bash, testing, static analysis and LLMs for code.

While we cover how to do a Pull Request, I don’t believe students are ready to actually contribute. Reading large codebases is a skill that even our graduate MSc students don’t have. Courses are designed to be contained, with projects that need to be graded with few human effort, resulting in standard assignments for all the students.

I would love to run something like the Fix a real-world bug course Nuno Lopes runs. But being able to review so many PRs is a bottleneck in a regular course.

Should proofs have to be readable

It seems to us that the scenario envisioned by the proponents of verification goes something like this: The programmer inserts his 300-line input/output package into the verifier. Several hours later, he returns. There is his 20,000-line verification and the message “VERIFIED.”

Social Processes and Proofs of Theorems and Programs by Richard DeMillo, Richard Lipton and Alan Perlis

Although this is another straw man, many people claim to have verified something, offering as evidence a formal proof using their favourite tool that cannot be checked except by running that very tool again, or possibly some other automatic tool. A legible formal proof allows a human reader to check and understand the reasoning. We must insist on this.

Lawrence C Paulson

In my research group, we have been thinking not only on how to improve error messages, but also how to improve the understandability of proofs. It feels good to read such reinsuring take.

Scope of Generics in Python

Thanks to Continuous Integration, I have found a typing problem in our genetic engine program synthesis framework. It boiled down to me not defining a scope for a type variable.

I started with some code that looked like the following:

with the following error:

main.py:18: error: Argument 1 has incompatible type "P@consume"; expected "P@__init__"  [arg-type]
Found 1 error in 1 file (checked 1 source file)

You can load this example on the MyPy playground if you want to play around with it.

In this case, MyPy is inferring the type of data as dict[str, Callable[[P@__init__], bool], where the key is the init part of the type variable that ends up being different than the use o P inside the consume function. This behavior is because type vars are, by default, bound to the function/method, and not the class. The first step is to actually introduce the explicit annotation for data with the dict[str, Callable[[P],bool]] type, inside Subclass. Now we get a different error:

main.py:17: error: Dict entry 0 has incompatible type "str": "Callable[[P@__init__], bool]"; expected "str": "Callable[[P], bool]" [dict-item]

Now the P type variable in the field annotation is different than the ones inside the method. To actually bind the type variable to the whole class, we need to extend Generic[P]:

Now, we have no typing errors, and we do not even need the explicit type declaration for data.

Most of this issue was due to me not clearly understanding the default behaviors of type variables1. Luckily, if you are able to only support Python 3.12 and upwards, you can use the new, saner syntax. And maybe someday I’ll finish the draft post where I explain why Python’s approach to typing is the best (for prototyping type systems and meta-programming techniques, like we do in GeneticEngine) and the worst (for real-world use).

1 Who the hell creates a type variable through the definition of a variable??

Writing a compiler using Python, Lex, Yacc and LLVM

I found a good post on how to build your own toy compiler using Flex, Bison and LLVM. I saw one disadvantage right in the beginning: you had to use C++. If I were just prototyping a compiler, I wouldn’t use C++ but rather a dynamic language. And last semester for the Compilers course that’s what I did.

Students were assigned to build a Pascal compiler (actually a subset, but not that small) and the tools suggested were Lex, Yacc (using the C language) and compiling the code into C. I took a different approach and decided to do the project in Python (I actually tried ruby first, but the ruby-lex and ruby-yacc projects didn’t pass my basic tests).

I wrote the language grammar using PLY (the lex and yacc DSLs for python) and it was pretty simple. As for the AST generation, I had only a class Node that accepted an type and a list of arguments while my colleagues using C had to make 1001 structs for each kind of node. Not that it wasn’t impossible using C, but dynamic languages make the code simpler and more clear.

For the code generation, I decided to go with LLVM. It is a very promising project. Just take a look at google’s unladen-swallow or macruby, even parrot is planing on using llvm for their JIT.

For writing the code in Python, I had to use the llvm-py which I may say it’s in a early stages and lacks documentation. That was my major problem using. I had only three resources: the official guide, a presentation in japanese with some source code, and the actual source of the project (in C and C++).

Since every time I got an error in the llvm code generation it crashed the program, I had to dig into the source code of the project and find that error message and reverse engineer what was wrong with my code (usually I was giving values or pointers instead of references and vice-versa). So if you are doing something more complex, you actually need some C++ reading skills.

The project however worked, and I’m making it available so anyone may use the code as an example until better resources are published.