Alcides Fonseca

40.197958, -8.408312

Posts tagged as Programming Languages and Compilers

Z3 Python in the Browser in 10 minutes

Last night, while I was catching up on email, I wanted to make use of my time and our Claude subscription. I decided to scratch an old itch.

Our aeon programming language has Liquid Types (e.g., {x:int | x > 0}) and we rely on an SMT solver to type check the implications of subtyping (e.g., when passing something of type {x:int | x > 3} to a function that accepts {x:int | x > 0}, we need to verify whether x > 3 -> x > 0, for all x).

But there was an issue: aeon is written in Python and relies on the z3 bindings that contain C++ code. We can run Python code in the browser with Pyodide, but the native libraries are not directly supported (at least this one, that relies on multi-threading).

On the other hand, there is a z3 port to web assembly (by alectryon’s Clément Pit-Claudel, no less) but it follows the C API, and has no browser-Python bindings.

So while I went through the ivory tower tall pile of emails, Claude reimplemented the z3 bindings in a different package, used the export to SMT-lib format feature, already present in z3, and passed that to the z3-wasm package.

I asked for examples, and piping the errors I found in the browser back to Claude, I gave up on being a middle man, and instructed Claude to use Rodney to interact with the browser directly (I was running this on a linux server, not my local machine). It then went ahead and made the examples work on Chrome, in a nice demo page. Unfortunately, it did not work on Safari due to the lack of stack switching support in web assembly, so I needed to make another prompt to fix that issue. It deployed to GitHub and Pipit automatically, with little effort.

Of course you get what you paid for: I provide no assurance that there are no bugs. But it’s useful enough for me to prepare some demos and materials for students that do not require any installation or compilation in their machines. That’s a win in my book.

And now, you have support for z3 in Python within the browser for your existing z3 Python projects, or just to play with z3 because the Python bindings are by far the most easy to just play with.

Foundations for hacking on OCaml

How do you acquire the fundamental computer skills to hack on a complex systems project like OCaml? What’s missing and how do you go about bridging the gap?

KC Sivaramakrishnan

KC gives several resources for students to get up to speed with contributing to OCaml.

One of the interesting resources is MIT’s the Missing Semester. This semester I created our own version of this course, covering git, docker, VMs, terminal/bash, testing, static analysis and LLMs for code.

While we cover how to do a Pull Request, I don’t believe students are ready to actually contribute. Reading large codebases is a skill that even our graduate MSc students don’t have. Courses are designed to be contained, with projects that need to be graded with few human effort, resulting in standard assignments for all the students.

I would love to run something like the Fix a real-world bug course Nuno Lopes runs. But being able to review so many PRs is a bottleneck in a regular course.

Should proofs have to be readable

It seems to us that the scenario envisioned by the proponents of verification goes something like this: The programmer inserts his 300-line input/output package into the verifier. Several hours later, he returns. There is his 20,000-line verification and the message “VERIFIED.”

Social Processes and Proofs of Theorems and Programs by Richard DeMillo, Richard Lipton and Alan Perlis

Although this is another straw man, many people claim to have verified something, offering as evidence a formal proof using their favourite tool that cannot be checked except by running that very tool again, or possibly some other automatic tool. A legible formal proof allows a human reader to check and understand the reasoning. We must insist on this.

Lawrence C Paulson

In my research group, we have been thinking not only on how to improve error messages, but also how to improve the understandability of proofs. It feels good to read such reinsuring take.

Scope of Generics in Python

Thanks to Continuous Integration, I have found a typing problem in our genetic engine program synthesis framework. It boiled down to me not defining a scope for a type variable.

I started with some code that looked like the following:

with the following error:

main.py:18: error: Argument 1 has incompatible type "P@consume"; expected "P@__init__"  [arg-type]
Found 1 error in 1 file (checked 1 source file)

You can load this example on the MyPy playground if you want to play around with it.

In this case, MyPy is inferring the type of data as dict[str, Callable[[P@__init__], bool], where the key is the init part of the type variable that ends up being different than the use o P inside the consume function. This behavior is because type vars are, by default, bound to the function/method, and not the class. The first step is to actually introduce the explicit annotation for data with the dict[str, Callable[[P],bool]] type, inside Subclass. Now we get a different error:

main.py:17: error: Dict entry 0 has incompatible type "str": "Callable[[P@__init__], bool]"; expected "str": "Callable[[P], bool]" [dict-item]

Now the P type variable in the field annotation is different than the ones inside the method. To actually bind the type variable to the whole class, we need to extend Generic[P]:

Now, we have no typing errors, and we do not even need the explicit type declaration for data.

Most of this issue was due to me not clearly understanding the default behaviors of type variables1. Luckily, if you are able to only support Python 3.12 and upwards, you can use the new, saner syntax. And maybe someday I’ll finish the draft post where I explain why Python’s approach to typing is the best (for prototyping type systems and meta-programming techniques, like we do in GeneticEngine) and the worst (for real-world use).

1 Who the hell creates a type variable through the definition of a variable??

Catching the Java Train

If you are into Node.js, Elixir and all those new cool tools, you might have not heard of Java. Java is a bloated language and runtime from the 90ies that is used mostly by your bank.

Java has been loosing adoption because it has not been up to date with features that developers want from a language. Specially compared with its evil twin, C#/.NET, that has been the recipient of several new and awesome features, driving innovation in the ecosystem (LINQ, Type Providers, etc…).

They have now announced that Java will follow the train model of Ubuntu: a new release every 6 months with whatever’s ready at the time, and a LTS (Long-term-support) release every three years for enterprise customers.

This is they response for having the same product for both enterprise and new kids. My opinion is that they are hiding from the truth. Java8 was delayed until Lambda was ready. Java9 was delayed until Jigsaw (module system) was ready. I believe they took the right approach. Except I would have released Java7.1 and Java8.1 with some new stuff that was ready before the main incomplete feature was. The real problem here, the one they are avoiding, is that it took them ages to develop and mature both Lambda and Jigsaw (and at least on the Lambda part I think it was really incomplete).

The enterprise world was mostly OK with not having nor Lambda nor Jigsaw at the predicted time – they do not care. They may have been annoyed for waiting so long for security updates that were waiting for the main feature. Something that with my minor releases would be fixed easily. This is what happens with other Open-Source languages such as Python or Ruby.

The main problem is that the new kids were missing these features and switching to Scala, Groovy, Kotlin or whatever was trending at the time on Hacker News. So much that Google accepted Kotlin as an official Android language, which IMHO sucks for Oracle. But Oracle has made it clear that they do not care about the success of Android. And changing to the train model does not fix this type of problem. If the main features will be delayed, developers will not care about it if they have better usable alternatives.

So, dear Mark Reinhold, I believe this is more of a marketing stunt than actually solving the real problem: JDK development is slow as hell.

And I’ve experienced this first hand: I joined project Sumatra to bring GPGPU to Java, which I first did via the AeminiumGPU project. However, no one in the JDK team made this an effort. AMD tried to bring the Aparapi, but the lack of efforts from the remaining members lead the project nowhere. Even the prototype source tree was abandoned.

I believe the main reason is that despite Java being Open-Source in theory, it is not. It is a Oracle-controlled environment. Python has advanced much more in the same years being completely open-source (despite some investments from Google and Dropbox, among many others). Python does not need the train-model: it has a bleeding edge version in 3.6 (or whatever 3.X we are in right now), and still has 2.7 (or 2.6 if you are lazy like me) running fine in many servers.

Train model is not the solution: a better (and more open) development process is.

Self-taught developers

Source for first website – table based layout, a lot of view source, a lot of Notepad, a lot of IE 6. Used to work mostly in HTML and CSS. With the help from books like “HTML for the World Wide Web – Visual Quickstart Guide”, learned a lot as a tinkerer.

Two years in: good with HTML (table layouts) and moderate CSS (fairly new), basic PHP, could use FTP and do basic web config. Could get a site up and running from scratch. This was enough to get my first developer job. This was without any computer science background.

Now: front end developer with 10 years experience, not an engineer, or a code ninja. I don’t know Angular, React, WebPack. I don’t even know JavaScript inside out. I am valuable to my team. Need more: empathy, honesty, being able to see stuff from a user’s perspective.

Self taught developers today, via Tom Morris’ live blogging.

Back in my day, we learnt how to do things. Nowadays, kids learn how use high-level APIs, without any idea how things work underneath. They might learn Meteor, but have no idea about HTTP or Sockets or how HTTP Sessions are implemented. Which is fine for developing tiny little apps, but they miss the I understand all this sh*t feeling.

Supposedly high-level frameworks allow developers to write more complex programs in the same timeframe. However, I don’t believe this is true for small projects, because the setup time is increasing exponentially. Let’s start a new single page app, what do we need? Node, npm, webpack, angular or react or any other trendy framework. Say what you will about PHP, but it was a single one-click WAMP install away from your fingertips.

If you were a 13 year old kid wanting to develop your own app, what would you use?

Writing a compiler using Python, Lex, Yacc and LLVM

I found a good post on how to build your own toy compiler using Flex, Bison and LLVM. I saw one disadvantage right in the beginning: you had to use C++. If I were just prototyping a compiler, I wouldn’t use C++ but rather a dynamic language. And last semester for the Compilers course that’s what I did.

Students were assigned to build a Pascal compiler (actually a subset, but not that small) and the tools suggested were Lex, Yacc (using the C language) and compiling the code into C. I took a different approach and decided to do the project in Python (I actually tried ruby first, but the ruby-lex and ruby-yacc projects didn’t pass my basic tests).

I wrote the language grammar using PLY (the lex and yacc DSLs for python) and it was pretty simple. As for the AST generation, I had only a class Node that accepted an type and a list of arguments while my colleagues using C had to make 1001 structs for each kind of node. Not that it wasn’t impossible using C, but dynamic languages make the code simpler and more clear.

For the code generation, I decided to go with LLVM. It is a very promising project. Just take a look at google’s unladen-swallow or macruby, even parrot is planing on using llvm for their JIT.

For writing the code in Python, I had to use the llvm-py which I may say it’s in a early stages and lacks documentation. That was my major problem using. I had only three resources: the official guide, a presentation in japanese with some source code, and the actual source of the project (in C and C++).

Since every time I got an error in the llvm code generation it crashed the program, I had to dig into the source code of the project and find that error message and reverse engineer what was wrong with my code (usually I was giving values or pointers instead of references and vice-versa). So if you are doing something more complex, you actually need some C++ reading skills.

The project however worked, and I’m making it available so anyone may use the code as an example until better resources are published.

Thoughts on PDC

Some may accuse me of being a Microsoft guy, but using a mac in the past or so, I can’t really say that about me. Nevertheless, I keep an eye on Microsoft Conferences ( and I even got to attend one or two) because really cool stuff come from them. I’m not kidding about this. Let’s see PDC 2008:

Windows 7

I’ve been following Engineering Windows 7 blog, so I was pretty up to date with this stuff, but seeing real screenshots was pretty impressive. I have mixed feelings about the taskbar redesign. While I really liked the old one, I understand that this way it’s more usable in smaller resolutions (say notebooks or even mobile phones, think Shift or Advantage). But in bigger displays, that are cheaper and cheaper each day, the old style was pretty cool.

The vista style of the windows was predictable, but I really hate it. I do! I hope they get a real theming engine, and not make us use some third party software to make them more macish.

One cool surprise was to see that they fixed the horrible wifi icon in the traybar. Linux and Mac did it right years ago, and in Windows up to Vista and even in Windows Mobile it’s a pain to connect to networks.

About the multi-touch? Well, they had it all along with Surface (and Surface SDK), so no big surprise. We’ll see MS release the iTablet before Apple does.

The Cloud Stuff

a.k.a. Windows Azure

Well, startups are going the Cloud way. Amazon Web Services and Google App Engine are just a first step. Microsoft wants Entreprise costumers to join this trend, and be able to have their business in the cloud. I don’t know if this is going to be such as a success and they think. a) real small business don’t want their data on the clould. They want it in their small server in their intranet. b) Large companies that have the need for a cloud server probably can support having their own infrastructure and not relying on Microsoft. Maybe I’m mistaken, but we’ll see.

James Governor has written a really interesting post on this matter and even mentions OpenID in Azure Services.

More Cloud Stuff

a.k.a. Live Mesh

Live Mesh is the Mobile Me for the rest of us. It syncs files P2P or through the cloud and for those, like me, with several computers rocks.

Since the Mac and Windows Mobile clients came out, I guess I’ll have to give it a try some day.

Dale Lane writes about the transition from USB syncing to Cloud syncing. It’s true Google doesn’t provide a offline sync out of the box in the Android, but I like to have the oldschool method available when needed.

Yet More Cloud Stuff

a.k.a. Live Services

Angus got extra points for the shirt and for spreading the social word among the entreprise developers there.

It’s true that Microsoft has a different view form Google and Yahoo that are embracing the OpenID+OAuth way, but this might change in the future. You can already see some little steps being made.

Dynamic Languages

Oddly, the first dynamic language I noticed in PDC was C#. Really! C# is now lightyears away from Java, and is evolving continuously. Version 4 brings a lot of new features and one of them is the ability to integrate dynamic languages directly in C# using the dynamic type. I believe C# is becoming more of a glue language (LINQ, Dynamic Languages, F#) that allows programmers to switch smoothly to other languages.

As usual, I love John Lam’s talk on IronRuby that besides the usual C#, Silverlight and Testing/Mocking stuff, demoed a Visual Studio Plugin in Ruby and Web Services using Sinatra. You should really take a look at it.

Oslo Modeling tools

DSLs are becoming popular in the several business software. and is something Microsoft was looking at a while ago. While I’d say IronRuby was the way to go (see RSpec examples), they took it further and made their own toolkit, Oslo, to develop both visually and textually Models The language they created to achieve that purpose is called M, and right now is supported through the IntelliPad editor.

In fact this editor was what got my interest in this area, since it’s codename was Emacs.NET, and since I’m in the quest for the perfect editor I wanted to take a look. Well, right now it supports the M language, but “you can extend it using IronPython”:hhttp://www.masteringbiztalk.com/blogs/jon/PermaLink,guid,92ec6f1f-45e5-4b7d-b675-548be5131a07.aspx. I’ll wait to see the first plugins to support different languages in the IntelliPad.

In the meanwhile, take a look at the different Oslo sessions at PDC

Mono

Yeah, Mono gets to be one of the main points of this post, as it should also be very important to Microsoft. The work Miguel and the team is doing gives much more value to .NET and Microsoft, than any other technology they presented in my opinion. Since the Mac and Linux worlds are raising their share, it’s important to let developers target those platforms too. And their doing interesting new stuff too, like the C# compiler service, the C# interpreter and even running .NET apps in the iPhone!

So take a look at his talk, one of the best in the whole PDC.

Of course this wasn’t everything PDC was about, but the stuff that I really care about. And I really liked some of this stuff!

Now I can touch ASP.NET again

So after my first real project in ASP.NET 2.0, I’ve never touched ASP.NET again. It’s simply ugly. And coding for the web in a language like C#, or Java is really a PITA. I just want my logic explained, and it’s one of the reasons for Ruby on Rails success.

But today Microsoft has made a small step that may make me experiment some stuff in their web technology again:

This afternoon we released a refresh of our DLR/IronPython support for ASP.NET, now called “ASP.NET Dynamic Language Support”, on our CodePlex site.

This means I will be able to do MVC web applications in Python (or Ruby). This is their response to the RoR success. Of course I like Django the most and I may even use it in the MS stack. This because the Microsoft teams for the IronRuby and IronPython are working to get Rails and Django working in their platforms, which is a really cool thing coming from the company that we all know well.