Alcides Fonseca

40.197958, -8.408312

The value of OpenSource

We estimate the supply-side value of widely-used OSS is $4.15 billion, but that the demand-side value is much larger at $8.8 trillion. We find that firms would need to spend 3.5 times more on software than they currently do if OSS did not exist. […] Further, 96% of the demand-side value is created by only 5% of OSS developers.

The Value of Open Source Software, Harvard Business School Strategy Unit, via Simon Willison

FFI troubleshoot in Haskell

When installing Haskell’s HLint via:

stack install hlint

I was getting the following error:


base-compat-batteries            > [  2 of 118] Compiling Control.Concurrent.Compat.Repl.Batteries                                          
hashable                         >                                                                                                          
hashable                         > /private/var/folders/3g/7tpj8zwx14qddcr3bt_w8jnr0000gn/T/stack-e594623fe9dbdb84/hashable-1.3.5.0/In file included from /var/folders/3g/7tpj8zwx14qddcr3bt_w8jnr0000gn/T/ghc52550_0/ghc_20.c:4:0: error:
hashable                         >                                                                                                          
hashable                         >                                                                                                          
hashable                         > In file included from /Users/alcides/.stack/programs/aarch64-osx/ghc-9.0.2/lib/ghc-9.0.2/lib/../lib/aarch64-osx-ghc-9.0.2/rts-1.0.2/include/ffi.h:66:0: error:
hashable                         >                                                                                                          
hashable                         >                                                                                                          
hashable                         > /Users/alcides/.stack/programs/aarch64-osx/ghc-9.0.2/lib/ghc-9.0.2/lib/../lib/aarch64-osx-ghc-9.0.2/rts-1.0.2/include/ffitarget.h:6:10: error:
hashable                         >      fatal error: 'ffitarget_arm64.h' file not found                                                     
hashable                         >   |                                                                                                      
hashable                         > 6 | #include "ffitarget_arm64.h"                                                                         
hashable                         >   |          ^                                                                                           
hashable                         > #include "ffitarget_arm64.h"                                                                             
hashable                         >          ^~~~~~~~~~~~~~~~~~~                                                                             
hashable                         > 1 error generated.                                                                                       
hashable                         > `gcc' failed in phase `C Compiler'. (Exit code: 1)   

The problem is that my clang was installed via home-brew. As such, there were a few missing paths. The solution was to run the following to find the location of this header:

find / -iname ffitarget_arm64.h

My output was:

find: /usr/sbin/authserver: Permission denied
find: /Library/Application Support/Apple/Screen Sharing/Keys: Permission denied
find: /Library/Application Support/Apple/ParentalControls/Users: Permission denied
find: /Library/Application Support/Apple/AssetCache/Data: Permission denied
find: /Library/Application Support/Apple/Remote Desktop/Task Server: Permission denied
find: /Library/Application Support/Apple/Remote Desktop/Client: Permission denied
find: /Library/Application Support/ApplePushService: Permission denied
find: /Library/Application Support/com.expressvpn.ExpressVPN/data: Permission denied
/Library/Developer/CommandLineTools/SDKs/MacOSX13.3.sdk/usr/include/ffi/ffitarget_arm64.h
/Library/Developer/CommandLineTools/SDKs/MacOSX14.0.sdk/usr/include/ffi/ffitarget_arm64.h
...

Now you need to pick the first one (second to last line in the previous output) as copy the path up to include (inclusive). Now use it in the following line, or add it to your ~/.zshrc.

C_INCLUDE_PATH=/Library/Developer/CommandLineTools/SDKs/MacOSX13.3.sdk/usr/include/

Thank your family with Backups

It’s thanksgiving time in the US and we’re nearing Xmas time in many other places. I would recommend setting up backups for your loved ones as a wonderful and thoughtful gift.

Last month I had to deal with the loss of a relative’s entire photo library. She copied all the loose photos in ~/Pictures to Apple Photo’s (and therefore to ~/Pictures/Photos.photolibrary/) and, once she was done, she selected everything inside the Pictures folder, and deleted them. And to be sure, she also emptied her trashcan.

Because I hadn’t set up a backup for this particular machine, I had to spend two weekends recovering all deleted photos among many trash lost in the remains of the HDD. For this purpose, I recommend Disk Drill.

As for Backups, I really recommend Backblaze, as it automatically backups all files to their cloud. You have optional encryption (which I don’t use, so I can access the file explorer on their website, to do selective backups). It’s a solution for non-technical folks that happens in the background with no need for user action. They might need your help to do a restore, but that’s very acceptable in my opinion. It has saved me several times for my own laptop.

If you are using this for a more technical use, you should be aware that by default, it does not back up large files like virtual machines or disk images. You should change this in the settings. Having said this, I also keep one time machine in an external disk, and another one in my NAS.

Aventura com as Giras

Apesar de adorar o conceito das Giras, — bicicletas partilhadas, com docas para não ficarem em qualquer sítio, com um passe anual tão baixo (25 euros) que incentiva as pessoas a usarem-nas em vez de transportes públicos como o metro ou autocarros — a implementação não tem sido nada feliz.

É certo que as demoras causadas pela falha de contrato de fornecedor não ajudaram, mas a minha principal queixa é a qualidade do software. Não que as bicicletas estejam sempre perfeitas, mas há forma de reportar e pedir que as arranjem.

Software pouco aperfeiçoado

Ora o primeiro problema (de muito menor importância) é que a aplicação em iOS não integra com o gestor de passwords nativo, pelo que é necessário autenticar-me sempre que abro a aplicação. É irritante, mas não me impede de usar o sistema. E mostra a pouca atenção que dão à experiência de utilização.

Assumem que tudo é perfeito.

Um problema bem mais significativo é o facto de quando se levanta uma bicicleta e se descobre que ela não funciona (desde coisas simples como o motor não funcionar, até não ter travões) e a devolvemos na própria estação ou na seguinte num prazo de 2 minutos, ele considera uma viagem completa e não deixa levantar outra num espaço de 5 minutos. A minha sugestão é não ter esse intervalo se a viagem tiver sido curta, e ter sido dado uma pontuação baixa indicando o problema com a bicicleta. Para ser usada como meio de transporte, não pode ter tantos entraves. E não é algo tão incomum que não deva ser levado a sério. Acontece-me uma vez a cada duas semanas.

Usam dark patterns.

Por falar em reportar problemas: no final de cada viagem é obrigatório pontuar a viagem. E no caso da pontuação ser baixa, é obrigatório escrever algo. Com cinco estrelas já não é. Ora, quando uma pessoa tem pressa vai dar 5 estrelas só para não escrever nada.

Para além de não dever ser obrigatório escrever algo, devia haver uma lista de problemas mais comuns por onde escolher. Torna-se a denúncia de problemas muito mais simples, e evita terem alguém no back-office a processar as mensagens dos vários utilizadores.

A história

Ora certo dia fui de bicicleta para o trabalho. Ao chegar, entreguei a bicicleta na doca e ela trancou a bicicleta. No entanto, na aplicação a viagem continuava a decorrer. Não a pude cancelar na aplicação, porque isso permitiria a qualquer um cancelar viagens sem devolver a bicicleta. Como infelizmente tem sido normal (e confirmado pelo suporte técnico), a doca não comunicou com sucesso com o servidor que a bicicleta tinha sido recebida. E como não foi a primeira vez, fiz o procedimento recomendado: ligar para o apoio técnico e esperar 5 minutos que o assunto ficasse resolvido.

Mas ninguém me atendeu a chamada. Segui para o meu trabalho e fiz o dia completo, esquecendo-me completamente do assunto. Quando quis regressar a casa, abri a aplicação e reparei que tinha tido a viagem a decorrer até ao momento. Reiniciei a aplicação e ele (finalmente) apercebeu-se que a viagem terminou. Mas tinha-me levado a saldo negativo. Tentei ligar para o suporte técnico outra vez, mas também ninguém me atendeu desta vez. Tive de carregar o saldo da aplicação, para poder ter saldo positivo e fazer a minha viagem de regresso.

Em casa escrevi um email muito zangado. Não era a primeira vez que me acontecia e já tinha oferecido os meus serviços para tentar solucionar o problema.
Responderam ao email num PDF em anexo (??).Devolveram-me o custo da viagem, afirmando que era uma excepção e que a culpa era minha. Isto depois do suporte técnico ao telefone em vezes anteriores me confirmar que eram as docas que nem sempre reportavam bem a actividade. Quanto ao dinheiro extra que tive de usar para carregar o saldo desnecessariamente fica do lado dele, desse não abrem a mão.

Conclusão: o software da Gira é mau, não tem sido melhorado, e o suporte técnico atira primeiro a culpa para os utilizadores, e demora imenso a admitir que o problema é deles, e não está disponível a devolver dinheiro introduzido na plataforma. Claramente as plataformas concorrentes privadas não sofrem destes problemas, porque têm concorrência, e pretendem que os utilizadores usufruam ao máximo da plataforma. Em justiça para a Gira, nenhuma plataforma privada faria preços tão baixos. Mas não é motivo para não se esforçarem na qualidade do software.

The First Lecture of My Compilers Class

Every year I have taught compilers, I start the first class by writing a compiler, even before going into the syllabus. Students are aware of what a compiler is, but not necessarily how it works. This introduction gives them this context, and allows students to make better sense of the syllabus and the contents for the rest of the semester.

As this is a very limited compiler (both in scope, and in quality), identifying those challenges also gives a perspective of why is it relevant to spend one or two classes on parsing.

Weekly Links

Academia

Development

Programming Languages

Another Puzzle with z3

On twitter, I’ve found another puzzle that could be trivially solved using z3, and serves as a good introductory example. Here is a very simple solution:

Software Engineering Practices

  • Documentation in the same repo as the code
  • Mechanisms for creating test data
  • Rock solid database migrations
  • Templates for new projects and components
  • Automated code formatting
  • Tested, automated process for new development environments
  • Automated preview environments

(I really like his rejection of the term “best practices” here: I always feel it’s prescriptive and misguiding to announce something as “best”.) — Simon Willison

Understandable and Useful Error Messages for Liquid Types @ WITS2022

Early at the WITS workshop at this year’s POPL I presented why we need to improve the error messages of languages with liquid types, in order to increase adoption. This presentation resulted from our experiences using (LiquidHaskell) and developing (Aeon, LiquidJava and MLVP) languages with liquid types. They are very useful to have guarantees of correctness, but you usually don’t know if it is an error in the usage of functions, or in their specification. We end up proposing some suggestions to improve the state of the art, in which we are working on.

The video:

And the slides:

LiquidHaskell on the M1

Getting a M1 Max MacBook Pro, I had to figure out how to install all the software I had on my old Intel Mac. One of the most challenging was LiquidHaskell. Here are the instructions for others to save some time.

Install z3

LiquidHaskell requires an SMT solver to handle constraint-solving. Z3 is my go to solver.

brew install z3
z3 --version
# Z3 version 4.8.14 - 64 bit

Install LLVM 12

Because of this bug in homebrew’s LLVM 13 that solves other problems, we need LLVM 12.

brew install llvm@12

Install latest version of stack.

The stable version of stack will not install GHC on OSX arm installations. To overcome that, we need to install the latest version.

brew install --HEAD stack

Setup a Demo Repository

I recommend using this scaffold project to get LiquidHaskell running on the command line and VS Code. It might work on other editors, but I have not tested them.

git clone https://github.com/alcides/lh-plugin-demo.git
cd lh-plugin-demo
stack build
stack ghci

This last line should open ghci, doing all the liquid type verifications necessary. You should see two green lines with 1 and 3 constraints checked, respectively.

You can do the exercise of editing src/Demo/Lib.sh and changing x + 1 to x - 1 and running the command again. You should now see an error.

VSCode Configuration

Now that we have error reporting on the command-line, we can move to the more convenient code editor. In my case that means VSCode1.

The first step is to uninstall all Haskell plugins. Just trust me on this, they are all pretty much incompatible with each other, except when they aren’t. Now you need to install Simple GHC (Haskell) Integration. On the CLI:

brew install visual-studio-code
code --install-extension dramforever.vscode-ghc-simple

However, this will not work for a M1-unrelated reason: LiquidHaskell and Haddock are not compatible (unfortunately). So the first step is to disable it.

Go to Settings > Extensions > GHC Simple config > Ghc Simple > Startup Commands and remove :set -haddock.

Cmd+shit+P and select GHC: Restart GHCi Sessions if it doesn’t restart on its own.

And now you should have localized errors in your Haskell files.

1 Textmate’s support for Haskell is pretty minimal. Stack and Cabal support is completely missing.

Enforcement by Software

Suppose, for example, that your driver’s license is on your mobile phone. (…) You no longer need to go to a government office to get your new driver’s license. However, it now makes it possible for a civil servant to decide that you cannot drive your car on Tuesdays. They do not need a new law, they do not need your consent, they can just switch a flag inside a server.

Enforcement by Software by Daniel Lemire

This is something that has been worrying me, especially in the cases where software doesn’t follow the law and leads tens or thousands of users to follow processes illegally, just because there is no other way of doing it. Maybe society should invest more in legal verification of software and processes.

Installing Lean4 on the M1 architecture

I upgraded to a new M1Max MacbookPro (will have to review it next), and I had to install lean4 today. Unfortunately, elan does not work in the M1 architecture as expected, so I had to use rosetta. here’s how I set it up:

1. Install VSCode: brew install --cask visual-studio-code && code --install-extension jroesch.lean
2. Install Rosetta2 /usr/sbin/softwareupdate --install-rosetta --agree-to-license
3. Install elan manually: curl https://raw.githubusercontent.com/leanprover/elan/master/elan-init.sh -sSf | arch -x86_64 sh
4. Install latest lean: elan default leanprover/lean4:nightly
5. Configure the lean path in the lean4 vs code extension to point to this lean: “Lean4: Executable Path” to arch -x86_64 lean

It will work. Now let’s hope that elan adds supports for arm64 binary fetching.

T-shaped People and Academia

When I’m looking for potential students, I always try to identify T-shaped people: those who can go deep into one topic, but also have a broad view of other aspects of Software Engineering. But what Bastien Rieck says is completely true.

In any case, many graduate schools and resume reviewers tend to ignore the breadth of a person’s skills, focusing instead only on the depth.

T-Shaped People and Academia, by Bastien Rieck

T-shaped people not only help in several aspects of the development of a project, but also identify bridges with other areas, and allow your work to be applied to other problems.

ImportError in numpy caused by missing libgfortran

Today I got the following error when executing regular numpy code in my homebrew Python3.9 installation.

ImportError: dlopen(/usr/local/lib/python3.9/site-packages/scipy/linalg/_fblas.cpython-39-darwin.so, 0x0002): Library not loaded: /usr/local/opt/gcc/lib/gcc/10/libgfortran.5.dylib Referenced from: /usr/local/lib/python3.9/site-packages/scipy/linalg/_fblas.cpython-39-darwin.so Reason: tried: '/usr/local/opt/gcc/lib/gcc/10/libgfortran.5.dylib' (no such file), '/usr/local/lib/libgfortran.5.dylib' (no such file), '/usr/lib/libgfortran.5.dylib' (no such file)

The issue was that I had gcc 11 installed (you can check via brew info gcc), but homebrew-installed bumpy expected gcc 10 to be installed.

Reinstalling numpy from bottle or from source didn’t work.

The workaround:

brew install gcc@10
sudo ln -s /usr/local/opt/gcc@10/lib/gcc/10/libgfortran.dylib /usr/local/lib/libgfortran.5.dylib  
sudo ln -s /usr/local/opt/gcc@10/lib/gcc/10/libquadmath.0.dylib /usr/local/lib/libquadmath.0.dylib

Cuda not working on Google Colab

For the 4th year in a row, I am teaching Parallel and Concurrent Programming. One of the main sections of the course is general-purpose GPU programming (GPGPU), where I mostly use CUDA, but I also show them how to the same using OpenCL, bindings in Java and Python, as well as Numba. For their projects, all students use either CUDA or Numba.

Because we want to provide students with access to a GPU, even if outside campus, we show them how to use Google Colab, a Jupiter Notebook environment with optional GPU or TPU, and it has been working fine for 3 straight years.

When preparing this week’s class, I ran the examples on my local machine and it all worked. Fast forward to the lab, and I was demoing the usage of Google Colab and it wouldn’t work. In panic, I gave students temporary access to my machine while I debugged the issue.

The Example

I’m still using the code prepared by NVIDIA’s Nuno Subtil for the GPGPU workshop we organized back in 2011. It is a simplified version of the NVIDIA examples, with a very handy HANDLE_ERRORS macro that checks the return code of each cuda function (cudaMalloc, cudaMemcpy, cudaFree. The example in particular would take two arrays A (containing elements from 0 to 1000) and B (from 1000 to 0) and would return the array C, with the sum of the corresponding positions in both arrays.

The Error

The program would output an array of all zeroes. CUDA functions return 0, showing they are working correctly. I reviewed all the code and it should work. And it worked on my machine ( famous last words ).

Finding the Error

After modifying the arrays manually to check where the error was, I restricted it to the kernel call. And it made sense, it was the only call whose result wasn’t being checked. Knowing this, I’ve added line 47 to understand the error: Device Variable Copying: no kernel image is available for execution on the device

Understanding the Error

Kernel is the name of the main function that executes on the GPU and it is called from the CPU using a kernelName<NTHREADS>(args). The kernel image should be the binary on the GPU that will be execute. If it isn’t there, the kernel does nothing. But the program continues silently. This is a really bad design decision on their part.

Fixing the Error

Because different GPUs support different binary versions (due to architectural differences), I tried to compile to the right architecture. Google Colab gave me a Tesla K80, which worked with SM_37 and compute_37. I’ve added those as flags in the mvvm compiler:

!nvcc -arch=sm_37 -gencode=arch=compute_37,code=sm_37 vector-sum.cu -o vector-sum

And it now worked. Why is this required? Because Tesla K80 is only supported between CUDA 5 and 10. Google Colab is running on CUDA 11 (despite providing unsupported GPUs) and users get silent errors.

Spending 30 minutes juggling debugging this issue while giving students a fallback alternative was very stressful. I mostly blame Nvidia for not having a better exception handling of kernel calls. There’s also a little blame in not supporting old cars in new SDK versions (even if not supporting all features). Google also shares the blame by loading a version of CUDA incompatible with the provided GPU.

Runtime, Run time and Run-time

During my PhD, my advisor and I disagreed about the spelling of runtime/run time. There was no definite answer as different papers used different spelling in different usages. This week I found a rationale that makes sense to me, even if it means that we were both wrong.

There are three variants of the word “run time” in computer science:
run-time — adjective (“the run-time performance”)
run time — noun — a moment in time (“occurs at run time”), or an amount of time (“the run time is 8 hours”)
runtime — noun — a program that runs/supports another program (“the runtime handles memory allocation”); a synonym for “runtime system”

Refinement Types Tutorial

I am (re)writing an implementation for the Refinement Types tutorial by Ranjit Jhala and Niki Vazou. I am hoping that it helps others understand these concepts, and promote this research among your researchers.

I am streaming its development on Twitch, so if you are interested make sure you either have your notifications turned on, or you follow me on Twitter.

Chapters:

Scientific Plots

Here are a few resources for plotting scientific data that is accessible and accurate.

  • YellowBrick — A matplotlib-based library for data-science. It has several templates for evaluating classifiers, exploring data, …

  • missingno — A visualization library for understanding missing data in datasets.
  • LovelyPlots – A theme for beautiful and TeX-compatible matplotlib themes.

Haskell, Cabal and QuickCheck on MacOS (Big Sur) and Ubuntu

The current approach for having a complex Haskell project with dependencies is using stack. While it is suitable for larger projects, we avoid it for our undergrad students, who only need ghc, ghci and small editor support.

We found a subset of VSCode plugins that provide an integrated interpreter, provide type information, autocomplete and even runs (sometimes too strict) hlint on your code automatically. Neither of this plugins requires Stack.

The main issue is that we teach QuickCheck and we need to install it globally. This is far from being straightforward in Windows and MacOS.

In Ubuntu it is quite direct using apt sudo apt-get install hlint haskell-platform libghc-quickcheck2-dev

Here is the solution for the Mac, just make sure to adapt to your own GHC version:


brew cask install cabal-install
ln -s ~/.cabal/store/ghc-8.10.1/package.db ~/.ghc/x86_64-darwin-8.10.1/package.conf.d
cabal install QuickCheck

If you now open ghci, it will import Test.QuickCheck successfully.

For reference, here are the plugins that work successfully without any Stack dependencies:

code --install-extension jcanero.hoogle-vscode
code --install-extension justusadam.language-haskell
code --install-extension hoovercj.haskell-linter
code --install-extension dramforever.vscode-ghc-simple
code --install-extension meowcolm024.has-go