Alcides Fonseca

40.197958, -8.408312

Aventura com as Giras

Apesar de adorar o conceito das Giras, — bicicletas partilhadas, com docas para não ficarem em qualquer sítio, com um passe anual tão baixo (25 euros) que incentiva as pessoas a usarem-nas em vez de transportes públicos como o metro ou autocarros — a implementação não tem sido nada feliz.

É certo que as demoras causadas pela falha de contrato de fornecedor não ajudaram, mas a minha principal queixa é a qualidade do software. Não que as bicicletas estejam sempre perfeitas, mas há forma de reportar e pedir que as arranjem.

Software pouco aperfeiçoado

Ora o primeiro problema (de muito menor importância) é que a aplicação em iOS não integra com o gestor de passwords nativo, pelo que é necessário autenticar-me sempre que abro a aplicação. É irritante, mas não me impede de usar o sistema. E mostra a pouca atenção que dão à experiência de utilização.

Assumem que tudo é perfeito.

Um problema bem mais significativo é o facto de quando se levanta uma bicicleta e se descobre que ela não funciona (desde coisas simples como o motor não funcionar, até não ter travões) e a devolvemos na própria estação ou na seguinte num prazo de 2 minutos, ele considera uma viagem completa e não deixa levantar outra num espaço de 5 minutos. A minha sugestão é não ter esse intervalo se a viagem tiver sido curta, e ter sido dado uma pontuação baixa indicando o problema com a bicicleta. Para ser usada como meio de transporte, não pode ter tantos entraves. E não é algo tão incomum que não deva ser levado a sério. Acontece-me uma vez a cada duas semanas.

Usam dark patterns.

Por falar em reportar problemas: no final de cada viagem é obrigatório pontuar a viagem. E no caso da pontuação ser baixa, é obrigatório escrever algo. Com cinco estrelas já não é. Ora, quando uma pessoa tem pressa vai dar 5 estrelas só para não escrever nada.

Para além de não dever ser obrigatório escrever algo, devia haver uma lista de problemas mais comuns por onde escolher. Torna-se a denúncia de problemas muito mais simples, e evita terem alguém no back-office a processar as mensagens dos vários utilizadores.

A história

Ora certo dia fui de bicicleta para o trabalho. Ao chegar, entreguei a bicicleta na doca e ela trancou a bicicleta. No entanto, na aplicação a viagem continuava a decorrer. Não a pude cancelar na aplicação, porque isso permitiria a qualquer um cancelar viagens sem devolver a bicicleta. Como infelizmente tem sido normal (e confirmado pelo suporte técnico), a doca não comunicou com sucesso com o servidor que a bicicleta tinha sido recebida. E como não foi a primeira vez, fiz o procedimento recomendado: ligar para o apoio técnico e esperar 5 minutos que o assunto ficasse resolvido.

Mas ninguém me atendeu a chamada. Segui para o meu trabalho e fiz o dia completo, esquecendo-me completamente do assunto. Quando quis regressar a casa, abri a aplicação e reparei que tinha tido a viagem a decorrer até ao momento. Reiniciei a aplicação e ele (finalmente) apercebeu-se que a viagem terminou. Mas tinha-me levado a saldo negativo. Tentei ligar para o suporte técnico outra vez, mas também ninguém me atendeu desta vez. Tive de carregar o saldo da aplicação, para poder ter saldo positivo e fazer a minha viagem de regresso.

Em casa escrevi um email muito zangado. Não era a primeira vez que me acontecia e já tinha oferecido os meus serviços para tentar solucionar o problema.
Responderam ao email num PDF em anexo (??).Devolveram-me o custo da viagem, afirmando que era uma excepção e que a culpa era minha. Isto depois do suporte técnico ao telefone em vezes anteriores me confirmar que eram as docas que nem sempre reportavam bem a actividade. Quanto ao dinheiro extra que tive de usar para carregar o saldo desnecessariamente fica do lado dele, desse não abrem a mão.

Conclusão: o software da Gira é mau, não tem sido melhorado, e o suporte técnico atira primeiro a culpa para os utilizadores, e demora imenso a admitir que o problema é deles, e não está disponível a devolver dinheiro introduzido na plataforma. Claramente as plataformas concorrentes privadas não sofrem destes problemas, porque têm concorrência, e pretendem que os utilizadores usufruam ao máximo da plataforma. Em justiça para a Gira, nenhuma plataforma privada faria preços tão baixos. Mas não é motivo para não se esforçarem na qualidade do software.

The First Lecture of My Compilers Class

Every year I have taught compilers, I start the first class by writing a compiler, even before going into the syllabus. Students are aware of what a compiler is, but not necessarily how it works. This introduction gives them this context, and allows students to make better sense of the syllabus and the contents for the rest of the semester.

As this is a very limited compiler (both in scope, and in quality), identifying those challenges also gives a perspective of why is it relevant to spend one or two classes on parsing.

Weekly Links

Academia

Development

Programming Languages

Software Engineering Practices

  • Documentation in the same repo as the code
  • Mechanisms for creating test data
  • Rock solid database migrations
  • Templates for new projects and components
  • Automated code formatting
  • Tested, automated process for new development environments
  • Automated preview environments

(I really like his rejection of the term “best practices” here: I always feel it’s prescriptive and misguiding to announce something as “best”.) — Simon Willison

Understandable and Useful Error Messages for Liquid Types @ WITS2022

Early at the WITS workshop at this year’s POPL I presented why we need to improve the error messages of languages with liquid types, in order to increase adoption. This presentation resulted from our experiences using (LiquidHaskell) and developing (Aeon, LiquidJava and MLVP) languages with liquid types. They are very useful to have guarantees of correctness, but you usually don’t know if it is an error in the usage of functions, or in their specification. We end up proposing some suggestions to improve the state of the art, in which we are working on.

The video:

And the slides:

LiquidHaskell on the M1

Getting a M1 Max MacBook Pro, I had to figure out how to install all the software I had on my old Intel Mac. One of the most challenging was LiquidHaskell. Here are the instructions for others to save some time.

Install z3

LiquidHaskell requires an SMT solver to handle constraint-solving. Z3 is my go to solver.

brew install z3
z3 --version
# Z3 version 4.8.14 - 64 bit

Install LLVM 12

Because of this bug in homebrew’s LLVM 13 that solves other problems, we need LLVM 12.

brew install llvm@12

Install latest version of stack.

The stable version of stack will not install GHC on OSX arm installations. To overcome that, we need to install the latest version.

brew install --HEAD stack

Setup a Demo Repository

I recommend using this scaffold project to get LiquidHaskell running on the command line and VS Code. It might work on other editors, but I have not tested them.

git clone https://github.com/alcides/lh-plugin-demo.git
cd lh-plugin-demo
stack build
stack ghci

This last line should open ghci, doing all the liquid type verifications necessary. You should see two green lines with 1 and 3 constraints checked, respectively.

You can do the exercise of editing src/Demo/Lib.sh and changing x + 1 to x - 1 and running the command again. You should now see an error.

VSCode Configuration

Now that we have error reporting on the command-line, we can move to the more convenient code editor. In my case that means VSCode1.

The first step is to uninstall all Haskell plugins. Just trust me on this, they are all pretty much incompatible with each other, except when they aren’t. Now you need to install Simple GHC (Haskell) Integration. On the CLI:

brew install visual-studio-code
code --install-extension dramforever.vscode-ghc-simple

However, this will not work for a M1-unrelated reason: LiquidHaskell and Haddock are not compatible (unfortunately). So the first step is to disable it.

Go to Settings > Extensions > GHC Simple config > Ghc Simple > Startup Commands and remove :set -haddock.

Cmd+shit+P and select GHC: Restart GHCi Sessions if it doesn’t restart on its own.

And now you should have localized errors in your Haskell files.

1 Textmate’s support for Haskell is pretty minimal. Stack and Cabal support is completely missing.

Enforcement by Software

Suppose, for example, that your driver’s license is on your mobile phone. (…) You no longer need to go to a government office to get your new driver’s license. However, it now makes it possible for a civil servant to decide that you cannot drive your car on Tuesdays. They do not need a new law, they do not need your consent, they can just switch a flag inside a server.

Enforcement by Software by Daniel Lemire

This is something that has been worrying me, especially in the cases where software doesn’t follow the law and leads tens or thousands of users to follow processes illegally, just because there is no other way of doing it. Maybe society should invest more in legal verification of software and processes.

Installing Lean4 on the M1 architecture

I upgraded to a new M1Max MacbookPro (will have to review it next), and I had to install lean4 today. Unfortunately, elan does not work in the M1 architecture as expected, so I had to use rosetta. here’s how I set it up:

1. Install VSCode: brew install --cask visual-studio-code && code --install-extension jroesch.lean
2. Install Rosetta2 /usr/sbin/softwareupdate --install-rosetta --agree-to-license
3. Install elan manually: curl https://raw.githubusercontent.com/leanprover/elan/master/elan-init.sh -sSf | arch -x86_64 sh
4. Install latest lean: elan default leanprover/lean4:nightly
5. Configure the lean path in the lean4 vs code extension to point to this lean: “Lean4: Executable Path” to arch -x86_64 lean

It will work. Now let’s hope that elan adds supports for arm64 binary fetching.

T-shaped People and Academia

When I’m looking for potential students, I always try to identify T-shaped people: those who can go deep into one topic, but also have a broad view of other aspects of Software Engineering. But what Bastien Rieck says is completely true.

In any case, many graduate schools and resume reviewers tend to ignore the breadth of a person’s skills, focusing instead only on the depth.

T-Shaped People and Academia, by Bastien Rieck

T-shaped people not only help in several aspects of the development of a project, but also identify bridges with other areas, and allow your work to be applied to other problems.

ImportError in numpy caused by missing libgfortran

Today I got the following error when executing regular numpy code in my homebrew Python3.9 installation.

ImportError: dlopen(/usr/local/lib/python3.9/site-packages/scipy/linalg/_fblas.cpython-39-darwin.so, 0x0002): Library not loaded: /usr/local/opt/gcc/lib/gcc/10/libgfortran.5.dylib Referenced from: /usr/local/lib/python3.9/site-packages/scipy/linalg/_fblas.cpython-39-darwin.so Reason: tried: '/usr/local/opt/gcc/lib/gcc/10/libgfortran.5.dylib' (no such file), '/usr/local/lib/libgfortran.5.dylib' (no such file), '/usr/lib/libgfortran.5.dylib' (no such file)

The issue was that I had gcc 11 installed (you can check via brew info gcc), but homebrew-installed bumpy expected gcc 10 to be installed.

Reinstalling numpy from bottle or from source didn’t work.

The workaround:

brew install gcc@10
sudo ln -s /usr/local/opt/gcc@10/lib/gcc/10/libgfortran.dylib /usr/local/lib/libgfortran.5.dylib  
sudo ln -s /usr/local/opt/gcc@10/lib/gcc/10/libquadmath.0.dylib /usr/local/lib/libquadmath.0.dylib

Cuda not working on Google Colab

For the 4th year in a row, I am teaching Parallel and Concurrent Programming. One of the main sections of the course is general-purpose GPU programming (GPGPU), where I mostly use CUDA, but I also show them how to the same using OpenCL, bindings in Java and Python, as well as Numba. For their projects, all students use either CUDA or Numba.

Because we want to provide students with access to a GPU, even if outside campus, we show them how to use Google Colab, a Jupiter Notebook environment with optional GPU or TPU, and it has been working fine for 3 straight years.

When preparing this week’s class, I ran the examples on my local machine and it all worked. Fast forward to the lab, and I was demoing the usage of Google Colab and it wouldn’t work. In panic, I gave students temporary access to my machine while I debugged the issue.

The Example

I’m still using the code prepared by NVIDIA’s Nuno Subtil for the GPGPU workshop we organized back in 2011. It is a simplified version of the NVIDIA examples, with a very handy HANDLE_ERRORS macro that checks the return code of each cuda function (cudaMalloc, cudaMemcpy, cudaFree. The example in particular would take two arrays A (containing elements from 0 to 1000) and B (from 1000 to 0) and would return the array C, with the sum of the corresponding positions in both arrays.

The Error

The program would output an array of all zeroes. CUDA functions return 0, showing they are working correctly. I reviewed all the code and it should work. And it worked on my machine ( famous last words ).

Finding the Error

After modifying the arrays manually to check where the error was, I restricted it to the kernel call. And it made sense, it was the only call whose result wasn’t being checked. Knowing this, I’ve added line 47 to understand the error: Device Variable Copying: no kernel image is available for execution on the device

Understanding the Error

Kernel is the name of the main function that executes on the GPU and it is called from the CPU using a kernelName<NTHREADS>(args). The kernel image should be the binary on the GPU that will be execute. If it isn’t there, the kernel does nothing. But the program continues silently. This is a really bad design decision on their part.

Fixing the Error

Because different GPUs support different binary versions (due to architectural differences), I tried to compile to the right architecture. Google Colab gave me a Tesla K80, which worked with SM_37 and compute_37. I’ve added those as flags in the mvvm compiler:

!nvcc -arch=sm_37 -gencode=arch=compute_37,code=sm_37 vector-sum.cu -o vector-sum

And it now worked. Why is this required? Because Tesla K80 is only supported between CUDA 5 and 10. Google Colab is running on CUDA 11 (despite providing unsupported GPUs) and users get silent errors.

Spending 30 minutes juggling debugging this issue while giving students a fallback alternative was very stressful. I mostly blame Nvidia for not having a better exception handling of kernel calls. There’s also a little blame in not supporting old cars in new SDK versions (even if not supporting all features). Google also shares the blame by loading a version of CUDA incompatible with the provided GPU.

Runtime, Run time and Run-time

During my PhD, my advisor and I disagreed about the spelling of runtime/run time. There was no definite answer as different papers used different spelling in different usages. This week I found a rationale that makes sense to me, even if it means that we were both wrong.

There are three variants of the word “run time” in computer science:
run-time — adjective (“the run-time performance”)
run time — noun — a moment in time (“occurs at run time”), or an amount of time (“the run time is 8 hours”)
runtime — noun — a program that runs/supports another program (“the runtime handles memory allocation”); a synonym for “runtime system”

Refinement Types Tutorial

I am (re)writing an implementation for the Refinement Types tutorial by Ranjit Jhala and Niki Vazou. I am hoping that it helps others understand these concepts, and promote this research among your researchers.

I am streaming its development on Twitch, so if you are interested make sure you either have your notifications turned on, or you follow me on Twitter.

Chapters:

Scientific Plots

Here are a few resources for plotting scientific data that is accessible and accurate.

  • YellowBrick — A matplotlib-based library for data-science. It has several templates for evaluating classifiers, exploring data, …

  • missingno — A visualization library for understanding missing data in datasets.
  • LovelyPlots – A theme for beautiful and TeX-compatible matplotlib themes.

Haskell, Cabal and QuickCheck on MacOS (Big Sur) and Ubuntu

The current approach for having a complex Haskell project with dependencies is using stack. While it is suitable for larger projects, we avoid it for our undergrad students, who only need ghc, ghci and small editor support.

We found a subset of VSCode plugins that provide an integrated interpreter, provide type information, autocomplete and even runs (sometimes too strict) hlint on your code automatically. Neither of this plugins requires Stack.

The main issue is that we teach QuickCheck and we need to install it globally. This is far from being straightforward in Windows and MacOS.

In Ubuntu it is quite direct using apt sudo apt-get install hlint haskell-platform libghc-quickcheck2-dev

Here is the solution for the Mac, just make sure to adapt to your own GHC version:


brew cask install cabal-install
ln -s ~/.cabal/store/ghc-8.10.1/package.db ~/.ghc/x86_64-darwin-8.10.1/package.conf.d
cabal install QuickCheck

If you now open ghci, it will import Test.QuickCheck successfully.

For reference, here are the plugins that work successfully without any Stack dependencies:

code --install-extension jcanero.hoogle-vscode
code --install-extension justusadam.language-haskell
code --install-extension hoovercj.haskell-linter
code --install-extension dramforever.vscode-ghc-simple
code --install-extension meowcolm024.has-go

How to write a CV for students or recent graduates

Over the last couple of years, I have spent a significant amount of time browsing and scoring resumés (Vitae is the common name in Portugal) for different positions, both in academia and industry. The majority of those had been poorly designed. In this post, I try to help you design the perfect resumé, and avoid common pitfalls. While this is designed towards Software Engineering, you might find this relevant for other areas.

The Header

You should have your basic information on the top of your CV: Name, email address, location and optionally phone number. A few tips:

  • If your student email is something like sch34134123@…, please use a personal email address instead.
  • Never use your current employer’s email. Use a personal email address.
  • Do not use your full address. Your street number is useless for this purpose and you don’t want your personal information on the internet. Just list your current city/county.
  • I would suggest having the Country specified, as it helps companies in understanding if they can’t hire you (due to visa and immigration issues).

Things you should omit from the header:

Things you may or may not include in the header:

  • A plain (meaning non-party) picture. I would suggest towards including if there is an expectation that the hirer would know you (e.g. if you are applying for a research grant with a professor).

First things first.

The next two sections are Education and Experience. If you do not have real-world experience or if you are applying to an academic position, you should have Education first. Otherwise, I would have Experience first.

Experience

Experience should come in reverse chronological order. Have your last (or current) job first. This is usually what most recruiters will focus on. Besides the start and end dates, title and company, you should have a small paragraph explaining what were the main activities and responsibilities of your position. If you were a software developer, you should explain what kind of software you worked on, and what was your contribution. Some people also like to include technologies they have worked on, which works well with HR people.

If you had odd-jobs outside your field, please leave them off the first page and in “Other Experience”. Not that you should feel ashamed of them, but let recruiters focus on what’s important for the role they are pursuing.

Education

Stick with reverse chronological order. If you are currently taking (or just signed up for) a degree, please include it, as well as the expected graduation year. If you have other degrees like Biology or Music Theory, please leave them off the front-page (and move them to Other Education) except if that might be relevant for the job application! Examples would be a company that works on software for DNA processing, or on software for Audio Synthesis.

For each degree, you should list your average grade, as well as highlight the top 5 or top 10 courses in which you had the best grades, or which are relevant for the job application.

Projects

This section is specific for Computer Science and it is only relevant if you do not have a lot of experience. In this case, you should write one paragraph about the projects (even if it’s coursework) that you did, what technologies you used and what you learned. 3 projects are more than enough, and if they were developed outside course work (opensource or in a research environment), it’s more valuable. Include a link for the code repository (Github or similar), making sure you already have the final grade for that course.

Other Activities and Awards.

You can have some other activities listed in your CV, but make sure they are relevant for the job if they are in the first page. As a rule-of-thumb, you should list those that reveal organizational or leadership. Being a casual chess player is not a relevant activity (It’s actually a hobby, and you should leave it off the Resumé) but being the President of your local Chess chapter can be relevant.

Final Remarks

Show your CV to more senior colleagues and even professors and ask for feedback. You want your CV to shine compared to those Europass form-generated CVs.

As an example, you can look at my CV in 2011 (I also had a cooler web version).

More resources:

Assinar como professor de uma escola

Muito se tem escrito sobre os catedráticos da NovaSBE não quererem que se assine artigos de opinião com o nome da escola.

Ora eu cá acho que falta trazer o argumento mais importante à discussão: para se ler um artigo de opinião é necessário perceber o background do autor. Vou ler um artigo de forma diferente se souber que é uma parte interessada do tópico. Não só o autor deve identificar o(s) cargo(s) que exerce e onde, como também mencionar a relação directa com os intervenientes da peça que está a escrever (ex: a empresa XPTO é mecenas da universidade a que pertenço, embora não tenha relação directa).

Como exemplos de autores que levam esta transparência a sério, têm James Governor e Tim Bray. Os seus artigos identificam claramente onde existem conflitos de interesse, muitas vezes especificando a relação (“empresa x é cliente, mas não trabalhamos com esta divisão da empresa” ou “trabalhámos em tempos com a empresa X, mas não temos qualquer relação actualmente”).

Claro que o bottom line desta discussão é se as universidades públicas deveriam estar assim tão dependentes financeiramente de privados (que é a origem deste assunto ser sequer debatido). Claro que não, mas para resolver isso é necessário que o governo aumente o orçamento destas instituições.

Sobre este assunto, há uma carta assinada por 50 cientistas e 50 empresários (e a crescer) a pedir que se faça um investimento contínuo e de base para a ciência em Portugal.

Lean Tutorial for Mere Mortals

Today I gave a tutorial on the Lean programming language and its use for theorem proving. This presentation has an interesting background and I am interested in improving it.

I have known about interactive theorem provers for a while, and during the past couple of years I have taught a course where we ask students to prove properties by hand. Since then, I have been trying tutorials, watching talks and I couldn’t get into any of the different languages: Agda, Idris and Coq.

The main issue is that people who give 50-minute talks about it already know everything about it and can’t really see the issue from the eyes of an ignorant. In particular, ignorants regarding mathematics. Even watching and reading Philip Wadler’s Propositions as Types excellent talk and articles.

Since i’m a frequent z3 user, I’ve heard about the Lean theorem prover (and language) but I really didn’t pay much attention, until I had some free time (actually a deadline for a boring task) and tried the Natural Number Game (NNG) and it blew my mind.

NNG presents theorem proving as a game, in which by completing a level, you learn new tools (lemmas and tactics) that you can use in the next levels. If feels like a puzzle!

Sokoban is a great example that can be compared with this game. If you would try all possible combinations of movements, you would solve any level in an unlimited time. But by looking at the screen, you use your pattern recognition skills to try alternatives that are more probable to solve the level, using a bit of backtracking if you make a wrong decision. Sudoku is a similar game.

NNG takes the same approach, but the Goal window of lean (that shows context and goals) as the game viewport. You see the current scenario (in local variables) and the direction where you want to go (on the goals). You can then make movements (using tactics) that hopefully move you towards your goal. If it doesn’t, you can go back and delete some code.

What NNG lacks is a small explanation of the basic syntax and tactics. Luckily, the Lean window is interactive and you can move your cursor around and learn by moving around the proof. In a sense, it’s similar to the Fog of War in RTS games.

After finishing NNG, I tried to write some useful proofs for work, but I ended up spending 90% of the time reading the lean source code to understand what is available to me. Only after a while I learnt about mathlib and library_search. But even having a productive lemma searcher, I need to understand what tactics I can use to manipulate my context in order to advance towards something I have in hand. This is still a lot of effort to me, and the official documentation (both lean’s and mathlib) requires much more examples and explanations for newbies.

With the goal of trying to get more people to use Lean (and help me!), I gave a tutorial for a mixed audience ranging from undergraduates to full professors. I felt I was the right person to give this tutorial because I know almost nothing about theorem proving and Lean in particular. I showed just the basic steps and left some notes for attendees to make their own path afterwards.

I have made the materials available, including a written version of the tutorial, including exercises and solutions. Feel free to follow them and provide me feedback.

Lean supports several idioms and it is one of the biggest barriers to entering this world. Python’s motto of “one obvious way to write it” that doesn’t even apply to Python would help novices here. Reading the standard library is quite hard because it is written in a “difficult” style that makes use of monads and utf-8 characters. It makes it easier for mathematicians that understand what they are doing, but makes it very hard for non-mathematicians looking for a tactic or lemma that is missing in their proof.

I present Lean as being three different languages: the one for executing code, the one for logic and proofs, and the tactic language (meta). I understand they all boil down to being the same language, but for the user, they are used in different contexts.

I know that you can define a lemma or theorem using def, but that makes it hard for novices to understand what code will execute and what code is used just for proofs. I present proofs in begin-end blocks, showing a procedimental style that mimics the one we use in paper. I understand that they all can be written using lambdas and monads, but that is code quite hard to understand.

This style was the one that clicked for me and I wrote the tutorial to help hackers that can learn by example and have no strong mathematical foundations other can proofs by induction and a bit of Haskell understanding.