Alcides Fonseca

40.197958, -8.408312

ChatGPT in Papers

Google Scholar for certainly, here is turns up a huge number of academic papers that include parts that were evidently written by ChatGPT—sections that start with “Certainly, here is a concise summary of the provided sections:” are a dead giveaway.

Simon Willison

Peer review isn’t built to handle the flood of AI content, especially as not all of it will be obvious, and not all will be malicious (lots of scholars pay editors to help make their writing better, now they will use chat).

Ethan Mollick

Misha Teplitskiy

The AI boom will soon crash

Put in the simplest way: Things have been too good for too long in InvestorWorld: low interest, high profits, the unending rocket rise of the Big-Tech sector, now with AI afterburners. Wile E. Coyote hasn’t actually run off the edge of the cliff yet, but there are just way more ways for things to go wrong than right in the immediate future.

Money Bubble by Tim Bray

Tim (correctly) points out that when investors throw money at things that are not well understood (.com, web2.0, blockchain, AI), it will eventually disappoint and crash the markets. Enjoy it while you can.

Mamba: The Easy Way

Today, basically any language model you can name is a Transformer model. OpenAI’s ChatGPT, Google’s Gemini, and GitHub’s Copilot are all powered by Transformers, to name a few. However, Transformers suffer from a fundamental flaw: they are powered by Attention, which scales quadratically with sequence length. Simply put, for quick exchanges (asking ChatGPT to tell a joke), this is fine. But for queries that require lots of words (asking ChatGPT to summarize a 100-page document), Transformers can become prohibitively slow. […] Mamba appears to outperform similarly-sized Transformers while scaling linearly with sequence length.

Mamba: The Easy Way, by Jack Cook

A wonderful explanation of the architectural differences in mamba and how it is much faster than existing transformer implementations. May require some CNN/RNN background to fully understand.

Power Metal Data Analysis

Bands from Spain, Germany and Finland have an average of more than 1600 words vocabulary; in comparison native countries like UK, US and Scotland have an average of 925, 1383 and 1501 words respectively, The most metal words are deliverance, defender, honour, forevermore, realm and the least are shit, baby, fuck, girl, verse. The most negative song is Condemned To Hell by Gamma Ray and the most positive There’s Something In The Skies by Dark Moor.

Power Metal: is it really about dragons? by Matt D.

I really wish there was code available, because I think the 58 bands provide a quite limited dataset, and I’m curious about my own Power Metal collection.

Bloom Filters Explained

While this looks almost identical to a Set, there are some key differences. Bloom filters are what’s called a probabalistic data structure. Where a Set can give you a concrete “yes” or “no” answer when you call contains, a bloom filter can’t. Bloom filters can give definite “no“s, but they can’t be certain about “yes.”

Bloom Filters by Sam Rose

Setting up SLURM for Single-Node usage in Ubuntu 22.04

SLURM is one of the most popular schedulers for clusters and High-Performance Computing (HPC). It takes care of two tasks. Firstly, it prevents everyone from starting processes on the same machine in a way that none of the processes can run successfully (due to not enough RAM, Disk or CPU time). Secondly, it allows to submit a set of programs to multiple computers automatically.

Typically, SLURM is used in a single, weaker computer (called the login node). Users submit jobs (a single program that can be executed many times, in parallel) and these jobs are scheduled in more power machines, which the user has no access to (for consistency sake).

These instructions are for the case where you want SLURM controlling a single computer (node). This is useful when you do not have a cluster, but a single powerful machine. Many of the instructions are taken from How to quickly set up Slurm on Ubuntu 20.04 for single node workload scheduling.

Install SLURM

sudo apt update -y
sudo apt install slurmd slurmctld -y
sudo mkdir /etc/slurm-llnl/
sudo chmod 777 /etc/slurm-llnl
sudo mkdir /var/lib/slurm-llnl/
sudo mkdir /var/log/slurm-llnl/
sudo chmod 777 /var/lib/slurm-llnl/
sudo chmod 777 /var/log/slurm-llnl/

And update the permissions to your liking.
Then we need to create two files: /etc/slurm-llnl/slurm.conf and /etc/slurm/slurm.conf. They should be the same, but they are in two different locations because of the multimode support (not in use in our scenario). As such, I end up creating a soft link between the two:

sudo ln -s /etc/slurm-llnl/slurm.conf /etc/slurm/slurm.conf

Now we edit the contents of /etc/slurm/slurm.conf and of /etc/slurm/gres.conf to the following:

Note that this configuration sets up two Nvidia A30 GPUs. If you have no Nvidia GPUs, then you can delete gres.conf and remove Gres=gpu:2,mps:200 from slurm.conf.

Now you can start the slurm processes (one to manage the execution, the other to manage the queues):

sudo service slurmctld restart && sudo service slurmd restart

To troubleshoot, you should check the following files: /var/log/slurm-llnl/slurmd.log and /var/log/slurm-llnl/slurmctld.log.

The European AI act, and how to ensure properties of AI?

During my current visit to UCL’s SOLAR group, I attended this week’s SSE Seminar on the challenges posed by the AI Act, presented by Paolo Falcarin.

The European Union Artificial Intelligence act is EU’s first attempt of regulating AI products and services. It defines different requirements based on the risk level of the application (ranges between high, medium and low).

In particular, high-risk usage (healthcare, toys, security, …) have stricter requirements. To begin with, they need to be registered in an European database, and frequently updated as the implementation or requirements change. Furthermore, the service or product should be documented, should be traceable, transparent, secure and overridable by humans. Despite these requirements, there is no clear definition, or path forward on how these properties can be ensured, especially when applications are closed-source and frequently trained and updated. One of the challenges we addressed in CAMELOT was how to build interprable Machine Learning models. We explored the use of Genetic Programming and Domain-Specific Languages to create inherently interpretable models. My research team is continuing exploring that possibility.

General-Purpose models have a special categorization within the AI act: Providers of general-purpose models (think OpenAI or Google) must provide a good understanding of the capabilities and limitations, comply with European copyright law, and provide a sufficiently detailed summary about the content used for training of the general-purpose model, following a given template.

Despite all the effort in understanding the potential of deep neural networks, and generative models in particular, it is not clear year what exactly are the capabilities or limitations of what they can produce. Without a clear standard of what is expected, organizations might be able to completely ignore this requirement.

As an example, take the Gandalf AI game, in which you can (easily) trick the LLM into telling you the password of the next level, even though it was instructed not to. Even with the aditional blocks introduced by each level, it is still easy to win the game. This is the state of the art in protecting LLMs from producing a known output. On a larger scale, an earlier version of Microsoft’s AI-powered Bing also generated output that went against the policies it was purposely trained against.

As for the copyright compliance, this goes in direct conflict with what OpenAI is defending. In fact, using copyrighted materials for free in the context of learning is allowed by European law. As such, it is not clear what this article entails in practice. My guess is that this is going to require a reform on the copyright law, possible to distinguish human versus automated learning. Otherwise, models this efficient may never (legally) exist again. This is something the law and computer science communities should debate before politicians take the initiative.

Finally, general-purpose AI with systemic risks (probably all of Large Language Models) have stricter restrictions: they need to evaluate models based on standardised protocols and tools, documenting adversarial testing of the model. While there are good practices for evaluating models, I do not believe the community will agree on an universal metric for general-purpose AI, and different metrics will arise.

Overall, I think it is positive that the EU is trying to regulate the use of AI. Unfortunately, I think it is a lost battle, as the technology is very new and evolves at a very high pace (something that beaurocracy might slow down). I defend that the EU should invest more on the evaluation and monitoring of AI, maybe more than on its development. After all, we cannot compete with the likes of NVIDIA, OpenAI/Microsoft, Google, Apple or Amazon, since they aquire all the relevant hardware before Europe does. Not even Intel or TSMC investing in Europe will allow us to beat US companies (or adjacent universities) in AI research. However, we can beat them in studying the societal impact of AI.

Windows 11 manages power depending on clean energy availability

Starting with this build, we are introducing the Power Grid Forecast API. This API empowers app developers to optimize app behavior, minimizing environmental impact by shifting background tasks to times when more renewable energy is available in the local electrical grid.
Announcing Windows 11 Insider Preview Build 26052 (via Terence Eden)

Guilherme, Paulo and I submitted a proposal of a similar service (but for server workloads, where it makes more sense) back in 2020 for an EDP competition. It was not fancy enough as it was mostly transparent (the API was a job queue).

Our team is working on making energy usage first-class in programming languages, so developers have a better understanding of their impact when making design decisions.

A Golden Era of Blogging

“It reminds me a lot of how blogging changed around 2005-2009, when ad money came pouring in, and while it was great for bloggers that previously were just publishing for the heck of it (myself included), eventually the money tainted the process as many people rushed to improve their bottom line, often at the expense of whole reason they created their sites.” — Today’s YouTubers are repeating the mistakes of yesterday’s bloggers by Matt Haughey

Love, passion, and curiosity — more than money — fuel the majority of posts that show up in my RSS feed every day and I love it.
Forget the days of Google Reader, now is a golden era of blogging.

A Golden Era of Blogging by Jim Nielsen

I have always only subscribed to passion-driven blogs, and not for profit endeavors (RedMonk maybe being the exception). But you do see the trend on Youtube as the quality decreases as YouTubers become need-to-pay-my-employees sized.

Portable EPUBs

A simple answer is to improve the PDF format. After all, we already have billions of PDFs — why reinvent the wheel?

— Will Crichton, in Portable EPUBs

I’ve met Will last year during SPLASH and besides having awesome game host presentation skills, he is also very passionate about this topic. LaTeX was made for a world where paper is king. But I don’t read papers in paper anymore, I read them on my laptop, frequently on external screens. Sometimes even on my phone. And let me tell you that most of the time I have to pan around to read a single line. We desperately need responsive layouts in most written form. eBooks got it right (but not all books were ported properly, and some will never be, and that’s okay).

[1]

I’ve learned a lot from his post, mainly about the advanced PDF capabilities that open-source software usually doesn’t support. You wouldn’t even need to extend the PDF format.

He proposes that the best practical solution is to use self-contained ePUB written in a safe subset of HTML, CSS and Javascript. His notion of safe is left too much for interpretation to my liking, but the overall idea is a good one.

And while ACM is looking into improving the status of accessibility in PDF papers and whitelisting packages that support HTML exporting, antagonizing computer scientists have relied on advanced macros for decades, ArXiV did without asking anyone’s permission.

I’m still not sure that an HTML-based format is the solution. I don’t think we have the proper authoring tools. Yes, we have TinyMCE and friends, but that has limited support for templating. Heck, even Microsoft FrontPage would give you better control over the layout, at the cost of unreadable source code. But designers want Adobe Indesign and QuarkXPress so they can have some control about pagination and whitespace. Maybe we need a new generation of those tools that also targets responsive HTML views?

But what doesn’t convince me the most is HTML, CSS and Javascript evolutions. Those are languages that have and will continue to evolve at a faster pace than PDF or Postscript. I argue for the tradeoff of having a very basic layout and content language with Active-X plugins that authors can use, at the cost of being lost in time, just like those awesome little Flash games that no-one can play anymore.

1 Curiously, I couldn’t hot link this image from his own post, probably due to the way the ePUB is being dynamically uncompressed.

The risks of open sourcing (imaginary) government LLMs

Let’s take a theoretical example. Suppose the Government trains an AI to assess appeals to, say, benefits sanctions. An AI is fed the text of all the written appeals and told which ones are successful and which ones aren’t. It can now read a new appeal and decide whether it is successful of not. Now let’s open source it.

Terrence Eden

Terrence explores the repercussions of open sourcing the training data, the training code and the trained weights of government data. I suggest reading the whole article, especially given that many Portuguese administrative organizations offer ChatGPT-based help.

How to generate random numbers in your head

Choose a 2-digit number, say 23, your “seed”.
Form a new 2-digit number: the 10’s digit plus 6 times the units digit.
The example sequence is 23 –> 20 –> 02 –> 12 –> 13 –> 19 –> 55 –> 35 –> …
and its period is the order of the multiplier, 6, in the group of residues relatively prime to the modulus, 10. (59 in this case).
The “random digits” are the units digits of the 2-digit numbers, ie, 3,0,2,2,3,9,5,… the sequence mod 10.

George Marsaglia via Hillel Wayne

FFI troubleshoot in Haskell

When installing Haskell’s HLint via:

stack install hlint

I was getting the following error:


base-compat-batteries            > [  2 of 118] Compiling Control.Concurrent.Compat.Repl.Batteries                                          
hashable                         >                                                                                                          
hashable                         > /private/var/folders/3g/7tpj8zwx14qddcr3bt_w8jnr0000gn/T/stack-e594623fe9dbdb84/hashable-1.3.5.0/In file included from /var/folders/3g/7tpj8zwx14qddcr3bt_w8jnr0000gn/T/ghc52550_0/ghc_20.c:4:0: error:
hashable                         >                                                                                                          
hashable                         >                                                                                                          
hashable                         > In file included from /Users/alcides/.stack/programs/aarch64-osx/ghc-9.0.2/lib/ghc-9.0.2/lib/../lib/aarch64-osx-ghc-9.0.2/rts-1.0.2/include/ffi.h:66:0: error:
hashable                         >                                                                                                          
hashable                         >                                                                                                          
hashable                         > /Users/alcides/.stack/programs/aarch64-osx/ghc-9.0.2/lib/ghc-9.0.2/lib/../lib/aarch64-osx-ghc-9.0.2/rts-1.0.2/include/ffitarget.h:6:10: error:
hashable                         >      fatal error: 'ffitarget_arm64.h' file not found                                                     
hashable                         >   |                                                                                                      
hashable                         > 6 | #include "ffitarget_arm64.h"                                                                         
hashable                         >   |          ^                                                                                           
hashable                         > #include "ffitarget_arm64.h"                                                                             
hashable                         >          ^~~~~~~~~~~~~~~~~~~                                                                             
hashable                         > 1 error generated.                                                                                       
hashable                         > `gcc' failed in phase `C Compiler'. (Exit code: 1)   

The problem is that my clang was installed via home-brew. As such, there were a few missing paths. The solution was to run the following to find the location of this header:

find / -iname ffitarget_arm64.h

My output was:

find: /usr/sbin/authserver: Permission denied
find: /Library/Application Support/Apple/Screen Sharing/Keys: Permission denied
find: /Library/Application Support/Apple/ParentalControls/Users: Permission denied
find: /Library/Application Support/Apple/AssetCache/Data: Permission denied
find: /Library/Application Support/Apple/Remote Desktop/Task Server: Permission denied
find: /Library/Application Support/Apple/Remote Desktop/Client: Permission denied
find: /Library/Application Support/ApplePushService: Permission denied
find: /Library/Application Support/com.expressvpn.ExpressVPN/data: Permission denied
/Library/Developer/CommandLineTools/SDKs/MacOSX13.3.sdk/usr/include/ffi/ffitarget_arm64.h
/Library/Developer/CommandLineTools/SDKs/MacOSX14.0.sdk/usr/include/ffi/ffitarget_arm64.h
...

Now you need to pick the first one (second to last line in the previous output) as copy the path up to include (inclusive). Now use it in the following line, or add it to your ~/.zshrc.

C_INCLUDE_PATH=/Library/Developer/CommandLineTools/SDKs/MacOSX13.3.sdk/usr/include/

Thank your family with Backups

It’s thanksgiving time in the US and we’re nearing Xmas time in many other places. I would recommend setting up backups for your loved ones as a wonderful and thoughtful gift.

Last month I had to deal with the loss of a relative’s entire photo library. She copied all the loose photos in ~/Pictures to Apple Photo’s (and therefore to ~/Pictures/Photos.photolibrary/) and, once she was done, she selected everything inside the Pictures folder, and deleted them. And to be sure, she also emptied her trashcan.

Because I hadn’t set up a backup for this particular machine, I had to spend two weekends recovering all deleted photos among many trash lost in the remains of the HDD. For this purpose, I recommend Disk Drill.

As for Backups, I really recommend Backblaze, as it automatically backups all files to their cloud. You have optional encryption (which I don’t use, so I can access the file explorer on their website, to do selective backups). It’s a solution for non-technical folks that happens in the background with no need for user action. They might need your help to do a restore, but that’s very acceptable in my opinion. It has saved me several times for my own laptop.

If you are using this for a more technical use, you should be aware that by default, it does not back up large files like virtual machines or disk images. You should change this in the settings. Having said this, I also keep one time machine in an external disk, and another one in my NAS.

Aventura com as Giras

Apesar de adorar o conceito das Giras, — bicicletas partilhadas, com docas para não ficarem em qualquer sítio, com um passe anual tão baixo (25 euros) que incentiva as pessoas a usarem-nas em vez de transportes públicos como o metro ou autocarros — a implementação não tem sido nada feliz.

É certo que as demoras causadas pela falha de contrato de fornecedor não ajudaram, mas a minha principal queixa é a qualidade do software. Não que as bicicletas estejam sempre perfeitas, mas há forma de reportar e pedir que as arranjem.

Software pouco aperfeiçoado

Ora o primeiro problema (de muito menor importância) é que a aplicação em iOS não integra com o gestor de passwords nativo, pelo que é necessário autenticar-me sempre que abro a aplicação. É irritante, mas não me impede de usar o sistema. E mostra a pouca atenção que dão à experiência de utilização.

Assumem que tudo é perfeito.

Um problema bem mais significativo é o facto de quando se levanta uma bicicleta e se descobre que ela não funciona (desde coisas simples como o motor não funcionar, até não ter travões) e a devolvemos na própria estação ou na seguinte num prazo de 2 minutos, ele considera uma viagem completa e não deixa levantar outra num espaço de 5 minutos. A minha sugestão é não ter esse intervalo se a viagem tiver sido curta, e ter sido dado uma pontuação baixa indicando o problema com a bicicleta. Para ser usada como meio de transporte, não pode ter tantos entraves. E não é algo tão incomum que não deva ser levado a sério. Acontece-me uma vez a cada duas semanas.

Usam dark patterns.

Por falar em reportar problemas: no final de cada viagem é obrigatório pontuar a viagem. E no caso da pontuação ser baixa, é obrigatório escrever algo. Com cinco estrelas já não é. Ora, quando uma pessoa tem pressa vai dar 5 estrelas só para não escrever nada.

Para além de não dever ser obrigatório escrever algo, devia haver uma lista de problemas mais comuns por onde escolher. Torna-se a denúncia de problemas muito mais simples, e evita terem alguém no back-office a processar as mensagens dos vários utilizadores.

A história

Ora certo dia fui de bicicleta para o trabalho. Ao chegar, entreguei a bicicleta na doca e ela trancou a bicicleta. No entanto, na aplicação a viagem continuava a decorrer. Não a pude cancelar na aplicação, porque isso permitiria a qualquer um cancelar viagens sem devolver a bicicleta. Como infelizmente tem sido normal (e confirmado pelo suporte técnico), a doca não comunicou com sucesso com o servidor que a bicicleta tinha sido recebida. E como não foi a primeira vez, fiz o procedimento recomendado: ligar para o apoio técnico e esperar 5 minutos que o assunto ficasse resolvido.

Mas ninguém me atendeu a chamada. Segui para o meu trabalho e fiz o dia completo, esquecendo-me completamente do assunto. Quando quis regressar a casa, abri a aplicação e reparei que tinha tido a viagem a decorrer até ao momento. Reiniciei a aplicação e ele (finalmente) apercebeu-se que a viagem terminou. Mas tinha-me levado a saldo negativo. Tentei ligar para o suporte técnico outra vez, mas também ninguém me atendeu desta vez. Tive de carregar o saldo da aplicação, para poder ter saldo positivo e fazer a minha viagem de regresso.

Em casa escrevi um email muito zangado. Não era a primeira vez que me acontecia e já tinha oferecido os meus serviços para tentar solucionar o problema.
Responderam ao email num PDF em anexo (??).Devolveram-me o custo da viagem, afirmando que era uma excepção e que a culpa era minha. Isto depois do suporte técnico ao telefone em vezes anteriores me confirmar que eram as docas que nem sempre reportavam bem a actividade. Quanto ao dinheiro extra que tive de usar para carregar o saldo desnecessariamente fica do lado dele, desse não abrem a mão.

Conclusão: o software da Gira é mau, não tem sido melhorado, e o suporte técnico atira primeiro a culpa para os utilizadores, e demora imenso a admitir que o problema é deles, e não está disponível a devolver dinheiro introduzido na plataforma. Claramente as plataformas concorrentes privadas não sofrem destes problemas, porque têm concorrência, e pretendem que os utilizadores usufruam ao máximo da plataforma. Em justiça para a Gira, nenhuma plataforma privada faria preços tão baixos. Mas não é motivo para não se esforçarem na qualidade do software.

The First Lecture of My Compilers Class

Every year I have taught compilers, I start the first class by writing a compiler, even before going into the syllabus. Students are aware of what a compiler is, but not necessarily how it works. This introduction gives them this context, and allows students to make better sense of the syllabus and the contents for the rest of the semester.

As this is a very limited compiler (both in scope, and in quality), identifying those challenges also gives a perspective of why is it relevant to spend one or two classes on parsing.

Weekly Links

Academia

Development

Programming Languages