Alcides Fonseca

40.197958, -8.408312

European Union will stop funding Open-Source projects in Horizon Program

The European Union must keep funding free software

Since 2020, Next Generation Internet (NGI) programmes, part of European Commission’s Horizon programme, fund free software in Europe using a cascade funding mechanism (see for example NLnet’s calls). This year, according to the Horizon Europe working draft detailing funding programmes for 2025, we notice that Next Generation Internet is not mentioned any more as part of Cluster 4.

NGI programmes have shown their strength and importance to support the European software infrastructure, as a generic funding instrument to fund digital commons and ensure their long-term sustainability. We find this transformation incomprehensible, moreover when NGI has proven efficient and ecomomical to support free software as a whole, from the smallest to the most established initiatives. This ecosystem diversity backs the strength of European technological innovation, and maintaining the NGI initiative to provide structural support to software projects at the heart of worldwide innovation is key to enforce the sovereignty of a European infrastructure.
Contrary to common perception, technical innovations often originate from European rather than North American programming communities, and are mostly initiated by small-scaled organizations.

Previous Cluster 4 allocated 27 millions euros to:

“Human centric Internet aligned with values and principles commonly shared in Europe” ; “A flourishing internet, based on common building blocks created within NGI, that enables better control of our digital life” ; “A structured eco-system of talented contributors driving the creation of new internet commons and the evolution of existing internet commons” .

In the name of these challenges, more than 500 projects received NGI funding in the first 5 years, backed by 18 organisations managing these European funding consortia.

NGI contributes to a vast ecosystem, as most of its budget is allocated to fund third parties by the means of open calls, to structure commons that cover the whole Internet scope – from hardware to application, operating systems, digital identities or data traffic supervision. This third-party funding is not renewed in the current program, leaving many projects short on resources for research and innovation in Europe.

Moreover, NGI allows exchanges and collaborations across all the Euro zone countries as well as “widening countries“¹, currently both a success and and an ongoing progress, likewise the Erasmus programme before us. NGI also contributes to opening and supporting longer relationships than strict project funding does. It encourages to implement projects funded as pilots, backing collaboration, identification and reuse of common elements across projects, interoperability in identification systems and beyond, and setting up development models that mix diverse scales and types of European funding schemes.

While the USA, China or Russia deploy huge public and private resources to develop software and infrastructure that massively capture private consumer data, the EU can’t afford this renunciation.
Free and open source software, as supported by NGI since 2020, is by design the opposite of potential vectors for foreign interference. It lets us keep our data local and favors a community-wide economy and know-how, while allowing an international collaboration.
This is all the more essential in the current geopolitical context: the challenge of technological sovereignty is central, and free software allows to address it while acting for peace and sovereignty in the digital world as a whole.

— OW2

The Register has more information on this issue.

4 ways to break out of a firewalled environment

There are four tricks in our arsenal that we’re going to use to jailbreak internal hosts behind a restrictive customer firewall:

Gabriella Gonzalez

I’ve lost count of how many times I’ve needed to do some kind of creating a socket between machines behind VPNs/Firewalls. This is a pretty useful summary of all the useful techniques.

Uv as the sane Python packaging default

There are two main issues with python’s packaging tooling: There is no sane way to distribute python-written apps, in a way that guarantees that all dependencies (including C and Fortran-written OS-level dependencies) and a way to manage the dependencies of your python apps.

In particular, in-the-wild requirements.pip are neither OS-aware neither python-version aware. Maybe with Python 3.9 I want a set of dependencies, and with Python 3.13 I want another. This becomes relevant when dealing with multiple (api-breaking) numpy and other scientific packages.

Astral released a new uv version that tries to solidify the ecosystem, replacing all the usual, incomplete suspects (poetry, pip-tools, etc..). Their newish goal with uv is to become Python’s cargo. With uv, you can now create and run projects, execute standalone python tools on your OS (replacing pipx) and managing different python versions.

But one of my favorite features is the support for PEP 723, which allows you to include dependencies in single-file python scripts:

#!/usr/bin/env uv run
# /// script
# requires-python = ">=3.12"
# dependencies = [
#     "flask==3.*",
# ]
# ///
import flask
# ...

(via Simon Willison)

Rust Low-level Concurrency in Practice

Mara Bos and O’REILLY have made Rust Atomics and Locks available online for free.

If the only thing you need is to create threads to handle different tasks that share no state, you won’t need this. But if you ware working on shared data-structures among threads, this books provides a full reference with examples that will help you step to the next level in Rust. Highly recommend!

I might even use this book to revamp my “Parallel and Concurrent Programming” master-level course.

Scope of Generics in Python

Thanks to Continuous Integration, I have found a typing problem in our genetic engine program synthesis framework. It boiled down to me not defining a scope for a type variable.

I started with some code that looked like the following:

with the following error:

main.py:18: error: Argument 1 has incompatible type "P@consume"; expected "P@__init__"  [arg-type]
Found 1 error in 1 file (checked 1 source file)

You can load this example on the MyPy playground if you want to play around with it.

In this case, MyPy is inferring the type of data as dict[str, Callable[[P@__init__], bool], where the key is the init part of the type variable that ends up being different than the use o P inside the consume function. This behavior is because type vars are, by default, bound to the function/method, and not the class. The first step is to actually introduce the explicit annotation for data with the dict[str, Callable[[P],bool]] type, inside Subclass. Now we get a different error:

main.py:17: error: Dict entry 0 has incompatible type "str": "Callable[[P@__init__], bool]"; expected "str": "Callable[[P], bool]" [dict-item]

Now the P type variable in the field annotation is different than the ones inside the method. To actually bind the type variable to the whole class, we need to extend Generic[P]:

Now, we have no typing errors, and we do not even need the explicit type declaration for data.

Most of this issue was due to me not clearly understanding the default behaviors of type variables1. Luckily, if you are able to only support Python 3.12 and upwards, you can use the new, saner syntax. And maybe someday I’ll finish the draft post where I explain why Python’s approach to typing is the best (for prototyping type systems and meta-programming techniques, like we do in GeneticEngine) and the worst (for real-world use).

1 Who the hell creates a type variable through the definition of a variable??

Software Bill of Materials (in Rust)

Ferrous Systems has been hired to improve the state of SBOM (Software Bill of Materials) in Rust.

What is an SBOM and why is it important? A Software Bill of Materials (or SBOM) declares, among other things, the inventory of all components used to build the software artifacts, as part of the software supply chain. Using this information can help detect vulnerability / security issues with the software or determine all conflicts in used licenses. A major reason to provide SBOMs for software in Germany is that the Federal Office for Information Security highly recommends them as part of their technical guidelines for Cyber Resilience (see PDF for details).

In recent years a number of pieces of legislations have been passed to improve cybersecurity. For example the US issued an Executive Order on improving the Nation’s Cybersecurity. In Europe, the EU has proposed the Cyber Resilience Act to improve cybersecurity and cyber resilience. These efforts are in response to an increased number of cyber attacks in recent years.

Not that most companies care, but verifying the compatibility of the software licenses in all dependencies of your project should be a one command task. Furthermore, high-risk projects should vendor all their dependencies, and keep track of the progress of their dependencies. Which is rarely accounted when budgeting for a new software project.

SPECIES Scholarships 2024

SPECIES is a society that aims to promote evolutionary algorithmic thinking, most known for organising Evo*. They also have a scholarship program that allows students (and recent PhD graduates) to spend 3 months doing an internship at selected host institutions.

It just happens that I was accepted as an host institution. So if you are interested in Program Synthesis, in particularly exploring type systems to make synthesis more efficient, or using heuristic methods to scale type-driven synthesis (à lá synquid or Idris), consider applying!

Also, feel free to reach out to me if you need more details. I’m quite flexible in regards to the logistics.

Trackmania and Machine Learning

I’m not that into video games — I can use one hand the games I actually played and appreciated —, but one of those is Trackmania. You have a car, and you just use the four arrows to make it to the end of the hotwheelesque, loop-filled stadium-sized track.

Now if you are into machine learning, self-driving cars, David and Felipe just posted a wonderful survey about the history of ML in Trackmania. While real-world self-driving cars take all the attention, there is much you can study with just a compute game.

Visualisation of Proofs in Lean

Awesome video explaining an interesting hack to visualize with smooth animations the different steps in lean proofs.

Network being dropped on a ASUS Strix X670OE-E motherboard

So I found some logs that the network PCIe device was being dropped in Ubuntu 22.04.

igc (...) eno1: PCIe link lost, device now detached

After looking it up, I reached the conclusion that that particular chip overheats, which causes the kernel to drop the device.

Other than adding a heatsink, the solution is to change the OS configuration, so you make it slower, so it doesn’t overhead:

  • Adding these two kernel parameters: pcie_port_pm=off pcie_aspm.policy=performance
  • Disabling a bunch of TCP features: sudo ethtool --offload eno1 rx off tx off. I personally find this option a bit scary, so I ended up reversing it.

Functional Programming in Python 3.12

Oskar Wickström shows off recent Python features (generics + pattern matching) that make writing Python more similar to ML, Haskell, Rust or Scala. If you need to support old versions of Python, you will have to wait a couple years before you use proper generics syntax (although you can use the TypeVar class).


def print_tree[T](tree: RoseTree[T]):
    trees = [(tree, 0)]
    while trees:
        match trees.pop(0):
            case Branch(branches), level:
                print(" " * level * 2 + "*")
                trees = [(branch, level + 1) for branch in branches] + trees
            case Leaf(value), level:
                print(" " * level * 2 + "- " + repr(value))

Statically Typed Functional Programming with Python 3.12

Improve file transfer speed between Macs and Synology

To improve the file transfer speed between macOS and Synology, you should do the following:

  • On Synology, go to Control Panel/File Services/SMB and under Advanced Settings upgrade the minimum SMB version from 1 to 2. This might prevent old devices from connecting, but can improve the speed of modern devices.
  • On macOS, disable packet signing.

The Alternative Implementation Problem

Hopefully, at this point, you see where I’m going with this. What I’ve concluded, based on experience, is that positioning your project as an alternative implementation of something is a losing proposition. It doesn’t matter how smart you are. It doesn’t matter how hard you work. The problem is, when you build an alternative implementation, you’ve made yourself subject to the whims of the canonical implementation. They have control over the direction of the project, and all you can do is try to keep up. In the case of JITted implementations of traditionally interpreted languages, there’s a bit of a weird dynamic, because it’s much faster to implement new features in an interpreter. The implementers of the canonical implementation may see you as competition they are trying to outrun. You may be stuck trying to ice skate uphill.

Maxime Chevalier-Boisvert

This is surely true in Python or Lua, but I believe it might not necessarily be the case for Java (where there is a specification, and enough effort by the industry to create alternative implementations). But I agree in general, unless you have something unique (like Android support, despite Dalvik being stuck on Java 8 for ages), which both IronPython and Jython didn’t have — I guess there is no general need for accessing the .NET and JVM runtimes from dynamic languages.

ChatGPT in Papers

Google Scholar for certainly, here is turns up a huge number of academic papers that include parts that were evidently written by ChatGPT—sections that start with “Certainly, here is a concise summary of the provided sections:” are a dead giveaway.

Simon Willison

Peer review isn’t built to handle the flood of AI content, especially as not all of it will be obvious, and not all will be malicious (lots of scholars pay editors to help make their writing better, now they will use chat).

Ethan Mollick

Misha Teplitskiy

The AI boom will soon crash

Put in the simplest way: Things have been too good for too long in InvestorWorld: low interest, high profits, the unending rocket rise of the Big-Tech sector, now with AI afterburners. Wile E. Coyote hasn’t actually run off the edge of the cliff yet, but there are just way more ways for things to go wrong than right in the immediate future.

Money Bubble by Tim Bray

Tim (correctly) points out that when investors throw money at things that are not well understood (.com, web2.0, blockchain, AI), it will eventually disappoint and crash the markets. Enjoy it while you can.

Mamba: The Easy Way

Today, basically any language model you can name is a Transformer model. OpenAI’s ChatGPT, Google’s Gemini, and GitHub’s Copilot are all powered by Transformers, to name a few. However, Transformers suffer from a fundamental flaw: they are powered by Attention, which scales quadratically with sequence length. Simply put, for quick exchanges (asking ChatGPT to tell a joke), this is fine. But for queries that require lots of words (asking ChatGPT to summarize a 100-page document), Transformers can become prohibitively slow. […] Mamba appears to outperform similarly-sized Transformers while scaling linearly with sequence length.

Mamba: The Easy Way, by Jack Cook

A wonderful explanation of the architectural differences in mamba and how it is much faster than existing transformer implementations. May require some CNN/RNN background to fully understand.

Power Metal Data Analysis

Bands from Spain, Germany and Finland have an average of more than 1600 words vocabulary; in comparison native countries like UK, US and Scotland have an average of 925, 1383 and 1501 words respectively, The most metal words are deliverance, defender, honour, forevermore, realm and the least are shit, baby, fuck, girl, verse. The most negative song is Condemned To Hell by Gamma Ray and the most positive There’s Something In The Skies by Dark Moor.

Power Metal: is it really about dragons? by Matt D.

I really wish there was code available, because I think the 58 bands provide a quite limited dataset, and I’m curious about my own Power Metal collection.

Bloom Filters Explained

While this looks almost identical to a Set, there are some key differences. Bloom filters are what’s called a probabalistic data structure. Where a Set can give you a concrete “yes” or “no” answer when you call contains, a bloom filter can’t. Bloom filters can give definite “no“s, but they can’t be certain about “yes.”

Bloom Filters by Sam Rose