Alcides Fonseca

40.197958, -8.408312

Posts tagged as Artificial Intelligence

Overview of what has been happening to LLMs

It’s impossible to keep up with all the new developments in the LLM-era. However, one thing has been true: they never stopped improving.

Malte Skarupke explains How LLMs Keep on Getting Better, covering a few of the different visible and invisible aspects of LLMs that have been worked on over the past couple of years. It’s a really good overview for those who are not into the weeds of it.

The Guardian on Europe's dependency on US Big Tech

(via Antónia)

An excellent layman’s recap on the dependency (in terms of defense, but also economy) that Europe has on the US tech. What happens if we cannot have US-owned operating systems in our mobile phones? Or we cannot buy American brands for our hospital computers and servers? Will you still receive emails or direct messages?

I will continue my quest to move out of gmail to something European. Unfortunately, Portuguese SAPO is no longer an alternative, so I will have to go for something German, Dutch or Swiss.

It's the end of anonymity in open-source as we know it.

There is no longer a curl bug-bounty program. It officially stops on January 31, 2026. […] Starting 2025, the confirmed-rate plummeted to below 5%. Not even one in twenty was real. The never-ending slop submissions take a serious mental toll to manage and sometimes also a long time to debunk. Time and energy that is completely wasted while also hampering our will to live.

The end of the curl bug bounty by Daniel Stenberg

Early last year I defended that the internet needed to stop being anonymous, so that we can live among LLM-generated content. The end of the curl bug bounty program is another piece of evidence — if we cannot tie submissions to real-people, tracking their reputation and eventually blocking them from trying a second or third time.

PGP was probably a solution behind its time. On the other hand, maybe we were lucky of what we achieved with anonymous developers working together on the internet.

Simon Willison reinvents TDD

As software engineers we don’t just crank out code—in fact these days you could argue that’s what the LLMs are for. We need to deliver code that works—and we need to include proof that it works as well. Not doing that directly shifts the burden of the actual work to whoever is expected to review our code.

Simon defends that engineers should provide evidence of things working when pushing PRs onto other projects. I recently had random students from other countries pushing PRs onto my repos. However, I spent too much time reviewing and making sure it worked. I 100% agree with Simon on this, but I feel the blog post is a bit pessimistic in the sense that software engineers might only be verifiers of correctness.

Don’t be tempted to skip the manual test because you think the automated test has you covered already! Almost every time I’ve done this myself I’ve quickly regretted it.

This is my experience for user-facing software. But these days, I spend little time writing user-facing code other than compiler flags.

Needy programs

Notifications are the ultimate example of neediness: a program, a mechanical, lifeless thing, an unanimate object, is bothering its master about something the master didn’t ask for. Hey, who is more important here, a human or a machine?

Nikita Prokopov

Funny piece by Niki, reporting how the 2010+ software is needy, shown by subscriptions, notifications, what’s new panels, accounts. I wonder how much of this is because of the Facebook-inspired all-in VC-backed software. You need to collect statistics and pay server costs, even if your app could work perfectly offline.

VSCode started as an interesting alternative to IDEs. Now I can no longer use it in my classroom: notifications, status bars, sidebars, copilot all get in the way of showing (and navigating) code. I really want to go back to Textmate, but it lacks LSP support. Zed is the new kid in the block, but the collaborative aspect of it kinda of ruins it for me. I want a native editor that you pay once, and don’t get distracted by it. If I want to use AI, I want a second editor for that (Cursor 2.0 is moving in that direction, but still not there for me)

Rodney Brooks of iRobot fame

Om Malik interviews iRobot founder Rodney Brooks:

At MIT, I taught big classes with lots of students, so maybe that helped. I came here in an Uber this morning and asked the guy what street we were on. He had no clue. He said, “I just follow it.” (‘It’ being the GPS—Ed.) And that’s the issue—there’s human intervention, but people can’t figure out how to help when things go wrong.

Taxi drivers used to have to know every single street in the city to get their license issued. TVDE don’t even need to know street names. If you ask for the directions to a Portuguese-named hotel in Lisbon, they ask you to type it in their phones. Navigation apps have done a disservice in now being designed to teach POIs and navigation to humans. Let’s hope you have a power bank near you when you get lost in your own city!

We’re trying to put technology in the manual warehouses, whether it’s DHL—our biggest customer—or Amazon. It’s about putting robots in places where there are no robots. And it’s not saying it’s a humanoid that’s going to do everything.
You’re right, it’s not sexy. And you know what that means for me? It’s hard to raise money. “Why aren’t you doing something sexy?” the VCs ask. But this is a $4 trillion market that will be there for decades.

Software Companies (Microsoft, Google, Facebook) have shifted the mind of VCs. Due to how fast it spreads, it was much easier to obtain monopolies (and make a lot ton of money) than with previous life-changing inventions (phones, computers, cars). Everyone is looking for the next unicorn like it’s the Gold Rush.

I always say about a physical robot, the physical appearance makes a promise about what it can do. The Roomba was this little disc on the floor. It didn’t promise much—you saw it and thought, that’s not going to clean the windows. But you can imagine it cleaning the floor. But the human form sort of promises it can do anything a human can. And that’s why it’s so attractive to people—it’s selling a promise that is amazing.

This is what you get when you study interaction design. Physical affordances and skeuomorphisms. If you were a 18 century time-traveler you would be more likely to be able to use an early iPhone than the current Liquid Design ones.

I think we need multiple education approaches and not put everything in the same bucket. I see this in Australia—”What’s your bachelor’s degree?” “I’m doing a bachelor’s degree in tourism management.” That’s not an intellectual pursuit, that’s job training, and we should make that distinction. The German system has had this for a long time—job training being a very big part of their education, but it’s not the same as their elite universities.

In Portugal, the technical schools and universities are now offering the same courses (given in the same style), including PhDs, with no distinction. Diversity is healthy and should address the dichotomy of learning to get a job, and learning to change the world. Both need distinct methods and depths.

As 3D printing becomes more general, in the same way information technology and payment systems got adopted in the third world more quickly than in the US, 3D printing will become the engine of manufacturing.
Right now, the supply chain is the reason China is so effective. Chinese manufacturing companies realized they had to diversify and started building supply chains in places like Malaysia, Vietnam. But if 3D printing really gets to be effective, the supply chain becomes all about raw materials that get poured into the front of those 3D printers. It’ll be about certain chemicals, about raw materials, because then every item would ultimately be 3D printed. That completely breaks the dynamic of what made Chinese manufacturing so strong—the supply chain of components.

Brooks is the first person to provide me with an optimistic viewpoint of manufacturing and China’s upcoming world dominance.

Well worth the read!

Wargames (1983)

I’ve always wanted to recommend a few movies for my students to think about societal impact of their work. This semester I am finally doing it, and I’m starting with Wargames, from 1983.

A very young Matthew Broderick plays a young “hacker” that learns about how computers talk to each other, and his curious mind leads him to play a game against an 80s style AI. This AI is realistic in the sense that it learns from different executions (min-maxing strategies, pre-Reinforcement Learning) to estimate the best course of action.

Early in the movie you see him using early modems and connecting directly to any machine in the world via phone number. Later you see him trying some phreaking with a payphone. Too bad he didn’t have a Cap’n Crunch whistle with him.

It also features one of the first uses of hallucination in AI, predating the 1995 origin material. I might be stretching it a little, but it makes whole sense to me.

Finally, there is a message in the movie that critical systems should have a human in the loop as a safeguard. I wonder how many companies and individuals today have the same urge to replace humans with machines in super-critical scenarios. Oh, but machines act immediately, without a second thought.

Reddit will block the Internet Archive

Reddit says that it has caught AI companies scraping its data from the Internet Archive’s Wayback Machine, so it’s going to start blocking the Internet Archive from indexing the vast majority of Reddit. The Wayback Machine will no longer be able to crawl post detail pages, comments, or profiles; instead, it will only be able to index the Reddit.com homepage, which effectively means Internet Archive will only be able to archive insights into which news headlines and posts were most popular on a given day.

Jay Peters for The Verge (via Simon Willison)

Next it will be google. And most of the post-2025 forum knowledge in the web will be lost. Imagine a web browser that cannot access StackOverflow or Reddit. How useful it is? LLMs will need new data to continue being relevant, and a new data monetization strategy will change the internet forever.

Using AI to get an answer

AI is a Floor Raiser, not a Ceiling Raiser

Every day I am convinced that Software Engineering should be taught without AI. AI can give you answers to easy problems. But you won’t be able to create mental models of how things work, which will help you solve hard problems.

This next semester I am teaching a course on practices and tools in Software Engineering (my take will be inspired by MIT’s The Missing Semester). AI usage will be one of the topics, where we will explore MCP, IDE integrations and AI-assisted documentation.

But I have no idea how to write assignments for other topics. It is very likely that an AI will be able to complete the assignment without any human intervention. If students opt to do that (they will, that’s the faith I have in our grade-oriented system), they will not achieve the learning outcomes.

Students will question: if AI can do these tasks, why should we learn it? Well, math teachers in school still assigned me problems that machines could already solve by then. But creating mental models of how things work is essential in education.

Now the real question is about the incentives. We should assess whether students can use their mental models, and not whether they can solve the task. Especially with 100 students, where exams or take-home assignments are the norm.

Subliminal Learning in LLMs

Anthropic published a paper showing that if you ask an LLM to generate data related to one topic, it will carry bias in (humanly) unrelated topics (via Simon Willisson).

For me, this is important as it shows that the intertwining of the neural network is actually used. Neural networks do not aggregate concepts in the same way as humans. Which, in turn, shows they will never be interpretable. Scary!

Weavers against the Machine

IconFactory is a boutique design studio that focus on app icons. Their business is being replaced by AI.

This is the challenge of GenerativeAI: it replaces creative work (or automates it). Design work is one of those fields where the number of employed designers will decrease, and you will need to be either a very good, or a very productive one. If I were an undergrad in design, I would try to learn the difficult stuff, and now the run-of-the-mill stuff that will be easily automated.

LLMs have the same right as humans when it comes to copyright and learning

As I stated before, the boundary of what is copyright infringement when it comes to machine training is quite blurred.

If you see LLMs as their own entities (I don’t, but I’m afraid Nobel laureate Geoffrey Hinton does), they have the same right to learn as humans. They just happen to have photographic (literary?) memory. Is it their fault?

On the other hand, you look at LLMs as a form of compression. Lossy, yes, but a compression algorithm nevertheless. In that case, if you zip a book and unzip it, even with a few faults, it’s the same book you copied.

Legislation will have to decide on this sooner or latter.

William Haskell Alsup, of Oracle vs Google fame, ruled that buying and scanning books to train LLMs was legal. He also decided that downloading (pirated by a 3rd party) ebooks was not fine.

Regardless of my own position, I believe every government should create a task force to think about this, including experts from different fields. Last time something like this (peer-to-peer, Napster, The Pirate Bay) happened, legislation took too long to arise. Now, this are moving at an ever faster pace. And I’m afraid our legal systems are not flexible and agile enough to adapt.

No AI in Servo

Contributions must not include content generated by large language models or other probabilistic tools, including but not limited to Copilot or ChatGPT. This policy covers code, documentation, pull requests, issues, comments, and any other contributions to the Servo project.

A web browser engine is built to run in hostile execution environments, so all code must take into account potential security issues. Contributors play a large role in considering these issues when creating contributions, something that we cannot trust an AI tool to do.

Contributing to Servo (via Simon Willison)

Critical projects should be more explicit about their policies. If I had a critical piece of software, I would do the same choice, for safety. The advantage of LLMs are not that huge to be worth the risk.

Transformative and Compositional Work

To assess the productivity gained by LLMs, Iris Meredith distinguishes between transformative and compositional work:

While work is obviously a very, very complicated thing, a useful lens for the purpose of this essay is to draw a distinction between work which reshapes a raw material into a finished object and work that puts together multiple objects in a way that creates a certain effect in the world. For the sake of having a shorthand, I’ve chosen to call them transformative and compositional work, respectively.

While Iris does not have a particular clear idea of LLMs, Jeremy Keith does:

My own take on this is that transformative work is often the drudge work—take this data dump and convert it to some other format; take this mock-up and make a disposable prototype. I want my tools to help me with that.
But compositional work that relies on judgement, taste, and choice? Not only would I not use a large language model for that, it’s exactly the kind of work that I don’t want to automate away.
Transformative work is done with broad brushstrokes. Compositional work is done with a scalpel.

Personally, I think it depends much more on where you are in the economic value proposition. Are you selling quick-and-dirty cheap stuff? LLMs are great for you. Are you delivering high-assurance, high-quality work? Then you might/should be skeptical of LLMs.

The Great Scrape

These companies are racing to create the next big LLM, and in order to do that they need more and more novel data with which to train these models. This incentivises these companies to ruthlessly scrape every corner of the internet for any bit of new data to feed the machine. Unfortunately these scrapers are terrible netizens and have been taking down site-after-site in an unintentional wide-spread DDoS attack.

More tools are being released to combat this, one interesting tool from Cloudflare is the AI Labyrinth which traps AI scrapers that ignore robots.txt in a never-ending maze of no-follow links. This is how the arms race begins.

Herman Martinus

This is a lost fight. As I’ve wrote before, the only solution I see is to encrypt content and monetize it, so there is a trace to show to courts.

How scientists learn computing and use LLMs to program

“scientists often use code generating models as an information retrieval tool for navigating unfamiliar programming languages and libraries.” Again, they are busy professionals who are trying to get their job done, not trying to learn a programming language.

How scientists learn computing and use LLMs to program: Computing education for scientists and for democracy

Very interesting read, especially since we teach programming to non-CS students, which is fundamentally different. Scientists are often multilingual (Python, R, bash) and use LLMs to get the job done. Their goal is not to write maintainable large software, but rather scripts that achieve a goal.

Now I wonder how confident they are that their programs do what they are supposed to do. In my own research, I’ve found invisible bugs (in bash, setting parameters, usually in parts of the code that are not algorithmic) that produce the wrong result. How much of the results in published articles is wrong because of these bugs?

We might need to improve the quality of code that is written by non-scientists.

Copyright, AI and the Future of the Web

Gorillaz’s Damon Albarn and Kate Bush are among 1000 artists who launched a silent album (on Spotify no less) in protest against the UK government allowing AIs to be trained using copyright-protected work without permission.

This protest highlights the tension between creating valuable tools and devaluing human content.

The value of AI

ChatGPT (and even Apple Intelligence) is trained on information publicly available on the internet, data from (consenting) third parties, and information provided by their employees or contractors. Over the last year and a half, people have been amazed at what ChatGPT has been able to do. Although the quality of its work fluctuates as new data/methods are being updated, ChatGPT and similar tools are being used to create value. But at what cost?

Unconsciously, The Algorithm has become more and more important in our lives. From Instagram and TikTok reels, X and Facebook timelines, Spotify, YouTube, or Netflix’s recommendations, the decision of what we see is no longer ours. And we are also not delegating our choices to a human editor (as is the case of the old boring telly or radio channels). Those decisions are being made by black-box algorithms that are hidden in the shadows.

The EU AI law, which I blogged about before, only requires explainability for applications in high-risk domains. Entertainment can hardly be thought of as high-risk. However, I would argue that given the importance of online content consumption in today’s society, it should be considered high-risk. One example is the perceived power of Twitter/X in political elections.

On the other hand, educational purposes are considered fair use in most countries (which is certainly true here in Portugal). What is the difference between fair use for human and machine learning? As we become increasingly dependent on AI for our daily tasks – I use Siri and Reminders to augment my memory and recalling ability — we become de facto cyborgs. Is there a difference between human and machine learning for education?

The devalue of Human content

In 2017, Spotify introduced the Perfect Fit Content program, encouraging editors to include songs purposely designed to fit a given mood in their playlists. Liz Pelly goes into all the details in her piece The Ghosts in the Machine. Some human, some AI, several companies have been starting to produce music à lá carte for Spotify.

According to The Dark Side of Spotify, Spotify investors are also investing in these companies (phantom artists on the platform, which use random names with no online presence other than inside the platforms) and promoting the use of AI to beat the algorithm. While this vertical integration might be cause for considering anti-trust or monopoly issues, the fact is that Netflix has been successful in expanding to content production (as has Disney been successful in expanding into content distribution).

AIs are much more productive in generating music than humans. Which is not necessarily the same as being successful in producing music a) that humans enjoy or b) that is commercially viable. The Musical Turing Test is almost solved, addressing a). Commercial viability is even easier to address. Because the cost of producing AI music is so low compared to the human equivalent, AI companies can flood the market with millions of songs, letting the algorithm filter out the ones that do not work. In that scenario, human musicians are not just competing with each other for user’s attention but are now unable to be showcased to users without an explicit search. Additionally, AI can better cater to some audiences based on data extracted from these networks (remember Spotify’s investors also investing in AI music production companies?) than humans can, at least in large numbers.

And I’m aware AI can be a tool for musicians, but if AI can perform end-to-end music generation passing the Musical Turing Test, it becomes much more interesting from a commercial standpoint.

The only chance for musicians is to promote their own content outside of these platforms, abandoning the initial goal of Web 2.0, where anyone can create content on the web. They can, but it just won’t be discoverable in the ocean of AI-generated content. But this is a symptom of a more significant problem for the web.

I feel like the people who try to be positive – well, I don’t know what they’re doing. I’m a music producer and also a writer who also happens to make art/design her own album art. Thankfully, I also dance, which is going to be the one thing that saves me I feel. — PrettyLittleHateMaschine on AI music.

The quality of AIs depends on human

ChatGPT was primarily trained on internet-available content. So, its quality depends on what is available at a given time. If we stop collecting new information, we can assume its quality will remain unchanged. Still, it will not be helpful with new information, such as news updates or scientific discoveries. Its usefulness will be reduced.

On the other hand, if the quality of AIs increases — it’s more and more difficult to tell the difference between human and GPT-generated text — and it passes the Turing test, the content available online will be more and more AI-generated than human-generated, as it’s more economical to use AI to produce text, audio or even video.

Here, we consider what may happen to GPT-{n} once LLMs contribute much of the text found online. We find that indiscriminate use of model-generated content in training causes irreversible defects in the resulting models, in which tails of the original content distribution disappear.

AI models collapse when trained on recursively generated data

This recent Nature paper reports that LLMs perform worse when trained on LLM-generated content. Human content is now essential! LLM companies need high-quality human content to train their next-generation models, especially concerning novel knowledge. But econmics no longer work. Content is created once, consumed once, and used to generate millions of derivates for almost free. An author might publish a book, hoping to make the money for the time it took to write from the sum of all individual sales. However, AI companies will not buy the book at its production cost to train a model. Same for daily news. The human audience is still needed to make this work. And suppose everything is made available for free on the web. In that case, humans are making the same mistake that led to ChatGPT being in business without contributing to the original content sources.

The current Web is not enough.

Web 2.0 died and now the web happens more and more inside silos. Famously, Instragram does not allow for links outside its app. “Link in the bio” will be listed as the cause of death in Tim Berners Lee’s obituary. It goes against what the web was supposed to be. But today’s personal entertainment happens in silos (Instagram, Netflix, Disney+, etc…), not on the open web. Even Reddit communities have started blocking links to some websites, like X.

The web failed at microtransactions. Paying 10 cents for reading a well-written article was the original goal. Even with Paypal and Apple Pay, the model was only successful for large purchases, not pay-per-view. Imagine that you give Youtube your credit card, and it takes 1 euro for each hour watched. Once you have something for free, it is difficult for companies to make you pay for it.

As a business that moved from analog to digital almost completely, most news outlets have failed to change their economics and they are now struggeling financially. As the price of advertising online has decreased over the past years, they have switched to a subscription model, putting up paywalls with dubious outcomes.

The future of the Web

I foresee a web where high-quality human content is behind paywalls. While most of the web can be AI-generated and free, it will be ignored if high-quality content is available from trusted sources. Content will be signed and (possibly) encrypted using personal keys. These keys can be provided by the government, or other parties. For instance, every Portuguese citizen already has their keys inside our citizen cards, sometimes with professional attributes.

If you wanted to read the news, you can go to an online newspaper, where the content will be signed by a recognized journalist or editor. The body of the text can be encrypted but with a faster Apple Pay-like prompt, you can pay cents to read it. Even if the journalist published AI-generated content, they are liable for its content.

This proposal makes the web a more trustful place and somewhat addresses the economic problems of paying for content on the web. It requires payment processors to drop the minimum cost per transaction, which I believe is happening more and more. And as more and more garbage is published online, users will see the need to pay for high-quality content.

As for AI providers, they will now have to pay for content. And even if it is ridiculously cheap, there is a trace that they bought that information, useful when you want to prove in court that your content was used in training LLMs.

We might not get to this Web, but I hope some of this ideas help the web survide the tsunami of garbage content that is starting to flood our dear World Wide Web.