Alcides Fonseca

40.197958, -8.408312

Posts tagged as Academia

ArXiv introduces penalties on slop

If generative AI tools generate inappropriate language, plagiarized content, biased content, errors, mistakes, incorrect references, or misleading content, and that output is included in scientific works, it is the responsibility of the author(s). We have recently clarified our penalties for this. If a submission contains incontrovertible evidence that the authors did not check the results of LLM generation, this means we can’t trust anything in the paper. The penalty is a 1-year ban from arXiv followed by the requirement that subsequent arXiv submissions must first be accepted at a reputable peer-reviewed venue. Examples of incontrovertible evidence: hallucinated references, meta-comments from the LLM (“here is a 200 word summary; would you like me to make any changes?”; “the data in this table is illustrative, fill it in with the real numbers from your experiments”)

Thomas G. Dietterich

Just like Github, arXiv is where anyone can upload their scientific outputs. There’s a minimal verification to prevent Spam, but arXiv was never about gatekeeping content. Until now.

As I mentioned before, reputation is more important than ever in an age where text (and voice) is being produced cheaply, financed by monopoly-inducing LLM factories at a loss.

If the goal of arXiv is to provide an open alternative to the gatekeeping of journals, what is the open alternative to the gatekeeping of arXiv?

O que se passa no ensino superior

A U. Lisboa é a maior universidade portuguesa, mesmo na capital do país. É uma boa aproximação do que se passa no ensino superior português.

Como é que menos de metade dos alunos de uma licenciatura não a acabam no tempo previsto? Sei que não é a dificuldade que tem aumentado, pelo contrário: nunca foi tão fácil ter sucesso no ensino superior.

A minha hipótese é que os alunos não têm motivação. Estudam porque sim (sem verdadeiramente gostar dos tópicos), e dedicam o tempo e a atenção a tudo menos ao que importa. Estou a falar dos que estão fora dos 45%. Certamente há muitos alunos aplicados e que se esforçam. Mas devia ser quase a totalidade dos elementos.

Era interessante que fosse feito um estudo piloto em que os alunos eram obrigados a fazer um gap year a trabalhar. Penso que faria grande diferença na atitude dos alunos perante o desafio que é o ensino superior.

As novas regras anti-endogamia são um tiro no pé

O governo fez aquilo que melhor sabe para resolver um problem — e eu concordo que é um problema em Portugal —, que foi legislar na nova proposta do Regime Júridico das Insitutições de Ensino Superior. Em particular,

Escolas médicas querem regime de excepção nas regras contra endogamia académica
“As unidades orgânicas que não tenham pelo menos 40% de docentes e investigadores de carreira licenciados ou doutorados noutra instituição de ensino superior ficam impedidas de contratar, independentemente do tipo de vínculo, nos três anos subsequentes à obtenção do grau de doutor, como docentes ou investigadores que nela tenham obtido todos os seus graus académicos.

Esta regra é um grande tiro no pé, prejudicando imensamente universidades no interior ou em cidades que só têm uma universidade. Imaginemos, alguém que termina o doutoramento na Universidade de Lisboa, pode ir para a Nova, para o ISCTE ou para outra das várias universidades que Lisboa tem. Já alguém que termine o doutoramento na UTAD tem obrigatòriamente de mudar de cidade. Imaginem o problema da UTAD com dificuldade em contratar professores, porque os doutorados que lá vivem não podem lá trabalhar.

Ora, existem vários motivos válidos para alguém não querer mudar de cidade (apoio à família, não querer que os filhos mudem de escola, two-body problem, ou porque acumulam com outra actividade profissional que é local. Esta medida prejudica gravemente quem está nestas situações.

O Conselho de Escolas Médicas Portuguesas concorda, pois os médicos não estão em regime de exclusividade e não querem mudar de serviço nem de cidade. E não são só os médicos!

E sim, eu sei que existe muita endogamia controlada (incluindo na Medicina), mas a solução não é bloquear contratações. Nem é delegar a contratação para elementos externos, como acontece com os painéis de peritos imparciais, mas que são escolhidos pela casa a dedo para valorizarem o que querem valorizar, numa táctica de tit-for-tat. Nós devemos dar às instituições a liberdade de definirem os seus objectivos e as estratégias e práticas que levem a esses objectivos. Mas devemos também avaliar e fiscalizar as decisões tomadas, com fortes implicações na carreira. Se a estratégia de contratação de uma entidade não funciona, é preciso entender o porquê e perceber se foi de má fé ou não.

Isso é o que devemos combater: a endogamia que foi feita por má vé, vs a endogamia que acontece por factor externos e justificáveis.

A New Age Software Engineering Degree

What may happen is that software development involves less coding than it has in the past because of AI. At least coding by humans. So BLS is probably right about a decline in the need for computer programmers. At the same time, if software developers spend less time doing actual coding they may have more time for higher level (if that is the right term) thinking and involvement in design. Unless AI starts doing more of that. So maybe we will not need more of them. Or perhaps AI will make it possible for more people to be software developers who wouldn’t be that now. We’ll see I guess.

Computer Programming or Software Development by Alfred Thompson

Alfred analyses the difference between a programmer and a software developer. AI is replacing programmers (those that implement features identified by software developers), but not Software Engineers.

On the other hand, we might not be preparing our SE students for the next decade. We have good, core CS and Programming courses. But advanced courses are not up to par with what the market needs. This aligns with the Barbell approach, which is the closest I have seen to a good path for our SE education. We need good, pen-and-paper, fundamental courses, and we need up-to-date advanced courses that make use of AI and whatever comes next.

The main problem is that technology is moving faster than Universities can adapt. Most professors are researchers in their own niche, and most are not doing Software Engineering, but they do teach it. We need more cutting-edge engineers to come back to universities to teach.

Here in Portugal, we have incentives not to hire professionals (I am fighting this locally, and got two real-world engineers to teach Functional Programming with me) and our degrees have to stay static for three to four years. This does not work for this day and age when the development process changes so frequently, and professors are so busy to actually get some hands on experience. I am also fighting that, but that’s for some other post.

Foundations for hacking on OCaml

How do you acquire the fundamental computer skills to hack on a complex systems project like OCaml? What’s missing and how do you go about bridging the gap?

KC Sivaramakrishnan

KC gives several resources for students to get up to speed with contributing to OCaml.

One of the interesting resources is MIT’s the Missing Semester. This semester I created our own version of this course, covering git, docker, VMs, terminal/bash, testing, static analysis and LLMs for code.

While we cover how to do a Pull Request, I don’t believe students are ready to actually contribute. Reading large codebases is a skill that even our graduate MSc students don’t have. Courses are designed to be contained, with projects that need to be graded with few human effort, resulting in standard assignments for all the students.

I would love to run something like the Fix a real-world bug course Nuno Lopes runs. But being able to review so many PRs is a bottleneck in a regular course.

To understand, you have to invent

To really understand a concept, you have to “invent” it yourself in some capacity. Understanding doesn’t come from passive content consumption. It is always self-built. It is an active, high-agency, self-directed process of creating and debugging your own mental models.

François Chollet (via Simon Willison)

It’s a rephrasing of our “The best way to understand something is to teach it to someone else”. And that’s why I still love my job.

Peer Review is Dead

If ChatGPT can produce research papers that are indistinguishable from what most scientists can write, then maybe scientists can focus on actually advancing science—something that ChatGPT has thus far proven unable to do.

Beyond papers: rethinking science in the era of artificial intelligence by Daniel Lemire

Looking at the proceedings of our conferences over the past few years, I find that most of the papers are simply uninteresting. Moreover, it seems that every first-year PhD student is now required to write a systematic review on their topic — supposedly to learn about the field while producing a publication.

Let me be blunt: every systematic review I’ve read has felt like a waste of time. I want to read opinionated reviews written by experts — people who have seen enough to have perspective — not by PhD students who have just skimmed the past decade of papers on Google Scholar.

We need far fewer papers (I’m doing my best to contribute to that cause), and the ones we do publish should be bold, revolutionary, and even a little irreverent. We need innovation and the courage to break expectations. Incremental research has its place, but that doesn’t mean it always needs to be published.

To make this possible, evaluation committees — both nationally and within universities — must rethink their processes to move away from bean-counting metrics. Our current incentive system discourages genuine peer review, and even when proper reviews happen, they often waste effort on work that adds little value.

Otherwise, yes — the bean-counting-reinforcement-learning AIs will take our jobs.

Universidades contornam limites de propinas com taxas e taxinhas

No Politécnico de Coimbra subiram a taxa de matrícula de 30 para 125 euros. Alunos de Mestrado e Doutoramento pagam até 500 euros de taxa de entrega de tese.

No caso das licenciaturas, estas taxas servem para as universidades públicas receberem mais dinheiro do que a propinas que está definida por lei. A nível de doutoramento, serve para manter o valor da propina naquele que a FCT suporta nas suas bolsas (2750 euros).

A verdade é que os alunos vêem um preço anunciado, e depois é-lhes impossível acabarem o curso pagando apenas esse valor. É literalmente publicidade enganosa.

Precisamos de duas mudanças: eliminação das taxas por parte das Universidades e Politécnicos, englobando esse custo na propina. Um aluno pagando a propina, deve conseguir ter acesso a assistir às aulas, ser avaliado e obter o diploma, sem qualquer taxa.

E o estado precisa de majorar o financiamento das universidades, que claramente têm de recorrer a estas acções eticamente discutíveis para manter a sustentabilidade económica que lhes é exigida pelo Tribunal de Contas.

Using AI to get an answer

AI is a Floor Raiser, not a Ceiling Raiser

Every day I am convinced that Software Engineering should be taught without AI. AI can give you answers to easy problems. But you won’t be able to create mental models of how things work, which will help you solve hard problems.

This next semester I am teaching a course on practices and tools in Software Engineering (my take will be inspired by MIT’s The Missing Semester). AI usage will be one of the topics, where we will explore MCP, IDE integrations and AI-assisted documentation.

But I have no idea how to write assignments for other topics. It is very likely that an AI will be able to complete the assignment without any human intervention. If students opt to do that (they will, that’s the faith I have in our grade-oriented system), they will not achieve the learning outcomes.

Students will question: if AI can do these tasks, why should we learn it? Well, math teachers in school still assigned me problems that machines could already solve by then. But creating mental models of how things work is essential in education.

Now the real question is about the incentives. We should assess whether students can use their mental models, and not whether they can solve the task. Especially with 100 students, where exams or take-home assignments are the norm.

Trust in Scientific Code

In 2010 Carmen Reinhart and Kenneth Rogoff published Growth in a Time of Debt. It’s arguably one of the most influential economics papers of the decade, convincing the IMF to push austerity measures in the European debt crisis. It was a very, very big deal.
In 2013 they shared their code with another team, who quickly found a bug. Once corrected, the results disappeared.
Greece took on austerity because of a software bug. That’s pretty fucked up.

How do we trust our science code? by Hillel Wayne

As more and more scientific publications are dependent on code, trusting code is more and more needed. Hillel asks for solutions, I propose to tackle the problem in two fronts.

1 – More engineering resources

Writing production-level quality software requires larger resources (usually engineerings, but also some tooling). Most scientific software is written once and read never. Some PhD or MSc student writes a prototype, shows the plots to their advisors who write (some or most of) the paper. It’s rare for senior researchers to inspect other people’s code. In fact, I doubt any of them (except if they teach software engineering principles) has had any training in code inspection.

We need research labs to hire (and maintain) scientific software engineering teams. For that to happen, funding has to be more stable. We cannot rely on project funding that may or may not be awarded. We need stable funding for institutions so they can maintain this team and resources.

2 – More reproducibility

Artifact Evaluation Committees are a good addition to computer science conferences. Mostly comprised of students (who have the energy to debug!), they run the artifacts and verify whether the results of the run justify the results presented in the paper. Having done that myself in the past, it is very tricky to find bugs in that process. Mostly we verify whether it will run outside of your machine, but not whether it is rightly implemented.

What would help is to fund reproduction of science. Set 50% of the agency funding for reproducibility. Labs that get these projects should spend less than the original project to reproduce the results (and most of the challenging decisions are already made). In this approach, we will have less new research, but more robust one.

Given how most of the CS papers are garbage (including mine), I welcome this change. We need more in-depth strong papers that move the needle, and less bullshit papers that are just published for the brownie points.

Overall we need better scientific policies with the right incentives for trustworthy science. I wonder who will take this challenge on…

How to select your side project

Recommended audience: CS students

Austin Henley shares some properties of a good side project. Personally, I think having a clear shippable objective is what most people lack, and prevents them from ever being complete.

I remember having side-projects suggestions during my courses. Maybe that’s something I have to incorporate in mine.

Most of what I’ve learned during my degree was doing side-projects. From competing in hackathons, creating a junior company, organizing conferences, doing a couple of research internships, and doing some freelancing work, these projects all taught me something that was not in the syllabus. That’s what separates you from the average student, and what will get you a good job in a world where unemployed software engineers are aplenty.

Joshua Barretto shares a really interesting list of possible side projects:

  • Regex engine (5 days)
  • x86 kernel (2 months)
  • Gameboy emulator (3 weeks)
  • Gameboy advance game (2 weeks)
  • Chess engine (5 days)
  • Physics engine (1 week)
  • Voxel engine (2 weeks)
  • GUI Toolkit (3 weeks)
  • Posix shell (5 days)
  • Dynamic interpreter (2 weeks)
  • Compiler (3 months)
  • Threaded Virtual machine (1 week)
  • Text editor (4 weeks)

The last four will give you an heads up in the programming language world. I might even have an internship for you.

Perhaps you’re a user of LLMs. But I might suggest resisting the temptation to use them for projects like this. Knowledge is not supposed to be fed to you on a plate. If you want that sort of learning, read a book – the joy in building toy projects like this comes from an exploration of the unknown, without polluting one’s mind with an existing solution.

Selling SAAS to universities

Recommended audience: Startups and large companies who intend to sell software to universities.

Most SAAS is sold on a per-seat basis. But this does not scale to universities, as we have a large number of possible seats, but most of them (students, possibly from different scientific areas) do not use the software, at least for it to be worthwhile.

On the other hand, unpredictable costs (when paying per activity) is also something that does not work, as we need other budget it yearly.

Chris Siebenmann has a really good write up on this issue, which I recommend if you manage or sell to universities.

Smart Donkey Factory

My first day of uni, I received these two t-shirts designed by the student group.

two blue t-shirts: the first one depicting the text biggest fucking noob of CS; the second features a factory that takes a donkey as input and outputs the same donkey, but with a diploma

While I completely forgot about the top one, I keep the bottom one near my heart. While I found it amusing, I did not find it to be true. I did learn a lot during these years, and I gained much more than the degree (which is only required for the Portuguese bureaucratic system where Simon Peyton Jones couldn’t even get a position as Assistant Professor).

Now 19 years later, I no longer find it to be funny. I feel the scholarly spirit is dying and young people do not care about learning or knowledge. They care only about grades and getting the degree. And GPT is the TLA that takes them from the donkey without the diploma to the donkey with the diploma.

I format my computer every semester

This tradition started back when I was a student. I installed random software for each of the 5 courses I took every semester. I ended up with wasted disk space, random OS configurations and always a complete mess in my $PATH.

So I started formatting my Macs at the end of every semester. And I continue doing that today. Being a professor, I also deal with the software baggage every same semester — otherwise I would probably format it every year.

Most of the people I know think this is insane! Because they spend days in this chore, they avoid it as much as possible, often delaying it so much that they end up buying a new computer before considering formatting. And they also delay buying a new computer for the same reason.

My trick is simple: I automate the process as much as possible, such that it takes ~20 minutes now to format and install everything, and another 2 hours to copy all data and login into the necessary accounts. And you can watch a TV Show while doing it.

I keep a repository with all my dot file configurations, which also contains scripts to soft link all my configurations (usually located at $HOME/Code/Support/applebin) to their expected location ($HOME). This process also includes a .bash_local or .zsh_local where I introduce machine or instance-specific details that I don’t mind losing when I format it in 6 months. Long-lasting configurations go in the repo.

If the machine runs macOS, I also run a script that sets a bunch of defaults (dock preferences, Safari options, you name it) that avoid me going through all settings windows and configuring it the way I like it.

But the most useful file is my Brewfile, that contains all the apps and command-line utilities I use. I should write another usesthis, where I go through all the apps I have installed, and why.

My process starts with copying my home directory to an external hard-drive (for restoring speeds). During this process I usually clean up my Downloads and Desktop folders, which act as more durable /tmp folders. When it’s done, I reset my MacBook to a new state. I then install homebrew and Xcode command line utilities (for git), I clone my repo and run the setup script. At the same time, I start copying back all my documents from the external drive back to my Mac. Then it’s time to do something else.

Two hours later, I can open the newly installed apps and login or enter registration keys, and make sure everything is working fine.

Now I’m ready for the next semester!

How scientists learn computing and use LLMs to program

“scientists often use code generating models as an information retrieval tool for navigating unfamiliar programming languages and libraries.” Again, they are busy professionals who are trying to get their job done, not trying to learn a programming language.

How scientists learn computing and use LLMs to program: Computing education for scientists and for democracy

Very interesting read, especially since we teach programming to non-CS students, which is fundamentally different. Scientists are often multilingual (Python, R, bash) and use LLMs to get the job done. Their goal is not to write maintainable large software, but rather scripts that achieve a goal.

Now I wonder how confident they are that their programs do what they are supposed to do. In my own research, I’ve found invisible bugs (in bash, setting parameters, usually in parts of the code that are not algorithmic) that produce the wrong result. How much of the results in published articles is wrong because of these bugs?

We might need to improve the quality of code that is written by non-scientists.

Do not take career advice from engineers with 5+ years of experience

Advice people with long careers on what worked for them when they were getting started is unlikely to be advice that works today. The tech industry of 15 or 20 years ago was, again, dramatically different from tech today. I used to joke that if you knew which was was up on a keyboard, you could get a job in tech. That joke makes no sense today: breaking into the field is now very difficult, and getting harder every year.

Beware tech career advice from old heads, by Jacob Kaplan-Moss

The industry is undervaluing junior developers, by thinking LLMs can do their work. This is true at this instant, but junior developers have the potential to become senior developers.

I still remember years when my team did not have interns at Uber; and years when we did. During the time we did: energy levels were up, and excluding the intern I’d wager we actually did more. Or the same. But it was a lot more fun. All our interns later returned as fulltime devs. All of them are now sr or above engineers – at the same company still (staying longer than the average tenure)

Gergely Orosz

It is up to your faith whether LLMs can eventually be promoted to senior developers (or management). And if you believe it, you may need to reconsider your own job.

Programming for non-CS is different

[…] the top learning objective was for their students to understand that websites can be built from databases.

I’m pretty sure that the most popular programming language (in terms of number of people using it) on most campuses is R. All of Statistics is taught in R.

End-user programmers most often use systems where they do not write loops. Instead, they use vector-based operations — doing something to a whole dataset at once. […] Yet, we teach FOR and WHILE loops in every CS1, and rarely (ever?) teach vector operations first.

CS doesn’t have a monopoly on computing education: Programming is for everyone by Mark Guzdial

The main take away is that you do not teach Programming 101 to non-Software Engineering/Computer Science the same way you teach to those students. The learning outcomes are different, and so should the content.

Funny how Functional Programming (via vectorized operations) is suggested to appear first than imperative constructs like for or while. This even aligns with GPU-powered parallelism that is needed when processing large datasets.

Food for thought.

Projeto de Regulamento do Emprego Científico em Contexto Não Académico da FCT

Está em consulta pública o Projeto de Regulamento do Emprego Científico em Contexto Não Académico da FCT. Eu fiz a minha parte e enviei o meu feedback sobre a proposta:

1. A candidatura é feita pelos possíveis investigadores doutorados. Isto não faz sentido: a candidatura apoia as entidades (privadas ou públicas) e portanto estas deveriam ser as candidatas. Na situação actual mostra um compromisso maior realizar a candidatura do que escrever apenas uma carta de apoio (proposta actual). Mas na realidade, a entidade deveria-se candidatar ao apoio para a vaga, independentemente da pessoa que for ocupar a vaga (nem submetendo o currículo). É que estes resultados demoram meses a sair, e no entanto os candidatos actuais de topo acabam por arranjar outras alternativas.

2. Estão excluídos os candidatos que já tenham um contrato sem termo. Ora uma pessoa que para financiar o seu doutoramento tenha conseguido uma posição de técnico sem termo, termina o doutoramento e não pode concorrer a estes apoios. É uma restrição completamente desnecessária.

Sobre as novas propostas do Estatuto da Carreira Científica

O Governo, PS e BE propuseram três versões muito idênticas de um novo estatuto da carreira de investigação científica.

Em linhas gerais, as três propostas pretendem tornar a carreira de investigador alinhada com a de docente do superior:

  • A contratação é feita por concurso internacional (com júri preferencialmente estrangeiro)
  • Tenure de 5 anos para Auxiliar e 3 para Principais e Coordenadores (Para docentes é apenas um ano para estes dois níveis).
  • Necessidade de Agregação para Coordenadores (mas com a ressalva que se vier do estrangeiro, pode fazê-la durante o período experimental)
  • Existência de Investigador Convidado
  • Avaliação em períodos de 3 anos (ou entre 3 e 5, segundo o BE)
  • Subida em escalões igual à de docentes (6 anos de avaliação máxima)

E outras novidades:

  • Existência de Investigador Doutorando, permitindo eventualmente acabar com as bolsas de investigação ilegais !
  • Carga lectiva até 4 horas (opcionais segundo o BE, decidido pelas instituições nas outras duas versões).
  • Permite alguma mobilidade entre a carreira de docência e investigação, mantendo o ordenado original.

Análise

Alinhamento com carreira docente

O alinhamento com a carreira docente parece-me um ponto positivo, em geral, visto que a diferença entre as duas carreiras se distinguem pelo peso da componente lectiva. Honestamente, parece-me um esforço desnecessário fazer uma carreira separada, quando bastava propor uma carreira única, onde a componente lectiva podia ser variável entre 0 a 100%, sendo a avaliação proporcional a essa fatia. Leis mais simples perduram mais tempo.

Infelizmente os graves problemas que existem com a carreira docente são transpostos para a carreira de investigação:

  • O período experimental é demasiado elevado. Quando comparado com o privado e outras áreas da função pública, os contratos permanentes são atribuídos nos primeiros 2-3 anos ou mesmo na celebração do contrato. Porque não podem as universidades e centros a liberdade de oferecer à tenure imediata a candidatos de excelência e CV apropriado.
  • Dá-se importância à agregação/habilitação. Embora seja mais fácil contratar investigadores de entidades estrangeiras onde não exista este título, é exigido na mesma aos nacionais que estejam na indústria. Devíamos descartar a necessidade de habilitação para qualquer posição: o currículo científico já é avaliado na totalidade pelo júri. A existência deste requisito não é justificado, senão para alinhar com a docência (onde também não encontro justificação).
  • Investigadores Auxiliares não gerem projectos. A separação entre investigador auxiliar e principal baseia-se no princípio que os principais gerem projectos. Ora a realidade é que os investigadores auxiliares gerem projectos (desde exploratórios até aos projectos de 3 anos FCT), criando uma situação impossível. Nesse caso, bastava serem investigadores principais de um projecto financiado para progredirem automaticamente para a categoria de Investigador Principal.
  • Os investigadores não doutorados têm de ser doutorandos. Não existe enquadramento para investigadores que não tenham nem queiram ter doutoramento. Passaram por mim já alguns jovens que queriam ser investigadores por alguns anos sem tirarem doutoramento. Estão satisfeitos com a formação de mestrado e estão a ser produtivos (com vários artigos publicados como primeiro autor). Porquê exigir que todos tenham doutoramento?
  • Avaliação de 3 em 3 (ou 5 em 5) anos é insuficiente. Tal como na docência, um período experimental de 5 anos (ou limite de 3 anos para convidados) torna uma avaliação ao final de 3 anos insuficiente para alterar o curso. Devemos promover avaliações ao final de semestres ou pelo menos anuais para docentes/investigadores convidados ou em período experimental. Assim, há de facto feedback útil para melhorarem.

Afinal já existia um LLM Português!

Recentemente o Primeiro Ministro anunciou na WebSummit uma contratação directa para o desenvolvimento de um LLM Português. Falou no IST e na Nova, omitindo o resto do consórcio em IA Responsável que tem trabalhado em vários aspectos relevantes (fairness, sustentabilidade ambiental e fiabilidade).

Curiosamente, um grupo de investigação do meu departamento tem já feito trabalho na área, tendo lançado dois modelos (Albertina e Gervásio) em Português Europeu. Deu agora uma entrevista muito educativa ao Dinheiro Vivo:

Por exemplo, um banco quer apenas ter um assistente virtual para os seus clientes, que fale acerca de depósitos, levantamentos, etc. Não vai querer que o seu chatbot faça tradução automática, sumarização, dê a biografia do Friedrich Nietzsche e faça piadas.

Um LLM não é um chatbot. Um LLM é, numa analogia que as pessoas compreendem, uma espécie de um motor e a partir de um motor nós podemos fazer diferentes modelos de carros. O LLM é aquilo sobre o qual se pode desenvolver diferentes aplicações, uma das quais é o chatbot, outra, por exemplo, a tradução automática, ou o diagnóstico médico, etc.

Então, a nossa proposta nesse artigo de opinião, que saiu no Público em fevereiro de 2023, é que o que precisamos de uma IA aberta e de desenvolvimento de LLMs em código aberto, licença aberta e distribuição aberta para que outros atores e outras organizações, seja da investigação, seja da administração pública, seja do setor da inovação, possam eles próprios construir as suas propostas de valor e tirar partido desses LLMs sem estarem dependentes do fornecimento desses serviços, das big techs. Portanto, quanto mais houver uma oferta cada vez mais variada, mais se reduz o risco de dependência de um pequeno oligopólico que nos fornece esses serviços.

António Branco @ Dinheiro Vivo