Alcides Fonseca

40.197958, -8.408312

Posts tagged as Software Engineering

False and True Positive Testing in Differential Testing

Alive2 is a translation validation tool: given two versions of a function in LLVM IR–usually these correspond to some code before and after an optimization has been performed on it–Alive2 tries to either prove that the optimization was correct, or prove that it was incorrect. Alive2 is used in practice by compiler engineers: more than 600 LLVM issues link to our online Alive2 instance.

John Regehr & Vsevolod Livinskii

Really interesting read on how Alive2 is used alongside the Minotaur superoptimizer and llvm-mca.

European Union will stop funding Open-Source projects in Horizon Program

The European Union must keep funding free software

Since 2020, Next Generation Internet (NGI) programmes, part of European Commission’s Horizon programme, fund free software in Europe using a cascade funding mechanism (see for example NLnet’s calls). This year, according to the Horizon Europe working draft detailing funding programmes for 2025, we notice that Next Generation Internet is not mentioned any more as part of Cluster 4.

NGI programmes have shown their strength and importance to support the European software infrastructure, as a generic funding instrument to fund digital commons and ensure their long-term sustainability. We find this transformation incomprehensible, moreover when NGI has proven efficient and ecomomical to support free software as a whole, from the smallest to the most established initiatives. This ecosystem diversity backs the strength of European technological innovation, and maintaining the NGI initiative to provide structural support to software projects at the heart of worldwide innovation is key to enforce the sovereignty of a European infrastructure.
Contrary to common perception, technical innovations often originate from European rather than North American programming communities, and are mostly initiated by small-scaled organizations.

Previous Cluster 4 allocated 27 millions euros to:

“Human centric Internet aligned with values and principles commonly shared in Europe” ; “A flourishing internet, based on common building blocks created within NGI, that enables better control of our digital life” ; “A structured eco-system of talented contributors driving the creation of new internet commons and the evolution of existing internet commons” .

In the name of these challenges, more than 500 projects received NGI funding in the first 5 years, backed by 18 organisations managing these European funding consortia.

NGI contributes to a vast ecosystem, as most of its budget is allocated to fund third parties by the means of open calls, to structure commons that cover the whole Internet scope – from hardware to application, operating systems, digital identities or data traffic supervision. This third-party funding is not renewed in the current program, leaving many projects short on resources for research and innovation in Europe.

Moreover, NGI allows exchanges and collaborations across all the Euro zone countries as well as “widening countries“¹, currently both a success and and an ongoing progress, likewise the Erasmus programme before us. NGI also contributes to opening and supporting longer relationships than strict project funding does. It encourages to implement projects funded as pilots, backing collaboration, identification and reuse of common elements across projects, interoperability in identification systems and beyond, and setting up development models that mix diverse scales and types of European funding schemes.

While the USA, China or Russia deploy huge public and private resources to develop software and infrastructure that massively capture private consumer data, the EU can’t afford this renunciation.
Free and open source software, as supported by NGI since 2020, is by design the opposite of potential vectors for foreign interference. It lets us keep our data local and favors a community-wide economy and know-how, while allowing an international collaboration.
This is all the more essential in the current geopolitical context: the challenge of technological sovereignty is central, and free software allows to address it while acting for peace and sovereignty in the digital world as a whole.

— OW2

The Register has more information on this issue.

Runtime, Run time and Run-time

During my PhD, my advisor and I disagreed about the spelling of runtime/run time. There was no definite answer as different papers used different spelling in different usages. This week I found a rationale that makes sense to me, even if it means that we were both wrong.

There are three variants of the word “run time” in computer science:
run-time — adjective (“the run-time performance”)
run time — noun — a moment in time (“occurs at run time”), or an amount of time (“the run time is 8 hours”)
runtime — noun — a program that runs/supports another program (“the runtime handles memory allocation”); a synonym for “runtime system”

Catching the Java Train

If you are into Node.js, Elixir and all those new cool tools, you might have not heard of Java. Java is a bloated language and runtime from the 90ies that is used mostly by your bank.

Java has been loosing adoption because it has not been up to date with features that developers want from a language. Specially compared with its evil twin, C#/.NET, that has been the recipient of several new and awesome features, driving innovation in the ecosystem (LINQ, Type Providers, etc…).

They have now announced that Java will follow the train model of Ubuntu: a new release every 6 months with whatever’s ready at the time, and a LTS (Long-term-support) release every three years for enterprise customers.

This is they response for having the same product for both enterprise and new kids. My opinion is that they are hiding from the truth. Java8 was delayed until Lambda was ready. Java9 was delayed until Jigsaw (module system) was ready. I believe they took the right approach. Except I would have released Java7.1 and Java8.1 with some new stuff that was ready before the main incomplete feature was. The real problem here, the one they are avoiding, is that it took them ages to develop and mature both Lambda and Jigsaw (and at least on the Lambda part I think it was really incomplete).

The enterprise world was mostly OK with not having nor Lambda nor Jigsaw at the predicted time – they do not care. They may have been annoyed for waiting so long for security updates that were waiting for the main feature. Something that with my minor releases would be fixed easily. This is what happens with other Open-Source languages such as Python or Ruby.

The main problem is that the new kids were missing these features and switching to Scala, Groovy, Kotlin or whatever was trending at the time on Hacker News. So much that Google accepted Kotlin as an official Android language, which IMHO sucks for Oracle. But Oracle has made it clear that they do not care about the success of Android. And changing to the train model does not fix this type of problem. If the main features will be delayed, developers will not care about it if they have better usable alternatives.

So, dear Mark Reinhold, I believe this is more of a marketing stunt than actually solving the real problem: JDK development is slow as hell.

And I’ve experienced this first hand: I joined project Sumatra to bring GPGPU to Java, which I first did via the AeminiumGPU project. However, no one in the JDK team made this an effort. AMD tried to bring the Aparapi, but the lack of efforts from the remaining members lead the project nowhere. Even the prototype source tree was abandoned.

I believe the main reason is that despite Java being Open-Source in theory, it is not. It is a Oracle-controlled environment. Python has advanced much more in the same years being completely open-source (despite some investments from Google and Dropbox, among many others). Python does not need the train-model: it has a bleeding edge version in 3.6 (or whatever 3.X we are in right now), and still has 2.7 (or 2.6 if you are lazy like me) running fine in many servers.

Train model is not the solution: a better (and more open) development process is.

Self-taught developers

Source for first website – table based layout, a lot of view source, a lot of Notepad, a lot of IE 6. Used to work mostly in HTML and CSS. With the help from books like “HTML for the World Wide Web – Visual Quickstart Guide”, learned a lot as a tinkerer.

Two years in: good with HTML (table layouts) and moderate CSS (fairly new), basic PHP, could use FTP and do basic web config. Could get a site up and running from scratch. This was enough to get my first developer job. This was without any computer science background.

Now: front end developer with 10 years experience, not an engineer, or a code ninja. I don’t know Angular, React, WebPack. I don’t even know JavaScript inside out. I am valuable to my team. Need more: empathy, honesty, being able to see stuff from a user’s perspective.

Self taught developers today, via Tom Morris’ live blogging.

Back in my day, we learnt how to do things. Nowadays, kids learn how use high-level APIs, without any idea how things work underneath. They might learn Meteor, but have no idea about HTTP or Sockets or how HTTP Sessions are implemented. Which is fine for developing tiny little apps, but they miss the I understand all this sh*t feeling.

Supposedly high-level frameworks allow developers to write more complex programs in the same timeframe. However, I don’t believe this is true for small projects, because the setup time is increasing exponentially. Let’s start a new single page app, what do we need? Node, npm, webpack, angular or react or any other trendy framework. Say what you will about PHP, but it was a single one-click WAMP install away from your fingertips.

If you were a 13 year old kid wanting to develop your own app, what would you use?

Every time you use -f, a kitten dies

I’ve been only using git for little more than two years now, but having using it daily for every project (even those in subversion servers, via git-svn) I’ve learnt a few tricks and developed my own workflow.

During this semester, I have been working on a 13 people project and we are using git (and github) to manage the code. This means a large code-base with two different teams working on different parts of the software, that depend on each other. And I’m the lucky poor bastard who has to keep updated with the whole system and perform the merges of feature and bug branches.

Working on such environment makes weird stuff happen to the repository and when one gets to merge a branch, discovers everything is now broken and some stuff disappeared. Here are some things to avoid, learned from this and many other projects:

  • Developer A commits some stuff. He then pushes to master.
  • Developer B (almost at the same time) commits and pushes to master.
  • Developer A finds out he forgot to include one file, and commit amends the file. He then pushes with -f (because an amended commit requires it) and B changes are lost for ever (not quite, but B may delete that code once pushed).

Another interesting story is about a feature X that was accepted to be merged into master, but since it was based on a really old version and a total refactor of half of the code. Smart as I were, I decided to do a rebase instead of a regular merge, to resolve merges commit by commit. Turned out I needed to undo the rebase and turn it again on a branch without my conflict solving.

As a rule of thumb, avoid at all costs to use -f, because as easy and attractive as it might seem, in the end it might corrupt your repository. Also, merges are a nice way of keeping your history clean and prevent from losing individual codes.

Writing a compiler using Python, Lex, Yacc and LLVM

I found a good post on how to build your own toy compiler using Flex, Bison and LLVM. I saw one disadvantage right in the beginning: you had to use C++. If I were just prototyping a compiler, I wouldn’t use C++ but rather a dynamic language. And last semester for the Compilers course that’s what I did.

Students were assigned to build a Pascal compiler (actually a subset, but not that small) and the tools suggested were Lex, Yacc (using the C language) and compiling the code into C. I took a different approach and decided to do the project in Python (I actually tried ruby first, but the ruby-lex and ruby-yacc projects didn’t pass my basic tests).

I wrote the language grammar using PLY (the lex and yacc DSLs for python) and it was pretty simple. As for the AST generation, I had only a class Node that accepted an type and a list of arguments while my colleagues using C had to make 1001 structs for each kind of node. Not that it wasn’t impossible using C, but dynamic languages make the code simpler and more clear.

For the code generation, I decided to go with LLVM. It is a very promising project. Just take a look at google’s unladen-swallow or macruby, even parrot is planing on using llvm for their JIT.

For writing the code in Python, I had to use the llvm-py which I may say it’s in a early stages and lacks documentation. That was my major problem using. I had only three resources: the official guide, a presentation in japanese with some source code, and the actual source of the project (in C and C++).

Since every time I got an error in the llvm code generation it crashed the program, I had to dig into the source code of the project and find that error message and reverse engineer what was wrong with my code (usually I was giving values or pointers instead of references and vice-versa). So if you are doing something more complex, you actually need some C++ reading skills.

The project however worked, and I’m making it available so anyone may use the code as an example until better resources are published.

The Github Momentum

History

Long time ago there was this website called SourceForge that hosted the majority of opensource projects. It would offer a unix shell account, hosting and CVS (and later on, SVN) repos and CDN powered downloads. Today a lot of Unix and Windows utilities live there.

Google got big and in 2006 they launched their own OpenSource Hosting with a SVN repository, wiki, issue tracking and downloads. Plain simple, à lá Google.

Then a couple of ruby hackers started a side-project called Github that offered repository hosting for projects that used the Git Version Control System. But it wasn’t a regular hosting like SourceForge, GoogleProjects, or even BitBucket or Launchpad [1], it uses the Web2.0 success model:

Simple to use

If you have used Github, you have seen that the web interface is really simple. Basecamp-like simple. The only thing that is limiting this factor it git itself that is not as straightforward to use as SVN or even Mercurial. But they even did some tutorials and provide some help about git itself, which works pretty well and is making some great opensource projects migrate to their service.

Social network

This is a small difference to the regular services. In Github you can follow2 developers, or simply some project. It has an activity stream (think facebook) where you can be up to date with commits, forks, pushes related to the projects you care.

Freemium

It is free for opensource projects, but they run a business. If you want your company to use their features for your projects, you can buy one of their plans. I find they a bit expensive, specially the lower ones for small teams, but it’s not by chance that they won the Best Bootstrapped Startup Crunchie.

Github Rocks!

I love the decentralization of git and now more than ever I love the offline commits. So bad I migrated everything I had in SVN and I am hosting everything as a git repository in my external hard-drive, VPS and the important ones in Github.

I have tried to use simple git repositories in my VPS and even using Redmine to browse but the experience sucks comparing to Github where you can see the various branches, commits and even get some stats.

There are some nifty features like being able to host your webpage as a github repo, or the per-project wiki that’s very useful for storing the documentation of your opensource project. There is a small different against google’s project wiki: it isn’t available in the repository, which I find weird for these guys that even have snippets in repositories. You can also edit a file and commit right there in the browser, which I use sometimes for quick fixes in my website. But my favorite feature is the commit comments which Gaspar use for code reviewing.

What I most miss is an issue tracker. Google has this, and while Github doesn’t include one, it allows you to integrate with 3rd party services like lighthouseapp. Be there is always hope.

The Catch

As I said before, I love the fact that Github works with the opensource community. They even blog about cool projects they host. There is a general concern about a commercial company hosting most of the opensource projects around (being Google, Github, any of them). I agree that would be safer to have non-profit entities, like the FSF and a non-freetard one, to do this service. However I find the advantages of having an innovative company working on this service enough to have the risk of having most of the opensource projects in the future.

1 The later two are a step ahead of the former and more close to Github.

2 Or stalk…

Thoughts on PDC

Some may accuse me of being a Microsoft guy, but using a mac in the past or so, I can’t really say that about me. Nevertheless, I keep an eye on Microsoft Conferences ( and I even got to attend one or two) because really cool stuff come from them. I’m not kidding about this. Let’s see PDC 2008:

Windows 7

I’ve been following Engineering Windows 7 blog, so I was pretty up to date with this stuff, but seeing real screenshots was pretty impressive. I have mixed feelings about the taskbar redesign. While I really liked the old one, I understand that this way it’s more usable in smaller resolutions (say notebooks or even mobile phones, think Shift or Advantage). But in bigger displays, that are cheaper and cheaper each day, the old style was pretty cool.

The vista style of the windows was predictable, but I really hate it. I do! I hope they get a real theming engine, and not make us use some third party software to make them more macish.

One cool surprise was to see that they fixed the horrible wifi icon in the traybar. Linux and Mac did it right years ago, and in Windows up to Vista and even in Windows Mobile it’s a pain to connect to networks.

About the multi-touch? Well, they had it all along with Surface (and Surface SDK), so no big surprise. We’ll see MS release the iTablet before Apple does.

The Cloud Stuff

a.k.a. Windows Azure

Well, startups are going the Cloud way. Amazon Web Services and Google App Engine are just a first step. Microsoft wants Entreprise costumers to join this trend, and be able to have their business in the cloud. I don’t know if this is going to be such as a success and they think. a) real small business don’t want their data on the clould. They want it in their small server in their intranet. b) Large companies that have the need for a cloud server probably can support having their own infrastructure and not relying on Microsoft. Maybe I’m mistaken, but we’ll see.

James Governor has written a really interesting post on this matter and even mentions OpenID in Azure Services.

More Cloud Stuff

a.k.a. Live Mesh

Live Mesh is the Mobile Me for the rest of us. It syncs files P2P or through the cloud and for those, like me, with several computers rocks.

Since the Mac and Windows Mobile clients came out, I guess I’ll have to give it a try some day.

Dale Lane writes about the transition from USB syncing to Cloud syncing. It’s true Google doesn’t provide a offline sync out of the box in the Android, but I like to have the oldschool method available when needed.

Yet More Cloud Stuff

a.k.a. Live Services

Angus got extra points for the shirt and for spreading the social word among the entreprise developers there.

It’s true that Microsoft has a different view form Google and Yahoo that are embracing the OpenID+OAuth way, but this might change in the future. You can already see some little steps being made.

Dynamic Languages

Oddly, the first dynamic language I noticed in PDC was C#. Really! C# is now lightyears away from Java, and is evolving continuously. Version 4 brings a lot of new features and one of them is the ability to integrate dynamic languages directly in C# using the dynamic type. I believe C# is becoming more of a glue language (LINQ, Dynamic Languages, F#) that allows programmers to switch smoothly to other languages.

As usual, I love John Lam’s talk on IronRuby that besides the usual C#, Silverlight and Testing/Mocking stuff, demoed a Visual Studio Plugin in Ruby and Web Services using Sinatra. You should really take a look at it.

Oslo Modeling tools

DSLs are becoming popular in the several business software. and is something Microsoft was looking at a while ago. While I’d say IronRuby was the way to go (see RSpec examples), they took it further and made their own toolkit, Oslo, to develop both visually and textually Models The language they created to achieve that purpose is called M, and right now is supported through the IntelliPad editor.

In fact this editor was what got my interest in this area, since it’s codename was Emacs.NET, and since I’m in the quest for the perfect editor I wanted to take a look. Well, right now it supports the M language, but “you can extend it using IronPython”:hhttp://www.masteringbiztalk.com/blogs/jon/PermaLink,guid,92ec6f1f-45e5-4b7d-b675-548be5131a07.aspx. I’ll wait to see the first plugins to support different languages in the IntelliPad.

In the meanwhile, take a look at the different Oslo sessions at PDC

Mono

Yeah, Mono gets to be one of the main points of this post, as it should also be very important to Microsoft. The work Miguel and the team is doing gives much more value to .NET and Microsoft, than any other technology they presented in my opinion. Since the Mac and Linux worlds are raising their share, it’s important to let developers target those platforms too. And their doing interesting new stuff too, like the C# compiler service, the C# interpreter and even running .NET apps in the iPhone!

So take a look at his talk, one of the best in the whole PDC.

Of course this wasn’t everything PDC was about, but the stuff that I really care about. And I really liked some of this stuff!

Now I can touch ASP.NET again

So after my first real project in ASP.NET 2.0, I’ve never touched ASP.NET again. It’s simply ugly. And coding for the web in a language like C#, or Java is really a PITA. I just want my logic explained, and it’s one of the reasons for Ruby on Rails success.

But today Microsoft has made a small step that may make me experiment some stuff in their web technology again:

This afternoon we released a refresh of our DLR/IronPython support for ASP.NET, now called “ASP.NET Dynamic Language Support”, on our CodePlex site.

This means I will be able to do MVC web applications in Python (or Ruby). This is their response to the RoR success. Of course I like Django the most and I may even use it in the MS stack. This because the Microsoft teams for the IronRuby and IronPython are working to get Rails and Django working in their platforms, which is a really cool thing coming from the company that we all know well.