Alcides Fonseca

40.197958, -8.408312

Tim Bray on Sharding

Which is to say, run some tests. You might just find that you’re getting enough performance out of your database that you can random-spray across your shards, have a stateless front-end and auto-scaled back-end and sleep sound at night because your nice simple system pretty well takes care of itself.

Tim Bray – On Sharding

Tim (of Tim’s dumb affinity code fame) presented a shortcoming conclusion. Besides running some tests to evaluate several sharding alternatives, you also have to be employing talented engineers in your company, so they can adapt the system according to the dynamic needs of sharding.

And 50% of the traffic being generated by the top 10 clients is not much, when compared with the Pareto principle that these situations usually fall into.

Beatas no chão e fumar para trabalhar

Acho nojento olhar para a nossa bonita calçada portuguesa e ver beatas espalhadas por todo o lado. Esse cenário é infelizmente comum à volta de cafés, restaurantes e outros sítios de lazer.

É com bons olhos que vejo o projecto de lei apresentado pelo PAN e aprovado pela assembleia que proíbe atirar pontas de cigarro para a via pública e que, na sua redacção final, prevê coimas entre 25 a 250 euros. Apoio mesmo sabendo que na prática pouco vai mudar.

Mas o que me deixa mais chocado é que todos os edifícios onde é proibido fumar deverão dispor de cinzeiros e de equipamentos próprios para deposição dos resíduos indiferenciados e selectivos, sob pena de enfrentarem multas de 250 a 1500 euros. Não que tenha uma posição tão radical como a japonesa (no Japão, a reciclagem teve imenso sucesso à custa de que cada cidadão é responsável pelo seu lixo, tendo mesmo que o levar no bolso), mas acho ridículo que esta seja a prioridade dos nossos deputados.

Isto porque a lei do tabaco já foi actualizada várias vezes, mas nunca se aplicou a proibição de fumar a 5 metros da porta de locais públicos. Já houve tentativas, mas eram focadas em hospitais e escolas e sem nunca considerarem a aplicação de coimas.

Não me incomoda que existiram caixotes do lixo à porta de locais públicos (sobretudo locais de trabalho). Mas existirem cinzeiros é um convite a fumarem à porta de todos os edifícios (acentuado em dias de chuva), onde qualquer pessoas é obrigada a ser fumador passivo para sair ou entrar do seu local de trabalho.

Ao fim de contas, uma pessoa é obrigada a fumar para trabalhar, ainda que em segunda-mão, com todos os efeitos prejudiciais para a saúde que implica.

Acho bem que se multe quem atira beatas para o chão. Tal como acho prioritário que se multe os fumadores à porta de edifícios (ou esplanadas, ou paragens de autocarro, ou locais de reunião) antes de multar qualquer edifício público por não ter um cinzeiro.

Contra o artigo 13

A nova lei europeia da directiva de direitos de autor foi aprovada pela comissão de assuntos legais. No final de março vai à votação a plenário.

Esta proposta foi feita por burocratas que não percebem nada da internet nem da forma como funciona. Não sei que peritos foram consultados, mas qualquer pessoa que perceba do assunto diria que isto é uma estupidez de todo o tamanho. Ou se calhar até disseram.

Escolham um dos deputados europeus que representam Portugal e liguem ou enviem email a pedir para votarem contra. Eles estão lá para nos representar! Eu já enviei o meu.

The Monopoly in Browsers

I remember the dark old times where IE6 was the default browser everywhere. Because of that dominance, developers focused solely on the IE substandard of JS/ActiveX/JavaApplet, leaving Linux and Mac users behind. Not long ago, you were required IE to submit your taxes. That has changed: Now you are required Chrome or Edge. As a Safari user, I find myself having to have an installed Chrome to access unsupported websites.

This week, MS announced they would be dropping their own rendered (EdgeHTML, a fork of Trident) on their current browser Edge and they will be using Chrome’s Blink (itself a fork of Webkit). So now the major browsers are:

  • Edge (Webkit/Blink)
  • Chrome (Webkit/Blink)
  • Firefox (Quantum Render, replacing Gecko)
  • Safari (Webkit)

Except for the tiny differences between vanilla Webkit and Blink, almost all the web uses the same renderer. This is the same monopoly Trident (IE5/6) had more than 10 years ago! And the sole fighter for a diverse web is the same browser and team that fought then: Mozilla Firefox! And this is not by chance: Mozilla’s Foundation is all about diversity and open standards. Just check their funded research projects to see that they put their money where their mouth is.

If it weren’t for Firefox, I believe Chrome would never had the success it had. Now Chrome’s the one that’s monopolising the web, and we need Firefox to be an alternative that will allow the NextBrowser™ to replace Chrome in 10 years.

I’ve been using Firefox as my main browser for a while now, and I can heartily recommend it. You should try it (and maybe talk to your relatives about it at Christmas). At this point, which browser you use no longer feels like it’s just about personal choice—it feels part of something bigger; it’s about the shape of the web we want.

Jeremy Keith

There is no such thing as pixel-perfect!

If you’ve spent any time with graphic designers, you’ll know that they love spending your money on imperceptible tweaks to your image files. “It must be pixel-perfect!” they cry. When you query why they’ve generated the same icon in multiple sizes, each with subtle variations, they cryptically mention how everything must align with “the grid.”
This is hokum.

The Myth of the Pixel Perfect Grid by Terence Eden

I have a couple of design friends that will keep on doing things pixel-perfect because of their OCD.

The impact of Microsoft acquiring Github

Microsoft has bought Github. This solves two problems for MS and two problems for GitHub.

The most insignificant improvement for Microsoft is that now they have expertise in git to solve the Windows mega-repository nightmare. The more important change is that they now have a piece of the development ecosystem that they have lost since Microsoft Visual studio became a boring concept (The recent Xamarin acquisition was the prequel for this move, more on this later).

For Github, this move has given them financial stability, as well as a new CEO. Why is this important? Well, a couple of years ago there were a string of harassment complaints at Github, including the CEO’s wife. Tom has since resigned. The new Ex-Xamaring New-Microsofty Nat comes from an open source background and has had quite a good run with Xamarin and its integration with MS tooling.

The hot question right now, is about the impact of this move on the Open-Source community, especially after a surge of project migrations to gitlab. Developers are afraid of what MS will do with Github, and are migrating to a more Open-Source alternative. While Github contributes to the community (libgit2, Atom, etc…), Gitlab develops its platform in open source (only the enterprise edition is closed source). This allows people (like me and my University) to host our own Gitlab instance, keeping all our (and our student’s) source code in-house. In case ever closes, this is a better safe, than relying on Github to survive for ever.

What I am more concerned is the impact of developers migrating to Gitlab on the community. These migrations can go either smoothly (keeping a mirror on or abruptly by deleting the repository on Github. I believe most of the cases will be the first, but I will consider the second for the sake of preparing for worst case scenario. Here’s what is going to break:

  • Github-hosted pages (within GitHub subdomains)
  • Other repositories that have the migrated repository as a sub-repo.
  • Homebrew formula’s that build from source. I assume other source-based package managers will also suffer from this (emerge, Portage and pkgsrc).
  • Python’s Pipfiles that depend on Github’s repositories will break. I assume the same for other language-level package managers.

Breaking the Cool URLs don’t die rule has a large impact on the community nowadays. A similar problem arised when _why committed online suicide and deleted all his repositories. All the ruby gems that depended on his were now broken. In his case it was an option, but you might be forced to migrate out of Github (it might close down one day if it goes out of business, or MS decides to go in other direction).

So how can one protect themselves against being too dependent on Github/MS or GitLab or whatever new fancy service? Well, it requires some DNS work. You should always own a domain that you use in your own repositories. This requires special support by Github/lab, similarly to DNS support by GitHub-pages, or what Tumblr or do. This way, you can always change git provider (assuming you have a compatible Issues API, which Gitlab does and you can migrate public keys and access). This way you can migrate safely to another provider, without breaking all the dependencies on your code. Of course, just like backups, this has to be planned ahead of time, before announcing your git repo.

Surprisingly, this is not a new problem. Back in 2009, when considering the dubious Twitter management moves, people were thinking about how to move their micro-blogging to a distributed platform, so they could have different providers. Standards were developed and in the meanwhile, most of Twitter users moved to Facebook, an even more proprietary platform.

In conclusion, just like in the past, I don’t expected anything to change. I can’t see Microsoft closing down Github or even screwing it up. I assume they will merge Atom and VSCode, which share a common Electron code-base, which is really good news for Github, because Atom was lagging behind VSCode lately. MS and Github were already collaborating on Git VS/Windows tooling, and I expect that to continue. Git hosting will stay the same for MS, probably with an EE edition on Azure. I can’t really see this going wrong over the next 7 years.

MS Excel for Mac 2016 without sorting

I cannot sort by columns on my Mac 2016. It simply crashes I always thought it was strange for such important feature to be crashing.

Today I found out that Apple is patching the APIs that older apps from Microsoft (and many other developer) use. Sounds a lot like what Microsoft did in the last to an extent that it because impossible to maintain and test years laters.

From the long and interesting post:

Microsoft Excel/PowerPoint/Word have a patch in _CFArraySortValues to change the sorting algorithm slightly. How do you break sorting?!

I wonder if it has something to do with that Sierra patch (I’m running High Sierra).

Overcoming the No Free Lunch Theorem in Cut-off Algorithms for Fork-Join programs

Editorial note: I will start blogging more about technical content. This one is about my latest research paper.

Have you heard about the No Free Lunch Theorem? Basically it occurs when different solutions for a class of programs are not better nor worse than other solutions on average.

A non-mathematical example: if you have a great pasta chef, a great sushi chef and a great chimichanga chef, it’s a case of the “No Free Lunch Theorem”, and not because they will be expensive, but because if you ask them to prepare meals from the three cuisines, the quality will be equal on average (each one with a great meal and two so-so ones).

So this occurs in search and optimization algorithms, in which algorithms can be fined tuned for a given problem, but that extra-performance will not be seen in other types of search and optimization problems. Maybe the latest research on Transfer learning will show that this is not the case.

Back to my paper, it should evidence of the application of the theorem in the case of granularity control mechanisms for parallel programs. But what are those granularity control mechanisms?

When you write a divide-and-conquer parallel program, you split a program in two independent halves that you computer in parallel. But if you have more than 2 cores on your CPU, you can subdivide each half again, and you can now use 4 cores. And so until you can’t subdivide. But if you subdivide to have 1000 tasks, but you only have 16 CPU cores, your program will actually be slower! Dividing tasks and schedulinging each micro-task takes time, and doing that 1000 times is stupid. So the solution is to create as many tasks as you have CPU cores1, right?

If only it was so simple. If a program being split in half would result in two independent tasks that take the same time to conclude, than the answer might be yes [2]. But there are several unbalanced or asymmetrical task graphs like this one:

In these cases, you have to create a dynamic condition that stops creating parallel tasks, and begins solving the problem sequentially. Here is an example program, using the Fibonacci example3:

The major research question of my PhD was to find the ideal function for that cut-off criteria. I proposed two new approaches and I have even tried to use Evolutionary Computation to find it for me, but I could never get to a function that would outperform all others in all of the 24 benchmark programs I have used.

So the answer to my PhD main Research Question is that it is impossible because this is a case of the No Free Lunch Theorem! There is no optimal cut-off criteria!

I could have given up on this, publishing a paper that would show that there is no answer, in order to prevent future PhD students to waste their time researching this topic like I did. But I ended up. not giving up on the big-picture problem here: Automatically optimizing parallel programs.

If there is no single cut-off criteria that is better than the rest, and typically there is one that is the best for each kind of problem, I took on the quest to automatically choose that best criteria for each problem. Using data from either my benchmarks, and random synthetic benchmarks, I applied Machine Learning Techniques to learn and predict the best criteria for future programs.

It’s 2018 and Machine Learning has saved the day once more [4]

1. Or hyper-threads, or CUDA cores, or OpenCL compute-units, of simply different machines.

2. But you still have to consider lock and memory contention.

3. Yes, this is a dumb implementation of fibonacci. It could be better either by being sequential or by using memoization. However, this dumb version is very asymmetrical and it’s hard to predict when its useful to stop parallelizing: it makes a wonderful hello world for this kind of problem.

4. Back in 2012 I have already applied Machine Learning to a very similar problem: deciding if a data-parallel program would execute on the CPU or GPU

Recovering exFAT partitions

Today my exFAT partition on my 1TB external hard drive died. I split that hard drive half-way for Time Machine (MacOS Extended (Journaled)) and exFAT for sharing large files with Windows machines with no stupid limits.

Disk utility was not helpful as it just hang when trying to repair the partition. And running any command-line utility gave me a “Resource Busy” error.

The solution:

ps -ax | grep disk2

with disk2 being the hard drive in question. Besides the grep process itself, you should kill all processes. Afterwards run:

sudo fsck_exfat -d /dev/rdisk2s2

with disk2s2 being the second partition of disk2. If should print a list of all the files in the drive as it fixes the filesystem (-d stands for debug).

Problem solved.

Star Wars @ Geek Freak RUC

Desta vez a convite do João Cotrim, fui ao Geek Freak fazer parte de um painel sobre o Star Wars Ep 8 – The Last Jedi

EXPosure and Pixelscamp

If you aren’t aware of, it’s a 3 days hackathon in Lisbon, Portugal. This time I’m living 3 metro stops away, so despite not taking days off, I’ll be participating.

One of the bigger changes from previous editions is that a new voting system will be used, based on EXP, a closed crypto-currency on top of the ethereum.

Basic rules:

  • If you are at the event, you get badges, each amounting to 100 EXP (I’m ignoring the variance here).
  • There are angels who own a lot of EXP’s (25.000).
  • There is one lucky guy who may win 50.000 EXP by solving a scavenger hunt.
  • Participants can create projects.
  • Everyone can invest in projects. The invested money is controlled by the organization.
  • At the end of the event (and presentations), all the money invested in all the projects is redistributed proportionally to the 10 “richest” projects. These are the main winners of the event.
  • After this, the EXP collected by each project are distributed by the investors proportionally to their investment.

I believe this approach is flawed.

Problem 1: Money is worthless at the end.

Unless you are one of top 10 (out of ~100) project owners or one of the X top investors, you get nothing. There is no actual interest in saving your money, or betting in low-risk investments. You want to hit the jackpot by investing in the number one project, if you are looking for winning the investor prize (I’m assuming the angels are out of the race).

Additionally, if you are the scavenge hunt winner, you should be extra careful with how you spend your money, not to lose it (because that money without investing is useless).

So, you should invest in the best company. The question is when.

Problem 2: Project presentations are in the end of the event.

There are no “investment” rounds here with no incentive for investing early. The best approach is to invest in the end, where you already have an idea of which projects have a chance of winning. What happens if everyone invests in the end? Maybe that’s the most fair scenario, but it is no different than a regular voting system without decentralized coins.

Problem 3: If you are competitive, you should invest in your own ideas.

If you really want to win this, it makes sense for you to invest early in your own project. If everybody does this, then the prize is only decided by the angels. Except if you do the right strategy:

  • Get a small team of developers who are interested in doing something useful. Alternatively, create vaporware. Doesn’t matter as long as the angels are interested.
  • Get a large team of miners, participants who are not interested in building something, but rather want to collect badges and participate in activities.

This will be the best approach towards winning (unless you have the best project according to the angel’s opinions).

Problem 4: There are no rewards in this Kickstarter.

The project presentation is supposed to work like Kickstarter. The most successful projects on Kickstarter had excellent rewards for investors. Here, unless you are rich, you have no real rewards.

If I end up creating a service as a project, I will give pre-access to the service to whoever backs me. If I am doing a prototype of something that will not last, there is no real reward I can give out (except free hugs).

Problem 4: Work vs Play

You should focus either on working (to get one of the 10 prizes) or playing (and getting badges to be a great investor). I don’t believe a half-way approach is very useful towards winning prizes (Although it might be the best fun experience).

Idea 1: VC-like groups.

Let’s say I invite everyone I know to a VC company. We decide amongst ourselves in which project we want to vote. Only the most voted project gets all the money from the group members. This might seem unfair for members who didn’t vote for it, but in the end it gives that project a better chance of winning, thus making VC members richer.

Btw, contact me in case you want to join Alcides&Friends Ventures, LLC. The larger the VC group, the larges the changes of winning a prize.

Idea 2: Betting

If you have been to previous editions (called Codebits at the time), you might remember the launch of a closed instance of Meo Wallet. It was the same idea, although projects didn’t ran on this. Rui and I won the prize for most money transferred, and came close second for the most money totaled (if only I hadn’t been so generous).

Idea 3: Ponzi scheme.

This was something we also worked on. I’ve known Ponzi schemes to last for months, and we only need it to last 3 days. So contact me as soon as you can to be one of the first to take advantage of the Alcides scheme™.

Don’t get me wrong, this is an interesting idea and we should use these events to play with ideas like this. But I have my doubts it will work in the end like it is intended to.

Catching the Java Train

If you are into Node.js, Elixir and all those new cool tools, you might have not heard of Java. Java is a bloated language and runtime from the 90ies that is used mostly by your bank.

Java has been loosing adoption because it has not been up to date with features that developers want from a language. Specially compared with its evil twin, C#/.NET, that has been the recipient of several new and awesome features, driving innovation in the ecosystem (LINQ, Type Providers, etc…).

They have now announced that Java will follow the train model of Ubuntu: a new release every 6 months with whatever’s ready at the time, and a LTS (Long-term-support) release every three years for enterprise customers.

This is they response for having the same product for both enterprise and new kids. My opinion is that they are hiding from the truth. Java8 was delayed until Lambda was ready. Java9 was delayed until Jigsaw (module system) was ready. I believe they took the right approach. Except I would have released Java7.1 and Java8.1 with some new stuff that was ready before the main incomplete feature was. The real problem here, the one they are avoiding, is that it took them ages to develop and mature both Lambda and Jigsaw (and at least on the Lambda part I think it was really incomplete).

The enterprise world was mostly OK with not having nor Lambda nor Jigsaw at the predicted time – they do not care. They may have been annoyed for waiting so long for security updates that were waiting for the main feature. Something that with my minor releases would be fixed easily. This is what happens with other Open-Source languages such as Python or Ruby.

The main problem is that the new kids were missing these features and switching to Scala, Groovy, Kotlin or whatever was trending at the time on Hacker News. So much that Google accepted Kotlin as an official Android language, which IMHO sucks for Oracle. But Oracle has made it clear that they do not care about the success of Android. And changing to the train model does not fix this type of problem. If the main features will be delayed, developers will not care about it if they have better usable alternatives.

So, dear Mark Reinhold, I believe this is more of a marketing stunt than actually solving the real problem: JDK development is slow as hell.

And I’ve experienced this first hand: I joined project Sumatra to bring GPGPU to Java, which I first did via the AeminiumGPU project. However, no one in the JDK team made this an effort. AMD tried to bring the Aparapi, but the lack of efforts from the remaining members lead the project nowhere. Even the prototype source tree was abandoned.

I believe the main reason is that despite Java being Open-Source in theory, it is not. It is a Oracle-controlled environment. Python has advanced much more in the same years being completely open-source (despite some investments from Google and Dropbox, among many others). Python does not need the train-model: it has a bleeding edge version in 3.6 (or whatever 3.X we are in right now), and still has 2.7 (or 2.6 if you are lazy like me) running fine in many servers.

Train model is not the solution: a better (and more open) development process is.

Killing apps on iOS

The single biggest misconception about iOS is that it’s good digital hygiene to force quit apps that you aren’t using. The idea is that apps in the background are locking up unnecessary RAM and consuming unnecessary CPU cycles, thus hurting performance and wasting battery life.

That’s not how iOS works. The iOS system is designed so that none of the above justifications for force quitting are true. Apps in the background are effectively “frozen”, severely limiting what they can do in the background and freeing up the RAM they were using.

John Gruber at daring fireball

I really hate to see smart people making this mistake. But it’s understandable how Gruber, always owning the last iPhone models feels this way. Let me introduce you to the cheap iPhone owner.

I have owned a 3G, a 4 and a 5. Usually I buy a new iPhone every two years, like it’s supposed to, but I am cheap and I buy the iPhone that was launched two years before the current one (eg, the 6 came out and I upgraded from the 4 to the 5, complaining about how much bigger 5 was). Right now I am still rocking the 5, despite its slowness.

On today’s iPhone 5, I need to kill apps that I am not using. It does make my phone faster. I can’t really measure it, but at the end of the day I can end up with 16 open apps, and I don’t have that many apps because I am limited to 8GB and I also want my music there. Of course, if I am opening that app in the next hour, I won’t close it, but there are several apps you use once a day or even less often.

So Mr Gruber, if you have a really old iPhone (it’s more common outside the US and other countries with good plans) and it’s slow as hell, killing apps you won’t use in the next hours does make your iPhone snappier.

Receita para equilibrar a balança fiscal portuguesa

O segredo desta receita está no primeiro passo que é genialmente o oposto do esperado. O segundo passo atrasa a resolução do problema para o próximo portugal2030.

1. Aumentar os custos
2. Usar os fundos europeus para pagar esses custos (e outros que já tínhamos) a curto prazo.
3. A balança está mais equilibrada do que estava. PROFIT!

Ora, o que acontece quando a duração dos fundos europeus para estes fins acabar? O que acontece quando o financiamento dos doutorados (por exemplo) acabar, e eles já tiverem nos quadros? Quem vai pagar os ordenados deles? Claramente não são as universidades que já não têm dinheiro para nada…

Legacy Operating Systems are still alive

In response to yesterday’s cyber attack, Microsoft released security update for expired OS versions.

Of course, several corporations are still on Windows XP/2003 server, and are not willing to pay for custom support. Nor they are willing to update their software.

Frequently, I see big institutions asking for budgets for software systems, but they do not care about maintenance or continuous development. And this is what happens in those cases.