OpenHPC on AArch64

My current effort is to drive ARM hardware into the data centre, through HPC deployments, for the exascale challenge. Current Top500 clusters start from 0.5 petaflops, but the top 10 are between 10 and 90 petaflops, using as much as close to 20MW of combined power. Scaling up 10 ~ 100 times in performance towards exascale would require a whole power plant (coal or nuclear) for each cluster. So, getting to ExaFLOPS involves not only getting server-grade ARM chips into a rack mount, but baking the whole ecosystem (PCB, PCIe, DRAM, firmware, drivers, kernel, compilers, libraries, simulations, deep learning, big data) that can run at least at 10x better performance than existing clusters at, hopefully, a fraction of the power budget.

The first step then, is to find a system that can glue most of those things together, so we can focus on real global solutions, not individual pieces together. It also means we need to fix all problems in the right places, rather than deferring that to external projects because it’s not our core business. At Linaro we tend to look at computing problems from an holistic point of view, so if things don’t work the way they should, we make it our job to go and fix the problem where it’s most meaningful, finding the appropriate upstream project and submitting pull requests there, then back-porting solutions internally and gluing them together into a proper solution.

And that’s why we came to OpenHPC, a project that aims to facilitate HPC cluster deployment. Essentially, it’s a package repository that glues functional groups using meta-packages, in additional to recipes and documentation on how to setup a cluster, and a lively community that deploys OpenHPC clusters across different architectures, operating systems and provisioning styles.

The recent version of OpenHPC, 1.3.2, has just been released with some additions proposed and tested by Linaro, such as Plasma, Scotch and SLEPc as well as LLVM with Clang and Flang. But while that works well on x86_64 clusters for now, and they have passed all tests on AArch64, installing a new cluster on AArch64 with automatic provisioning still needs some modification. That’s why it’s still in Tech Preview.

Warewulf

For those who don’t know, warewulf is a cluster provisioning service. As such, it is installed in a master node, which will then keep a database of all the compute nodes, resource files, operating system images and everything that is necessary to get the compute nodes up and running. While you can install the nodes by hand, then install a dispatcher (like Slurm) on each node, warewulf makes that process simple: it creates an image of the installation of a node, produces an EFI image for PXE boot and let the nodes discover themselves as they come up alive.

The OpenHPC documentation explain step by step how you can do this (by installing OpenHPC’s own meta-packages and running a few configuration tasks), and if all goes well, every compute node that boots in PXE mode will soon encounter the DHCP server, then the TFTP and will get its disk-less live installation painlessly. That is, of course, if the PXE process work.

While running this on a number of different ARM clusters, I realised that I was getting an error:

ERROR:  Could not locate Warewulf's internal pxelinux.0! Things might be broken!

While that doesn’t sound good at all, I learnt that people in the ARM community knew about that all along and were using a Grub hack (to create a simple Grub EFI script to jump start the TFTP download and installation). It’s good that it works in some way, but it’s the kind of thing that should just work, or people won’t even try anything further. Turns out the PXELinux folks haven’t bothered much with AArch64 (examples here and here), so what to do?

Synergy

One of the great strengths of Linaro is that there are a bunch of maintainers of the core open source projects most people use, so I was bound to find someone that knew what to do, or at least who to ask. As it turns out, two EFI gurus (Leif and Ard) work on my extended team, and we begun discussing alternatives when Eric (Arm Research, also OpenHPC collaborator) tipped us that warewulf would be completely replacing PXELinux for iPXE, in view of his own efforts in getting that working.

After a few contributions and teething issues, I managed to get iPXE booting on ARM clusters as smoothly as I would have done for x86_64.

Tech Preview

Even though it works perfectly well, OpenHPC 1.3.2 was release without it.

That’s because it uses an older snapshot of warewulf, while I was using the bleeding edge development branch, which was necessary due to all the fixes we had to do while making it work on ARM.

This has prompted me to replicate their validation build strategy so that I could replace OpenHPC’s own versions of warewulf in-situ, ignoring dependencies, which is not a very user friendly process. And while I have validated OpenHPC (ran the test-suite, all green) with the development branch of warewulf (commit 74ad08d), it is not a documented or even recommended process.

So, despite working better than the official 1.3.2 packages, we’re still in Tech Preview state, and will be until we can pack the development branch of warewulf past that commit into OpenHPC. Given that it has been reported to work on both AArch64 and x86_64, I’m hoping it will be ready for 1.3.3 or 1.4, whatever comes next.

Open Source and Profit

I have written extensively about free, open source software as a way of life, and now reading back my own articles of the past 7 years, I realize that I was wrong on some of the ideas, or in the state of the open source culture within business and around companies.

I’ll make a bold statement to start, trying to get you interested in reading past the introduction, and I hope to give you enough arguments to prove I’m right. Feel free to disagree on the comments section.

The future of business and profit, in years to come, can only come if surrounded by free thoughts.

By free thoughts I mean free/open source software, open hardware, open standards, free knowledge (both free as in beer and as in speech), etc.

Past Ideas

I began my quest to understand the open source business model back in 2006, when I wrote that open source was not just software, but also speech. Having open source (free) software is not enough when the reasons why the software is free are not clear. The reason why this is so is that the synergy, that is greater than the sum of the individual parts, can only be achieved if people have the rights (and incentives) to reach out on every possible level, not just the source, or the hardware. I make that clear later on, in 2009, when I expose the problems of writing closed source software: there is no ecosystem in which to rely, so progress is limited and the end result is always less efficient, since the costs to make it as efficient are too great and would drive the prices of the software too high up to be profitable.

In 2008 I saw both sides of the story, pro and against Richard Stallman, on the views of the legitimacy of propriety control, being it via copyright licenses or proprietary software. I may have come a long way, but I was never against his idea of the perfect society, Richard Stallman’s utopia, or as some friends put it: The Star Trek Universe. The main difference between me and Stallman is that he believes we should fight to the last man to protect ourselves from the evil corporations towards software abuse, while I still believe that it’s impossible for them to sustain this empire for too long. His utopia will come, whether they like it or not.

Finally, in 2011 I wrote about how copying (and even stealing) is the only business model that makes sense (Microsoft, Apple, Oracle etc are all thieves, in that sense) and the number of patent disputes and copyright infringement should serve to prove me right. Last year I think I had finally hit the epiphany, when I discussed all these ideas with a friend and came to the conclusion that I don’t want to live in a world where it’s not possible to copy, share, derive or distribute freely. Without the freedom to share, our hands will be tied to defend against oppression, and it might just be a coincidence, but in the last decade we’ve seen the biggest growth of both disproportionate propriety protection and disproportional governmental oppression that the free world has ever seen.

Can it be different?

Stallman’s argument is that we should fiercely protect ourselves against oppression, and I agree, but after being around business and free software for nearly 20 years, I so far failed to see a business model in which starting everything from scratch, in a secret lab, and releasing the product ready for consumption makes any sense. My view is that society does partake in an evolutionary process that is ubiquitous and compulsory, in which it strives to reduce the cost of the whole process, towards stability (even if local), as much as any other biological, chemical or physical system we know.

So, to prove my argument that an open society is not just desirable, but the only final solution, all I need to do is to show that this is the least energy state of the social system. Open source software, open hardware and all systems where sharing is at the core should be, then, the least costly business models, so to force virtually all companies in the world to follow suit, and create the Stallman’s utopia as a result of the natural stability, not a forced state.

This is crucial, because every forced state is non-natural by definition, and every non-natural state has to be maintained by using resources that could be used otherwise, to enhance the quality of the lives of the individuals of the system (being them human or not, let’s not block our point of view this early). To achieve balance on a social system we have to let things go awry for a while, so that the arguments against such a state are perfectly clear to everyone involved, and there remains no argument that the current state is non-optimal. If there isn’t discomfort, there isn’t the need for change. Without death, there is no life.

Profit

Of all the bad ideas us humans had on how to build a social system, capitalism is probably one of the worst, but it’s also one of the most stable, and that’s because it’s the closest to the jungle rule, survival of the fittest and all that. Regulations and governments never came to actually protect the people, but as to protect capitalism from itself, and continue increasing the profit of the profitable. Socialism and anarchy rely too much on forced states, in which individuals have to be devoid of selfishness, a state that doesn’t exist on the current form of human beings. So, while they’re the product of amazing analysis of the social structure, they still need heavy genetic changes in the constituents of the system to work properly, on a stable, least-energy state.

Having less angry people on the streets is more profitable for the government (less costs with security, more international trust in the local currency, more investments, etc), so panis et circenses will always be more profitable than any real change. However, with more educated societies, result from the increase in profits of the middle class, more real changes will have to be made by governments, even if wrapped in complete populist crap. One step at a time, the population will get more educated, and you’ll end up with more substance and less wrapping.

So, in the end, it’s all about profit. If not using open source/hardware means things will cost more, the tendency will be to use it. And the more everyone uses it, the less valuable will be the products that are not using it, because the ecosystem in which applications and devices are immersed in, becomes the biggest selling point of any product. Would you buy a Blackberry Application, or an Android Application? Today, the answer is close to 80% on the latter, and that’s only because they don’t use the former at all.

It’s not just more expensive to build Blackberry applications, because the system is less open, the tools less advanced, but also the profit margins are smaller, and the return on investment will never justify. This is why Nokia died with their own App store, Symbian was not free, and there was a better, free and open ecosystem already in place. The battle had already been lost, even before it started.

But none of that was really due to moral standards, or Stallman’s bickering. It was only about profit. Microsoft dominated the desktop for a few years, long enough to make a stand and still be dominant after 15 years of irrelevance, but that was only because there was nothing better when they started, not by a long distance. However, when they tried to flood the server market, Linux was not only already relevant, but it was better, cheaper and freer. The LAMP stack was already good enough, and the ecosystem was so open, that it was impossible for anyone with a closed development cycle to even begin to compete on the same level.

Linux became so powerful that, when Apple re-defined the concept of smartphones with the iPhone (beating Nokia’s earlier attempts by light-years of quality), the Android system was created, evolved and dominated in less than a decade. The power to share made possible for Google, a non-device, non-mobile company, to completely outperform a hardware manufacturer in a matter of years. If Google had invented a new OS, not based on anything existent, or if they had closed the source, like Apple did with FreeBSD, they wouldn’t be able to compete, and Apple would still be dominant.

Do we need profit?

So, the question is: is this really necessary? Do we really depend on Google (specifically) to free us from the hands of tyrant companies? Not really. If it wasn’t Google, it’d be someone else. Apple, for a long time, was the odd guy in the room, and they have created an immense value for society: they gave us something to look for, they have educated the world on what we should strive for mobile devices. But once that’s done, the shareable ecosystem learns, evolves and dominate. That’s not because Google is less evil than Apple, but because Android is more profitable than iOS.

Profit here is not just the return on investment that you plan on having on a specific number of years, but adding to that, the potential that the evolving ecosystem will allow people to do when you’ve long lost the control over it. Shareable systems, including open hardware and software, allow people far down in the planing, manufacturing and distributing process to still have profit, regardless of what were your original intentions. One such case is Maddog’s Project Cauã.

By using inexpensive RaspberryPis, by fostering local development and production and by enabling the local community to use all that as a way of living, Maddog’s project is using the power of the open source initiative by completely unrelated people, to empower the people of a country that much needs empowering. That new class of people, from this and other projects, is what is educating the population of the world, and what is allowing the people to fight for their rights, and is the reason why so many civil uprisings are happening in Brazil, Turkey, Egypt.

Instability

All that creates instability, social unrest, whistle-blowing gone wrong (Assange, Snowden), and this is a good thing. We need more of it.

It’s only when people feel uncomfortable with how the governments treat them that they’ll get up their chairs and demand for a change. It’s only when people are educated that they realise that oppression is happening (since there is a force driving us away from the least-energy state, towards enriching the rich), and it’s only when these states are reached that real changes happen.

The more educated society is, the quicker people will rise to arms against oppression, and the closer we’ll be to Stallman’s utopia. So, whether governments and the billionaire minority likes or not, society will go towards stability, and that stability will migrate to local minima. People will rest, and oppression will grow in an oscillatory manner until unrest happens again, and will throw us into yet another minimum state.

Since we don’t want to stay in a local minima, we want to find the best solution not just a solution, having it close to perfect in the first attempt is not optimal, but whether we get it close in the first time or not, the oscillatory nature of social unrest will not change, and nature will always find a way to get us closer to the global minimum.

Conclusion

Is it possible to stay in this unstable state for too long? I don’t think so. But it’s not going to be a quick transition, nor is it going to be easy, nor we’ll get it on the first attempt.

But more importantly, reaching stability is not a matter of forcing us to move towards a better society, it’s a matter of how dynamic systems behave when there are clear energetic state functions. In physical and chemical systems, this is just energy, in biological systems this is the propagation ability, and in social systems, this is profit. As sad as it sounds…

Uno score keeper

With the spring not coming soon, we had to improvise during the Easter break and play Uno every night. It’s a lot of fun, but it can take quite a while to find a piece of clean paper and a pen that works around the house, so I wondered if there was an app for that. It turns out, there wasn’t!

There were several apps to keep card game scores, but every one was specific to the game, and they had ads, and wanted access to the Internet, so I decided it was worth it writing one myself. Plus, that would finally teach me to write Android apps, a thing I was delaying to get started for years.

The App

Adding new players
Card Game Scores

The app is not just a Uno score keeper, it’s actually pretty generic. You just keep adding points until someone passes the threshold, when the poor soul will be declared a winner or a loser, depending on how you set up the game. Since we’re playing every night, even the 30 seconds I spent re-writing our names was adding up, so I made it to save the last game in the Android tuple store, so you can retrieve it via the “Last Game” button.

It’s also surprisingly easy to use (I had no idea), but if you go back and forth inside the app, it cleans the game and start over a new one, with the same players, so you can go on as many rounds as you want. I might add a button to restart (or leave the app) when there’s a winner, though.

I’m also thinking about printing the names in order in the end (from victorious to loser), and some other small changes, but the way it is, is good enough to advertise and see what people think.

If you end up using, please let me know!

Download and Source Code

The app is open source (GPL), so rest assured it has no tricks or money involved. Feel free to download it from here, and get the source code at GitHub.

Distributed Compilation on a Pandaboard Cluster

This week I was experimenting with the distcc and Ninja on a Pandaboard cluster and it behaves exactly as I expected, which is a good thing, but it might not be what I was looking for, which is not. 😉

Long story short, our LLVM buildbots were running very slow, from 3 to 4.5 hours to compile and test LLVM. If you consider that at peak time (PST hours) there are up to 10 commits in a single hour, the buildbot will end up testing 20-odd patches at the same time. If it breaks in unexpected ways, of if there is more than one patch on a given area, it might be hard to spot the guilty.

We ended up just avoiding the make clean step, which put us around 15 minutes build+tests, with the odd chance of getting 1 or 2 hours tops, which is a great deal. But one of the alternatives I was investigating is to do a distributed build. More so because of the availability of cluster nodes with dozens of ARM cores inside, we could make use of such a cluster to speed up our native testing, even benchmarking on a distributed way. If we do it often enough, the sample might be big enough to account for the differences.

The cluster

So, I got three Pandaboards ES (dual Cortex-A9, 1GB RAM each) and put the stock Ubuntu 12.04 on them and installed the bare minimum (vim, build-essential, python-dev, etc), upgraded to the latest packages and they were all set. Then, I needed to find the right tools to get a distributed build going.

It took a bit of searching, but I ended up with the following tool-set:

  • distcc: The distributed build dispatcher, which knows about the other machines in the cluster and how to send them jobs and get the results back
  • CMake: A Makefile generator which LLVM can use, and it’s much better than autoconf, but can also generate Ninja files!
  • Ninja: The new intelligent builder which not only is faster to resolve dependencies, but also has a very easy way to change the rules to use distcc, and also has a magical new feature called pools, which allow me to scale job types independently (compilers, linkers, etc).

All three tools had to be compiled from source. Distcc’s binary distribution for ARM is too old, CMake’s version on that Ubuntu couldn’t generate Ninja files and Ninja doesn’t have binary distributions, full stop. However, it was very simple to get them interoperating nicely (follow the instructions).

You don’t have to use CMake, there are other tools that generate Ninja files, but since LLVM uses CMake, I didn’t have to do anything. What you don’t want is to generate the Ninja files yourself, it’s just not worth it. Different than Make, Ninja doesn’t try to search for patterns and possibilities (this is why it’s fast), so you have to be very specific on the Ninja file on what you want to accomplish. This is very easy for a program to do (like CMake), but very hard and error prone for a human (like me).

Distcc

To use distcc is simple:

  1. Replace the compiler command by distcc compiler on your Ninja rules;
  2. Set the environment variable DISTCC_HOSTS to the list of IPs that will be the slaves (including localhost);
  3. Start the distcc daemon on all slaves (not on the master): distccd --daemon --allow <MasterIP>;
  4. Run ninja with the number of CPUs of all machines + 1 for each machine. Ex: ninja -j6 for 2 Pandaboards.

A local build, on a single Pandaboard of just LLVM (no Clang, no check-all) takes about 63 minutes. With distcc and 2 Pandas it took 62 minutes!

That’s better, but not as much as one would hope for, and the reason is a bit obvious, but no less damaging: The Linker! It took 20 minutes to compile all of the code, and 40 minutes to link them into executable. That happened because while we had 3 compilation jobs on each machine, we had 6 linking jobs on a single Panda!

See, distcc can spread the compilation jobs as long as it copies the objects back to the master, but because a linker needs all objects in memory to do the linking, it can’t do that over the network. What distcc could do, with Ninja’s help, is to know which objects will be linked together, and keep copies of them on different machines, so that you can link on separate machines, but that is not a trivial task, and relies on an interoperation level between the tools that they’re not designed to accept.

Ninja Pools

And that’s where Ninja proved to be worth its name: Ninja pools! In Ninja, pools are named resources that bundle together with a specific level of scalability. You can say that compilers scale free, but linkers can’t run more than a handful. You simply need to create a pool called linker_pool (or anything you want), give it a depth of, say, 2, and annotate all linking jobs with that pool. See the manual for more details.

With the pools enabled, a distcc build on 2 Pandaboards took exactly 40 minutes. That’s 33% of gain with double the resources, not bad. But, how does that scale if we add more Pandas?

How does it scale?

To get a third point (and be able to apply a curve fit), I’ve added another Panda and ran again, with 9 jobs and linker pool at 2, and it finished in 30 minutes. That’s less than half the time with three times more resources. As expected, it’s flattening out, but how much more can we add to be profitable?

I don’t have an infinite number of Pandas (nor I want to spend all my time on it), so I just cheated and got a curve fitting program (xcrvfit, in case you’re wondering) and cooked up an exponential that was close enough to the points and use the software ability to do a best fit. It came out with 86.806*exp(-0.58505*x) + 14.229, which according to Lybniz, flattens out after 4 boards (about 20 minutes).

Pump Mode

Distcc has a special mode called pump mode, in which it pushes with the C file, all headers necessary to compile it solely on the node. Normally, distcc will pre-compile on the master node and send the pre-compiled result to the slaves, which convert to object code. According to the manual, this could improve the performance 10-fold! Well, my results were a little less impressive, actually, my 3-Panda cluster finished in just about 34 minutes, 4 minutes more than without push mode, which is puzzling.

I could clearly see that the files were being compiled in the slaves (distccmon-text would tell me that, while there was a lot of “preprocessing” jobs on the master before), but Ninja doesn’t print times on each output line for me to guess what could have slowed it down. I don’t think there was any effect on the linker process, which was still enabled in this mode.

Conclusion

Simply put, both distcc and Ninja pools have shown to be worthy tools. On slow hardware, such as the Pandas, distributed builds can be an option, as long as you have a good balance between compilation and linking. Ninja could be improved to help distcc to link on remote nodes as well, but that’s a wish I would not press on the team.

However, scaling only to 4 boards will reduce a lot of the value for me, since I was expecting to use 16/32 cores. The main problem is again the linker jobs working solely on the master node, and LLVM having lots and lots of libraries and binaries. Ninja’s pools can also work well when compiling LLVM+Clang on debug mode, since the objects are many times bigger, and even on above average machine you can start swapping or even freeze your machine if using other GUI programs (browsers, editors, etc).

In a nutshell, the technology is great and works as advertised, but with LLVM it might not be yet the thing. It’s still more profitable to get faster hardware, like the Chromebooks, that are 3x faster than the Pandas and cost only marginally more.

Would also be good to know why the pump mode has regressed in performance, but I have no more time to spend on this, so I leave as a exercise to the reader. 😉

Barrelfish

Minix seems to be inspiring more operating systems nowadays. Microsoft Research is investing on a micro-kernel (they call it multi-kernel, as there are slight differences) called Barrelfish.

Despite being Microsoft, it’s BSD licensed. The mailing list looks pretty empty, the last snapshot is half a year ago and I couldn’t find an svn repository, but still more than I would expect from Microsoft anyway.

Multi-kernel

The basic concept is actually very interesting. The idea is to be able to have multi-core hybrid machines to the extreme, and still be able to run a single OS on it. Pretty much the same way some cluster solutions do (OpenMPI, for instance), but on a single machine. The idea is far from revolutionary. It’s a natural evolution of the multi-core trend with the current cluster solutions (available for years) and a fancy OS design (micro-kernel) that everyone learns in CS degrees.

What’s the difference, then? For one thing, the idea is to abstract everything away. CPUs will be just another piece of hardware, like the network or graphic cards. The OS will have the freedom to ask the GPU to do MP floating-point calculations, for instance, if it feels it’s going to benefit the total execution time. It’ll also be able to accept different CPUs in the same machine, Intel and ARM for instance (like the Dell Latitude z600), or have different GPUs, nVidia and ATI, and still use all the hardware.

With Windows, Linux and Mac today, you either use the nVidia driver or the ATI one. You also normally don’t have hybrid-core machines and absolutely can’t recover if one of the cores fail. This is not the same with cluster solutions, and Barrelfish’s idea is to simulate precisely that. In theory, you could do energy control (enabling and disabling cores), crash-recovery when one of the cores fail but not the other, or plug and play graphic or network cards and even different CPUs.

Imagine you have an ARM netbook that is great for browsing, but you want to play a game on it. You get your nVidia and a coreOcta 10Ghz USB4 and plug in. The OS recognizes the new hardware, loads the drivers and let you play your game. Battery life goes down, so once you’re back from the game, you just unplug the cards and continue browsing.

Scalability

So, how is it possible that Barrelfish can be that malleable? The key is communication. Shared memory is great for single-processed threaded code and acceptable for multi-processed OSs with little number of concurrent process accessing the same region in memory. Most modern OSs can handle many concurrent processes, but they rarely access the same data at the same time.

Normally, processes are single threaded or have a very small number of threads (dozens) running. More than that is so difficult to control that people usually fall back to other means, such as client/server or they just go out and buy more hardware. In clusters, there is no way to use shared memory. For one, accessing memory in another computer via network is just plain stupid, but even if you use shared memory in each node and client/server among different nodes, you’re bound to have trouble. This is why MPI solutions are so popular.

In Barrelfish there’s no shared memory at all. Every process communicate with each other via messages and duplicate content (rather than share). There is an obvious associated cost (memory and bus), but the lock-free semantics is worth it. It also gives Barrelfish another freedom: to choose the communication protocol generic enough so that each piece of hardware is completely independent of all others, and plug’n’play become seamless.

Challenges

It all seem fantastic, but there’s a long road ahead. First, message passing scales much better than shared memory, but nowadays there isn’t enough cores in most machines to make it worth it. Message passing also introduces some other problems that are not easily solvable: bus traffic and storage requirements increase considerably, and messages are not that generic in nature.

Some companies are famous for not adhering to standards (Apple comes to mind), and a standard hardware IPC framework would be quite hard to achieve. Also, even if using pure software IPC APIs, different companies will still use slightly modified APIs to suit their specific needs and complexity will rise, exponentially.

Another problem is where the hypervisor will live. Having a distributed control centre is cool and scales amazingly well, but its complexity also scales. In a hybrid-core machine, you have to run different instructions, in different orders, with different optimizations and communication. Choosing one core to deal with the scheduling and administration of the system is much easier, but leaves the single-point-of-failure.

Finally, going the multi-hybrid-independent style is way too complex. Even for a several-year’s project with lots of fund (M$) and clever people working on it. After all, if micro-kernel was really that useful, Tanembaum would have won the discussion with Linus. But, the future holds what the future holds, and reality (as well as hardware and some selfish vendors) can change. Multi-kernel might be possible and even easier to implement in the future.

This seems to be what the Barrelfish’s team is betting on, and I’m with them on that bet. Even if it fails miserably (as did Minix), some concepts could still be used in real-world operating systems (like Minix), whatever that’ll mean in 10 years. Being serious about parallelism is the only way forward, sticking with 40 years old concepts is definitely not.

I’m still aspiring for non-deterministic computing, though, but that’s an even longer shot…

Linux is whatever you want it to be

Normally the Linux Magazine has great articles. Impartial, informative and highly technical. Unfortunately, not always. In a recent article, some perfectionist zealot stated that Ubuntu makes Linux looks bad. I couldn’t disagree more.

Ubuntu is a fast-paced, fast-adapted Linux. I was one of the early adopters and I have to say that most of the problems I had with the previous release were fixed. Some bugs went through, of course, but they were reported and quickly fixed. Moreover, Ubuntu has the support from hardware manufacturers, such as Dell, and that makes a big difference.

Linux is everything

Linux is excellent for embedded systems, great for network appliances, wonderful for desktops, irreplaceable as a development platform, marvellous on servers and the only choice for real clusters. It also sucks when you have to find the configuration manually, it’s horrible to newbies, it breaks whenever a new release is out, it takes longer to get new software (such as Firefox) but also helps a lot with package dependencies. Something that neiter Mac nor Windows managed to do properly over the past decades.

Linux is great as any piece of software could be but horrible as every operating system that was release since the beginning of times. Some Linux distributions are stable, others not so. Debian takes 10 years to release and when it does, the software it contains is already 10 years old. Ubuntu tries to be a bit faster but that obviously breaks a few things. If you’re fast enough fixing, the early adopters will be pleased that they helped the community.

“Unfortunately what most often comes is a system full of bugs, pain, anguish, wailing and gnashing of teeth – as many “early” adopters of Karmic Koala have discovered.”

As any piece of software, open or closed, free or paid, free or non-free. It takes time to mature. A real software engineer should know better, that a system is only fully tested when it reaches the community, the user base. Google uses their own users (your granny too!) as beta testers for years and everyone seem to understand it.

Debian zealots hate Red Hat zealots and both hate Ubuntu zealots that probably hate other zealots anywhere else. It’s funny to see how opinions vary greatly from a zealot clan to the other about what Linux really is. All of them have a great knowledge on what Linux is comprised of, but few seems to understand what Linux really is. Linux, or better, GNU/Linux is a big bunch of software tied together with so many different points of view that it’s impossible to state in less than a thousand words what it really is.

“Linux is meant to be stable, secure, reliable.”

NO, IT’S NOT! Linux is meant to be whatever you make of it, that’s the real beauty. If Canonical thought it was ready to launch is because they thought that, whatever bug pased the safety net was safe enough for the users to grab and report, which we did! If you’re not an expert, wait for the system to cool down. A non-expert will not be an “early adopter” anyway, that’s for sure.

Idiosyncrasies

Each Linux has its own idiosyncrasies, that’s what makes it powerful, and painful. The way Ubuntu updates/upgrades itself is particular to Ubuntu. Debian, Red Hat, Suse, all of them do it differently, and that’s life. Get over it.

“As usual, some things which were broken in the previous release are now fixed, but things which were working are now broken.”

One pleonasm after another. There is no new software without new bugs. There is no software without bugs. What was broken was known, what is new is unknown. How can someone fix something they don’t know? When eventually the user tested it, found it broken, reported, they fixed! Isn’t it simple?

“There’s gotta be a better way to do this.”

No, there isn’t. Ubuntu is like any other Linux: Like it? Use it. Don’t like it? Get another one. If you don’t like the way Ubuntu works, get over it, use another Linux and stop ranting.

Red Hat charges money, Debian has ubber-stable-decade-old releases, Gentoo is for those that have a lot of time in their hands, etc. Each has its own particularities, each is good for a different set of people.

Why Ubuntu?

I use Ubuntu because it’s easy to install, use and update. The rate of bugs is lower than on most other distros I’ve used and the rate of updates is much faster and stable than some other distros. It’s a good balance for me. Is it perfect? Of course not! There are lots of things I don’t like about Ubuntu, but that won’t make me use Windows 7, that’s for sure!

I have friends that use Suse, Debian, Fedora, Gentoo and they’re all as happy as I am, not too much, but not too few. Each has problems and solutions, you just have to choose the ones that are best for you.

Gtk example

Gtk, the graphical interface behind Gnome, is very simple to use. It doesn’t have an all-in-one IDE such as KDevelop, which is very powerful and complete, but it features a simple and functional interface designer called Glade. Once you have the widgets and signals done, filling the blanks is easy.

As an example, I wrote a simple dice throwing application, which took me about an hour from install Glade to publish it on the website. Basically, my route was to apt-get install glade, open it and create a few widgets, assign some callbacks (signals) and generate the C source code.

After that, the file src/callbacks.c contain all the signal handlers to which you have to implement. Adding just a bit of code and browsing this tutorial to get the function names was enough to get it running.

Glade generates all autoconf/automake files, so it was extremely easy to compile and run the mock window right at the beginning. The rest of the code I’ve added was even less code than I would add if doing a console based application to do just the same. Also, because of the code generation, I was afraid it’d replace my already changed callbacks.c when I changed the layout. Luckily, I was really pleased to see that Glade was smart enough not to mess up with my changes.

My example is not particularly good looking (I’m terrible with design), but that wasn’t the intention anyway. It’s been 7 years since the last time I’ve built graphical interfaces myself and I’ve never did anything with Gtk before, so it shows how easy it is to use the library.

Just bear in mind a few concepts of GUI design and you’ll have very little problems:

  1. Widget arrangement is not normally fixed by default (to allow window resize). So workout how tables, frames, boxes and panes work (which is a pain) or use fixed position and disallow window resize (as I did),
  2. Widgets don’t do anything by themselves, you need to assign them callbacks. Most signals have meaningful names (resize, toggle, set focus, etc), so it’s not difficult to find them and create callbacks for them,
  3. Side effects (numbers appearing at the press of a button, for instance) are not easily done without global variables, so don’t be picky on that from start. Work your way towards a global context later on when the interface is stable and working (I didn’t even bother)

If you’re looking for a much better dice rolling program for Linux, consider using rolldice, probably available via your package manager.

SLC 0.2.0

My pet compiler is now sufficiently stable for me to advertise it as a product. It should deal well with the most common cases if you follow the syntax, as there are some tests to assure minimum functionality.

The language is very simple, called “State Language“. This language has only global variables and static states. The first state is executed first, all the rest only if you call goto state;. If you exit a state without branching, the program quits. This behaviour is consistent with the State Pattern and that’s why implemented this way. You can solve any computer problem using state machines, therefore this new language should be able to tackle them all.

The expressions are very simple, only binary operations, no precedence. Grouping is done with brackets and only the four basic arithmetic operations can be used. This is intentional, as I don’t want the expression evaluator to be so complex that the code will be difficult to understand.

As every code I write on my spare time, this one has an educational purpose. I learn by writing and hopefully teach by providing the source, comments and posts, and by letting it available on the internet so people can find it through search engines.

It should work on any platform you can compile to (currently only Linux and Mac binaries provided), but the binary is still huge (several megabytes) because they contain all LLVM libraries statically linked to it.

I’m still working on it and will update the status here at every .0 release. I hope to have binary operations for if statements, print string and all PHI calculated for the next release.

40 years and full of steam

Unix is turning 40 and BBC folks wrote a small article about it. What a joy to remember when I started using Unix (AIX on an IBM machine) around 1994 and was overwhelmed by it.

By that time, the only Unix that ran well on a PC was SCO and that was a fortune, but there were some others, not as mature, that would have the same concepts. FreeBSD and Linux were the two that came into my sight, but I have chosen Linux for it was a bit more popular (therefore could get more help).

The first versions I’ve installed didn’t even had a X server and I have to say that I was happier than using Windows. Partially because of all the open-source-free-software-good-for-mankind thing, but mostly because Unix has a power that is utterly ignored by other operating systems. It’s so, that Microsoft used good bits from FreeBSD (that allows it via their license) and Apple re-wrote its graphical environment to FreeBSD and made the OS X. The GNU folks certainly helped my mood, as I could find all power tools on Linux that I had on AIX, most of the time even more powerful.

The graphical interface was lame, I have to say. But in a way it was good, it reminded me of the same interface I used on the Irix (SGI’s Unix) and that was ok. With time, it got better and better and in 1999 I was working with and using it at home full time.

The funny thing is that now, I can’t use other operating systems for too long, as I start missing some functionalities and will eventually get locked, or at least, extremely limited. The Mac OS is said to be nice and tidy, and with a full FreeBSD inside, but I still lacked agility on it, mainly due to search and installation of packages and configuration of the system.

I suppose each OS is for a different type of person… Unix is for the ones that like to fine-tune their machines or those that need the power of it (servers as well) and Mac OS is for those that need something simple, with the biggest change as the background colour. As for the rest, I fail to see a point, really.

FSF Settles Suit Against Cisco

The long dispute with Cisco has finally come to an agreement. For me, that means two things: first, they’re not trolls sucking money from the big corps for stupid patent infringement, as some might fear. Second, they’re very patient, understanding and sometimes a bit too naive.

Why the fear?

When building embedded systems or when you’re too close to the hardware (such as Cisco) you may take a wise decision to use open source software, as it’s quite likely to be stable and taken care by a good bunch of good people. Even though there are several ways of doing it independently, so your software is not virally infected by the GPL, it’s not always possible and you may have to re-invent the wheel because of that.

It’s not only GPL, patents can also cause a whole lot of damage, and it seems that TomTom has decided to go head first with the Linux community.

So, although the fear is understandable, it’s more of a hysteria than based on actual facts. The FSF hasn’t had much to show on court, and that adds up to the uncertainty of the lawyers, but it’s in cases like the Cisco that they show a much higher maturity that most companies have shown recently, even mature companies like Microsoft.

Richard Stallman

The FSF is not only Stallman. Even though he’s the boss, the organization is a large list of people, sponsors, advisers (and now interns). One thing is to fear what RMS will do when he finds you using GPL in your kitchen scale, but a completely different matter is what the FSF (as an organization) does.

The Cisco case has been going for several years. They offered help, they’ve asked politely, they’ve warned about the potential dangers and so on. A lot has been made before they have actually filled the suit, and they’ve settled it nicely. This shows that they’re not just waiting the next infringement to get you down, they actually care about their (and your) freedom.

The day the FSF starts acting stupid is the day people will drive away. It’s not like Microsoft that you have no option, there’s plenty of options out there, software, licences, partners, advisers, programmers, etc. GNU/Linux is not the decent open source operating system, the BSDs are as good, sometimes better, especially in the embedded case.

The year of Linux

Every year since 1995 is the year of Linux. For me it always was, but I can’t say the same for the rest of the world. Recently, Linux (and other open source software) has played an important role in defining the future of mankind and more and more the Linux community feels that it’s their sweat and blood.

There is a great chance it’ll become the platform of all things in a very short time-frame. Cars, mobile phones, PDAs, netbooks, laptops, desktops, servers, clusters, spaceships. One platform to rule them all and in the darkness bind them, but if they play dumb, their glory might never see daylight.

Lots of people disagree with the new revisions of the GPL license, they feel it bites the hand that feeds it. Many companies feed back open source regularly and that kinda broke the synergy. I personally think that it’s excellent for some cases, but not for all. For instance, development tools should not be restricted, especially when it comes to platforms they can’t reach. Opening the platform is an obvious way around it, but not everything can be exposed and they can’t figure out every implementation detail.

Drivers might also have trouble with GPLv3 for the same reason. Again, there are ways around it, the FSF recently opened a backdoor to develop proprietary plug-ins if they’re blessed, but that might not be suitable for every case.

Solution?

Sorry, not today. Stick to FreeBSD if you can’t cope with GPLv3, find a way to co-exist with the GCC exception and provide the source code of what you have to. If it’s not your core business, you could donate your code to the community and make it GPL too and treat your program as enabling technology, of course, providing your code doesn’t expose any patent or trade secret.

So, well, yeah. Each case is a different case, that’s the problem of being in the long tail.