On using MLIR for Verona

Verona is a new language being developed by Microsoft Research Cambridge which explores the concept of concurrent ownership. Accessing shared memory in a thread-safe manner needs all sorts of atomic access controls. While modern CPUs implement cache locality and atomic instructions (in addition to previous generation’s inefficient catch-all barriers), such fine control is only meaningful for small objects (usually register sized).

If you want to share large areas of (concurrently) mutable memory, you generally introduce the idea of locks, and with that, comes dead-locks, live-locks and all sorts of memory corruption problems. Alternatives, like message passing, exist for a number of decades, but they’re usually efficient for (again) small objects. It’s quite inefficient to send MB or GB sized mutable blobs in messages.

Verona aims to fix that by passing the ownership of a mutable region of memory as a message instead. The message queues are completely lock-free (using Atomic-Swap lock-free data structures) and only pass the entry-point to a region (an isolated reference to the whole mutable memory blob) as the unique ownership. Meaning only the thread that has that isolated object can access any memory in the region contained within. So each thread has the strong guarantee that no one else is accessing the entire region and there are no chances of concurrent mutation and no need for locks.

MLIR is a multi-level intermediate representation developed within the TensorFlow project but later migrated under the LLVM umbrella. As a core part of the LLVM project, MLIR is being used in a lot more than just ML representations and is gaining traction as the high-level representation of some language front-ends (Fortran, Hardware description) and other experiments bringing Clang to use MLIR as its own intermediate language.

The main benefit of using MLIR for language lowering is that you can keep the language semantics as high level as you need, by constructing dialect operations that encode the logic you want, and then creating passes that use those operations to infer behaviour and types or to transform into a more optimal format, before lowering to the standard dialects and further down, LLVM IR.

This fixes the two big problems in front-ends: on the one hand, it’s a lot easier to work with flexible IR operations than AST (abstract syntax tree) nodes, and on the other hand, we only lower to LLVM IR (which is very low level) when we’re comfortable we can’t extract any more special behaviour from the language semantics. Other existing dialects add to the list of benefits, as they already have rich semantics and existing optimisation passes we can use for free.

Why Verona needs MLIR

Verona aims to be easy to program but powerful to express concurrent and rich type semantics without effort. However, Verona’s type system is far from simple. C-like languages usually have native types (integer, float, boolean) that can be directly represented in hardware, or are simple to be operated by a runtime library; and container types (lists, sets, queues, iterators), which offer different views and access on native types. Some languages also support generics, which is a parametrisation of some types on other types, for example, a container of any type, to be defined later.

Verona has all of that, plus:

  1. Type capabilities (mutable, immutable, isolated). Controlling the access to objects with regards to mutability (ex. immutable objects are stored in immutable memory outside of mutable regions) as well as region sentinels (isolated) that cannot be held by more than one reference from outside the region.
  2. Type unions (A | B). This feature allows users to create functionality that works with multiple types, allowing easy restriction on the types passed and matching specific types in the code (via keyword match) with the guarantee that the type will be one of those.
  3. Type intersections (A & B). This allows restricting types with capabilities, making it harder to have unexpected access, for example, returning immutable references or identifying isolated objects on creation. It can also help designing interfaces, creating requirements on objects (ex. to be both random-access and an ordered collection). But also as function arguments, controlling the access to received objects.
  4. Strong inferred types. The compiler will emit an error if types cannot be identified at compile time, but users don’t need to declare them everywhere, and sometimes they can’t even be known until the compiler runs its own type inference pass (ex. generics, unions, or lambdas).

Verona currently uses a PEG parser that produce the AST and is quite straight forward, but once it’s constructed, working with ASTs (more specifically creating new nodes, replacing or updating existing ones) is quite involved and error prone. MLIR has some nice properties (SSA form, dialect operations, regions) that make that work much easier to change. But more importantly, the IR has explicit control flow (CFG), which is important for tracking where variables come from, pass through all combinations of paths, and ultimately end up at. To infer types and check safety, this is fundamental to make sure the code can’t get to an unknown state through at least one of the possible paths.

So, the main reason why we chose MLIR to be our representation is so we can do our type inference more easily.

The second reason is that MLIR allows us to mix any number of dialects together. So we can lower the AST into a mix of Verona dialect and other standard dialects, and passes that can only see Verona operations will ignore the others and vice-versa. It also allows us to partially lower parts of the dialect into other dialects without having to convert the whole thing. This keep the code clean (short, to-the-point passes) and allows us to slowly build more information, without having to run a huge analysis pass followed by a huge transformation pass, only to lose information in the middle.

An unexpected benefit of MLIR was that it has native support for opaque operations, ie. function-call-like operations that don’t need defining anywhere. This allowed us to prototype the dialect even before it existed, and was the precursor of some of our current dialect operations. We’re still using opaque nodes where the dialect is not complete yet, allowing us to slowly build the dialect without having to rush through (and fail at) a long initial design phase.

Where are we at

Right now, we have a dialect, a simple and incomplete MLIR generator and a few examples. None of those examples can be lowered to LLVM IR yet, as we don’t have any partial conversion to other standard dialects. Once we have a bit more support for the core language, and we’re comfortable that the dialect is at the right level of abstraction, we’ll start working on the partial conversion.

But, like other examples in MLIR, we’ll have to hold on to strings and printing in our own dialect until it can be lowered directly to the LLVM dialect. This is because MLIR has no native string representation and there is no sane way of representing all types of strings in the existing types.

Other missing important functionality are:

  • when keyword, which controls access to the regions (requests ownership),
  • match keyword, which controls the types (ex. from a type union) in the following block,
  • lambda and literal objects, which will create their own capture context via anonymous structures and expose a function call.

We also need to expose some minimal runtime library written in Verona to operate on types (ex. integer/floating-point arithmetic and logic), and we need those classes compiled and exposed to user code as an MLIR module, so that we can look at the code as a whole and do more efficient optimisations (like inlining) as well as pattern-matching known operations, like addition, and lower them to native LLVM instructions (ex. add or fadd).

Here be dragons

While we’re always happy to accept issues and pull requests, the current status of the language is raw. We’re co-designing the language, the compiler and the runtime library, as well as its more advanced features such as its clang interface and process sandbox. All of which would need multiple additional blog posts to cover, and all in continuous discussions to define both syntax and semantics.

By the time we have some LLVM IR generated and hopefully some execution of a simple program, the parser, compiler and libraries will be a bit more stable to allow external people to not only play with it by contribute back.

What would be extremely helpful, though, are tough questions about the language’s behaviour, our own expectations and the API that is exposed to programmers by the runtime. There are a lot of unknowns and until we start writing some serious Verona code, we won’t know for sure what works better. If you’re feeling brave, and would like to create issues with examples of what you would like to see in the language, that’d be awesome.

Also issues (and even pull requests) on the existing implementation would be nice, with recommendations of better patterns on our usage of the external libraries, for example MLIR, which is in constant evolution on its own, and we can’t keep up with every new shiny feature.

So, patches welcome, but bring your dragon-scale armour, shield and fire resistance spells.

Living Room Server

While playing games on an Arm box with an NVidia card and reasonable frame rates is cool, that’s not exactly what the developer box was intended to do. The idea is, as I said earlier, to give people the same experience of developing on an x86 box. Playing games is obviously part of it, but with an underpowered CPU, the bottleneck will move away from the GPU, even with nouveau drivers, pretty soon.

Other people are reporting using it as a build machine, so I decided to use it as a local (private) cloud box, replacing our ageing media server (Celerom dual core based).

The Hardware

Board with no GPU, two HDDs, in an empty rack-mount case

For this application, we don’t need a GPU at all, so no need for any PCI devices. I’ve also decided to add back the 1TB hard-drive, as the media partition. Interesting enough, the board + SSD use about 12W on idle without the spinning rust, but 17W with it: 50% more! With the prices of 1TB SSD falling, I’ll be investing in one pretty soon.

Everything else was still the same. Same main SSD disk, same 8GB of RAM. The only change was to flip the jumper to make it boot when powered, given that I won’t have access to the power button any more.

The Form Factor

For a few weeks, I left the original case in the living room. While it worked fine, it was big, clumsy and there really was no good place for it. So I decided to buy a 1U rack-mount case.

Too many cases to choose from, most of them over £100 and with all bells and whistles I don’t need. But I remembered one that we used in the Linaro lab a while ago: SuperMicro SC512-260B.

picoPSU adaptor cable is a tad too short

It has more than enough space for the (slightly too large) mother board, two HDDs and the power supply, and you can get used ones for £40 on Ebay.

So I got one, installed the board in it and was quickly reminded of the data centre experience. Unsurprisingly, the giant fan was a bit too much, given that the CPU has a passive heat-sink and the DIMM is right in front of it, stopping any decent air-flow. After removing it, I then realised the PSU was equally noisy, if not worse. For data centre standards, it was relatively silent, but for a living room, while people are trying to watch a film, is unacceptable.

Back to the original case until I find a solution. None of the 1U PSUs are remotely silent and people reporting having to buy sound proof racks for their homes. I wasn’t going to cash out that much, nor I have space in the house for that, so I had to fine another way. It was then that I remembered that the board is really really low powered (12W is nothing), and that other similar boards (for example, the Macchiato Bin) are powered by a 12V PSU.

Wall mounted system. 12V on the right, network on the bottom

That’s when I found picoPSU. For £25, you get up to 80W (4x more than I could ever need) and zero noise. The PSU is completely silent and, guess what, accepts 12V input.

With that in, I could power the board silently and managed to remove the PSU and everything else other than the outer case. I printed an adaptor for the 12V socket where the 1U PSU used to sit, but only later realised the mother board is too big and the 20-pin sits too far from the end of the unit, so the cable is just too short.

Nothing that a ceramic drill couldn’t fix. I hacked the side of it (literally, the metal is all dented) and screwed the 12V input to it. Given this is not a rack mount anything, it will do just fine.

Finally, I printed some mounting brackets, fixed them 19″ apart and hanged the server behind the router, in an invisible corner of the living room. Neither noise nor visual pollution.

The Software

For the cloud software, there isn’t much choice. You either pay for the services (with money and your privacy), or you run ownCloud / Nextcloud. After asking folks that know a lot more about clouds than I do, it seems Nextcloud is the one to go. It might not be as complete as ownCloud (given its history), but it’s the one that has the highest chance at succeeding while still remaining mostly upstream.

Installation is pretty trivial via snap. After installing, you should follow the installation guideline to enable the Admin user, create users, enable HTTPS, etc. You can play a bit with the plugins and configurations on the website, but I realised Nextcloud is a bit on the slow side.

Of course, with an A53 @ 1GHz, I wasn’t expecting a lot. But I had a shared instance for a while on a cloud server and it was even worse. Looking at the CPU load, it’s mostly empty, with one or two CPUs maxing out. This means that the access is serialised into one process and everything is done inside it. It’s reasonable for large powerful servers with hundreds of users, but it’s inadequate for smaller (more scalable) servers or shared cloud servers, where standard Apache works very well.

I don’t know what the structure of Nextcloud is, and it seems to rely heavily on MySQL, which could explain why everything has to be serialised. Using cached static objects (especially for images and videos, which take many seconds to load and usually crash the PHP memory limit).

All in all, it’s better than building by hand, but honestly, for such a hype, I was expecting a much more professional product.

Conclusion

The machine is suitable for web workloads, and I believe the shortcomings of Nextcloud would be felt in an equally priced x86 box. The form-factor, however, gains silence from consuming under 20W at full loads (less when I get a new SSD), not needing any active heat-sink on the CPU and not having a GPU at all. Even with the large motherboard, I could still have half the size on the rack-mount and still fit everything perfectly. All of that at a fraction of the power budget of a Celeron-based NUC.

If you want additional network ports, you can get a 90-degree PCI riser (which fits into the x1 slot) or a flexible PCI cable (so you can fit on the x8) and add up to four ports easily in the one slot available. I haven’t tested routing workloads on the dev box yet, but the CPU is more powerful than the majority of the routers available, so I doubt there will be any performance issues. If you have done that, feel free to add comments below.

Linaro’s Dev Box

10 years ago, when I joined Arm, I imagined that we’d all be using Arm desktops soon. After a while working there, I realised this wasn’t really in anyone’s plans (at least not transparent to us, mere developers), so I kind of accepted that truth.

But as time passed, and the 64-bit architecture came along and phones really didn’t seem to be benefiting from the bump in address space or integer arithmetic (it was actually worse, power consumption wise), so I begun to realise that my early hopes weren’t so unfounded.

But as I left Arm around 2011, to a high-performance group, I realised how complicated it would be to move all of the x86_64/PPC high-performance computing to Arm, and that planted a seed in my brain that led me to join the HPC SIG at Linaro last year.

But throughout that journey, I realised I still didn’t have what I wanted in the first place: an Arm desktop. I’m not alone in that feeling, by all means. Enthusiasts have been building Beagle/Panda/RaspberryPi “computers” for a long time, and we have had Arm Chromebooks for a while, and even used them in our LLVM CI for 3 good years. But they were either severely under-powered to the point of uselessness, or the OS was by far the restricting factor (eyes ChromeOS).

So, when Martin told me we were going to build a proper system, with PCIe, GB network, DRAM, SATA in a compatible form factor (MicroATX), I was all in. Better still, we had the dream team of Leif/Ard/Graeme looking at the specs and fixing the bugs, so I was fairly confident we would get something decent at the end. And indeed, we have.

In September 2016, Linus Torvalds told David Rusling:

“x86 is still the one I favour most and that is because of the PC. The infrastructure is there there and it is open in a way no other architecture is.”

Well, the new Arm devbox is ATX format, with standard DIMMs, SATA disks (SSD and spinning), GB Ethernet port (and speed), PCIe (x8+x1+x1) and has open bootloaders, kernels and operating systems. I believe we have delivered on the request.

Synquacer Developer Box

Dev box with 1080p monitor, showing Youtube in a browser, 24 cores idling, cpuinfo and lspci outputs as well as some games…

The dev box itself is pretty standard (and that’s awesome!), and you can see the specs for yourself here. We got a few boxes to try out, and we had a few other spare hardware to try it with, so after a week or so we had tried all combinations possible, and apart from a few bugs (that we fixed along the way), everything worked well enough. For more news on the box itself, have a look here and here.  Also, here’s the guide on how to install it. Not unlike other desktops.

Even the look is not unlike other desktops, although as I’ll explain later, I’d prefer if I could buy the board on its own, rather than the whole box.

The good

Building LLVM on 20 or all cores doesn’t seem to push the power consumption that much… The GPU is active on idle, and about 12w are spent on it and about 5w (est.) on inefficient PSU

I tried four GPUs: NVidia GT210, GT710, GTX1050Ti and old AMD (which didn’t work on UEFI for lack of any standard firmware). The box comes with the 710 which (obviously) works out-of-the-box. But so does the 210. The 1050Ti works well on UEFI and framebuffer, but (on Debian at lest), you need to install firmware-misc-nonfree which has to be done either with the 710 on terminal or through serial first, then it works on the next boot.

We tried a large number of DIMMs, with and without ECC, and they all seem to work, up to 16GB. We are limited to 4GB per DIMM, but that’s a firmware issue and we’re fixing it. Will come on the next update. Also, in the subject of firmware updates, no need to get your JTAG probes. On Debian, just do like any other desktop.
$ sudo apt install fwupd
$ sudo fwupdmgr refresh
$ sudo fwupdmgr update
$ sudo reboot

Another nice thing is the HTTP installer. Of course, as expected from a desktop, downloading an ISO from your preferred distro and booting from it works out-of-the-box, but in case you’re lazy and don’t want to dd stuff into a USB stick, we bundled an HTTP install from an ISO “on the cloud”. This is an experimental feature, so salt, pepper and all, but the lesson here is simple: on boot, you’ll be pleasantly redirected to a BIOS screen, with options to boot from whatever device, including HTTP net-inst and USB stick.

Folks manage to run Debian (Stretch and Buster) and Fedora and they all work without issues. Though, for the GTX1050Ti you’ll need Buster, because the Nouveau driver that supports it is 1.0.15, which is not on Stretch. I did a dist-upgrade from Stretch and it worked without incidents. A full install, with desktop environment, Cinnamon, Gnome or LXDE have also worked out-of-the-box.

The box builds GCC, LLVM, Linux and a bunch of other software we put it to do (with more than 4GB of RAM is much easier), and it accepts multiple PCI NICs, so you can also run it as a home server, router, firewall. I haven’t tried 10GBE on that board, but I know those cards work on Arm (on our HPC Lab), so it should work just as well on the Synquacer box. 

The not so bad

Inside my server, 8GB RAM, SSD, GT210 (no need for graphics) and a PCIe NIC.

While a lot works out of the box and that’s a first in consumer Arm boards, not everything works perfectly well and needs a bit of fine tuning. Disregarding the need for more / better hardware in the box (you’ll eventually have to buy more RAM and an SSD), there are a few other things that you may need to fiddle.

For example, while Nouveau works out-of-the-box, it does need the following config in its module to get to full speed (seems specific to older cards):

$ echo 'options nouveau config=NvClkMode=auto' | sudo tee /etc/modprobe.d/nouveau.conf
$ sudo update-initramfs -u

Without this, GPU works perfectly well, but it’s not fast enough. With it, I could play Nexuiz at 30fps on “normal” specs, Armagetron at 40fps with all bells and whistles, and 30fps-capped on minetest, with all options set. SuperTuxKart gives me 40fps on the LEGO level, but only 15 on the “under the sea”, and that’s very likely because of its abuse of transparency.

This is not stellar, of course, but we’re talking nouveau driver, which is known to be less performing than the proprietary NVidia drivers, on a GT710. Those games are the ones we had packages for on Debian/Arm, and they’re not the most optimised, OpenGL-wise, so all in all, not bad numbers after all.

Then there’s the problem of too many CPUs for too little RAM. I keep coming at this point because it’s really important. For a desktop, 4GB is enough. For a server, 8GB is enough. But for a build server (my case), it really isn’t. As compilers get more complicated and as programs get larger, the amount of RAM that is used by the compiler and linker can easily pass 1GB per process. On laptops and desktops with 8 cores and 16GB or RAM, that was never a problem, but when the numbers flip to 24 by 4, then it gets ridiculous.

Even 8GB gave me trouble trying to compile LLVM, because I want a sweet spot between using as many cores as possible and not swapping. This is a trial and error process, and with build times in the scale of hours, it’s a really boring process. And that process is only valid until the code grows or you update the compiler. With 8GB, -j20 seems to be that sweet spot, but I still got 3GB of swap anyway. Each DIMM like the one in the box goes for around £40, so there’s another £120 to make it into a better builder. I’d happily trade the GPU for more RAM.

LLVM builds on just over an hour with -j20 and 2 concurrent link jobs (low RAM), which is acceptable, but not impressive. Most of my problems have been RAM shortage and swapping, so I’ll re-do the test with 16GB and see how it goes, but I’m expecting nothing less than 40min, which is still twice as much as my old i7-4720HQ. It’s 1/3 of the clock, in-order, and it has 3x the number of threads, so I was expecting closer timings. Will update with more info when I’m done.

The ugly

The first thing that comes to mind is that I have to buy the whole package, for a salty price of $1210, with hardware that is, putting it mildly, outdated.

It has a case with a huge plastic cover meant for mean big Intel heat-sinks, of which the Synquacer needs none. It’s also too small for some GPUs and too full of loose parts to be of any easy maintenance. No clips, just screws and hard pushes.

The disk is 1TB, standard WD Blue, which is fine, but honestly, in 2018, I’d expect an SSD. A run-of-the-mill SanDisk 120GB SSD comes at the same price and despite being 1/8 of the total space, I’d have preferred it any day. For not much more you could get a 240GB, which is good enough for almost all desktop uses, especially one that won’t be your main box.

It can cope with 64GB of RAM (albeit, right now, firmware limits it to 16GB), but the box comes with 4GB only. This may seem fine when talking about Intel laptops with 4 cores, but the Synquacer has a whooping 24 of them. Even under-powered (1GHz A53), make -j will create a number of threads that will make the box crawl and die when the linking starts. 8GB would have been the minimum I’d recommend for that hardware.

Finally, the SoC. I have had zero trouble with it. It doesn’t overheat, it doesn’t crash, there are no kernel panics, no sudden errors or incompatibility. It reports all its features and Linux is quite happy with it. But it’s an A53. At 1GHz. I know, there are 24 of them, which is amazing when building software (provided you have enough RAM), but pretty useless as a standard desktop.

When I was using the spinning disk, starting Cinnamon was a pain. At least 15 seconds looking at the screen after login. Then I moved to SSD and it got considerably faster to about 5 seconds. With 8GB of RAM barely used, I blame the CPU. It’s not bad bad, but it’s one of the things that, if we have a slight overclock (even to 1.5GHz), it would have been an improvement.

I understand, power consumption and heating is an issue and the whole board design would have to be re-examined to match, but it’s worth it, in my view. I’d have been happier with half of the cores at twice the clock.

Finally, I’d really like to purchase the board alone, so I can put on the case I already have, with the GPU/disk/RAM I want. To me, it doesn’t make much sense to ship a case and a spinning disk halfway across the world, so I can throw it away and buy new ones.

Conclusion

Given that Linux for Arm has been around for at least a decade, it’s no surprise that it works well on the Synquacer box. The surprise is PCIx x8 working with NVidia cards and running games on open source drivers without crashing. The surprise is that I could connect a large number of DIMMs and GPUs and disks and PCI network without a single glitch.

I have been following the developer team working on the problems I reported early on, and I found a very enthusiastic (and extremely competent) bunch of folks (Linaro, Socionext, Arm), who deserve all the credit for making this a product I would want to buy. Though, I’d actually buy the board, if it came on its own, not the entire box.

It works well as an actual desktop (browser, mail, youtube and what have you), as a build server for Arm (more RAM and there you go) and as a home server (NAS, router, firewall). So, I’m quite happy with the results. The setbacks were far fewer and far less severe than I was expecting, even hoping (and I’m a pessimist), so thumbs up!

Now, just get one of those to Linus Torvalds and Gabe Newell, and we have successfully started the “year of the Arm desktops”.

Security Camera without remote access

Ever since our garage doors got broken by burglars (they also broken others, stolen things, etc.), I wanted to get a security camera. However, it looks like there’s only two choices: a full sized CCTV system, costing many hundreds of pounds, with a lot of equipment and complicated setup, or cheap stupid WiFi cameras that not only require cloud access (to a cloud I don’t trust), but also opens up access to Mirai.

In 2016, the malware was already wide spread across the Internet and caused a huge Internet blackout, delivering hundreds of GB/s bandwidth in multiple DDoS attacks, also crippling many first grade websites like PayPal, Twitter and Spotify. Here’s some good coverage (as usual) by Ars and Engadget.

After the storm, I really though the IoT industry would shake up, grow up and a year or so later, we’d have better devices. Alas, mid-2017 still had vulnerable cameras in massive botnets. None of the big manufacturers issues recalls (only one small company did, AFAIK), and the cameras you find on Amazon today are the same you would find at the time of Mirai.

So, I had two choices: either pay premium for a full CCTV kit and spend weeks installing it all around my house (and getting the ire of the local community for a massive spy-shop), or build my own camera. Of course, I opted for the latter…

CameraApp

I’m not a very creative person when it comes to names (our kids have all standard names, as we do), so CameraApp sounds like a name as good as any. What it is, basically, is a Python script that pools a PIR sensor for movement, and when it sees it, the camera takes a snapshot (optionally flashing an LED). That simple. The main loop is about 10 lines of Python on a RaspberryPi. The snapshots go into a directory that has a simple PHP script, which generates thumbnails of new pictures and print a simple gallery of the images in a very crude HTML.

If you setup a web server on the board, you have yourself a gallery. If you mount that directory into a NAS, and have a web server on another machine, you have a remote server and backup. All of the security is managed by two simple concepts:

  1. System security: No root passwords, clamped down Linux, only access via SSH keys, only open necessary ports, etc. Left as an exercise to the sysadmin.
  2. Isolation: No access to the Internet needed, in or out, so if you want to get to the images from outside you’ll need to either VPN in or DMZ an external server out. If you DMZ your camera, you get what you deserve.

While I could have used any small gadget board, I decided to go with the RaspberryPi. Not only I had 4 of them lying around the house, but it’s the easiest one to find proper Linux distributions, compatible hardware on Amazon and instructions on the Internet.

Case Design

We also happen to have a 3D printer, and while it’s fun, it’s not easy to find uses for it to pay its own cost. Honestly, anything useful I can print on it, is cheaper on Amazon. So, if I get the chance to design something out of nothing, that’s a good use case for the printer.

On Thingverse, I looked for RaspberryPi cases, and you can find a huge list, so I just picked the simplest looking one (so I could mod). Then, using TinkerCAD, I designed from scratch a case for every component (PIR, LED, Camera) and joined them together into a face that would snap into the base. The project is here. That part took a lot of iterative print-try-mod cycles and the final design went back to Thingverse here.

Putting it all together

So, after printing and assembling the components, it looked like this:

The software was developed in the Pi itself, by connecting it to the TV (via HDMI) and using wireless keyboard/mouse and the Python IDE that comes with Raspbian. This setup makes it a lot easier to develop than Arduino or mbed, as everything I do can be live tested directly on the board, instead of having compiler-flash-no-output problems all the time.

After the development period was over, I could remove everything from the Pi but the power cable (remember, 2A at least), and use the on-board WiFi module for connectivity. This makes it extremely simple to put the camera in random hard-to-reach places. You could, in theory, use a battery (if it is able to provide 2A), but that means a big 20Ah would only provide a day or two and it would crash when the juice runs out (damaging the filesystem).

Uses

While this is great for indoor snooping and holiday reassurance, it needs more than what’s in the package to be actually useful (like most other cameras). The only use that actually comes for free as is, is the cookie jar example, but you don’t want to teach your kids to only do the right thing because someone is watching, so scratch that.

Holiday reassurance

For this use case, you’d just connect the camera and place it facing a door or the whole room. But if you keep the pictures in the camera’s filesystem, well, the burglar will take it too, and you’re back to square one. If you keep the pictures in a server, the burglar can also take the server, and your backups, so you need external backup. Luckily, Linux has good support for cloud storage, including safe (and encrypted) options, so you don’t need to trust the manufacturer.

You can either mount the Images directory directly on the cloud disk, or setup an rsync to push it once in a while in a cron job. In that case, you can easily look at the pictures while still on holidays, contact the police, send them the pictures and get it started even before you come back home.

Package Monitoring

Another problem that this could solve is package monitoring (ex. taking pictures of a door to see if a package has arrived), but this Python script won’t work because PIR sensors only detect movement of things that emit IR and boxes usually don’t. Extending to this usage would probably mean writing a slightly different script to take pictures every minute and compare with the previous. If similar enough, replace, try again.

You should replace the picture to make sure light and shadows only play a small enough role between two pictures, not the entire day. Also, comparing images will need additional software (like imagemagick), unless that’s your cup of tea and you want to do it all from scratch.

If the package you want to monitor is actually your pet, then the camera will work out of the box, providing you put the sensor on maximum, to capture the low heat that pets emit, compared to humans.

External CCTV

This is the use case I had in mind, but unfortunately, PIR sensors don’t work well through windows. That’s because glass is a good IR insulator, so this camera would pick up a mob with pitchforks and torches, but not much else. Unless you’re an orc afraid of your life, you’ll need to place the camera outside, and well, that comes with a lot of problems on its own.

First, the case will need to be waterproof (or at least resistant), and that’s a challenge on its own. The Pi is a computer, and as such, needs cooling, which is usually done by air passing on top of the CPU (or its heat sink). Heat it too much and the Pi dies, bricking the camera. Moreover, you will need to give some maintenance on the board some time later, so packing it as a one-off won’t do any good.

Second, taking power outside is usually done with proper extensions, not many of them with USB options, and not many USB adaptors are waterproof. So, the best option is to put the device inside a shed, providing the shed has power, of course. At least, taking power to a shed is a lot simpler (the endpoints are inside the house/shed) and then regular adaptors and bluetac would work for fixing the camera somewhere.

If your shed has solar power and a battery, giving the low consumption of the camera, it could last the whole summer at the very least.

The one thing I’ll have to add later is a way to configure the camera via a text file. As is, it doesn’t support night vision (even if your camera does), as this is a PiCamera option, along many others that one could easily add to the camera setup phase in the script. That’s my next step, which will go into GitHub as soon as I’m done.

Dash Cam

A bit of bluetac would work to put that case as is on your dashboard and the car’s power socket usually has more than enough power to support a 5V 2A adaptor. But again, you would have to write a new Python script to take videos instead of snapshots. I’d also add a big button to start/stop recording and a large USB dongle to store the videos (they’ll get big).

Educational Value

Getting a camera was the idea, but the most important take out was to show the kids how easy it is to do something functional. By using the right tools, things essentially build themselves.

A few of the takeaways:

  • By using the RaspberryPi instead of an Arduino, I could develop the app on the board itself and test it as I went, which really have shorten the development cycle.
  • It also allowed me to download Raspbian, which comes with absolutely everything I needed (OS, GUI, IDE, Python, PiCamera, GPIO, browser for searching).
  • It also allowed me to purchase the right devices on Amazon (especially the camera!) and everything worked out of the box.
  • By doing it in the living room (on the only TV), I forced my kids to watch and sometimes help. Seeing only the final product makes it look like magic.
  • The 3D printer was really helpful, as initially I have lost a lot of time looking for software bugs when actually the pins got disconnected by touching it.
  • And it turned out to be half of the fun, printing, fitting, trying again.
  • I hate to say this, but, by doing it in Python and PHP, I could really leverage the APIs and modules. Writing it in C++ would have been a nightmare and totally pointless.
  • And I could also find a dozen other projects that did similar things, and even steal a bit of code from them, refactor, change completely.

Closing Words

In the end, building stuff is always half the fun, so you have to plan accordingly. If you need a camera today, go buy on Amazon, but if you have some time to spare on your holidays and are bored of looking at the cold rain outside, a project like this really shines.

If this project interests you in anyway, feel free to collaborate (on GitHub, TinkerCAD, Thingverse), and let me know (in the comments or as GitHub issues) of ideas you have and problems you find. Happy hacking!

 

 

Going ARM

Last week I attended the Arm Research Summit, and on Tuesday there was a panel session on using Arm hardware for performance, an effort known as GoingArm. You can watch the whole panel on youtube, from 2:12 onward (~1h30). All videos are available on the website.

It was interesting to hear the opinions of people in the front lines or ARM HPC, as much as the types of questions that came, which made me believe even more strongly that now is the time to re-think about general computing in general. For a long time we have been coasting on the fact that cranking up the power and squeezing down the components were making the lives of programmers easier. Scaling up to multiple cores was probably the only big change most people had to deal with, and even so, not many people can actually take real advantages of it.

Continue reading “Going ARM”

OpenHPC on AArch64

My current effort is to drive ARM hardware into the data centre, through HPC deployments, for the exascale challenge. Current Top500 clusters start from 0.5 petaflops, but the top 10 are between 10 and 90 petaflops, using as much as close to 20MW of combined power. Scaling up 10 ~ 100 times in performance towards exascale would require a whole power plant (coal or nuclear) for each cluster. So, getting to ExaFLOPS involves not only getting server-grade ARM chips into a rack mount, but baking the whole ecosystem (PCB, PCIe, DRAM, firmware, drivers, kernel, compilers, libraries, simulations, deep learning, big data) that can run at least at 10x better performance than existing clusters at, hopefully, a fraction of the power budget.

The first step then, is to find a system that can glue most of those things together, so we can focus on real global solutions, not individual pieces together. It also means we need to fix all problems in the right places, rather than deferring that to external projects because it’s not our core business. At Linaro we tend to look at computing problems from an holistic point of view, so if things don’t work the way they should, we make it our job to go and fix the problem where it’s most meaningful, finding the appropriate upstream project and submitting pull requests there, then back-porting solutions internally and gluing them together into a proper solution.

And that’s why we came to OpenHPC, a project that aims to facilitate HPC cluster deployment. Essentially, it’s a package repository that glues functional groups using meta-packages, in additional to recipes and documentation on how to setup a cluster, and a lively community that deploys OpenHPC clusters across different architectures, operating systems and provisioning styles.

The recent version of OpenHPC, 1.3.2, has just been released with some additions proposed and tested by Linaro, such as Plasma, Scotch and SLEPc as well as LLVM with Clang and Flang. But while that works well on x86_64 clusters for now, and they have passed all tests on AArch64, installing a new cluster on AArch64 with automatic provisioning still needs some modification. That’s why it’s still in Tech Preview.

Warewulf

For those who don’t know, warewulf is a cluster provisioning service. As such, it is installed in a master node, which will then keep a database of all the compute nodes, resource files, operating system images and everything that is necessary to get the compute nodes up and running. While you can install the nodes by hand, then install a dispatcher (like Slurm) on each node, warewulf makes that process simple: it creates an image of the installation of a node, produces an EFI image for PXE boot and let the nodes discover themselves as they come up alive.

The OpenHPC documentation explain step by step how you can do this (by installing OpenHPC’s own meta-packages and running a few configuration tasks), and if all goes well, every compute node that boots in PXE mode will soon encounter the DHCP server, then the TFTP and will get its disk-less live installation painlessly. That is, of course, if the PXE process work.

While running this on a number of different ARM clusters, I realised that I was getting an error:

ERROR:  Could not locate Warewulf's internal pxelinux.0! Things might be broken!

While that doesn’t sound good at all, I learnt that people in the ARM community knew about that all along and were using a Grub hack (to create a simple Grub EFI script to jump start the TFTP download and installation). It’s good that it works in some way, but it’s the kind of thing that should just work, or people won’t even try anything further. Turns out the PXELinux folks haven’t bothered much with AArch64 (examples here and here), so what to do?

Synergy

One of the great strengths of Linaro is that there are a bunch of maintainers of the core open source projects most people use, so I was bound to find someone that knew what to do, or at least who to ask. As it turns out, two EFI gurus (Leif and Ard) work on my extended team, and we begun discussing alternatives when Eric (Arm Research, also OpenHPC collaborator) tipped us that warewulf would be completely replacing PXELinux for iPXE, in view of his own efforts in getting that working.

After a few contributions and teething issues, I managed to get iPXE booting on ARM clusters as smoothly as I would have done for x86_64.

Tech Preview

Even though it works perfectly well, OpenHPC 1.3.2 was release without it.

That’s because it uses an older snapshot of warewulf, while I was using the bleeding edge development branch, which was necessary due to all the fixes we had to do while making it work on ARM.

This has prompted me to replicate their validation build strategy so that I could replace OpenHPC’s own versions of warewulf in-situ, ignoring dependencies, which is not a very user friendly process. And while I have validated OpenHPC (ran the test-suite, all green) with the development branch of warewulf (commit 74ad08d), it is not a documented or even recommended process.

So, despite working better than the official 1.3.2 packages, we’re still in Tech Preview state, and will be until we can pack the development branch of warewulf past that commit into OpenHPC. Given that it has been reported to work on both AArch64 and x86_64, I’m hoping it will be ready for 1.3.3 or 1.4, whatever comes next.

Trashing Chromebooks

At Linaro, we do lots of toolchain tests: GCC, LLVM, binutils, libraries and so on. Normally, you’d find a fast machine where you could build toolchains and run all the tests, integrated with some dispatch mechanism (like Jenkins). Normally, you’d have a vast choice of hardware to chose from, for each different form-factor (workstation, server, rack mount) and you’d pick the fastest CPUs and a fast SSD disk with space enough for the huge temporary files that toolchain testing produces.

tcwg-rack

The only problem is, there aren’t any ARM rack-servers or workstations. In the ARM world, you either have many cheap development boards, or one very expensive (100x more) professional development board. Servers, workstations and desktops are still non-existent. Some have tried (Calxeda, for ex.) but they have failed. Others are trying with ARMv8 (the new 32/64-bit architecture), but all of them are under heavy development, so not even Alpha quality.

Meanwhile, we need to test the toolchain, and we have been doing it for years, so waiting for a stable ARM server was not an option and still isn’t. A year ago I took the task of finding the most stable development board that is fast enough for toolchain testing and fill a rack with it. Easier said than done.

Choices

Amongst the choices I had, Panda, Beagle, Arndale and Odroid boards were the obvious candidates. After initial testing, it was clear that Beagles, with only 500MB or RAM, were not able to compile anything natively without some major refactoring of the build systems involved. So, while they’re fine for running remote tests (SSH execution), they have very little use for anything else related to toolchain testing.

panda

Pandas, on the other hand, have 1GB or RAM and can compile any toolchain product, but the timing is a bit on the wrong side. Taking 5+ hours to compile a full LLVM+Clang build, a full bootstrap with testing would take a whole day. For background testing on the architecture, it’s fine, but for regression tracking and investigative work, they’re useless.

With the Arndales, we haven’t had such luck. They’re either unstable or deprecated months after release, which makes it really hard to acquire them in any meaningful volumes for contingency and scalability plans. We were left then, with the Odroids.

arndale

HardKernel makes very decent boards, with fast quad-A9 and octa-A15 chips, 2GB of RAM and a big heat sink. Compilation times were in the right ball park (40~80 min) so they’re good for both regression catching and bootstrapping toolchains. But they had the same problem as every other board we tried: instability under heavy load.

Development boards are built for hobby projects and prototyping. They normally can get at very high frequencies (1~2 GHz) and are normally designed for low-power, stand-by usage most of the time. But toolchain testing involves building the whole compiler and running the full test-suite on every commit, and that puts it on 100% CPU usage, 24/7. Since the build times are around an hour or more, by the time that the build finishes, other commits have gone through and need to be tested, making it a non-stop job.

CPUs are designed to scale down the frequency when they get too hot, so throughout the normal testing, they stay stable at their operating temperatures (~60C), and adding a heat sink only makes it go further on frequency and keeping the same temperature, so it won’t solve the temperature problem.

The issue is that, after running for a while (a few hours, days, weeks), the compilation jobs start to fail randomly (the infamous “internal compiler error”) in different places of different files every time. This is clearly not a software problem, but if it were the CPU’s fault, it’d have happened a lot earlier, since it reaches the operating temperature seconds after the test starts, and only fails hours or days after they’re running full time. Also, that same argument rules out any trouble in the power supply, since it should have failed in the beginning, not days later.

The problem that the heat sink doesn’t solve, however, is the board’s overall temperature, which gets quite hot (40C~50C), and has negative effects on other components, like the SD reader and the card itself, or the USB port and the stick itself. Those boards can’t boot from USB, so we must use SD cards for the system, and even using a USB external hard drive with a powered USB hub, we still see the failures, which hints that the SD card is failing under high load and high temperatures.

According to SanDisk, their SD cards should be ok on that temperature range, but other parties might be at play, like the kernel drivers (which aren’t build for that kind of load). What pointed me to the SD card is the first place was that when running solely on the SD card (for system and build directories), the failures appear sooner and more often than when running the builds on a USB stick or drive.

Finally, with the best failure rate at 1/week, none of those boards are able to be build slaves.

Chromebook

That’s when I found the Samsung Chromebook. I had one for personal testing and it was really stable, so amidst all that trouble with the development boards, I decided to give it a go as a buildbot slave, and after weeks running smoothly, I had found what I was looking for.

The main difference between development boards and the Chromebook is that the latter is a product. It was tested not just for its CPU, or memory, but as a whole. Its design evolved with the results of the tests, and it became more stable as it progressed. Also, Linux drivers and the kernel were made to match, fine tuned and crash tested, so that it could be used by the worst kind of users. As a result, after one and a half years running Chromebooks as buildbots, I haven’t been able to make them fail yet.

But that doesn’t mean I have stopped looking for an alternative. Chromebooks are laptops, and as such, they’re build with a completely different mindset to a rack machine, and the number of modifications to make it fit the environment wasn’t short. Rack machines need to boot when powered up, give 100% of its power to the job and distribute heat efficiently under 100% load for very long periods of time. Precisely the opposite of a laptop design.

Even though they don’t fail the jobs, they did give me a lot of trouble, like having to boot manually, overheating the batteries and not having an easy way to set up a Linux image easily deployable via network boot. The steps to fix those issues are listed below.

WARNING: Anything below will void your warranty. You have been warned.

System settings

To get your Chromebook to boot anything other than ChromeOS, you need to enter developer mode. With that, you’ll be able not only to boot from SD or USB, but also change your partition and have sudo access on ChromeOS.

With that, you go to the console (CTRL+ALT+->), login with user chronos (no password) and set the boot process as described on the link above. You’ll also need to set sudo crossystem dev_boot_signed_only=0 to be able to boot anything you want.

The last step is to make your Linux image boot by default, so when you power up your machine it boots Linux, not ChromeOS. Otherwise, you’ll have to press CTRL+U every boot, and remote booting via PDUs will be pointless. You do that via cgpt.

You need to find the partition that boots on your ChromeOS by listing all of them and seeing which one booted successfully:

$ sudo cgpt show /dev/mmcblk0

The right partition will have the information below appended to the output:

Attr: priority=0 tries=5 successful=1

If it had tries, and was successful, this is probably your main partition. Move it back down the priority order (6-th place) by running:

$ sudo cgpt add -i [part] -P 6 -S 1 /dev/mmcblk0

And you can also set the SD card’s part to priority 0 by doing the same thing over mmcblk1

With this, installing a Linux on an SD card might get you booting Linux by default on next boot.

Linux installation

You can chose a few distributions to run on the Chromebooks, but I have tested both Ubuntu and Arch Linux, which work just fine.

Follow those steps and insert the SD card in the slot and boot. You should get the Developer Mode screen and waiting for long enough, it should beep and boot directly on Linux. If it doesn’t, means your cgpt meddling was unsuccessful (been there, done that) and will need a bit more fiddling. You can press CTRL+U for now to boot from the SD card.

After that, you should have complete control of the Chromebook, and I recommend adding your daemons and settings during the boot process (inid.d, systemd, etc). Turn on the network, start the SSD daemon and other services you require (like buildbots). It’s also a good idea to change the governor to performance, but only if you’re going to use it for full time heavy load, and especially if you’re going to run benchmarks. But for the latter, you can do that on demand, and don’t need to leave it on during boot time.

To change the governor:
$ echo [scale] | sudo tee /sys/bus/cpu/devices/cpu[N]/cpufreq/scaling_governor

scale above can be one of performance, conservative, ondemand (default), or any other governor that your kernel supports. If you’re doing before benchmarks, switch to performance and then back to ondemand. Use cpuN as the CPU number (starts on 0) and do it for all CPUs, not just one.

Other interesting scripts are to get the temperatures and frequencies of the CPUs:

$ cat thermal
#!/usr/bin/env bash
ROOT=/sys/devices/virtual/thermal
for dir in $ROOT/*/temp; do
temp=`cat $dir`
temp=`echo $temp/1000 | bc -l | sed 's/0\+$/0/'`
device=`dirname $dir`
device=`basename $device`
echo "$device: $temp C"
done

$ cat freq
#!/usr/bin/env bash
ROOT=/sys/bus/cpu/devices
for dir in $ROOT/*; do
if [ -e $dir/cpufreq/cpuinfo_cur_freq ]; then
freq=`sudo cat $dir/cpufreq/cpuinfo_cur_freq`
freq=`echo $freq/1000000 | bc -l | sed 's/0\+$/0/'`
echo "`basename $dir`: $freq GHz"
fi
done

Hardware changes

batteries

As expected, the hardware was also not ready to behave like a rack server, so some modifications are needed.

The most important thing you have to do is to remove the battery. First, because you won’t be able to boot it remotely with a PDU if you don’t, but more importantly, because the head from constant usage will destroy the battery. Not just as in make it stop working, which we don’t care, but it’ll slowly release gases and bloat the battery, which can be a fire hazard.

To remove the battery, follow the iFixit instructions here.

Another important change is to remove the lid magnet that tells the Chromebook to not boot on power. The iFixit post above doesn’t mention it, bit it’s as simple as prying the monitor bezel open with a sharp knife (no screws), locating the small magnet on the left side and removing it.

Stability

With all these changes, the Chromebook should be stable for years. It’ll be possible to power cycle it remotely (if you have such a unit), boot directly into Linux and start all your services with no human intervention.

The only think you won’t have is serial access to re-flash it remotely if all else fails, as you can with most (all?) rack servers.

Contrary to common sense, the Chromebooks are a lot better as build slaves are any development board I ever tested, and in my view, that’s mainly due to the amount of testing that it has gone through, given that it’s a consumer product. Now I need to test the new Samsung Chromebook 2, since it’s got the new Exynos Octa.

Conclusion

While I’d love to have more options, different CPUs and architectures to test, it seems that the Chromebooks will be the go to machine for the time being. And with all the glory going to ARMv8 servers, we may never see an ARMv7 board to run stably on a rack.

Open Source and Profit

I have written extensively about free, open source software as a way of life, and now reading back my own articles of the past 7 years, I realize that I was wrong on some of the ideas, or in the state of the open source culture within business and around companies.

I’ll make a bold statement to start, trying to get you interested in reading past the introduction, and I hope to give you enough arguments to prove I’m right. Feel free to disagree on the comments section.

The future of business and profit, in years to come, can only come if surrounded by free thoughts.

By free thoughts I mean free/open source software, open hardware, open standards, free knowledge (both free as in beer and as in speech), etc.

Past Ideas

I began my quest to understand the open source business model back in 2006, when I wrote that open source was not just software, but also speech. Having open source (free) software is not enough when the reasons why the software is free are not clear. The reason why this is so is that the synergy, that is greater than the sum of the individual parts, can only be achieved if people have the rights (and incentives) to reach out on every possible level, not just the source, or the hardware. I make that clear later on, in 2009, when I expose the problems of writing closed source software: there is no ecosystem in which to rely, so progress is limited and the end result is always less efficient, since the costs to make it as efficient are too great and would drive the prices of the software too high up to be profitable.

In 2008 I saw both sides of the story, pro and against Richard Stallman, on the views of the legitimacy of propriety control, being it via copyright licenses or proprietary software. I may have come a long way, but I was never against his idea of the perfect society, Richard Stallman’s utopia, or as some friends put it: The Star Trek Universe. The main difference between me and Stallman is that he believes we should fight to the last man to protect ourselves from the evil corporations towards software abuse, while I still believe that it’s impossible for them to sustain this empire for too long. His utopia will come, whether they like it or not.

Finally, in 2011 I wrote about how copying (and even stealing) is the only business model that makes sense (Microsoft, Apple, Oracle etc are all thieves, in that sense) and the number of patent disputes and copyright infringement should serve to prove me right. Last year I think I had finally hit the epiphany, when I discussed all these ideas with a friend and came to the conclusion that I don’t want to live in a world where it’s not possible to copy, share, derive or distribute freely. Without the freedom to share, our hands will be tied to defend against oppression, and it might just be a coincidence, but in the last decade we’ve seen the biggest growth of both disproportionate propriety protection and disproportional governmental oppression that the free world has ever seen.

Can it be different?

Stallman’s argument is that we should fiercely protect ourselves against oppression, and I agree, but after being around business and free software for nearly 20 years, I so far failed to see a business model in which starting everything from scratch, in a secret lab, and releasing the product ready for consumption makes any sense. My view is that society does partake in an evolutionary process that is ubiquitous and compulsory, in which it strives to reduce the cost of the whole process, towards stability (even if local), as much as any other biological, chemical or physical system we know.

So, to prove my argument that an open society is not just desirable, but the only final solution, all I need to do is to show that this is the least energy state of the social system. Open source software, open hardware and all systems where sharing is at the core should be, then, the least costly business models, so to force virtually all companies in the world to follow suit, and create the Stallman’s utopia as a result of the natural stability, not a forced state.

This is crucial, because every forced state is non-natural by definition, and every non-natural state has to be maintained by using resources that could be used otherwise, to enhance the quality of the lives of the individuals of the system (being them human or not, let’s not block our point of view this early). To achieve balance on a social system we have to let things go awry for a while, so that the arguments against such a state are perfectly clear to everyone involved, and there remains no argument that the current state is non-optimal. If there isn’t discomfort, there isn’t the need for change. Without death, there is no life.

Profit

Of all the bad ideas us humans had on how to build a social system, capitalism is probably one of the worst, but it’s also one of the most stable, and that’s because it’s the closest to the jungle rule, survival of the fittest and all that. Regulations and governments never came to actually protect the people, but as to protect capitalism from itself, and continue increasing the profit of the profitable. Socialism and anarchy rely too much on forced states, in which individuals have to be devoid of selfishness, a state that doesn’t exist on the current form of human beings. So, while they’re the product of amazing analysis of the social structure, they still need heavy genetic changes in the constituents of the system to work properly, on a stable, least-energy state.

Having less angry people on the streets is more profitable for the government (less costs with security, more international trust in the local currency, more investments, etc), so panis et circenses will always be more profitable than any real change. However, with more educated societies, result from the increase in profits of the middle class, more real changes will have to be made by governments, even if wrapped in complete populist crap. One step at a time, the population will get more educated, and you’ll end up with more substance and less wrapping.

So, in the end, it’s all about profit. If not using open source/hardware means things will cost more, the tendency will be to use it. And the more everyone uses it, the less valuable will be the products that are not using it, because the ecosystem in which applications and devices are immersed in, becomes the biggest selling point of any product. Would you buy a Blackberry Application, or an Android Application? Today, the answer is close to 80% on the latter, and that’s only because they don’t use the former at all.

It’s not just more expensive to build Blackberry applications, because the system is less open, the tools less advanced, but also the profit margins are smaller, and the return on investment will never justify. This is why Nokia died with their own App store, Symbian was not free, and there was a better, free and open ecosystem already in place. The battle had already been lost, even before it started.

But none of that was really due to moral standards, or Stallman’s bickering. It was only about profit. Microsoft dominated the desktop for a few years, long enough to make a stand and still be dominant after 15 years of irrelevance, but that was only because there was nothing better when they started, not by a long distance. However, when they tried to flood the server market, Linux was not only already relevant, but it was better, cheaper and freer. The LAMP stack was already good enough, and the ecosystem was so open, that it was impossible for anyone with a closed development cycle to even begin to compete on the same level.

Linux became so powerful that, when Apple re-defined the concept of smartphones with the iPhone (beating Nokia’s earlier attempts by light-years of quality), the Android system was created, evolved and dominated in less than a decade. The power to share made possible for Google, a non-device, non-mobile company, to completely outperform a hardware manufacturer in a matter of years. If Google had invented a new OS, not based on anything existent, or if they had closed the source, like Apple did with FreeBSD, they wouldn’t be able to compete, and Apple would still be dominant.

Do we need profit?

So, the question is: is this really necessary? Do we really depend on Google (specifically) to free us from the hands of tyrant companies? Not really. If it wasn’t Google, it’d be someone else. Apple, for a long time, was the odd guy in the room, and they have created an immense value for society: they gave us something to look for, they have educated the world on what we should strive for mobile devices. But once that’s done, the shareable ecosystem learns, evolves and dominate. That’s not because Google is less evil than Apple, but because Android is more profitable than iOS.

Profit here is not just the return on investment that you plan on having on a specific number of years, but adding to that, the potential that the evolving ecosystem will allow people to do when you’ve long lost the control over it. Shareable systems, including open hardware and software, allow people far down in the planing, manufacturing and distributing process to still have profit, regardless of what were your original intentions. One such case is Maddog’s Project Cauã.

By using inexpensive RaspberryPis, by fostering local development and production and by enabling the local community to use all that as a way of living, Maddog’s project is using the power of the open source initiative by completely unrelated people, to empower the people of a country that much needs empowering. That new class of people, from this and other projects, is what is educating the population of the world, and what is allowing the people to fight for their rights, and is the reason why so many civil uprisings are happening in Brazil, Turkey, Egypt.

Instability

All that creates instability, social unrest, whistle-blowing gone wrong (Assange, Snowden), and this is a good thing. We need more of it.

It’s only when people feel uncomfortable with how the governments treat them that they’ll get up their chairs and demand for a change. It’s only when people are educated that they realise that oppression is happening (since there is a force driving us away from the least-energy state, towards enriching the rich), and it’s only when these states are reached that real changes happen.

The more educated society is, the quicker people will rise to arms against oppression, and the closer we’ll be to Stallman’s utopia. So, whether governments and the billionaire minority likes or not, society will go towards stability, and that stability will migrate to local minima. People will rest, and oppression will grow in an oscillatory manner until unrest happens again, and will throw us into yet another minimum state.

Since we don’t want to stay in a local minima, we want to find the best solution not just a solution, having it close to perfect in the first attempt is not optimal, but whether we get it close in the first time or not, the oscillatory nature of social unrest will not change, and nature will always find a way to get us closer to the global minimum.

Conclusion

Is it possible to stay in this unstable state for too long? I don’t think so. But it’s not going to be a quick transition, nor is it going to be easy, nor we’ll get it on the first attempt.

But more importantly, reaching stability is not a matter of forcing us to move towards a better society, it’s a matter of how dynamic systems behave when there are clear energetic state functions. In physical and chemical systems, this is just energy, in biological systems this is the propagation ability, and in social systems, this is profit. As sad as it sounds…

Uno score keeper

With the spring not coming soon, we had to improvise during the Easter break and play Uno every night. It’s a lot of fun, but it can take quite a while to find a piece of clean paper and a pen that works around the house, so I wondered if there was an app for that. It turns out, there wasn’t!

There were several apps to keep card game scores, but every one was specific to the game, and they had ads, and wanted access to the Internet, so I decided it was worth it writing one myself. Plus, that would finally teach me to write Android apps, a thing I was delaying to get started for years.

The App

Adding new players
Card Game Scores

The app is not just a Uno score keeper, it’s actually pretty generic. You just keep adding points until someone passes the threshold, when the poor soul will be declared a winner or a loser, depending on how you set up the game. Since we’re playing every night, even the 30 seconds I spent re-writing our names was adding up, so I made it to save the last game in the Android tuple store, so you can retrieve it via the “Last Game” button.

It’s also surprisingly easy to use (I had no idea), but if you go back and forth inside the app, it cleans the game and start over a new one, with the same players, so you can go on as many rounds as you want. I might add a button to restart (or leave the app) when there’s a winner, though.

I’m also thinking about printing the names in order in the end (from victorious to loser), and some other small changes, but the way it is, is good enough to advertise and see what people think.

If you end up using, please let me know!

Download and Source Code

The app is open source (GPL), so rest assured it has no tricks or money involved. Feel free to download it from here, and get the source code at GitHub.