Computer Science vs Software Engineering

The difference between science and engineering is pretty obvious. Physics is science, mechanics is engineering. Mathematics is (ahem) science, and building bridges is engineering. Right?

Well, after several years in science and far too much time in software engineering that I was hoping to tell my kids when they grow up, it seems that people’s beliefs are much more exacerbated about the difference, if there’s any, than their own logic seems to imply.

Beliefs

General beliefs that science is more abstract fall apart really quickly when you compare maths to physics. There are many areas of maths (statistics, for example) that are much more realistic and real world than many parts of physics (like string theory and a good part of cosmology). Nevertheless, most scientists will turn their noses up at or anything that resembles engineering.

From different points of view (biology, chemistry, physics and maths), I could see that there isn’t a consensus on what people really consider a less elaborate task, not even among the same groups of scientists. But when faced with a rejection by one of their colleagues, the rest usually agree on it. I came to the conclusion that the psychology of belonging to a group was more important than personal beliefs or preferences. One would expect that from young school children, not from professors and graduate students. But regardless of the group behaviour, there still is that feeling that tasks such as engineering (whatever that is) are mundane, mechanical and more detrimental to the greater good than science.

Real World

On the other side of the table, the real world, there are people doing real work. It generally consists of less thinking, more acting and getting things done. You tend to use tables and calculators rather than white boards and dialogue, your decisions are much more based on gut feelings and experience than over-zealously examining every single corner case and the result of your work is generally more compact and useful to the every-day person.

From that perspective, (what we’re calling) engineers have a good deal of prejudice towards (what we called) scientists. For instance, the book Real World Haskell is a great pun from people that have one foot on each side of this battle (but are leaning towards the more abstract end of it). In the commercial world, you don’t have time to analyse every single detail, you have a deadline, do what you can with that and buy insurance for the rest.

Engineers also produce better results than scientists. Their programs are better structured, more robust and efficient. Their bridges, rockets, gadgets and medicines are far more tested, bullet-proofed and safe than any scientist could ever hope to do. It is a misconception that software engineers have the same experience than an academic with the same time coding, as is a misconception that engineers could as easily develop prototypes that would revolutionise their industry.

But even on engineering, there are tasks and tasks. Even loathing scientists, those engineers that perform a more elaborate task (such as massive bridges, ultra-resistant synthetic materials, operating systems) consider themselves above the mundane crowd of lesser engineers (building 2-bed flats in the outskirts of Slough). So, even here, the more abstract, less fundamental jobs are taken at a higher level than the more essential and critical to society.

Is it true, then, that the more abstract and less mundane a task is, the better?

Computing

Since the first thoughts on general purpose computing, there is this separation of the intangible generic abstraction and the mundane mechanical real world machine. Leibniz developed the binary numeral system, compared the human brain to a machine and even had some ideas on how to develop one, someday, but he ended up creating some general-purpose multipliers (following Pascal’s design for the adder).

Leibniz would have thrilled in the 21th century. Lots of people in the 20th with the same mindset (such as Alan Turing) did so much more, mainly because of the availability of modern building techniques (perfected for centuries by engineers). Babbage is another example: he developed his differential machine for years and when he failed (more by arrogance than anything else), his analytical engine (far more elegant and abstract) has taken his entire soul for another decade. When he realised he couldn’t build it in that century, he perfected his first design (reduced the size 3 times) and made a great specialist machine… for engineers.

Mathematicians and physicists had to do horrible things (such as astrology and alchemy) to keep their pockets full and, in their spare time, do a bit of real science. But in this century this is less important. Nowadays, even if you’re not a climate scientist, you can get a good budget for very little real applicability (check NASA’s funded projects, for example). The number of people working in string theory or trying to prove the Riemann hypothesis is a clear demonstration of that.

But computing is still not there yet. We’re still doing astrology and alchemy for a living and hoping to learn the more profound implications of computing on our spare time. Well, some of us at least. And that comes to my point…

There is no computer science… yet

The beginning of science was marked by philosophy and dialogue. 2000 years later, man kind was still doing alchemy, trying to prove the Sun was the centre of the solar system (and failing). Only 200 years after that that people really started doing real science, cleansing themselves from private funding and focusing on real science. But computer science is far from it…

Most computer science courses I’ve seen teach a few algorithms, an object oriented language (such as Java) and a few courses on current technologies (such as databases, web development and concurrency). Very few of them really teach about Turing machines, group theory, complex systems, other forms of formal logic and alternatives to the current models. Moreover, the number of people doing real science on computing (given what appears on arXiv or news aggregation sites such as Ars Technica or Slashdot) is probably less than the number of people working with string theory or wanting a one-way trip to Mars.

So, what do PHDs do in computer science? Well, novel techniques on some old school algorithms are always a good choice, but the recent favourite has been breaking the security of the banking system or re-writing the same application we all already have, but for the cloud. Even the more interesting dissertations like memory models in concurrent systems, energy efficient gate designs are all commercial applications at most.

After all, PHDs can get a lot more money in the industry than remaining at the universities, and doing your PHD towards some commercial application can guarantee you a more senior position as a start in such companies than something completely abstract. So, now, to be honestly blunt, we are all doing alchemy.

Interesting engineering

Still, that’s not to say that there aren’t interesting jobs in software engineering. I’m lucky to be able to work with compilers (especially because it also involves the amazing LLVM), and there are other jobs in the industry that are as interesting as mine. But all of them are just the higher engineering, the less mundane rocket science (that has nothing of science). But all in all, software engineering is a very boring job.

You cannot code freely, ignore the temporary bugs, ask the user to be nice and have a controlled input pattern. You need a massive test infrastructure, quality control, standards (which are always tedious), and well documented interfaces. All that gets in the way of real innovation, it makes any attempt of doing innovation in a real company a mere exercise of futility and a mild source of fun.

This is not exclusive of the software industry, of course. In the pharmaceutical industry there is very little innovation. They do develop new drugs, but using the same old methods. They do need to get new medicines, more powerful out of the door quickly, but the massive amount of tests and regulation they have to follow is overwhelming (this is why they avoid as much as possible doing it right, so don’t trust them!). Nevertheless, there are very interesting positions in that industry as well.

When, then?

Good question. People are afraid of going out of their area of expertise, they feel exposed and ridiculed, and quickly retract to their comfort area. The best thing that can happen to a scientist, in my opinion, is to be proven wrong. For me, there is nothing worse than being wrong and not knowing. Not many people are like that, and the fear of failure is what keeps the industry (all of them) in the real world, with real concerns (this is good, actually).

So, as far as the industry drives innovation in computing, there will be no computer science. As long as the most gifted software engineers are mere employees in the big corporations, they won’t try, to avoid failure, as that could cost them their jobs. I’ve been to a few companies and heard about many others that have a real innovation centre, computer laboratory or research department, and there isn’t a single one of them that actually is bold enough to change computing at its core.

Something that IBM, Lucent and Bell labs did in the past, but probably don’t do it any more these days. It is a good twist of irony, but the company that gets closer to software science today is Microsoft, in its campus in Cambridge. What happened to those great software teams of the 70’s? Could those companies really afford real science, or were them just betting their petty cash in case someone got lucky?

I can’t answer those questions, nor if it’ll ever be possible to have real science in the software industry. But I do plea to all software people to think about this when they teach at university. Please, teach those kids how to think, defy the current models, challenge the universality of the Turing machine, create a new mathematics and prove Gödel wrong. I know you won’t try (by hubris and self-respect), but they will, and they will fail and after so many failures, something new can come up and make the difference.

There is nothing worse than being wrong and not knowing it…

Zero-cost exception handling in C++

C++ exception handling is a controversial subject. The intricate concepts are difficult to grasp, even to seasoned programmers but, in complexity, nothing compares to its implementation. I can’t remember any real-world (commercial or otherwise) product that makes heavy use of exception handling. Even STL itself only throws a couple of them and only upon terminal failures.

Java exceptions, on the other hand, are much more common. Containers throw them at will (ex. OutOfBounds) and creating your own exceptions is very easy, developers are encouraged to use them. But still, some C++ projects’ coding standards hint (like LLVM’s) or specifically disallow (like Google’s) the use of exceptions. The Android NDK, for example, doesn’t even have (nor plans to) support exceptions.

Some of the cons commonly listed are the encouragement of bad style (throwing at will, like in Java), compilation problems with old code and all the gotchas that exception handling can bring to your code, making it a nightmare to understand the real flow of your program and to figure out the critical DON’Ts of exception handling (like throwing from inside a destructor). For a deeper discussion on that subject, read Sutter’s book, for the implementation details of exception handling from a compiler point of view, keep reading.

Three distant worlds

Exception handling is not just a language construction, like if, it’s an intricate interaction between three distant worlds. All the complications that exception handling brings to us have to be implemented by a three-layer infrastructure: the user code, the run-time library and the compiler. The communication between them has to be absolutely perfect, one bit different and the control flow goes south, you start executing random code and blow up your program for good.

Not only it’s three very different places, but the code is written by three different kinds of people, at different times. The only formal document that binds them together is the ABI (ex. the Itanium C++ ABI), which is, to say the least, vague. Luckily, the guys who write the run-time libraries are close enough to the guys writing the compiler (and hopefully, but not necessarily, the same writing the ABI).

In the end, if the compiler implementation is correct, the user only has to care about the rules imposed by the C++ standard. But that’s not too comforting, the C++ standard is complex and frequently driven by previous implementations, rather than a fix and clear rule of thumb. Most users don’t precisely understand why you can’t throw from inside a destructor or why throwing a newly created object is calling terminate on their code, they just avoid it at all costs.

Dirty and expensive

The easy exception handling implementation is called SetJump/LongJump. It literally sets jumps between your code and exception handling code (created by the compiler). If an exception was thrown, the flow jumps directly to the exception handling code. This is very easy, but extremely expensive. Usually, if you follow good practices, you’d expect that your program would spend most of the time doing real computation and only occasionally handling exceptions. But if for every action you create a bunch of exception handling data, you’re wasting a lot of resources to check for something that might not even happen during the whole run of your program.

When an exception happens, you need to go back down the call graph and inspect if any of those functions (including the one you just thrown) is catching that exception. This is called unwinding the stack, and in a SetJump/LongJump exception handling this is done by inspecting if any LongJump was called by a series of SetJump checks. The problem is, that those checks are done even if an exception was not thrown at all.

Zero-cost

So, how is it possible to only execute exception handling code when an exception really happen? Simple, let the compiler/library do it for you. 😉

The whole idea of a zero-cost exception handling is to create a set of functions (like __cxa_throw, __cxa_begin_catch) that will deal with the exception handling flow by sending it to a different path, the EH path. Under normal situations, your code flow will be identical with or without exceptions. The flow could never reach the EH blocks and it’s only because the compiler knows those blocks are in the EH path that they don’t get discarded as unreachable code, since there is no explicit flow between user code and EH code.

Only the run-time library knows how to get there, and that’s helped by a table, normally in a read-only data section, specific to exception handling (ex. the .eh_frame ELF section), created by the compiler when reading the user’s exception flow. This is the core interaction between user, compiler and library codes.

When an exception is thrown via __cxa_allocate_exception + __cxa_throw, the first function allocates space and prepare the object to be thrown and the second starts reading the exception handling table. This table contains a list of exceptions that the particular piece of code is catching (often sorted by address, so it’s easy to do a binary search) and the code that will catch the exception.

The code that catches exceptions is called the personality routine. The personality routine library call enables you to use multiple zero-cost EH mechanisms in the same program. Normally, the table entry contains a pointer to the routine and some data it’ll use while unwinding, but some ABIs (like ARM’s EHABI) can add specially crafted tables for their purposes (like having the unwind code as 16-bits instructions embedded in the data section).

All in all, the general flow is the same: the personality routine will find the EH block (also called landing pads, unreachable from normal code) and jump there, where the catching happens. That code will get the exception allocated before the throw and make it available to the user code (inside the catch block) to do what the user requested. If there is no entry in the EH table that catches that exception, the stack will be unwound again and the library will read the caller’s EH table, until it reaches a table that does catch it or terminate if none does.

Clean-ups

So far so good, but what happens with the local variables and the temporaries (stack-based data)? If they have non-trivial destructors, they must be called, which could chain a lot of cleaning up to do. Clean-up code is part in the unreachable blocks (when you’re cleaning up EH stuff) and part in normal code, for automatic destruction at the end of a lexical block. Although they do similar things, their purpose is completely different.

Normal flow clean-up blocks are always called and always clean up all objects. But EH clean-up code has to clean up only the objects that have been allocated so far, in the current lexical block or in any parent until the root block of the function. If you can imagine deeply nested blocks, creating objects at will, with a throw inside in the increment part of a for loop, where the object beeing incremented has a non-trivial destructor, you got the picture.

What happens if you throw an exceptions when other was being caught? Here is where the complexity met reality and the boundaries were settled. If you are unwinding one exception, and have a flow being dealt with by the run-time library, and another exception is thrown, you can’t distinguish which exception you are handling. Since there is space for one exception been unwounded at a time, whenever a second exception is thrown, you have to terminate.

Of course, you could argue to create multiple lanes for exception handling, where in every exception handling is handled separately, so you know that one exception was caused when throwing another. That may be easy to implement in Java (or other interpreted/bytecode languages), because the VM/interpreter has total control of the language, but as stated earlier, exception handling in C++ is just a contract between three parties: compiler, run-time library and the user, and often these three parties are completely different groups in completely different space-time realities.

Conclusion

This is just a very brief, shallow demonstration of how complex exception handling can be. Many other pitfalls appear, especially when dealing with specific user code running on a specific hardware platform with a specific binary configuration for the object file. It’s impossible to test every single combination (including room temperature, or annual rainfall forecast) to make it an implementation final and weird bugs creep up every once in a while.

Enhancing the ABI to allow that is possible, but the complexities of doing so, together with the very scarce use of exception handling in C++, makes it a very unlikely candidate to be changed in the near future. It’d have to be done in so many levels (C++ standard, ABIs, compilers, run-time libraries) that it’s best left alone.

Probabilistic processor and the end of locks

Every now and then, when I’m bored with normal work, I tend to my personal projects. For some reason, that tends to be mid-year, every year. So, here we go…

Old ideas, new style

This time was an old time project, to break the parallelization barrier without the use of locks. I knew simple boolean logic on regular Turing machines wouldn’t do, as they rely on the fact that the tape is only written by one head at a time. If you have more than one head writing to the same place, the behaviour is undefined. I’ve looked into several other technologies…

Some were just variations of Turing machines, like molecular, ternary and atomtronic computers. Others were simply interesting toys, like the programmable water, but there were two promising classes: quantum computers (taking a bit too long to impress) and non-deterministic Turing machine (NTM), that could get around some concurrency problems quite well.

Quantum computers are nice, but they’re more like vector computing on steroids. You can operate on several qbits at once, maybe even use non-locality to speed up communications, but until all that reaches the size of a small building, we need something different than yet-another n-core frying-pan Intel design. That may be non-deterministic machines…

Non-deterministic Turing machines are exactly like their deterministic counterparts, but the decision process can take different paths given the same input. So, instead of

if (a == 10) do_something;

you have

if (probability(a) > threshold) do_something;.

The problem with that is that you still can’t ignore different processes writing to the same data, since if a process happens to read it (even if by chance), the data you get from it is still being concurrently changed, therefore, still undefined. In other words, even if the state change is probabilistic, the data is still deterministic.

So, how can you make sure that, no matter how many processes are writing to a piece of data at random times, you still have meaningful data?

Well, if the reading part is not interested into one particular value of that data, but say, only in the probability of that data be a certain value, then it doesn’t matter in which order the writes are performed, as long as they’re all writing about the same thing. That can be viewed as a non-deterministic register, not a non-deterministic Turing machine. You can still have a deterministic machine reading and writing on non-deterministic registers, that you still won’t need locks for those values.

Probabilistic registers

Suppose you want to calculate an integral. You can use several very good deterministic algorithms (like the Romberg’s method), but if you create producers to get the values and consumers to reduce it to a sum, you’ll need a synchronized queue (ie. locks) to make sure the consumer is getting ALL the values. If you loose some values, you might get a completely wrong result.

If instead, you create random generators, getting the value of any (multi-dimensional) function from random values, and accumulate them into a non-deterministic register (say by updating the average of the last values, like recharging a capacitor that keeps discharging), when the consumer process read the value from it, it’ll get an average of the previously written values. Because the writes are all random, and if you keep only the average of the last N writes, you know how much data is being filled in and what’s the average value for that block.

In essence, you’re not storing any particular value, but information enough to re-create the original data. Since specific values are not important in this case, the order in what they appear (or if they appear at all) is completely irrelevant.

In a way, it’s similar to quantum mechanics, where you have the distribution of probabilities but not the values themselves. With a few tricks you can extract every information from those distributions. This is also similar to what Monte Carlo methods are. Taking the probabilistic side of computations in order to achieve a rough image of the problem quicker than by brute-force, when the complexity of the problem would be non-polynomial and the problem size would be big enough to need a few billion years to calculate.

Here are some examples of a non-det register.

A new logic

Obviously, current deterministic algorithms cannot run on such machine. All of them are based on the fact that specific data is stored in a specific place. If you compress a file and uncompress it back again, it’s just impossible to work with probable values, your original file will be completely undecipherable. Also, boolean logic cannot operate on such registers, Operating an AND between probabilities or a XOR of two averages doesn’t make any sense. You need a different logic, something that can make sense of probable inputs and analog streams. Fortunately, Bayesian logic is just the thing for such machines.

But the success of Bayesian logic in computing is not enormous. First, it’s not simple to create algorithms using a network of probabilities, and very few mundane algorithms can benefit from it to be worth the trouble. (Optimor labs is a good example). But with the advent of this new machine, a different logic has to be applied and new algorithms must be developed to make use of it.

Probably by mixing deterministic with non-deterministic hardware and creating a few instructions that would use the few non-deterministic registers (or accumulators) available. Most likely they would be used for inter-process communication in the beginning, but as you increase the number of those registers you can start building complex Bayesian networks or also neural networks in the registers themselves. That would open the door to much more complex algorithms, probably most of them unpredictable or found by trial and error (genetic approach?), but with time, people would begin to think that way and that would be the dawn of the non-deterministic computing.

Eureka moment

The probabilistic register was an Eureka moment, when I finally implemented it and saw that my simple monte-carlo or my neuron model was really a three-line code, but that didn’t last long…

A few days ago I read the news that DARPA has developed a fully featured probabilistic processor, taking random computation to the masses, even using Bayesian logic in place of boolean logic! The obvious place for a “public trial” was spam filtering (though I think the US military is not that interested into filtering spam…), but according to them, the sky is the limit. I do have to agree…

Anyway, I wouldn’t have 5 spare years nor $18mi in the bank to implement that project like they did, so in a way I’m glad that it turned out to be a good idea after all. I just hope that they make good use of it (ie. not use it for Carnivore 2.0), and they realize that not only the military have uses for it, or anti-spam filters, for that matter. We (them?) should develop a whole new set of algorithms to work on that logic and make good use of a hardware that is not a simple progression of current technology (to keep Moore’s law going), but could be a game changing in the next decades.

I could re-focus on defining the basic blocks for that processor’s new logic, but I’m sure DARPA has much better personnel to do that. So, what’s next, then? I’ll tell you next year…

What I don’t miss about Java

Disclaimer: This is not a rant

I spent my last year working with Java, and it was not at all bad. But while Java has its moments and shines, I always felt a bit out of place when using it. In fact, when I moved back to C++, contrary to when I moved to Java, I felt that I actually wasn’t missing much…

Last year, while writing in Java at work, I felt compelled more often than usual to write C++ programs at home. Even simple programs, that would do better with scripting languages, they all came in C++.

Recently, working full time with C++, I noticed I’m doing very little home development and definitely not doing any Java. So, what did I miss about C++ that I don’t miss about Java?

Expressiveness: While functional languages are much more expressive than C++, there are few languages less expressive than Java. Java encourages child-like programming like forcing to call everything by methods not operators. By not having explicit pointers, operator overload and other dangerous things from C++, you end up repeating yourself quite a lot and it’s very hard to understand the logic afterwards, when all you have is bloatware.

While Java designers tried to avoid pointers and operators, they couldn’t. We still have null references (throwing null pointer exceptions) and the fake operators (like toString(), hash(), compare()) that can easily be overridden to change the expected behaviour pretty much the same way as C++ operators, but in the “method” notation.

In the end, you can do some bad things, but not all. So, they took away dangers by taking away functionality, without a proper redesign of C++.

Abuse of Object Orientation: While in Ruby, everything is an object, in Java, almost everything can be. Every class derive from Object silently, but base types do not. So you have the basic objects (Integer et al) which get automatically converted into basic types in subtle ways it’s hard to predict and has a huge performance impact (see auto-boxing).

Not just performance, but the language design is, again, incomplete.

Most OO programmers (mainly Java ones) complain a lot about Perl OO. They say Perl (or Python for that matter) has no proper OO, since everything is a hash and there is no concept of protection.

While Java objects and members are strongly typed, and you have the concept of protection, it’s way too easy to transform Java OO into Perl OO with reflection.

Of course, with C++ you can cast things to void pointers, mess up in the memory and so on, but getting objects by name, removing the private protection in a safe way is simply wrong. It’s like giving loaded guns to children and telling them where the lock is.

Abuse of Design Patterns: Java developers are encourage to use design patterns, to the point of stupidity. The first thing I learnt from design patterns is that their misuse is actually an anti-pattern.

Properties are important when the requirements change too often, not when they’re static. Factories are used when the objects created may differ or be customized, not for never-changing one-object construction. Still, most libraries (all?) will have Factories, Properties and so on, just for the sake of Design Patterns Compliance ™.

Fact, one of the strengths of Java development is that every one is encouraged to do things the same way. No Larry Wall style, all factory workers, doing their share in the big picture. While this is good for big, quick projects on companies with high turn-over (like consultancy companies), it’s horrible for start-ups or more creative development.

Half-implemented features: Well, templates is an issue. There is no template mechanism in Java. With the so-called Generics (like cheap version of meds), there is no type safety at all, it’s just syntax sugar for lists of Objects.

That generates a lot of misunderstandings and bad code being generated when the syntax is obviously correct, that is, if the types were actually being checked.

Again, incomplete design for the sake of backward compatibility with old codes and VMs.

Performance: Running in a JVM is already a bad start for performance, but a good compiler and a well done JIT environment can take most of it away by intelligently removing unused code, re-optimizing most used code during run-time and using profiling results to change branch-prediction code.

While the JVM does some of it, it also introduces several problems that take away the advantage and put it back on the back of the class. Auto-boxing and generics create a lot of useless casts, that can be a huge performance hit. Very few Java programmers really care about it and the compiler doesn’t do a good job in reducing that impact or even warning the programmer.

I often see Java developer scorn at performance issues. The phrase used most is “a programmer shouldn’t care about memory footprint or performance, only about business logic”. That, together with the fact that almost all universities now are teaching Java in undergraduate courses, kinda frightens me a bit.

Strong dependency on IDEs: Borland made quite a lot of money out of C++ IDEs in the 90’s, but most C++ programmers I know still use VIM or Emacs. On the other hand, every Java programmer I know use Eclipse, IntelliJ or something of the sort.

This is not just ease of use (code completion, syntax colouring, hints, navigation), it’s all about speeding up the development process by taking away boiler-plate code generation and refactoring.

IDEs are capable of writing complete pieces of code, refactor and re-write things (even behind your back). The programmers don’t care about it, the code becomes bloated, unintelligible and forgotten. Not to mention the desire of IDEs and people following IDE-style to use certain patterns for everything, like using Properties where simple structures would suffice. (see above, Abuse of Design-Patterns).

False Guarantees: The big selling point of Java, besides cheap cross-platform development, is it’s apparent safety and ease of use. But it isn’t in so many levels…

The abuses and problems related above are only part of the story. The garbage collector is another…

Some good garbage collection routines can help the initial development of programs, and they do take away the job of the lazy programmers to manage their own memory, but the Java garbage collection became a beast, with incomprehensible command-line options, undefined behaviour and total lack of control over it. You’re rendered hostage to its desires.

Not to mention the complete memory management that won’t cope with dynamic memory allocation. I mean, if you want to make memory management easy for programmers (as they went to all that trouble for a garbage collection), you could have gone a bit further and actually figured out the available memory and used it politely.

Join those with the fact that pointers and operators are still available, and you have a language that is not so much simpler than C++, with a huge price in performance and weirdness.

Undocumented APIs: Java claims to be platform independent, but has quite a few available (but undocumented) APIs to use platforms specific functionality (like signals). Still, Sun (now Oracle) reserves the right to change whenever they wish and there’s little you (or anyone) can do about it.

And that takes us to the final point:

Standards (or lack thereof): Sun did a nice job at many things (mostly hardware and OS), but they screwed up neatly when it came to support software. There is no standard, IBM and even Microsoft created their own JVM (which was better than Sun’s, btw) without any final definition about the standard API. During the Java 1.1 days, it was possible to be platform agnostic but VM specific in the same platform!

Conclusion: Java was meant to be an easy language, but it turns out that it’s deceitful enough to be just as bad as any other. And recent changes are making it worse.

Programmers are loosing the ability to understand how the machine works, how their languages behave and, more importantly, to know the implications of their actions.

Why spend time understanding the fiddlings some people had with Java if you can spend the same time understanding how the machines actually work and therefore be able to use any programming language you want?

Some argue that Java is the new Cobol and will disappear the same way… I tend to agree…

Barrelfish

Minix seems to be inspiring more operating systems nowadays. Microsoft Research is investing on a micro-kernel (they call it multi-kernel, as there are slight differences) called Barrelfish.

Despite being Microsoft, it’s BSD licensed. The mailing list looks pretty empty, the last snapshot is half a year ago and I couldn’t find an svn repository, but still more than I would expect from Microsoft anyway.

Multi-kernel

The basic concept is actually very interesting. The idea is to be able to have multi-core hybrid machines to the extreme, and still be able to run a single OS on it. Pretty much the same way some cluster solutions do (OpenMPI, for instance), but on a single machine. The idea is far from revolutionary. It’s a natural evolution of the multi-core trend with the current cluster solutions (available for years) and a fancy OS design (micro-kernel) that everyone learns in CS degrees.

What’s the difference, then? For one thing, the idea is to abstract everything away. CPUs will be just another piece of hardware, like the network or graphic cards. The OS will have the freedom to ask the GPU to do MP floating-point calculations, for instance, if it feels it’s going to benefit the total execution time. It’ll also be able to accept different CPUs in the same machine, Intel and ARM for instance (like the Dell Latitude z600), or have different GPUs, nVidia and ATI, and still use all the hardware.

With Windows, Linux and Mac today, you either use the nVidia driver or the ATI one. You also normally don’t have hybrid-core machines and absolutely can’t recover if one of the cores fail. This is not the same with cluster solutions, and Barrelfish’s idea is to simulate precisely that. In theory, you could do energy control (enabling and disabling cores), crash-recovery when one of the cores fail but not the other, or plug and play graphic or network cards and even different CPUs.

Imagine you have an ARM netbook that is great for browsing, but you want to play a game on it. You get your nVidia and a coreOcta 10Ghz USB4 and plug in. The OS recognizes the new hardware, loads the drivers and let you play your game. Battery life goes down, so once you’re back from the game, you just unplug the cards and continue browsing.

Scalability

So, how is it possible that Barrelfish can be that malleable? The key is communication. Shared memory is great for single-processed threaded code and acceptable for multi-processed OSs with little number of concurrent process accessing the same region in memory. Most modern OSs can handle many concurrent processes, but they rarely access the same data at the same time.

Normally, processes are single threaded or have a very small number of threads (dozens) running. More than that is so difficult to control that people usually fall back to other means, such as client/server or they just go out and buy more hardware. In clusters, there is no way to use shared memory. For one, accessing memory in another computer via network is just plain stupid, but even if you use shared memory in each node and client/server among different nodes, you’re bound to have trouble. This is why MPI solutions are so popular.

In Barrelfish there’s no shared memory at all. Every process communicate with each other via messages and duplicate content (rather than share). There is an obvious associated cost (memory and bus), but the lock-free semantics is worth it. It also gives Barrelfish another freedom: to choose the communication protocol generic enough so that each piece of hardware is completely independent of all others, and plug’n’play become seamless.

Challenges

It all seem fantastic, but there’s a long road ahead. First, message passing scales much better than shared memory, but nowadays there isn’t enough cores in most machines to make it worth it. Message passing also introduces some other problems that are not easily solvable: bus traffic and storage requirements increase considerably, and messages are not that generic in nature.

Some companies are famous for not adhering to standards (Apple comes to mind), and a standard hardware IPC framework would be quite hard to achieve. Also, even if using pure software IPC APIs, different companies will still use slightly modified APIs to suit their specific needs and complexity will rise, exponentially.

Another problem is where the hypervisor will live. Having a distributed control centre is cool and scales amazingly well, but its complexity also scales. In a hybrid-core machine, you have to run different instructions, in different orders, with different optimizations and communication. Choosing one core to deal with the scheduling and administration of the system is much easier, but leaves the single-point-of-failure.

Finally, going the multi-hybrid-independent style is way too complex. Even for a several-year’s project with lots of fund (M$) and clever people working on it. After all, if micro-kernel was really that useful, Tanembaum would have won the discussion with Linus. But, the future holds what the future holds, and reality (as well as hardware and some selfish vendors) can change. Multi-kernel might be possible and even easier to implement in the future.

This seems to be what the Barrelfish’s team is betting on, and I’m with them on that bet. Even if it fails miserably (as did Minix), some concepts could still be used in real-world operating systems (like Minix), whatever that’ll mean in 10 years. Being serious about parallelism is the only way forward, sticking with 40 years old concepts is definitely not.

I’m still aspiring for non-deterministic computing, though, but that’s an even longer shot…

Smart Grid Privacy

I have recently joined the IETF Smart Grid group to see what people were talking about it and to put away my fears on security and privacy. What I saw was a bunch of experts discussing the plethora of standards that could be applied (very important) but few people seemed too interested in the privacy issue.

If you see the IEEE page on Smart Grids, besides the smart generation / distribution / reception (very important) there is a paragraph on the interaction between the grid and the customers, being very careful not to mention invasive techniques to allow the grid to control customer’s appliances:

“Intelligent appliances capable of deciding when to consume power based on pre-set customer preferences.”

Here, they focus on letting the appliances decide what will be done to save power, not the grid or the provider. Later on, on the same paragraph:

“Early tests with smart grids have shown that consumers can save up to 25% on their energy usage by simply providing them with information on that usage and the tools to manage it.”

Again, enforcing that the providers will only “provide [the customer] with information”. In other words, the grid is smart up to the smart meter (that is controlled by the provider), where inside people’s houses, it’s the appliances that have to be smart. One pertinent comment from Hector Santos in the IETF group:

“Security (most privacy) issues, I believe, has been sedated over the years with the change in consumer mindset. Tomorrow (and to a large extent today) generation of consumers will not even give it a second thought. They will not even realize that it was once considered a social engineering taboo to conflict with user privacy issues.”

I hate to be pessimist, but there is a very important truth in this. Not only people are allowing systems to store their data for completely different reasons, but they don’t care if the owner of the system will distribute their information or not. I, myself, always paranoid, have signed contracts with providers knowing that they would use and sell my data to third parties. The British Telecom is one good example. He continues:

“Just look how social networking and the drive to share more, not less has changed the consumer mindset. Tomorrow engineers will be part of all this new mindset.”

There is no social engineering any more like it used to be. Who needs to steal your information when it’s already there, on your Facebook? People are sharing willingly, and a lot of them know what problems it may cause, but the benefit, for them, is greater. Moreover, millions bought music, games and films with DRM, allowing a company control what you do, see or listen. How many Kindles were bought? How many iPhones? People don’t care what’s going on if they have what they want.

That is the true meaning of sedated privacy concerns. It’s a very distorted way of selfishness, where you don’t care about yourself, as long as you are happy. If it makes no sense to you, don’t worry, it makes no sense to me too.

Recently, the Future of Privacy Forum published an excellent analysis (via Ars) on the smart grid privacy. Several concepts that are easy to understand how dangerous they can be, became commonplace to not think about it or even consider it a silly worry, given that no one cares anyway.

An evil use of a similar technology is the “Selectable Output Control“. Just like a Kindle, the media companies want to make sure you only watch what you pay for. It may seem fair, and even cheaper, as they allow “smart pricing”, like some smart-grid technologies.

But we all have seen what Amazon did to kindle users, of Apple did to its AppStore, taking down contents without warn, removing things you paid for from your device, allowing or disallowing you to run applications or contents on your device as if you hadn’t pay enough money to own the device and its contents.

In the end, “smart pricing” is like tax cut, they reduce tax A, but introduce taxes B, C and D, which double the amount of taxes you pay. Of course, you only knew about tax A and went happy about your life. All in all, nobody cares who or how much they pay, as long as they can get the newest fart app

Linux is whatever you want it to be

Normally the Linux Magazine has great articles. Impartial, informative and highly technical. Unfortunately, not always. In a recent article, some perfectionist zealot stated that Ubuntu makes Linux looks bad. I couldn’t disagree more.

Ubuntu is a fast-paced, fast-adapted Linux. I was one of the early adopters and I have to say that most of the problems I had with the previous release were fixed. Some bugs went through, of course, but they were reported and quickly fixed. Moreover, Ubuntu has the support from hardware manufacturers, such as Dell, and that makes a big difference.

Linux is everything

Linux is excellent for embedded systems, great for network appliances, wonderful for desktops, irreplaceable as a development platform, marvellous on servers and the only choice for real clusters. It also sucks when you have to find the configuration manually, it’s horrible to newbies, it breaks whenever a new release is out, it takes longer to get new software (such as Firefox) but also helps a lot with package dependencies. Something that neiter Mac nor Windows managed to do properly over the past decades.

Linux is great as any piece of software could be but horrible as every operating system that was release since the beginning of times. Some Linux distributions are stable, others not so. Debian takes 10 years to release and when it does, the software it contains is already 10 years old. Ubuntu tries to be a bit faster but that obviously breaks a few things. If you’re fast enough fixing, the early adopters will be pleased that they helped the community.

“Unfortunately what most often comes is a system full of bugs, pain, anguish, wailing and gnashing of teeth – as many “early” adopters of Karmic Koala have discovered.”

As any piece of software, open or closed, free or paid, free or non-free. It takes time to mature. A real software engineer should know better, that a system is only fully tested when it reaches the community, the user base. Google uses their own users (your granny too!) as beta testers for years and everyone seem to understand it.

Debian zealots hate Red Hat zealots and both hate Ubuntu zealots that probably hate other zealots anywhere else. It’s funny to see how opinions vary greatly from a zealot clan to the other about what Linux really is. All of them have a great knowledge on what Linux is comprised of, but few seems to understand what Linux really is. Linux, or better, GNU/Linux is a big bunch of software tied together with so many different points of view that it’s impossible to state in less than a thousand words what it really is.

“Linux is meant to be stable, secure, reliable.”

NO, IT’S NOT! Linux is meant to be whatever you make of it, that’s the real beauty. If Canonical thought it was ready to launch is because they thought that, whatever bug pased the safety net was safe enough for the users to grab and report, which we did! If you’re not an expert, wait for the system to cool down. A non-expert will not be an “early adopter” anyway, that’s for sure.

Idiosyncrasies

Each Linux has its own idiosyncrasies, that’s what makes it powerful, and painful. The way Ubuntu updates/upgrades itself is particular to Ubuntu. Debian, Red Hat, Suse, all of them do it differently, and that’s life. Get over it.

“As usual, some things which were broken in the previous release are now fixed, but things which were working are now broken.”

One pleonasm after another. There is no new software without new bugs. There is no software without bugs. What was broken was known, what is new is unknown. How can someone fix something they don’t know? When eventually the user tested it, found it broken, reported, they fixed! Isn’t it simple?

“There’s gotta be a better way to do this.”

No, there isn’t. Ubuntu is like any other Linux: Like it? Use it. Don’t like it? Get another one. If you don’t like the way Ubuntu works, get over it, use another Linux and stop ranting.

Red Hat charges money, Debian has ubber-stable-decade-old releases, Gentoo is for those that have a lot of time in their hands, etc. Each has its own particularities, each is good for a different set of people.

Why Ubuntu?

I use Ubuntu because it’s easy to install, use and update. The rate of bugs is lower than on most other distros I’ve used and the rate of updates is much faster and stable than some other distros. It’s a good balance for me. Is it perfect? Of course not! There are lots of things I don’t like about Ubuntu, but that won’t make me use Windows 7, that’s for sure!

I have friends that use Suse, Debian, Fedora, Gentoo and they’re all as happy as I am, not too much, but not too few. Each has problems and solutions, you just have to choose the ones that are best for you.

Gtk example

Gtk, the graphical interface behind Gnome, is very simple to use. It doesn’t have an all-in-one IDE such as KDevelop, which is very powerful and complete, but it features a simple and functional interface designer called Glade. Once you have the widgets and signals done, filling the blanks is easy.

As an example, I wrote a simple dice throwing application, which took me about an hour from install Glade to publish it on the website. Basically, my route was to apt-get install glade, open it and create a few widgets, assign some callbacks (signals) and generate the C source code.

After that, the file src/callbacks.c contain all the signal handlers to which you have to implement. Adding just a bit of code and browsing this tutorial to get the function names was enough to get it running.

Glade generates all autoconf/automake files, so it was extremely easy to compile and run the mock window right at the beginning. The rest of the code I’ve added was even less code than I would add if doing a console based application to do just the same. Also, because of the code generation, I was afraid it’d replace my already changed callbacks.c when I changed the layout. Luckily, I was really pleased to see that Glade was smart enough not to mess up with my changes.

My example is not particularly good looking (I’m terrible with design), but that wasn’t the intention anyway. It’s been 7 years since the last time I’ve built graphical interfaces myself and I’ve never did anything with Gtk before, so it shows how easy it is to use the library.

Just bear in mind a few concepts of GUI design and you’ll have very little problems:

  1. Widget arrangement is not normally fixed by default (to allow window resize). So workout how tables, frames, boxes and panes work (which is a pain) or use fixed position and disallow window resize (as I did),
  2. Widgets don’t do anything by themselves, you need to assign them callbacks. Most signals have meaningful names (resize, toggle, set focus, etc), so it’s not difficult to find them and create callbacks for them,
  3. Side effects (numbers appearing at the press of a button, for instance) are not easily done without global variables, so don’t be picky on that from start. Work your way towards a global context later on when the interface is stable and working (I didn’t even bother)

If you’re looking for a much better dice rolling program for Linux, consider using rolldice, probably available via your package manager.

SLC 0.2.0

My pet compiler is now sufficiently stable for me to advertise it as a product. It should deal well with the most common cases if you follow the syntax, as there are some tests to assure minimum functionality.

The language is very simple, called “State Language“. This language has only global variables and static states. The first state is executed first, all the rest only if you call goto state;. If you exit a state without branching, the program quits. This behaviour is consistent with the State Pattern and that’s why implemented this way. You can solve any computer problem using state machines, therefore this new language should be able to tackle them all.

The expressions are very simple, only binary operations, no precedence. Grouping is done with brackets and only the four basic arithmetic operations can be used. This is intentional, as I don’t want the expression evaluator to be so complex that the code will be difficult to understand.

As every code I write on my spare time, this one has an educational purpose. I learn by writing and hopefully teach by providing the source, comments and posts, and by letting it available on the internet so people can find it through search engines.

It should work on any platform you can compile to (currently only Linux and Mac binaries provided), but the binary is still huge (several megabytes) because they contain all LLVM libraries statically linked to it.

I’m still working on it and will update the status here at every .0 release. I hope to have binary operations for if statements, print string and all PHI calculated for the next release.

The LLVM compilation infrastructure

I’ve been playing with LLVM (Low-Level Virtual Machine) lately and have produced a simple compiler for a simple language.

The LLVM compilation infrastructure (much more than a simple compiler or virtual machine), is a collection of libraries, methods and programs that allows one to create a simple, robust and very powerful compilers, virtual machines and run-time optimizations.

As GCC, it’s roughly separated into three layers: the front-end, which parses the files and produce intermediate representation (IR), the independent optimization layer, which acts on the language-independent IR and the back-end, which turns the IR into something executable.

The main difference is that, unlike GCC, LLVM is extremely generic. While GCC is struggling to fit broader languages inside the strongly C-oriented IR, LLVM was created with a very extensible IR, with a lot of information on how to represent a plethora of languages (procedural, object-oriented, functional etcetera). This IR also carries information about possible optimizations, like GCC’s but to a deeper level.

Another very important difference is that, in the back-end, not only code generators to many platforms are available, but Just-In-Time compilers (somewhat like JavaScript), so you can run, change, re-compile and run again, without even quitting your program.

The middle-layer is where the generic optimizations are done on the IR, so it’s language-independent (as all languages wil convert to IR). But that doesn’t mean that optimizations are done only on that step. All first-class compilers have strong optimizations from the time it opens the file until it finishes writing the binary.

Parser optimizations normally include useless code removal, constant expression folding, among others, while the most important optimizations on the back-end involve instruction replacement, aggressive register allocation and abuse of hardware features (such as special registers and caches).

But the LLVM goes beyond that, it optimizes during run-time, even after the program is installed on the user machine. LLVM holds information (and the IR) together with the binary. When the program is executed, it profiles automatically and, when the computer is idle, it optimizes the code and re-compile it. This optimization is per-user and means that two copies of the same software will be quite different from each other, depending on the user’s use of it. Chris Lattner‘s paper about it is very enlightening.

There are quite a few very important people and projects already using LLVM, and although there is still a lot of work to do, the project is mature enough to be used in production environments or even completely replace other solutions.

If you are interested in compilers, I suggest you take a look on their website… It’s, at least, mind opening.