Who’s afraid of the big bad code?

What would Bruce Schneier say about the magic list that the NSA is putting together with Microsoft and Symantec of the 25 biggest errors in code that normally lead to a security flaw.

Don’t get me wrong, putting out a list of bad practices is a fantastic job, that’s for sure. It makes programmers more aware of the dangers, and as the article says itself, newbies can learn from experience before getting into a new field.

But the way that (lay) people take it makes it so magical that the practical side of such list is greatly reduced.

Order and size of the list

I understand that the order must have some sense, but which? Is it ordered by number of attacks in the last 12 months? Or by the sum of all reported losses caused by them? Or by number of such errors found in common code (on those companies’ code, of course)? Or by any other subjective “importance” factor from a bunch of “Security Experts”?

Also, why 25? Why not 30? Who says that the 25th is so important to show up in the list and not the 26th?

Real-world

We programmers know about most of them, know the problems they pose and normally how to fix them. We often want to fix them, but that normally requires some refactoring and now it’s time to implement those features that our client needs for the demo, right? We can think about that later… can we? Will we?

Than, NSA decides to make this a priority for the country and claim it as a national security problem. Big companies like fancy terms, and would strive to adopt any new standard that shows up in the market.

Then, comes down the VP of engineering and say:

“We need to make sure every programmer knows how to write code that is free of the top 25 errors.”

Done, he can put the GIF image from the NSA saying his company’s software is secure against all odds, according to the NSA and DHS.

Now, coders and technicians, tell me: Would any editor, IDE or compiler ever be able to spot those errors with 100% accuracy?

“Then we need to make sure every programming team has processes in place to find and fix these problems [in existing code] and has the tools needed to verify their code is as free of these errors,”

Of course not, but they will try, and Microsoft will put a beta on Visual C++ and other companies will tell their clients that their software is being tested with the new product and the clients will buy, after all, who are them to say anything about that matter?

Protect against who?

Now, after so much time and effort, 30+ companies and government departments working hard to come up with a (quite good) list of the most common errors that lead to security flaws for what?

“The real dedicated serial attacker will probably find a way in even if all these errors were removed. But a high school hacker with malicious intent – ankle-biters if you will – would be deterred from breaking in.”

WHAT?!?! All that to stop script-kids? For heavens’ sake, I thought they were serious on that… Well, maybe I expected too much from the NSA… again…

(Note: quotes from original article, ipsis litteris)

What’s new on Windows 7?

After the buzz on Windows 7 I decided to take a look on a video posted (apparently by Microsoft itself) on youtube.

I was expecting to hear about the new Operating System, only to find out that everything that matters to MS is the Window Manager. No memory or CPU consumption reports, no filesystem or network configuration structure, nothing.

Anyway, I have to talk about the interface then… Now, is it only me or they’re copying what Gnome/Compiz is doing? Because, copying Apple it’s obvious, even Gnome/Compiz is at some extent.

First, window transparency and ALT-TAB with window thumbnails to select with your mouse. Done. Second, dragging icons to the taskbar, what’s new on THAT?!

But now, “something really cool we’re putting in Windows 7 is called ‘snap-to’ “?!?!? If I recall right, the graphical interface from the PARC team already had tiled/cascade window arrangement and undoubtedly Microsoft used that on Windows 3, so how is that cool in any way?

Well, they better have a much stabler environment and much lower footprint, otherwise they won’t have nothing really serious to show.

Vista is no more

It still hasn’t gone to meet it’s maker, but it was also not as bad as it could’ve been.

After Windows Vista was launched with more PR and DRM than any other, Microsoft hoped to continue its domination of the market. Maybe afraid of the steep Linux increase in desktops (Ubuntu has a great role in that) and other market pressures, they’ve rushed out Vista with so many bugs and security flaws, so slow and with such a big memory and CPU footprint that not many companies really wanted to change their whole infrastructure to see it drawn a little later.

China government ditched it for XP because it was not stable enough to run the Olympics, only to find out that the alternative didn’t help at all.

All that crap helped a lot Linux (especially Ubuntu) jump on the desktop world. Big companies shipping Linux on lots of desktops and laptops, all netbooks with Linux as primary option, lay people now using Linux as they would use any other desktop OS. So, is it just because Vista is so bad? No. Not at all. Linux got really user friendly over the last five to ten years and it’s now as easy as any other.

Vista is so bad that Microsoft had to keep supporting Windows XP, they’re rushing again with Windows 7 and probably (hopefully) they’ll make the same mistakes again. It’s got so bad that the Free Software Foundation’s BadVista campaign is officially is closing down for good. For good as in: Victory!

Yes, victory because in one year they could show the world how bad Vista really is and how good the other opportunities are. Of course, they were talking about Linux and all the free software around, including the new gNewSense platform they’re building, but the victory is greater than that. The biggest message is that Windows is not the only solution to desktops, and most of the time, it’s the worst.

In conjunction with the DefectiveByDesign guys, they also showed how Vista (together with Sony, Apple, Warner et al) can completely destroy your freedom, privacy and entertainment. They were so successful in their quest that they’re closing doors to spend time (and donors’ money) in more important (and pressing) issues.

Now, they’re closing down but that doesn’t mean that the problem is over. The idea is to stabilise the market. Converting all Windows and Mac users to Linux wouldn’t be right, after all, each person is different. But the big challenge is to have users that need (or want) a Mac, to use a Mac. Who needs Windows and can afford to pay all extra software to protect your computer (but not your privacy), can use it. For developers the real environment is Unix, they should be able to get a good desktop and good development tools as well. It’s, at least, fair.

But for the majority of users, what they really want is a computer to browse the web, print some documents, send emails and for that, any of the three is good enough. All three are easy to install (or come pre-installed), all three have all the software you need and most operations and configurations are easy or automatic. It’s becoming more a choice of style and design than anything else.

Now that Apple got rid of all DRM crap, Spore was a fiasco so EA is selling games without DRM, the word is getting out. It’s a matter of time it’ll be a minor problem, too. Would DefectiveByDesign retire too? I truly hope so.

As an exercise to the reader, go to Google home page and search for the terms: “windows vista“. You’ll see the BadVista website in the first page. If you search for “DRM” you’ll also see the DefectiveByDesign web page as well. This is big, it means that lots and lots of websites are pointing to those websites when they’re talking about those subjects!

If you care enough and you have a Google user and is using the personalised Google search, you could search for those terms and press the up arrow symbol on those sites to make them go even higher in the rank. Can we make both be the first? I did my part already.

Intel’s Game Demo Contest announce winners

…and our friend Mauro Persano won in two categories: 2nd on Intel graphics and 5th on best game on the go.

The game, Protozoa, is a retro Petri-dish style frenetic shooting-the-hell-out-of-the bacteria, virii and protozoa stuff that comes in your way. You can play with a PS2 (two-analogue sticks) control, one for the movements and other for the shooting, or just use the keyboard. The traditional timed-power-up and megalomaniac explosions raise even more the sense of nostalgia.

You can download the latest Windows version here but don’t worry, it also runs pretty fine with Wine.

Have fun!

Numerical methods package in C++

I still code in my spare time and for a while I’ve been gathering some numerical methods I did at university in an easy-to-understand generic C++ package. Despite being easy to understand, I also tried to implement the best method I knew for each problem.

The root finding algorithm is based on Brent’s method which is, in turn, based on the secant method. If everything fails, it throws an exception and you can use the safe bisection method.

The integration is using the Romberg’s method, which is a further extrapolations of Simpson’s method which is already much better than the trapezoid basic method.

For Interpolation I’m using the Natural Cubic Spline but would like to implement other types of splines (like complete, periodical, clamped etc). The interpolation is working, but I couldn’t managed to get the coefficients right yet.

Other codes I have are Monte Carlo, Runge-Kutta and Markov Chain (this one using boost graph library for C++) and will be integrated soon. I’ll let you know when it’s done.

Book: Flat and Curved Space Times

The first time I read this book was during my special relativity course at university. I couldn’t understand a thing the teacher was saying (probably because his explanations were always: “you won’t be able to understand that”) and I needed to replace a 35% grade I got in the first exam to complete the course.

Well, hopeless as I was, headed to the library in search of a magical book (other classmates were helpless as well) and found this one. The magic in it is that, instead of trying to force the Lorentz transformations down the throat first and then explain the basic principles of relativity, it does it by simply showing the topology of the space and assuming that the speed of light is constant (pretty much the same path Einstein took in the first place).

So, the first chapter has no equations whatsoever, only graphics with light waves going back and forth and he derives the light-cones automagically from it, what happens to the “world” at high speeds and how does it affect our senses of reality. It goes on for all kinematic principles only using Newton equations and gamma. Lorentz transformations only appear in the fourth chapter.

After that, not only I could understand relativity as a whole, but I also got 90% grade on the final exam! It’s an old (88) book but time has no meaning for a very good book, especially for a subject that hasn’t changed that much in the last decades.

I recommend it to physics-wannabe as well as lay people with little background in math, and if your teacher is as hopeless as mine was, ignore him and read this book.

Click here for the US version.

Silly projects of the week

Last week I did two silly but still quite funny projects: word search on protein sequences and chat bot using markov chains.

Word search

Searching for similar sequences among the known proteins to understand evolutionary paths and function similarities is a powerful algorithm called BLAST. Following the same lines I spent a few minutes to develop a similar (but not quite the same) algorithm to search for all dictionary words on all UniProt sequences.

The algorithm will look into a dictionary (say /usr/share/dict/words) and for every sequence it’ll find the words in it, allowing a few (no more than 2) amino-acids to be between the letters. Than it gives a weight to the match based on the size of the word and the “added length” due to additional amino-acids in between. You can also restrict the number of letters on the word list to avoid too short or long words.

The results for 7 to 9 characters words in English searched only on Swiss-Prot entries are:

STRANGE  1.4
CHARADE  1.75
SLEDGED  1.75
MEALIER  1.75
DEMEANS  2.33333
DETAINS  2.33333

Markov bot

I knew a markov chain could have true wisdom but I never tried to make it actually talk to anyone until a friend told me about MegaHAL (much better than my markov model, of course). I decided to give it a try and got the XMPP bot from the Python examples and plugged in my markov program on all phrases spoken to it.

In the beginning it was awful, of course, it knew nothing. But with time (and hard work from some friends to actually teach him about leisure, programming, physics etc) it could babble a few coherent words once in a while.

The results are complete rubbish, below is a sample of the complete non-sense conversation I had with it now:

rengolin: how are you doing? have you managed to find the answer to Life, The Universe and Everything?
oraculoids: with an amazing company makes the basic points of some operator
rengolin: operator or not, you must come to reality and sense the danger there is in talking like that after such a complex and meaningful question
oraculoids: dead as an interface to follow an animal
rengolin: whatever animal you prefer, I would say but it seems that you are now driving the conversation
oraculoids:
rengolin: aha! I know you didn’t have answers to everything!
oraculoids: how is in between and pray
rengolin: I don’t pray too much, really, I’m not that kind of a person anyway…
oraculoids: but at the conversation it’t fall into them!
rengolin: I would never say that, especially not with you my friend
oraculoids: A design pattern is a practical laywer
rengolin: lawyers have no purpose other than screwing up the poor
oraculoids: hi bot
rengolin: I am not a bot
oraculoids: field of some of life, love them!

Quite stupid, I might say, but still I had a lot of fun doing it and that’s what matters… 😉 In spite of that, though, megaHAL is quite an impressive program and BLAST is a very powerful search mechanism.

RDBMS, to rewrite or not to rewrite… I got confused…

Mike Stonebreaker (Ingres/Postgres) seems to be confused as well…

First he said Google’s Map/Reduce was “Missing most of the features that are routinely included in current DBMS”, but earlier he said to ditch RDBMS anyway because “modern use of computers renders many features of mainstream DBMS obsolete”.

So, what’s the catch? Should we still use RDBMS or not? Or should we still develop technologies based on relational databases while Mike develops himself the technology of the new era? Maybe that was the message anyway…

My opinion:

MapReduce is not a step backwards, there are sometimes when indexing is actually slower than brute-force. And I’m not saying that on insert time the indexes have to be updated and so on, I’m saying in the actual search for information, if the index is too complex (or too big) it might take more time to search through the index, compute the location of the data (which might be anywhere in a range of thousands of machines), retrieve the data and later on, sort, or search on the remaining fields.

MapReduce can effectively do everything in one step, while still in the machine and return less values per search (as opposed to primary key searches first) and therefore less data will be sent over the network and less time will be taken.

Of course, MapReduce (as any other brute-force methods) is hungry for resources. You need a very large cluster to make it really effective (1800 machines is enough :)) but that’s a step forward something different from RDBMS. In the distributed world, RDBMS won’t work at all, something have to be done and Google just gave the first step.

Did we wait for warp-speed to land on the moon?! No, we got a flying foil crap and landed on it anyway.

Next steps? Many… we can continue with brute-force and do a MapReduce on the index and use the index to retrieve in an even larger cluster, or use automata to iteratively search and store smaller “views” somewhere else, or do statistical indexes (quantum indexes) and get the best result we can get instead of all results… The possibilities are endless…

Lets wait and see how it goes, but yelling DO IT than later DON’T is just useless…

UPDATE:

This is not a rant against Stonebreaker, I share his ideas about the relational model being far too outdated and the need for something new. What I don’t agree, though, is that MapReduce is a step backwards, maybe not even a step forward, probably sideways.

The whole point is that the relational model is the thesis and there are lots of antithesis, we just didn’t come up with the synthesis yet.

DRM in external drives?

Western Digital thinks that bundling the external hard drive with a crappy software that won’t allow you to share your own videos, music and photos is security.

A friend of mine have this disk, he uses for everything, including legal music bought over the internet and other mp3 (like my band’s songs) without DRM. It works a charm on his Mac and on my Linux, I didn’t even know that drive had restrictions.

It’s quite easy for the newbie geek to avoid DRM (especially if he/she uses Linux or Mac which is common) but the non-geek consumer will probably give up the whole thing and by a new one. If only the hardware industry would just stop and think for a second…

I wonder if the same security consultant WD used is the one behind biometric passwords… or probably they just did the same security course at Microsoft…

UPDATE: a very good article by BBC.