Unix plumbing

Unix has some fantastic plumbing tools. It’s not easy to understand the power of pipes if you don’t use it every day and normally Windows users think it’s no big deal at all. Let me give you some examples and see what you think…

Tools

With a small set of tools we can do very complex plumbing on Unix. The basic tools are:

  • Pipes (represented by the pipe symbol ‘|’) are interprocess communication devices. They’re similar to connectors in real life. They attach the output of a process to the input of another.
  • FIFOs are fake files that pretty much to the same thing but have a representation on the file system. In real life they would be the pipes (as they’re somewhat more visible).
  • Background execution (represented by the and symbol ‘&’) enables you to run several programs at the same time from the same command line. This is important when you need to run all programs at each corner of the piping system.

Simple example

Now you can understand what the grep below is doing:

cat file.txt | grep "foobar" > foobar.txt

It’s filtering every line that contains “foobar” and saving in a file called foobar.txt.

simple example

Multiple pipelining

With the tee you can run two or more pipes at the same time. Imagine you want to create three files: one containing all foo occurrences, another with all bar occurrences and a third with only foo and bar at the same time. You can do this:

mkfifo pipe1; mkfifo pipe2
cat file | tee pipe1 | grep "foo" | tee pipe2 > foo.txt &
cat pipe1 | grep "bar" > bar.txt &
cat pipe2 | grep "bar" > foobar.txt

The Tees are redirecting the intermediate states to the FIFOs which are holding those states until another process read them. All of them run at the same time because they’re running in background. Check here the plumbing example.

multiple example

Full system mirror

Today you have many tools to replicate entire machines and rapidly build a cluster with an identical configuration than a certain machine at a certain point but none of them are as simple as:

tar cfpsP - /usr /etc /lib /var (...) | ssh dest -C tar xvf -

With the dash, tar redirects the output to the second command in line, the ssh, which then connects to the destination machine and un-tar the information from the input.

The pipe is very simple and at the same time very powerful. The information is being carried from one machine to the other, encrypted by ssh and you didn’t have to set-up anything special. It works with most Unix and even between different types of unices.

There is a wiki page explaining the hack better here.

Simplicity, performance and compliance

Pipes, FIFOs and Tees are universal tools, available on all Unices and supported by all Shells. Because everything is handled in memory, it’s much faster than creating temporary files, and even if programs are not prepared to read from the standard input (and using pipes) you can create FIFOs and have the same effect, cheating the program. It’s also much simpler to use pipes and FIFOs than creating temporary files with non-colliding names and remove them later when needed.

It can be compared with static vs. dynamic allocation in programming languages like C or C++. With static allocation you can create new variables, use them locally and they’ll be automatically thrown away when you don’t need them any more, but it can be quite tricky to deal with huge or changing data. On the other hand, dynamic allocation handles it quite easily but the variables must be created, manipulated correctly and cleaned after use, otherwise you have a memory leak.

Using files on Unix requires the same amount of care not to fill up the quota or have too many files in a single directory but you can easily copy them around and they can be modified and re-modified, over and over. It really depends on what you need, but for most uses a simple pipe/FIFO/Tee would be much more than enough. People just don’t use them…

Object Orientation in C: Structure polymorphism

Listening to a podcast about the internals of GCC I’ve learnt that, in order to support object oriented languages in a common AST (abstract syntax tree), the GCC does polymorphism in a quite exquisite way.

There is a page that describes how to do function polymorphism in C but not structure polymorphism as it happens on GCC, by means of a union, so I decided that was a good post to write…

Unions

Like structs, you can create a list of things together in an union but, unlike structs, the things share the same space. So, if you create a struct of an int and a double, the size of the structure is the sum of both sizes. In an union, its size is the size of the biggest element and all elements are in the same area of memory, accessed from the first byte of the union.

Its usage is somewhat limited and can be quite dangerous, so you won’t find many C programs and rarely find any C++ programs using it. One of the uses is to (unsafely) convert numbers (double, long, int) to their byte representation by accessing as an array of chars. But the use we’ll see now is how to entangle several structures together to achieve real polymorphism.

Polymorphism

In object oriented polymorphism, you can have a list of different objects sharing the same common interface being accessed by their interface’s structure. But in C you don’t have classes and you can’t build structure inheritance, so to achieve the same effect you need to put them all in the same box, but at the same time defining a generic interface to access their members.

So, if you define your structures like:

struct Interface {
    int foo;
    void (*bar)();
};
struct OneClass {
    int foo;
    void (*bar)();
};
struct TwoClass {
    int foo;
    void (*bar)();
};

and implement the methods (here represented by function pointers) like:

void one_class_bar () {
    printf("OneClass.Bar()\n");
}
void two_class_bar () {
    printf("TwoClass.Bar()\n");
}

and associate the functions created to the objects (you could use a Factory for that), you have three different classes, still not connected. The next step is to connect them via the union:

typedef union {
    struct Interface i;
    struct OneClass o;
    struct TwoClass t;
} Object;

and you have just created the generic object that could hold both OneClass and TwoClass and be accessed via the Interface. Later on, when reading from a list of Objects, you can access through the interface (if you build your classes with parameters in the same order) and it’ll call the correct method (or use the correct variable):

Object list[2];
/* Setting */
list[0] = (Object) one;
list[1] = (Object) two;
/* Using */
list[0].i.bar(list[0]);
list[1].i.bar(list[1]);

Note that when iterating the list, it access the Object via the Interface (list[0].i) and not via OneClass or TwoClass. Although the result would be the same (as they share the same portion in memory, thus would execute the same method), it’s conceptually correct and compatible with object oriented polymorphisms.

The code above produces the following output:

$ ./a.out
OneClass.Bar()
TwoClass.Bar()

You can get the whole example here. I haven’t checked the GCC code but I believe that they’ve done it in a much better and more stable way, of course, but the idea is probably be the same.

Disclaimer: This is just a proof-of-concept. It’s not nice, they (GCC programmers) were not proud (at least in the podcast) of using it and I’d not recommend anyone to use that in production.

gzip madness

Another normal day here at EBI when I change a variable called GZIP from local to global (via export on Bash) and I got a very nice surprise: all my gzipped files have gzip itself as a header!!!

Let me explain… I have a makefile that, among other things, gzip some files. So, I’ve created a variable called GZIP that is the same as “gzip –best –stdout” and on my rules I do:

%.foo : %.bar
        $(GZIP) < $ $@

So far so good, always worked. But I had a few makefiles redefining the same command, so I though: why not make an external include file with all shared variables? I could use the @include for makefiles but I also needed some of those variables for shell scripts as well, so I decided to use “export VARIABLE” for all make variables (otherwise they aren’t caught) and called it a day. That’s when everything started failing…

gzip environment

After a while digging the problem (I was blaming the poor LSF on that) I found that when I hadn’t the GZIP variable defined all went well, but by the moment I defined GZIP=”/bin/gzip –best –stdout” even a plain call to gzip was corrupted (ie. had the binary gzip as a header).

A quick look on gzip’s manual gave me the answer… GZIP is the environment variable that gzip stores all default options. So, if you say that GZIP=”–best –stdout”, every time you call gzip it’ll use those parameters by default.

So, by putting “gzip” on the parameter list, I was always running the following command:

$ /bin/gzip /bin/gzip --best --stdout  a.bar

and putting a compressed copy of gzip binary together with a.foo into a.bar.

What a mess can a simple environment variable do…

LSF, Make and NFS 2

Recently I’ve posted this entry about how NFS cache was playing tricks on me and how sleep 1 kinda solved the issue.

The problem got worse, of course. I’ve raised to 5 seconds and in some cases it was still not enough, than I’ve learnt from the admins that the NFS cache timeout was 60 seconds!! I couldn’t sleep 60 on all of them, so I had to come with a script:


timeout=60
while [ ! -s $file ] && (( $slept < $timeout )); do sleep 5; slept=$(($slept+5)); done

In a way it’s not ugly as it may seem… First, the alternative is to change the configuration (either disable cache or reduce timeout) in the whole filesystem and that would affect others. Second because now I just wait for the (almost) correct amount of time and only when I need (the first -s will get the file if there is no problem).

At least, sleep 60 on everything would be much worse! πŸ˜‰

Multics back from the dead

Multics arose from the dead in the source code shape! MIT has just released its source and now you can see with your own eyes how it was back in ’64!

It’s not easy to retrieve the whole code (no tarballs) but it’s a good exercise to read its parts if you can understand the structure, of course. If you couldn’t, don’t worry, start here.

LSF, Make and NFS

I use LSF at work, a very good job scheduler. To parallelize my jobs I use Makefiles (with -j option) and inside every rule I run the command with the job scheduler. Some commands call other Makefiles, cascading even more the spawn of jobs. Sometimes I achieve 200+ jobs in parallel.

Our shared disk BlueArc is also very good, with access times quite often faster than my local disk but yet, for almost two years I’ve seen some odd behaviour when putting all of them together.

I’ve reported random failures on processes that worked until then and, without any modifications, worked ever after. But not a long time ago I figured out what the problem was… NFS refresh speed vs. LSF spawn speed using Makefiles.

When your Makefile looks like this:

bar.gz:
    $(my_program) foo > bar
    gzip bar

There isn’t any problem because as soon as bar is created gzip can run and create the gz file. Plain Makefile behaviour, nothing to worry about. But then, when I changed to:

bar.gz:
    $(lsf_submit) $(my_program) foo > bar
    $(lsf_submit) gzip bar

Things started to go crazy. Once every a few months in one of my hundreds of Makefiles it just finished saying:

bar: No such file or directory
make: *** [bar.gz] Error 1

And what’s even weirder, the file WAS there!

During the period when these magical problems were happening, which I was lucky to streamline the Makefiles every day so I could just restart the whole thing and it went well as planned, I had another problem, quite common when using NFS: NFS stale handle.

I have my CVS tree under the NFS filesystem and when testing some perl scripts between AMD Linux and Alpha OSF machines I used to get this errors (the NFS cache was being updated) and had to wait a bit or just try again on most of the cases.

It was then that I have figured out what the big random problem was: NFS stale handle! Because the Makefile was running on different computers, the NFS cache took a few milliseconds to update and the LSF spawner, berzerk for performance, started the new job way before NFS could reorganize itself. This is why the file was there after all, because it was on its way and the Makefile crashed before it arrived.

The solution? Quite stupid:

bar.gz:
    $(lsf_submit) "$(my_program) foo > bar" && sleep 1
    $(lsf_submit) gzip bar

I’ve put it on all rules that have more than one command being spawned by LSF and never had this problem again.

The smart reader will probably tell me that it’s not just ugly, it doesn’t cover all cases at all, and you’re right, it doesn’t. NFS stale handle can take more than one second to update, single-command rules can break on the next hop, etc but because there is some processing between them (rule calculations are quite costy, run make -d and you’ll know what I’m talking about) the probability is too low for our computers today… maybe in ten years I’ll have to put sleep 1 on all rules… πŸ˜‰

Yet another supercomputer

SciCortex is to launch their cluster-in-a-(lunch)-box with promo video and everything. Seems pretty nice but some things worries me a bit …

Of course a highly interconnected backpane and some smart shortest-path routing algorithms (probably not as good as Feynman’s) is much faster (and reliable?) than gigabit ethernet (myrinet also?). Of course, all-in-one chip technology is much faster and safer and more economic than any HP or IBM 1U node money can buy.

There are also some eye-candy like a pretty nice external case, dynamic resource partitioning (like VMS), native parallel filesystem, MPI optimized interconnection and so on… but do you remember Cray-1? It had wonderful vector machines but in the end it was so complex and monolithic that everyone got stuck with it and never used it anymore.

Assembling a 1024-node Linux cluster with PC nodes, Gigabit, PVFS, MPI etc is hard? Of course it is, but the day Intel stops selling PCs you can use AMD (and vice-versa) and you won’t have to stop using the old machines until you have a whole bunch of new ones up and running transparently integrated with your old cluster. If you do it right you can have a single cluster beowulf cluster running alphas, Intel, AMD, Suns etc, just bother with the paths and the rest is done.

I’m not saying it’s easier, nor cheaper (costs with air conditioning, cabling and power can be huge) but being locked to a vendor is not my favourite state of mind… Maybe if they had smaller machines (say 128 nodes) that could be assembled in a cluster and still allow external hardware to be connected having intelligent algorithms to understand the cost of migrating process to external nodes (based on network bandwidth and latency) would be better. Maybe it could even make their entry easier to existent clusters…

VI: a love story

The first editor I’ve used on Unix was VI. Since then, I’ve been using lots of different editors for both code and text files but I still can’t find a replacement for VI.

VI, now called vim, is the most powerful and simple editor in existence (Yes! Emacs users, it *is* simpler than Emacs). Of course, there are simpler or more powerful editors around but not both. At that time (early 90’s) VI wasn’t so complete and powerful but it was simple and widely available on Unix world and that’s what made it famous.

But before using VI for coding, I used Borland’s fantastic Turbo C (for DOS) and the need for a smarter IDEs was something I always had in mind. It began, then, the search for a TC-like IDE. Borland made later several great IDEs for Windows but once coding on Unix it’s very hard to turn back and code on Windows, so I had to find a good IDE, for Linux.

Early tries

After coding for so long in VI I was feeling like it was a natural choice to use VI every time I wanted to edit a file, whatever it was. I never bothered to find other text editors (such as joe or emacs) but I did use a bit of pico (later nano) and it was terrible.

When Gnome and KDE came to substitute WindowMaker they came with lots of text editor but they were, after all, notepad clones. Later they became a bit better but still not as good as VI so, why bother change?

Well, one good reason to change was that, every time I need to edit a file I had to go to the console and open the VI. That was not such a bad thing because I always have a console open somewhere and navigating through the filesystem is easier anyway, but a few times it was annoying and I used Kate (from KDE, my WM of choice). Anyway, it was around that time that VI gained a nice brother, gvim: the graphical editor! One reason less to not use VI.

Kate was really good in fact but I found out that I had lots of “:wq” (the command to save and close VI) on my files when using any other editor. I also tried to use Quanta for HTML but it was so cluttered and I had so much “:wq” on my pages that I just gave up.

Java?

When I started programming in Java I found out the Eclipse IDE. A fantastic tool with thousands of features and extremely user friendly editor and all gadgets that a coder would want to have! And it was free and faster than any other Java IDE available at the moment. And it was free! too good to be true?

Nah, for the Java community it was *that* good, but for the rest of us it was crap. The C++ plug-in was (and still is) crap, as well as the Perl plug-in. It didn’t understand classes, inheritance and most important, didn’t have all nice features as for Java for refactoring and understanding the code.

So, why use a gigantic (still fast) IDE that doesn’t speak your language? If it’s not to speak the same language I very much prefer VI! So I went back, once again. Also, by that time, VI got a wonderful feature: tab-completion (CTRL-N in fact).

KDeveloper

The most promising rival is KDeveloper and it’s almost as good as I wanted to be, but not quite enough. It have CVS integration (not much easier as using the console), class structure information, integrated debugger, etc etc etc. But, it’s still a bit heavy (as expected) and not useful for all development projects.

VI re-birth

For a while I only used VI at work and for text files at home, specially while I was busy trying all possibilities of KDeveloper, and that’s because I still missed one very important feature of an IDE that VI didn’t have: tabs.

Editing with tabs is so much simpler than switching buffers or splitting windows. That’s why I revisited Kate a few times later than have abandoned it and that’s why I didn’t use much VI for a long time in my personal projects.

But than VI 7.0 came out, with lots of improvements and the long wanted tab support. It was like one of those amazing sunsets in the country with birds singing and all that stuff. Also, the tab-completion (still CTRL-N) is really smart, it understands includes, class, defines, typedef, everything and have a very simple interface to use.

VI, or now vim is complete! And I’m happy! πŸ˜‰

Thanks Bram Moolenaar for this amazing piece of software!

ACPI on Toshiba

I’ve been fighting for a week or two with my toshiba because of linux 2.6 acpi and my crapped ACPI system table (DSDT). It seems that there’s a lot of holes on it and every time linux find a hole i sleeps for a couple of milliseconds and lost all interrupts!!

For most hardware it’s pretty normal, as few of them really trust the PCI bus (ie. don’t ask for confirmation for each package) but some like ME and my keyboard have to re-send (ie. retype) all missing keystrokes that were not captured!

Until now, on this post, I’ve already retyped at least 30 chars…

There’s another problem, the patch with a hack on it (ignore AE_NOT_FOUND and create an empty entry on the table) is alrady available since last December but, the kernel guys are always so busy that hey could just insert it on version 2.6.16 (which’s not released yet) and would take decades until debian guys get it and apply their own patches and my so desired madwifi module.

Before you start lemme warn you, I did compiled several kernels but the point is, it’s not that simple.

1. I’ve downloaded the latest stable kernel, 2.6.15 and manualy applied the fix (by editing some files on drivers/acpi) and the ACPI got a little better, but not fixed at all. I’ve applied some other things and it did worked fine, but, many other things stopped working because many debian kernel packages (obviously) assume a specific kernel version and, as 2.6.15 is “unstable” for debian, it doesn’t have man modules and I should compile all of them… no way!

2. Logical solution ? Got the debian sources for my own kernel image version 2.6.12 and applied my own configuration (on /boot) [1], than applied the acpi patches (manually because I had a very different kernel version now) and compiled using the debian way (make-kpkg), which worked fine, though madwifi didn’t worked and I probably didn’t added all needed changes on ACPI source because the problem persisted, less intensive, though.

3. I had to keep the debian way, so, got the madwifi sources from apt-get, compiled using make-kpkg again, which created my .deb madwifi package based on my own kernel source and thus should work nice. Pity… it didn’t worked. Neither ACPI nor wifi.

4. Well, had already give up of that, but had to be sure i was not doing the wrong thing. I recreated the whole tree(no patch, no config) and re-read the config from /boot (clean debian config), changed the ext3 to be core and compiled debian way with make-kpkg and installed the package. Rebooted the system and what happened ? Of course, it didn’t care…

Well, I can see some morals to this story:
– don’t buy cheap toshiba. If you’re going to buy a cheap notebook, buy cheap brands, they are generic like desktops
– even debian can be quite unstable
– if you’re going to compile your kernel, prepare yourself to compile all your modules (internal and external) as well
– don’t use automated programs (like make-kpkg) to compile your kernel, if you’re doing it, do it right
– always use debian for bugtracking, it’s very funny to blame debian for debian fans! πŸ˜‰

[1] (for some reason) the config from deban tells the kernel that the ext3 sources should be compiled as a module! This panics the kernel at boot time, because it must mount the root partition (ext3) before start loading modules!