You have less worries
Yes, the sky is blue

Blast from the demoscene past

25 June 2009, by Ben    one comment

Do you like the demoscene? Or do you just like smaller, faster, or embedded code? Read on.

When I was 14, I started learning how to program, for at least two reasons:

Scratch that. I still like the demoscene. I mean, who else can make incredible 3D tube or lattice demos in a 256-byte executable? Try them — both still run fine under Windows XP.

Fire effect and starfieldSo I read diskmags and tutes to learn how to program the VGA hardware, push pixels to 0xA000:0000, and use Mode X. Oh, and I learnt about sin and cos before I learnt at school — for basic 2D and 3D rotation. Then there were effects: the fire effect, plasma, starfields, wormholes, etc, etc. (Click on the piccy to the right to download some of my old source.)

Anyway, back from Second Reality to the real thing …

As I’ve noted before, I’m not exactly in favour of bloatware. But in today’s “a GB here, a GB there” world, is small still beautiful? I think so, for two reasons:

Embedded programming

In the embedded world, size still matters a lot. Microcontrollers are getting bigger and faster, sure, but in electronic products there’s often a place for the small ones (say 64KB flash, 2KB RAM). Just the other day, I cut our code size by 900 bytes, which was a significant percentage of the total — less code to download, test, and maintain.

And it’s not only important for small micros, but also to limit download time and cost for in-field updates. If you want to update code for 1000 units over a fairly slow and costly radio link, small is good.

Binary diffs or deltas are really good for this. My brother Berwyn has developed a proof-of-concept binary diffing algorithm which is designed for tiny embedded systems — contact us if you’re keen to hear more.

Binary diffing isn’t new, of course — bsdiff already does something similar for Firefox’s updates, so you only need to download a small update. But bsdiff doesn’t work on small embedded systems, because it uses a compression program which requires a fair amount of RAM (bzip2).

To go fast, do less

Yep, as the guy said: To go fast, do less.

And KISS. Keeping it Short and Simple means less code to test, and if you’re using basically the right approach and algorithm, it usually also means faster code. And to follow my own advice, I’m keeping this section short.

Conclusion

In a word, if you’re a budding hacker, or the parent of a budding hacker, teach them that small is still beautiful. And get ‘em started with the demoscene. There is still a pretty active ’scene community, and here are some starting points:

Pilot ships through Google Earth

21 May 2009, by Ben    one comment

Paul van Dinther, one of the folks we work with, has released a ship simulation game that uses Google Earth for its “terrain” data:

PlanetInAction.com released a new simulation game called Ships. In Ships you take the helm from a choice of 3D ships. What is special about this game is that it makes use of the rich 3D data present in Google Earth. The entire world is your playground.

“Ships” is a graphically rich environment with intricate visual effects that runs right inside your web-browser. All you need is a small Google Earth plugin. Take control now of the majestic Queen Mary 2 and hit the authentic fog horn as you leave the port of Rotterdam in the Netherlands. If water is not your thing then why not climb aboard the airship Hindenburg and check-out the Swiss alps.

Have fun sailing around the (real) world!

Python’s Ellipsis explained

8 May 2009, by Ben    3 comments

As others note, using Python’s Ellipsis object in slices is rather obscure, and there are hardly any good examples out there. So I thought I’d do my bit with a nice, simple example, complete with comments:

# Portable way to get the You-Know-Which object without naming it
class __:
 def __getitem__(__, _):
  return _
___ = __()[...]

# An Ellipsobinary-to-ASCII convertor
class __:
 def __getitem__(__, _):
  return chr(sum(1<<i if _[-i-1] is ___ else 0 for i in range(len(_))))
_ = __()

# Finally, use the That-Which-Must-Not-Be-Named object
print (
 _[...,_,_,...,_,_,_] +
  _[...,...,_,_,...,_,...] +
   _[...,...,_,...,...,_,_] +
    _[...,...,_,...,...,_,_] +
     _[...,...,_,...,...,...,...] +
      _[...,_,...,...,_,_] +
       _[...,_,_,_,_,_] +
        _[...,...,...,_,...,...,...] +
         _[...,...,_,...,...,...,...] +
          _[...,...,...,_,_,...,_] +
           _[...,...,_,...,...,_,_] +
            _[...,...,_,_,...,_,_] +
             _[...,_,_,_,_,...])

And you thought Python code couldn’t be obfuscated?

Seriously, though … do any non-NumPy programmers actually use Ellipsis? I have yet to find other (non-NumPy) uses of it in the wild.

Knuth, goto, Python, and OOP

9 April 2009, by Ben    8 comments

Literate Programming by Donald KnuthThe year was 1973. Real programmers were still using punch-cards and Pascal. C had barely been invented, and object-oriented programming was hardly a twinkle even in the eyes of top computer scientists. Whether or not to use goto was the hot topic of the day.

Knuth

It was then that Donald Knuth wrote his famous essay Structured Programming with go to Statements. And some essay it is: he covers everything from the current trends on structured programming to premature optimization being the root of all evil. (I read it in Literate Programming, but it’s also available as a scanned PDF here.)

He’s responding to Edsger Dijkstra’s well-known letter Go To Statement Considered Harmful, but (in typical Knuth fashion) he covers so much related ground it’s not funny.

What was striking to me is the context of his discussion. It’s clear that structured programming — which we use every day and think is “common sense” — had to be invented, discussed, and refined. Like most inventions, it’s obvious … 36 years later.

goto

It’s also obvious that goto (or “go to” as Knuth calls it) was much more widely used and abused than it is today. This is probably partly because assembly language was so much more common, but also because “they” had to learn that goto isn’t usually the right abstraction — in fact, it isn’t much of an abstraction at all.

Now it’s 2009, and goto is pretty rare. It’s still used, of course, but I’ve usually seen it only in the cases Knuth is talking about: for efficiency, error exits, and for breaking out of certain kinds of loops.

In C you still occassionally need it for cleaning up before error exits, or for breaking out of efficient nested loops, or in generated code, but these days we also have other constructs and other languages that solve 1973’s problems most of the time.

In C, you have the invaluable break as well as the ability to return early. Knuth advocated the equivalent of C’s break, implying also that most languages at the time didn’t have it.

Compilers are also somewhat better at producing optimized code from non-gotoed source: for example, I can program my virtual machine’s opcode dispatcher as a bunch of case statements, knowing the compiler will probably optimize it into a jump table.

And in most modern high-level languages (C++ and up) you have exceptions, which eliminate the need for error-exit gotos, as well as solve several other problems in a really tidy way.

Python

Python is important in this discussion not only because Knuth is keen on beautiful code, but because Knuth “predicted” its arrival in several different ways. Here’s a quote from the last section of his essay:

It seems clear that languages somewhat different from those in existence today would enhance the preparation of structured programs. We will perhaps eventually be writing only small modules which are identified by name as they are used to build larger ones, so that devices like indentation, rather than delimiters, might become feasible for expressing local structure in the source language.

Of course, many languages now have “small modules which are identified by name as they are used to build larger ones”, but Python really took Knuth seriously about using indentation as a delimiter.

What’s more, you can always add goto to Python if you really need it. :-)

OOP

And it gets even more interesting when he goes on to say:

Although our examples don’t indicate this, it turns out that a given level of abstraction often involves several related routines and data definitions; for example, when we decide to represent a table in a certain way, we simultaneously want to specify the routines for storing and fetching information from that table. The next generation of languages will probably take into account such related routines.

Correct me if I’m wrong, but doesn’t that sound awfully like OOP? So in a single essay apparently about goto statements, Knuth predicted modules, Python’s use of indentation as delimiters, and object-oriented programming. :-)

A browser-agnostic plugin system?

13 March 2009, by Ben    16 comments

[Update: I've wrongly been calling extensions "plugins". So this post is actually talking about extensions, not plugins.]

Chrome: Where are my plugins?Lately I’ve been browsing with Google Chrome.
It looks and works great, and it’s quick. But it doesn’t have plugins.

One of the few plugins I can hardly do without is the Blank Canvas Gmail Signatures add-on. Gmail doesn’t allow a signature per email address, which is a pity, but this add-on handles that nicely. It also allows you to use HTML in your signatures.

Anyway, while cycling to work the other day I thought of an idea for a plugin system that would work in all browsers. My idea was a web proxy running on your local machine that injects HTML or JavaScript into pages from certain URLs, Greasemonkey-style. With some clever regex-matching and JavaScript injection you could do just about whatever you needed.

But of course – a deflated ego later – similar things have already been done. :-)

Privoxy

First, I searched for “browser-agnostic plugin manager” and discovered Privoxy. It’s geared for privacy and blocking ads, but it can do content replacement based on regexes. For instance, there’s a built-in filter called tiny-textforms that makes your <textarea>s bigger, a bit like this Firefox plugin.

Privoxy looks good and is pretty complete, but it’s not really geared for plugins (though one of the Privoxy developers mentioned you could probably do Greasemonkey-like stuff with it). Also, Privoxy’s got a pretty nerdy configuration interface that just wouldn’t cut it for normal plugin users.

Proxomitron, Proximodo and BFilter

Proxomitron, Proximodo and BFilter are three similar web proxy filters that might well do the job. Again, they’re mainly used for blocking ads, and their GUIs aren’t really set up as “plugin managers”, so I’m not sure they’d work for ordinary users.

Plugins in Chrome

Google is talking seriously about extension support in Chrome – perhaps it’ll happen in the next few months.

Stop press! Apparently Chromium already supports user scripts – again, Greasemonkey-style) – but only in the latest developer builds.

Greasemetal

In the meantime, there’s also Greasemetal, which is a Greasemonkey-compatible tool that uses Chrome’s inter-process AutomationProxy communications channel. So it’s not browser-agnostic, but good work Kazuho Oku! I’ll probably start using this if Chrome doesn’t hurry up with their extension interface.

But what about browser-agnostic?

But back to browser-agnostic. Imagine if all your favourite plugins or Greasemonkey scripts worked in IE, Firefox, Safari, Opera, Chrome, and Lynx.

I don’t think it would take too much hacking to package up Privoxy as a plugin manager with a decent UI, or perhaps turn it into a browser-agnostic version of Greasemonkey.

Any takers, hackers?

Or maybe I’ll get around to it someday. I’ve got a name for my plugin manager already: Proxymoron.

Some regular expression libraries

27 February 2009, by Ben    add a comment

While I’m on the lookout for small libraries anyway, I thought I’d also post a list of a few regular expression libraries I’ve found recently (again, for possible use in an embedded Linux system).

So if you need a small or medium sized regex library for one of your projects, here’s a start:

Being a bloat despiser myself, I love the idea of using just 15KB of source code to handle your regexen. If you need something more powerful, however, TRE does look very good. Somehow reminds me of TCC (Tiny C Compiler).

Cracking an INI file with a jackhammer

26 February 2009, by Ben    13 comments

INI File SnippetAll I was after was a simple .INI file reader in C or C++. You know, to parse [section] and name=value lines for config files. We needed it for an embedded Linux project, so it had to be small, portable and only dependent on the C and C++ standard libraries.

Maybe it’s my embedded background that makes me defensive about my mild case of Not-Invented-Here syndrome. Then again, perhaps I’m in good company.

But surely there are tons of INI parsing libraries around, right? Well, kind of.

I found no fewer than 15 in under an hour. But why is it that things like this are either way off the mark (bloated or non-portable) or they’re very close, but just not quite what you want?

Below is the list of INI file readers that I found, from bad to better. Some of the ones that weren’t for me might well suit your application, so treat this as something of a non-exhaustive list of INI file readers:

Not for me

Half way there

So very close

INI Not Invented Here (INIH)

Of course, in the time it took to investigate all these, I could have easily written my own. And, being unable to help myself, I did. :-) So I present you with my own offering: INI Not Invented Here, a.k.a. INIH or simply ini.h.

It contains a small C function, ini_parse(), that parses an INI file and executes the given callback function for each name=value pair parsed (think SAX).

The reason I used the callback style is so you don’t have to load the whole file into memory if you don’t need to — good for embedded systems. Plus, I wanted to be able to use the parser easily in C, but not implement a dictionary-like structure in C.

For a more user-friendly, lookup style API if you’re using C++, I’ve wrapped ini_parse() in a class called INIReader, which has Get() and GetInteger() methods (all I needed). And it’s easy to sub-class INIReader if you need fancier GetXYZ() functions.

Show us the code

For what it’s worth, I hereby put my code into the public domain.

The C code: ini.h, ini.c, ini_dump.c, test.ini

The C++ code: INIReader.h, INIReader.cpp, INIReaderTest.cpp

Or get it all in a single 5KB zip file: ini.zip

Have fun, and feel free to send any feedback my way!

bitchecker: Binary Irony

19 February 2009, by Ben    4 comments

BitsInspired by this article on a website called The Daily TLA, I wrote a fully-fledged bit verifier, which checks not only the ordinal values of all the bits on your hard drive, but also their Quantumness (TM).

At the heart of the Highly Patentable Algorithm are the following functions, which return true if there’s something wrong with either the value or the quantum nature of a given bit, but false if the bit is okay:

// Return true if bit's numericity is wrong (bit is neither zero nor one).
int invalid_numericity(int bit)
{
    return bit != 0 && bit != 1;
}

// Return true if bit's quantum-ness is wrong (i.e., bit is both zero and
// one at the same time).
int invalid_quantumness(int bit)
{
    return bit == 0 && bit == 1;
}

Download and run bitchecker.exe, or have a look at the full C source code. It will check your entire hard drive, displaying the files it’s checking, and showing you any bit errors as it finds them.

Note that I haven’t ported it to Linux, because open source software tends to have fewer problems with Quantum Bit Integrity than proprietary software.

If you have non-technical friends, point them to this handy utility — it will help them identify the worst bit errors on their computers in no time!

RAII, AC/DC, and the “with” statement

7 February 2009, by Ben    2 comments

Once upon a time there was a programming idiom called Resource Acquisition Is Initialization (RAII), and it almost deserved it. (Apologies to C. S. Lewis.)

First, a bit of context

Recently I wanted to figure out exactly what this RAII thing was that C++ and a few other languages have. With such a name, I figured it must be both extremely powerful and rather complicated.

What does “resource acquisition is initialization” even mean? Does it matter if you call it “initialization is resource acquisition”? And why initialization when everyone seems to think it’s mostly about what to do on destruction?

Well, it turns out it’s badly named. Even the C++ FAQ agrees:

However, if you dissect “RAII” as an acronym, and if you look (too?) closely at the words making up that acronym, you will realize that the words are not a perfect match for the concept. Who cares?!? The concept is what’s important; “RAII” is merely a moniker used as a handle for that concept.

Who cares? Maybe just me, but usually an acronym tells you what something is. What’s called RAII is actually a fairly simple concept, and perhaps more of us would use and understand it if it had a better name. So I hereby propose one:

Enter AC/DC

AC/DC — Acquire in Constructor, Destructor does Cleanup.

First AC/DC was two types of electrical current, then a heavy-metal band. And now it’s a C++ programming idiom. :-) The idea is to acquire or allocate all the resources you need in a class’s constructor, and clean them up or free them in the destructor.

My first thought when I’d figured out what RAII meant was, “Is that all? Isn’t that what everyone does anyway?” Well, yes and no.

For a start, this works in C++ because destructors are called in a deterministic way. The destructor is called when an object goes out of scope — and this is one of the key points — even when an exception occurs.

So when you use AC/DC, your resources are tied to objects, and when and however the objects die, the resources are freed too.

Well, how do you use it, and what’s it good for? Let’s see an example:

The old FILE-wrapper example

#include <cstdio>
#include <exception>

class CharReader {
public:
    class Exception : public std::exception {
    public:
        virtual const char* what() const throw() { return "Error!"; }
    };

    static const int EndOfFile = EOF;

    CharReader(const char* name) {
        f = fopen(name, "r");
        if (!f)
            throw Exception();
    }

    ~CharReader() {
        printf("Closing file\n");
        fclose(f);
    }

    int Read() {
        int c = fgetc(f);
        if (c == EOF && ferror(f))
            throw Exception();
        return c;
    }

private:
    FILE* f;
};

int main() {
    int c;

    try {
        CharReader reader("CharReader.cpp");

        while ((c = reader.Read()) != CharReader::EndOfFile) {
            if (c == 'z') {
                printf("'z' byte found in file!\n");
                return 2;
            }
            printf("%02X ", c);
        }
    } catch (CharReader::Exception& e) {
        printf("Error: %s\n", e.what());
        return 1;
    }

    return 0;
}

How it works

As soon as you instantiate the reader it opens the file. The object owns the file until it goes out of scope and the destructor is called. However the program exits (with an early return or via an exception) the reader’s destructor is called and the file closed.

In this case it’s not a big deal (the OS will close files for you anyway), but when you’re writing a file and haven’t flushed, or when you’re connecting to a database and need to commit, or when you’ve disabled interrupts and need to re-enable them regardless of where the function returns — these are great uses for the AC/DC idiom.

Python’s “with” statement

Python (and Java, I think) doesn’t really do AC/DC, because when an object goes out of scope, its __del__ method isn’t necessarily called. __del__ is only called when the object’s reference count goes down to zero or when the object is collected as garbage.

The old-school way of coping with this was try ... finally, for example:

def log(s):
    f = open('output.log', 'a+')
    try:
        f.write(s + '\n')
    finally:
        f.close()

That way, whether or not a write exception occurred, your file would be closed and the log flushed. But with the introduction of the with statement in Python 2.5 you can do what amounts to AC/DC using with:

def log(s):
    with open('output.log', 'a+') as f:
        f.write(s + '\n')

The difference is that it’s explicit, rather than implicit based on the scope of the object. (The with protocol uses the special functions __enter__ and __exit__ rather than the constructor and destructor.)

References

Some further reading if you’re interested:

microBlog now The Brush Blog

30 January 2009, by Ben    add a comment

Well, for quite some time it’s been all western on the quiet front around here. We hope to change that soon! We’ve also done a touch of re-branding — microBlog is now “The Brush Blog”.

The microPledge blog had slowed to a near standstill and because Brush Technology is the parent company, we decided it’d be better to run this as a Brush company blog from now on.

In short: watch this space.