Blast from the demoscene past
25 June 2009, by Ben one comment
Do you like the demoscene? Or do you just like smaller, faster, or embedded code? Read on.
When I was 14, I started learning how to program, for at least two reasons:
- My dad could code, and he made some neat stuff, like a code-generating pentomino puzzle solver and a really tiny pop-up editor.
- Because I really liked the demoscene.
Scratch that. I still like the demoscene. I mean, who else can make incredible 3D tube or lattice demos in a 256-byte executable? Try them — both still run fine under Windows XP.
So I read diskmags and tutes to learn how to program the VGA hardware, push pixels to 0xA000:0000, and use Mode X. Oh, and I learnt about sin and cos before I learnt at school — for basic 2D and 3D rotation. Then there were effects: the fire effect, plasma, starfields, wormholes, etc, etc. (Click on the piccy to the right to download some of my old source.)
Anyway, back from Second Reality to the real thing …
As I’ve noted before, I’m not exactly in favour of bloatware. But in today’s “a GB here, a GB there” world, is small still beautiful? I think so, for two reasons:
Embedded programming
In the embedded world, size still matters a lot. Microcontrollers are getting bigger and faster, sure, but in electronic products there’s often a place for the small ones (say 64KB flash, 2KB RAM). Just the other day, I cut our code size by 900 bytes, which was a significant percentage of the total — less code to download, test, and maintain.
And it’s not only important for small micros, but also to limit download time and cost for in-field updates. If you want to update code for 1000 units over a fairly slow and costly radio link, small is good.
Binary diffs or deltas are really good for this. My brother Berwyn has developed a proof-of-concept binary diffing algorithm which is designed for tiny embedded systems — contact us if you’re keen to hear more.
Binary diffing isn’t new, of course — bsdiff already does something similar for Firefox’s updates, so you only need to download a small update. But bsdiff doesn’t work on small embedded systems, because it uses a compression program which requires a fair amount of RAM (bzip2).
To go fast, do less
Yep, as the guy said: To go fast, do less.
And KISS. Keeping it Short and Simple means less code to test, and if you’re using basically the right approach and algorithm, it usually also means faster code. And to follow my own advice, I’m keeping this section short.
Conclusion
In a word, if you’re a budding hacker, or the parent of a budding hacker, teach them that small is still beautiful. And get ‘em started with the demoscene. There is still a pretty active ’scene community, and here are some starting points:
- Wikipedia article on the demoscene
- Pouet.net, a great place to download popular demos
- Scene.org, another good ’scene resource
Pilot ships through Google Earth
21 May 2009, by Ben one comment
Paul van Dinther, one of the folks we work with, has released a ship simulation game that uses Google Earth for its “terrain” data:
PlanetInAction.com released a new simulation game called Ships. In Ships you take the helm from a choice of 3D ships. What is special about this game is that it makes use of the rich 3D data present in Google Earth. The entire world is your playground.
“Ships” is a graphically rich environment with intricate visual effects that runs right inside your web-browser. All you need is a small Google Earth plugin. Take control now of the majestic Queen Mary 2 and hit the authentic fog horn as you leave the port of Rotterdam in the Netherlands. If water is not your thing then why not climb aboard the airship Hindenburg and check-out the Swiss alps.
Have fun sailing around the (real) world!
Python’s Ellipsis explained
8 May 2009, by Ben 3 comments
As others note, using Python’s Ellipsis object in slices is rather obscure, and there are hardly any good examples out there. So I thought I’d do my bit with a nice, simple example, complete with comments:
# Portable way to get the You-Know-Which object without naming it
class __:
def __getitem__(__, _):
return _
___ = __()[...]
# An Ellipsobinary-to-ASCII convertor
class __:
def __getitem__(__, _):
return chr(sum(1<<i if _[-i-1] is ___ else 0 for i in range(len(_))))
_ = __()
# Finally, use the That-Which-Must-Not-Be-Named object
print (
_[...,_,_,...,_,_,_] +
_[...,...,_,_,...,_,...] +
_[...,...,_,...,...,_,_] +
_[...,...,_,...,...,_,_] +
_[...,...,_,...,...,...,...] +
_[...,_,...,...,_,_] +
_[...,_,_,_,_,_] +
_[...,...,...,_,...,...,...] +
_[...,...,_,...,...,...,...] +
_[...,...,...,_,_,...,_] +
_[...,...,_,...,...,_,_] +
_[...,...,_,_,...,_,_] +
_[...,_,_,_,_,...])
And you thought Python code couldn’t be obfuscated?
Seriously, though … do any non-NumPy programmers actually use Ellipsis? I have yet to find other (non-NumPy) uses of it in the wild.
Knuth, goto, Python, and OOP
9 April 2009, by Ben 8 comments
The year was 1973. Real programmers were still using punch-cards and Pascal. C had barely been invented, and object-oriented programming was hardly a twinkle even in the eyes of top computer scientists. Whether or not to use goto was the hot topic of the day.
Knuth
It was then that Donald Knuth wrote his famous essay Structured Programming with go to Statements. And some essay it is: he covers everything from the current trends on structured programming to premature optimization being the root of all evil. (I read it in Literate Programming, but it’s also available as a scanned PDF here.)
He’s responding to Edsger Dijkstra’s well-known letter Go To Statement Considered Harmful, but (in typical Knuth fashion) he covers so much related ground it’s not funny.
What was striking to me is the context of his discussion. It’s clear that structured programming — which we use every day and think is “common sense” — had to be invented, discussed, and refined. Like most inventions, it’s obvious … 36 years later.
goto
It’s also obvious that goto (or “go to” as Knuth calls it) was much more widely used and abused than it is today. This is probably partly because assembly language was so much more common, but also because “they” had to learn that goto isn’t usually the right abstraction — in fact, it isn’t much of an abstraction at all.
Now it’s 2009, and goto is pretty rare. It’s still used, of course, but I’ve usually seen it only in the cases Knuth is talking about: for efficiency, error exits, and for breaking out of certain kinds of loops.
In C you still occassionally need it for cleaning up before error exits, or for breaking out of efficient nested loops, or in generated code, but these days we also have other constructs and other languages that solve 1973’s problems most of the time.
In C, you have the invaluable break as well as the ability to return early. Knuth advocated the equivalent of C’s break, implying also that most languages at the time didn’t have it.
Compilers are also somewhat better at producing optimized code from non-gotoed source: for example, I can program my virtual machine’s opcode dispatcher as a bunch of case statements, knowing the compiler will probably optimize it into a jump table.
And in most modern high-level languages (C++ and up) you have exceptions, which eliminate the need for error-exit gotos, as well as solve several other problems in a really tidy way.
Python
Python is important in this discussion not only because Knuth is keen on beautiful code, but because Knuth “predicted” its arrival in several different ways. Here’s a quote from the last section of his essay:
It seems clear that languages somewhat different from those in existence today would enhance the preparation of structured programs. We will perhaps eventually be writing only small modules which are identified by name as they are used to build larger ones, so that devices like indentation, rather than delimiters, might become feasible for expressing local structure in the source language.
Of course, many languages now have “small modules which are identified by name as they are used to build larger ones”, but Python really took Knuth seriously about using indentation as a delimiter.
What’s more, you can always add goto to Python if you really need it. :-)
OOP
And it gets even more interesting when he goes on to say:
Although our examples don’t indicate this, it turns out that a given level of abstraction often involves several related routines and data definitions; for example, when we decide to represent a table in a certain way, we simultaneously want to specify the routines for storing and fetching information from that table. The next generation of languages will probably take into account such related routines.
Correct me if I’m wrong, but doesn’t that sound awfully like OOP? So in a single essay apparently about goto statements, Knuth predicted modules, Python’s use of indentation as delimiters, and object-oriented programming. :-)
A browser-agnostic plugin system?
13 March 2009, by Ben 16 comments
[Update: I've wrongly been calling extensions "plugins". So this post is actually talking about extensions, not plugins.]
Lately I’ve been browsing with Google Chrome.
It looks and works great, and it’s quick. But it doesn’t have plugins.
One of the few plugins I can hardly do without is the Blank Canvas Gmail Signatures add-on. Gmail doesn’t allow a signature per email address, which is a pity, but this add-on handles that nicely. It also allows you to use HTML in your signatures.
Anyway, while cycling to work the other day I thought of an idea for a plugin system that would work in all browsers. My idea was a web proxy running on your local machine that injects HTML or JavaScript into pages from certain URLs, Greasemonkey-style. With some clever regex-matching and JavaScript injection you could do just about whatever you needed.
But of course – a deflated ego later – similar things have already been done. :-)
Privoxy
First, I searched for “browser-agnostic plugin manager” and discovered Privoxy. It’s geared for privacy and blocking ads, but it can do content replacement based on regexes. For instance, there’s a built-in filter called tiny-textforms that makes your <textarea>s bigger, a bit like this Firefox plugin.
Privoxy looks good and is pretty complete, but it’s not really geared for plugins (though one of the Privoxy developers mentioned you could probably do Greasemonkey-like stuff with it). Also, Privoxy’s got a pretty nerdy configuration interface that just wouldn’t cut it for normal plugin users.
Proxomitron, Proximodo and BFilter
Proxomitron, Proximodo and BFilter are three similar web proxy filters that might well do the job. Again, they’re mainly used for blocking ads, and their GUIs aren’t really set up as “plugin managers”, so I’m not sure they’d work for ordinary users.
Plugins in Chrome
Google is talking seriously about extension support in Chrome – perhaps it’ll happen in the next few months.
Stop press! Apparently Chromium already supports user scripts – again, Greasemonkey-style) – but only in the latest developer builds.
Greasemetal
In the meantime, there’s also Greasemetal, which is a Greasemonkey-compatible tool that uses Chrome’s inter-process AutomationProxy communications channel. So it’s not browser-agnostic, but good work Kazuho Oku! I’ll probably start using this if Chrome doesn’t hurry up with their extension interface.
But what about browser-agnostic?
But back to browser-agnostic. Imagine if all your favourite plugins or Greasemonkey scripts worked in IE, Firefox, Safari, Opera, Chrome, and Lynx.
I don’t think it would take too much hacking to package up Privoxy as a plugin manager with a decent UI, or perhaps turn it into a browser-agnostic version of Greasemonkey.
Any takers, hackers?
Or maybe I’ll get around to it someday. I’ve got a name for my plugin manager already: Proxymoron.
Some regular expression libraries
27 February 2009, by Ben add a comment
While I’m on the lookout for small libraries anyway, I thought I’d also post a list of a few regular expression libraries I’ve found recently (again, for possible use in an embedded Linux system).
So if you need a small or medium sized regex library for one of your projects, here’s a start:
- SLRE (Super Light Regular Expression library): 15KB of C, but still a decent subset of Perl regexes. Simple API, doesn’t use heap.
- T-Rex: Again, very light at 17KB of C, but good subset of Perl regexes. Uses heap, has a C++ wrapper class.
- TRE: C, lightweight but powerful and POSIX-compliant, looks very good.
- PCRE, C (with C++ wrapper), heavyish, fully-fledged, Unicode support, industry-standard (used in Apache etc).
- Boost.Xpressive: C++ header only, heavyish, API okay, quite hard to extract from Boost.
- Boost.Regex: C++, heavyish, API looks good, hard to extract from Boost.
Being a bloat despiser myself, I love the idea of using just 15KB of source code to handle your regexen. If you need something more powerful, however, TRE does look very good. Somehow reminds me of TCC (Tiny C Compiler).
Cracking an INI file with a jackhammer
26 February 2009, by Ben 13 comments
All I was after was a simple .INI file reader in C or C++. You know, to parse [section] and name=value lines for config files. We needed it for an embedded Linux project, so it had to be small, portable and only dependent on the C and C++ standard libraries.
Maybe it’s my embedded background that makes me defensive about my mild case of Not-Invented-Here syndrome. Then again, perhaps I’m in good company.
But surely there are tons of INI parsing libraries around, right? Well, kind of.
I found no fewer than 15 in under an hour. But why is it that things like this are either way off the mark (bloated or non-portable) or they’re very close, but just not quite what you want?
Below is the list of INI file readers that I found, from bad to better. Some of the ones that weren’t for me might well suit your application, so treat this as something of a non-exhaustive list of INI file readers:
Not for me
- SDL_Config: Rather bloated … somehow 246KB of source code just to read an INI file doesn’t do it for me. I also admit to being a bit icked out by five levels of pointer indirection, for example:
file->internal_file-> global_group.multi_value_iterator = ... - ini.c by Carey Bloodworth: Small and in C, but has a funny API and is ripe with buffer overrun goodness.
- GetPrivateProfileString API: Fine API, but of course Win32 only.
- CIniReader by Xiangxiong Jian: C++ class, nice and simple, but Win32 only (uses
GetPrivateProfileString). - CIniFile by Ludvik Jerabek: C++ class, but uses loads of non-portable MFC helper classes. Update: Ludvik’s now added a portable version to this page.
Half way there
- SimpleIni by Brodie Thiesfield: C++, cross platform, and not really bad, but it just shows that “simple” is in the eye of the beholder. Does one really want 110KB of heavy duty, template-driven C++ code just to read an INI file?
- Config by freejack: Tantalising … a C++ class with a nice API, only 5KB of code (as it should be), and optional environment variable expansion. The problem? What it parses are not quite INI files, but his own special “structured config files”. Which is all very nice, but not exactly INI-compatible. (Admittedly this could easily be hacked into an INI reader.)
- Boost.Program_options: C++, and not bad code, but it’s more of a fully-fledged “program configuration library” than just an INI file reader. Plus, we weren’t yet sure we wanted a dependency on Boost.
- CIniFile by Silviu Simen: C++ class, a little better, but relies on the Boost.Spirit parsing library.
- M’s INI parser: C++, not bad, small, portable and uses the STL, but relies on the re2c scanner generator, giving it another dependency and making the code harder to read (and modify).
So very close
- minIni by CompuPhase: So very close, does just what you want (also writes INI files), and in only 17KB of portable C source, great for embedded systems … but re-reading the INI file for every
name=valuepair you need somehow just doesn’t sit right. - CDataFile by Gary McNickle: This one looks decent — small, C++, uses the STL instead of MFC or its own fancy dictionary type … I probably would have used this had I seen it sooner.
- libinifile by Anders Lövgren: This one’s quite good too. Minimal, portable, low memory footprint, but a slightly unusual API (partly to give it the low memory footprint). I might well have used this one too, had I found it sooner.
- libini by Simon White: Plain C, SWIGgable, fairly small, though has a bit of an odd API (and I’m struggling to see why a simple INI file reader needs a 665KB configuration script :-).
- iniParser by Nicolas Devillard: This is the one we ended up using. Small (about 32KB of source) and fast. The only minor drawback is that it implements its own dictionary type in C, and we’re using the STL which already has one. (Still, in C, what else can you do?) Also, it looks like it’s been around forever and is well tested.
INI Not Invented Here (INIH)
Of course, in the time it took to investigate all these, I could have easily written my own. And, being unable to help myself, I did. :-) So I present you with my own offering: INI Not Invented Here, a.k.a. INIH or simply ini.h.
It contains a small C function, ini_parse(), that parses an INI file and executes the given callback function for each name=value pair parsed (think SAX).
The reason I used the callback style is so you don’t have to load the whole file into memory if you don’t need to — good for embedded systems. Plus, I wanted to be able to use the parser easily in C, but not implement a dictionary-like structure in C.
For a more user-friendly, lookup style API if you’re using C++, I’ve wrapped ini_parse() in a class called INIReader, which has Get() and GetInteger() methods (all I needed). And it’s easy to sub-class INIReader if you need fancier GetXYZ() functions.
Show us the code
For what it’s worth, I hereby put my code into the public domain.
The C code: ini.h, ini.c, ini_dump.c, test.ini
The C++ code: INIReader.h, INIReader.cpp, INIReaderTest.cpp
Or get it all in a single 5KB zip file: ini.zip
Have fun, and feel free to send any feedback my way!
bitchecker: Binary Irony
19 February 2009, by Ben 4 comments
Inspired by this article on a website called The Daily TLA, I wrote a fully-fledged bit verifier, which checks not only the ordinal values of all the bits on your hard drive, but also their Quantumness (TM).
At the heart of the Highly Patentable Algorithm are the following functions, which return true if there’s something wrong with either the value or the quantum nature of a given bit, but false if the bit is okay:
// Return true if bit's numericity is wrong (bit is neither zero nor one).
int invalid_numericity(int bit)
{
return bit != 0 && bit != 1;
}
// Return true if bit's quantum-ness is wrong (i.e., bit is both zero and
// one at the same time).
int invalid_quantumness(int bit)
{
return bit == 0 && bit == 1;
}
Download and run bitchecker.exe, or have a look at the full C source code. It will check your entire hard drive, displaying the files it’s checking, and showing you any bit errors as it finds them.
Note that I haven’t ported it to Linux, because open source software tends to have fewer problems with Quantum Bit Integrity than proprietary software.
If you have non-technical friends, point them to this handy utility — it will help them identify the worst bit errors on their computers in no time!
RAII, AC/DC, and the “with” statement
7 February 2009, by Ben 2 comments
Once upon a time there was a programming idiom called Resource Acquisition Is Initialization (RAII), and it almost deserved it. (Apologies to C. S. Lewis.)
First, a bit of context
Recently I wanted to figure out exactly what this RAII thing was that C++ and a few other languages have. With such a name, I figured it must be both extremely powerful and rather complicated.
What does “resource acquisition is initialization” even mean? Does it matter if you call it “initialization is resource acquisition”? And why initialization when everyone seems to think it’s mostly about what to do on destruction?
Well, it turns out it’s badly named. Even the C++ FAQ agrees:
However, if you dissect “RAII” as an acronym, and if you look (too?) closely at the words making up that acronym, you will realize that the words are not a perfect match for the concept. Who cares?!? The concept is what’s important; “RAII” is merely a moniker used as a handle for that concept.
Who cares? Maybe just me, but usually an acronym tells you what something is. What’s called RAII is actually a fairly simple concept, and perhaps more of us would use and understand it if it had a better name. So I hereby propose one:
Enter AC/DC
AC/DC — Acquire in Constructor, Destructor does Cleanup.
First AC/DC was two types of electrical current, then a heavy-metal band. And now it’s a C++ programming idiom. :-) The idea is to acquire or allocate all the resources you need in a class’s constructor, and clean them up or free them in the destructor.
My first thought when I’d figured out what RAII meant was, “Is that all? Isn’t that what everyone does anyway?” Well, yes and no.
For a start, this works in C++ because destructors are called in a deterministic way. The destructor is called when an object goes out of scope — and this is one of the key points — even when an exception occurs.
So when you use AC/DC, your resources are tied to objects, and when and however the objects die, the resources are freed too.
Well, how do you use it, and what’s it good for? Let’s see an example:
The old FILE-wrapper example
#include <cstdio>
#include <exception>
class CharReader {
public:
class Exception : public std::exception {
public:
virtual const char* what() const throw() { return "Error!"; }
};
static const int EndOfFile = EOF;
CharReader(const char* name) {
f = fopen(name, "r");
if (!f)
throw Exception();
}
~CharReader() {
printf("Closing file\n");
fclose(f);
}
int Read() {
int c = fgetc(f);
if (c == EOF && ferror(f))
throw Exception();
return c;
}
private:
FILE* f;
};
int main() {
int c;
try {
CharReader reader("CharReader.cpp");
while ((c = reader.Read()) != CharReader::EndOfFile) {
if (c == 'z') {
printf("'z' byte found in file!\n");
return 2;
}
printf("%02X ", c);
}
} catch (CharReader::Exception& e) {
printf("Error: %s\n", e.what());
return 1;
}
return 0;
}
How it works
As soon as you instantiate the reader it opens the file. The object owns the file until it goes out of scope and the destructor is called. However the program exits (with an early return or via an exception) the reader’s destructor is called and the file closed.
In this case it’s not a big deal (the OS will close files for you anyway), but when you’re writing a file and haven’t flushed, or when you’re connecting to a database and need to commit, or when you’ve disabled interrupts and need to re-enable them regardless of where the function returns — these are great uses for the AC/DC idiom.
Python’s “with” statement
Python (and Java, I think) doesn’t really do AC/DC, because when an object goes out of scope, its __del__ method isn’t necessarily called. __del__ is only called when the object’s reference count goes down to zero or when the object is collected as garbage.
The old-school way of coping with this was try ... finally, for example:
def log(s):
f = open('output.log', 'a+')
try:
f.write(s + '\n')
finally:
f.close()
That way, whether or not a write exception occurred, your file would be closed and the log flushed. But with the introduction of the with statement in Python 2.5 you can do what amounts to AC/DC using with:
def log(s):
with open('output.log', 'a+') as f:
f.write(s + '\n')
The difference is that it’s explicit, rather than implicit based on the scope of the object. (The with protocol uses the special functions __enter__ and __exit__ rather than the constructor and destructor.)
References
Some further reading if you’re interested:
- An explanation of RAII from the Firebird docs
- Exception Safety, a paper by Stroustrup that talks about RAII [pdf]
- Effective C++, a book by Scott Meyers that discusses RAII in depth
microBlog now The Brush Blog
30 January 2009, by Ben add a comment
Well, for quite some time it’s been all western on the quiet front around here. We hope to change that soon! We’ve also done a touch of re-branding — microBlog is now “The Brush Blog”.
The microPledge blog had slowed to a near standstill and because Brush Technology is the parent company, we decided it’d be better to run this as a Brush company blog from now on.
In short: watch this space.







