You have less worries
Yes, the sky is blue

DecentURL now requires an account

15 January 2010, by Ben    2 comments

So I finally gave up: DecentURL — my URL nicifier inspired by a reddit comment — now requires you to have an account to create URLs. It was just used too much for phishing, spam, and porn.

I tried various measures to reduce spam: deleting bad URLs by hand, bot detection, IP blacklisting, etc. None of them were very effective — most of the spammers were probably human, and they always seemed to find ways around my (admittedly simple) protection schemes.

Plus, DecentURL has always been just a neat little side project. But I just don’t have the incentive or time to keep deleting spam. Not to mention actioning emails from PayPal asking me to delete nasty phishing URLs.

I also realise that URL redirection services have their problems: they add another link in the chain, slowing things down and meaning a higher probability for an outage.

Most URL redirectors also obscure the destination. I like to think DecentURL does a bit better here, because it keeps (at least part of) the original domain in the final URL. So http://xxxyucksite.com/blahblah will turn into something like http://xxxyucksite.decenturl.com/z instead of http://tinyurl.com/z.

And I’d like to think DecentURL is still useful for turning URLs like

http://maps.google.com/maps?f=q&source=s_q&hl=en&geocode=& q=hagley+park,+christchurch&sll=-43.537552,172.617488& sspn=0.012802,0.01826&ie=UTF8&hq=&hnear=Hagley+Park&z=15&iwloc=A

into

http://maps.google.decenturl.com/hagley-park

So I’ve kept DecentURL running. All existing URLs will continue to redirect fine, but you’ll need a DecentURL username/password to make new ones. Just contact me if you’d like one.

MRO: Map Rows to Objects with web.py

6 January 2010, by Ben    2 comments

MRO is not an ORM. It’s not even the reverse of an ORM. Very simply, MRO maps rows to objects. It’s a thin layer on top of web.py’s equally thin database wrapper.

Why? Well, for a minimalist framework, I do like web.py’s close-to-the-SQL approach. But as soon as you have more than a couple of database operations, you find you’ve got code that repeats itself repeats itself. In the case of Gifty, I was repeating column names.

But I didn’t want a fancy ORM that gave me a new domain-specific language to worry about, or that supported every left, right, inner and outer join under the sun. I’ve found that for simple web apps, all I want is a bit of object with my row: a class that has attributes for columns, and lets you select and save rows.

I ended up with something that looks quite similar to Django’s database layer, albeit much simplified. (In hindsight, I may well have used Django for this project if I was doing it again.)

So here’s what MRO looks like:

# define a User object and its columns (SQL table name is "users")
class User(Table):
    _table = 'users'
    id = Serial(primary_key=True)
    username = String(secondary_key=True)
    hash = String()
    time = Timestamp(not_null=True, default='now()')

# create the users table with its columns and indexes
User.create()

# insert a new user into the database (defaults used for id and time)
bob = User(username='bob', hash='1234')
bob.save()

# fetch an existing user and update its hash column
bob = User('bob')
bob.hash = '4321'
bob.save()

# fetch an existing user (this time by primary key) and delete it
bob = User(42)
bob.delete()

# fetch an existing user (or None if no user called 'bill')
bill = User.get('bill')
if not bill:
    print 'Old Bill seems not to exist'

# get list of Users whose usernames start with 'ab' (also shows interpolation)
abusers = User.select(where='username LIKE $u', vars={'u': 'ab%'})

So, if you use web.py for a small web app, but you want a touch of class (ahem) with your database operations, go ahead and use MRO. Be aware that it’s an in-house tool (for instance, it only supports PostgreSQL at the moment).

Get MRO’s source code: mro.py

Go Forth and WikiReadit

4 December 2009, by Ben    4 comments

WikiReaderThe WikiReader is a little $99 gizmo that lets you read Wikipedia. Yep, that’s all it does. No mobile phone, no movie player, no Webkit-enabled browser.

There’s something about a product that does one simple thing well. But what really sets the WikiReader apart is that it lasts a year on 2 AAA batteries with no charging. How? The low-power LCD screen, and the tiny microprocessor.

But what’s even cooler, at least for someone who learned to program by dabbling in Forth, is that the device has a built-in Forth interpreter for testing the hardware and running small programs.

I was pleasantly surprised – I know that Forth is good for embedded work on tiny micros, but since the main WikiReader app is written in C, I was curious why they chose Forth for testing and apps. So I asked Christopher Hall, one of the main firmware developers. His reply was very informative, and he’s kindly allowed me to copy it here:

I have written testing programs in several languages, but compiled programs always have the problem of the edit, cross compile, load, and try to debug. Sometimes the platform can run BSD or Linux, and then you can have the full suite of tools on the platform. This is okay if the person doing the initial testing can write programs, but often the test is how to toggle a particular I/O line on/off and see the effect on the rest of the circuit. Then having some kind of scripting on the platform seems the best way to achieve this.
For the initial testing, just start the interpreter REPL and you can start the initial tests. Initially I looked at TCL and Python which I have used before, but they would take far too long to port since they need a lot of Posix system calls which do not exist for this platform.
I also considered Hedgehog, Pico Lisp or perhaps some simple Scheme interpreter but the syntax would probably be too difficult for the hardware engineers to use. Forth is pretty simple syntax and RPN was probably not too difficult for them to learn. Also it was easy to build the Forth interpreter, incrementally adding features until it is now an almost ANSI standard Forth.
Since I added all the device registers the hardware engineers can use commands like the following (I used the same register names as the datasheet):
P0_P0D p?    \ display value of port
1 P0_P0D p!  \ set port to 1
While waiting for the main application development I could build tests for items like the LCD and CTP with just a serial connection on the device itself – using cut/paste from Emacs to picocom to upload Forth words. This is much quicker than cross-compiling and swapping SD cards.
The Forth is rather slow in compiling, the dictionary search is quite slow for example, and the indirect threading adds run-time overhead so in its present form it is probably not fast enough for the main reader application, but for quick applications to try things out I find it very convenient.
Also, the first version was hand translated from a version of EForth for Linux before I migrated it to the ANSI standard. (I kept copies in samo-lib/forth/EForthOriginals subdirectory.)

Very neat. If Lisp is the secret weapon for developing web apps, maybe Forth is it for embedded apps. Both are extensible at the language level and both have real macros, but Lisp is high level and Forth is low level.

Well, you know what to buy me for Christmas:

feeling-nice? if  WikiReader buy  then

catdoc ported to Windows

15 September 2009, by Ben    2 comments

Recently I had to automatically extract text from a bunch of Word documents under Windows. I liked the looks of catdoc, but didn’t see a native Win32 port around. The source code looked so very close to compiling under MinGW, so I made the few minor changes necessary and got it working (catdoc, catppt, and xls2csv). Native Win32 executables, support for long filenames, etc.

Basically all I did was:

Nothing special, and it’s not perfect. But here is a zip of the compiled binaries and (GPL-licensed) source code, just for you:

catdoc-0.94.2-win32.zip

Code generation with X-Macros in C

21 August 2009, by Ben    6 comments

C and C++ are relatively non-dynamic languages, and one thing this means is that not repeating yourself (aka DRY) is often harder than in a language like Python.

For instance, when you’ve got a config file, a config structure, config defaults, and a config printer, you want all those things to come from a single spec. One good way around this problem is code generation — for example, using an XML spec with Python and Cheetah templates to generate C code.

But for simple C projects this can be overkill. And it turns out the age-old C preprocessor contains a few goodies that help with DRY programming. As the Wikipedia article says, one little-known usage pattern of the C preprocessor is known as “X-Macros”.

So what are X-Macros?

An X-Macro is a standard preprocessor macro (or just a header file) that contains a list of calls to a sub-macro. For example, here’s the config.def file for the INI-parsing code we’ll be looking at (uses my simple INI parser library):

/* CFG(section, name, default) */
CFG(protocol, version, "0")
CFG(user, name, "Fatty Lumpkin")
CFG(user, email, "fatty@lumpkin.com")
#undef CFG

That’s an X-Macro that defines a config file with a protocol version and user name and email fields. If we weren’t following DRY, our main code would specify the field names in the struct definition, repeat them for setting the default values, and repeat them again for loading and printing the structure.

To do this in X-Macro style, we just #include "config.def" repeatedly, but #define CFG to what we need each time we include it. Sticking with show-me-the-code, here’s a program that loads, stores, and prints our config:

#include <stdio.h>
#include <string.h>
#include "../ini.h"

/* define the config struct type */
typedef struct {
    #define CFG(s, n, default) char *s##_##n;
    #include "config.def"
} config;

/* create one and fill in its default values */
config Config = {
    #define CFG(s, n, default) default,
    #include "config.def"
};

/* process a line of the INI file, storing valid values into config struct */
int handler(void *user, const char *section, const char *name,
            const char *value)
{
    config *cfg = (config *)user;

    if (0) ;
    #define CFG(s, n, default) else if (stricmp(section, #s)==0 && \
        stricmp(name, #n)==0) cfg->s##_##n = strdup(value);
    #include "config.def"

    return 1;
}

/* print all the variables in the config, one per line */
void dump_config(config *cfg)
{
    #define CFG(s, n, default) printf("%s_%s = %s\n", #s, #n, cfg->s##_##n);
    #include "config.def"
}

int main(int argc, char* argv[])
{
    if (ini_parse("test.ini", handler, &Config) < 0)
        printf("Can't load 'test.ini', using defaults\n");
    dump_config(&Config);
    return 0;
}

Note that config.def is included 4 times, so you’d have to repeat yourself 3 times with no X-Macros. I admit it’s not beautiful artwork. But it’s not too ugly either — and it gets the job done with nothing but C’s built-in code generator.

Site Doublers: Website optimization

7 August 2009, by Ben    add a comment

Recently we’ve been running some Google AdWords and doing some SEO (Search Engine Optimization), and I must say it helps to know what you don’t know.

John Hyde of Site Doublers has been a great help on this score. He’s a consultant that helps you as a business optimize traffic to your website, via search engines and advertisements, and helps you convert visitors to sales once people are going to your website.

John’s been very professional to work with: he knows what he’s on about, he asks the right questions, and he does his homework. All of which to say, if you run a website, talk to John on +64 3 942 3799 or visit his website:

SiteDoublers logo

P.S. And no, John didn’t pay me to write this. :-)

fabricate: The better build tool

28 July 2009, by Ben    add a comment

We’ve been using Bill McCloskey’s memoize to build projects for a while now, and it works nicely, but only on Linux.

So enter fabricate. It was developed by us guys at Brush Technology for in-house use, but we thought it was cool enough to release into the wild.

From the project page:

fabricate is a build tool that finds dependencies automatically for any language. It’s small and just works. No hidden stuff behind your back. It was inspired by Bill McCloskey’s make replacement, memoize, but fabricate works on Windows as well as Linux.

Easy IP-to-country lookup in Python

10 July 2009, by Ben    5 comments

We’re branching into the U.S. market with our wedding registry website, Gifty. So first we grabbed a .com domain name, but we also had to make sure the price shows correctly in USD or NZD depending on where you’re from.

There are a number of tools available for geo-locating someone based on their IP address, including some free ones. MaxMind is pretty popular and nice to use, and their free GeoLite Country database did the trick for me.

Gifty runs on Python, so I wanted something I could just use in pure Python. It turns out that pygeoip is a nice Python replacement for MaxMind’s C-based API.

However, I was only interested in the country-code lookup, so I decided to strip it down and release the two-pages-of-Python version I’m using. Just grab MaxMind’s database and put the code in Python’s Lib/site-packages directory:

get geoip.py

And then to use it, simply type:

>>> import geoip
>>> geoip.country('202.21.128.102')
'NZ'

Blast from the demoscene past

25 June 2009, by Ben    one comment

Do you like the demoscene? Or do you just like smaller, faster, or embedded code? Read on.

When I was 14, I started learning how to program, for at least two reasons:

Scratch that. I still like the demoscene. I mean, who else can make incredible 3D tube or lattice demos in a 256-byte executable? Try them — both still run fine under Windows XP.

Fire effect and starfieldSo I read diskmags and tutes to learn how to program the VGA hardware, push pixels to 0xA000:0000, and use Mode X. Oh, and I learnt about sin and cos before I learnt at school — for basic 2D and 3D rotation. Then there were effects: the fire effect, plasma, starfields, wormholes, etc, etc. (Click on the piccy to the right to download some of my old source.)

Anyway, back from Second Reality to the real thing …

As I’ve noted before, I’m not exactly in favour of bloatware. But in today’s “a GB here, a GB there” world, is small still beautiful? I think so, for two reasons:

Embedded programming

In the embedded world, size still matters a lot. Microcontrollers are getting bigger and faster, sure, but in electronic products there’s often a place for the small ones (say 64KB flash, 2KB RAM). Just the other day, I cut our code size by 900 bytes, which was a significant percentage of the total — less code to download, test, and maintain.

And it’s not only important for small micros, but also to limit download time and cost for in-field updates. If you want to update code for 1000 units over a fairly slow and costly radio link, small is good.

Binary diffs or deltas are really good for this. My brother Berwyn has developed a proof-of-concept binary diffing algorithm which is designed for tiny embedded systems — contact us if you’re keen to hear more.

Binary diffing isn’t new, of course — bsdiff already does something similar for Firefox’s updates, so you only need to download a small update. But bsdiff doesn’t work on small embedded systems, because it uses a compression program which requires a fair amount of RAM (bzip2).

To go fast, do less

Yep, as the guy said: To go fast, do less.

And KISS. Keeping it Short and Simple means less code to test, and if you’re using basically the right approach and algorithm, it usually also means faster code. And to follow my own advice, I’m keeping this section short.

Conclusion

In a word, if you’re a budding hacker, or the parent of a budding hacker, teach them that small is still beautiful. And get ‘em started with the demoscene. There is still a pretty active ’scene community, and here are some starting points:

Pilot ships through Google Earth

21 May 2009, by Ben    one comment

Paul van Dinther, one of the folks we work with, has released a ship simulation game that uses Google Earth for its “terrain” data:

PlanetInAction.com released a new simulation game called Ships. In Ships you take the helm from a choice of 3D ships. What is special about this game is that it makes use of the rich 3D data present in Google Earth. The entire world is your playground.

“Ships” is a graphically rich environment with intricate visual effects that runs right inside your web-browser. All you need is a small Google Earth plugin. Take control now of the majestic Queen Mary 2 and hit the authentic fog horn as you leave the port of Rotterdam in the Netherlands. If water is not your thing then why not climb aboard the airship Hindenburg and check-out the Swiss alps.

Have fun sailing around the (real) world!