This page is a collection of news feeds from friends of mine and myself. Its not, as the name might imply, a planet of Horms, or for the most part even stuff written by me. For that, please go here.

Nearby Planets: Planet SLUG, Planet Linux Australia, Planet Russell
Subscriptions: Alan Robertson Alexander Reeder Anand Kumria Andrew Cowie Andy Fitzsimon Benno Chizuko Horman Chris DiBona Chris Yeoh Craige McWhirter Dave Miller Dave Ruys David Luyer Erik de Castro Lopo Horms Hugh Blemmings James Morris Jan Schmidt Jaq Jeff Waugh Jeremy Kerr John Ferlito Joseph Arruda Kfish Mark Greenaway Martin Pool Mary Gardiner Nick Jenkins Ozone Pete Ryland Peter Hardy Peter Nixon Pia Waugh Raster Real World Haskell Robert Collins Roberto Nibali Roger Barnes Russell Coker Rusty Sam Johnston Silvia Pfeiffer Ted T'so Tong Master Tractorgen - Commits Wichert Akkerman fusion94

-----

January 07, 2009

Rusty

Fun with cpumasks

I've been meaning for a while to write up what's happening with cpumasks in the kernel. Several people have asked, and it's not obvious so it's worth explaining in detail. Thanks to Oleg Nesterov for the latest reminder.

The Problems

The two obvious problems are

  1. Putting cpumasks on the stack limits us to NR_CPUS around 128, and
  2. The whack-a-mole attempts to fix the worst offenders is a losing game.

For better or worse, people want NR_CPUS 4096 in stock kernels today, and that number is only going to go up.

Unfortunately, our merge-surge development model makes whack-a-mole the obvious thing to do, but the results (creeping in largely unnoticed) have been between awkward and horrible. Here's some samples across that spectrum:

  1. cpumask_t instead of struct cpumask. I gather that this is a relic from when cpus were represented by an unsigned long, even though now it's always a struct.
  2. cpu_set/cpu_clear etc. are magic non-C creatures which modify their arguments through macro magic:
    #define cpu_set(cpu, dst) __cpu_set((cpu), &(dst))
    
  3. cpumask_of_cpu(cpu) looked like this:
    #define cpumask_of_cpu(cpu)
    (*({
            typeof(_unused_cpumask_arg_) m;
            if (sizeof(m) == sizeof(unsigned long)) {
                    m.bits[0] = 1UL(cpu);
            } else {
                    cpus_clear(m);
                    cpu_set((cpu), m);
            }
            &m
    }))
    
    Ignoring that this code has a silly optimization and could be better written, it's illegal since we hand the address of a going-out-of-scope local var. This is the code which got me looking at this mess to start with.
  4. New "_nr" iterators and operators have been introducted to only go to up to nr_cpu_ids bits instead of all the way to NR_CPUS, and used where it seems necessary. (nr_cpu_ids is the actual cap of possible cpu numbers, calculated at boot).
  5. Several macros contain implicit declarations in them, eg:
    #define CPUMASK_ALLOC(m)        struct m _m, *m = &_m
    ...
    #define node_to_cpumask_ptr(v, node)                                    \
                    cpumask_t _##v = node_to_cpumask(node);                 \
                    const cpumask_t *v = &_##v
    
    #define node_to_cpumask_ptr_next(v, node)                               \
                              _##v = node_to_cpumask(node)
    

But eternal vigilance is required to ensure that someone doesn't add another cpumask to the stack, somewhere. This isn't going to happen.

The Goals

The Solution

These days we avoid Big Bang changes where possible. So we need to introduce a parallel cpumask API and convert everything across, then get rid of the old one.

Conclusion

At this point, we will have code that doesn't suck, rules which can be enforced by the compiler, and the possibility of setting CONFIG_NR_CPUS to 16384 as the SGI guys really want.

Importantly, we are forced to audit all kinds of code. As always, some few were buggy, but more were unnecessarily ugly. With less review happening these days before code goes in, it's important that we strive to leave code we touch neater than when we found it.


January 07, 2009 09:49 AM

James Morris

Kernel Security Wiki

This is to announce a kernel security subsystem wiki, supported by the kind folk at kernel.org. It's intended for use by community developers and users of kernel security projects. (Note that it's not related to security response: the security contact for the kernel per the MAINTAINERS file is security @ kernel.org).

So far, there are sections on working with the security-testing git repo, a listing of various kernel security projects, and an events page.

If there's something you'd like to see or change on the wiki (particularly if it relates to your own project), create an account and make it so.


January 07, 2009 09:46 AM

-----

January 06, 2009

fusion94

Command Line Port Scanner OS X

Who knew that OS X has a command line port scanner? I for one didn't. I've always just used the Network Utility version. Here's how to access it.

As the superuser run this:

ln /Applications/Utilities/Network\ Utility.app/Contents/Resources/stroke /bin/stroke
chmod uo+x /bin/stroke

Then as any user you can call it from the terminal as such:

stroke address startPort endPort

So to check your local machine do:

stroke 127.0.0.1 0 1024

You should see output that looks similar to this:

Port Scanning host: 127.0.0.1

Open TCP Port: 80 http
Open TCP Port: 631 ipp


January 06, 2009 05:34 PM

James Morris

Security changes in the 2.6.28 kernel

Version 2.6.28 of the Linux kernel was released during Christmas, so I thought it'd be worthwhile waiting until after typical vacation days to post a summary of changes to the security subsystem. As always, thanks to the Kernel Newbies folk who track major kernel changes.


This was not a terribly exciting release for the security subsystem.

Thus far for the 2.6.29 kernel, the main change is the massive credentials API change from David Howells. This has caused a couple of regressions, which were picked up by subsystem testing of Linus' tree. Fixes have been developed and are currently partially merged upstream. It seems we need to get more testing done in linux-next to avoid such breakage during future merge windows.

Also noteworthy is the merge of the pathname security hooks for LSM, which should pave the way for TOMOYO and AppArmor in 2.6.30, subject to the general patch submission review process. TOMOYO is only a couple of acks from approval, has been baking in -mm, and is pretty much self-contained. It may even appear in 2.6.29 if the merge window is open for features long enough.


January 06, 2009 11:19 AM

Martin Pool

journalling vs blogging

Alex Payne writes of keeping a private journal, separate from his blogs.

The primary and obvious difference between journaling and blogging is that a journal is private. A journal exists only for the author’s personal consumption, or possibly as a posthumous record of a life. Without the modest audience my blog has accrued, I have no incentive to filter what I write for content or style. I can be as dry or as flowery as I like and nobody need suffer.


I've been doing this too, and went through some kind of dry spell in blogging in the process of working it out.

It seems like by blogging in 2009 you speak to a much larger audience, or a different kind of audience, to that you might have reached in say 2001. I remember one friend then writing very personal content in a blog, hidden in html comments. Was it to filter it to more technical readers, or to people looking closely rather than just skimming? Or was it, I speculate, a kind of desire to work out the boundary between the public and personal.

Beyond keeping a private diary, sometimes online and sometimes on paper, I've found it useful to keep special journals, append-only and dated, just in text files, on particular topics or projects. If I come back to a project after a gap of a few weeks or even a few minutes it reminds me where I was, and seems to help in reestablishing flow. And if it's the kind of issue that takes long-term contemplation it's quite enlightening to see what I thought of it six months ago. One can easily believe one always thought what one thinks now.


January 06, 2009 09:46 AM

trying out dopplr

I'm heading to South America later this month, which should be great. It's on my dopplr page, which looks like an interesting little service.


January 06, 2009 09:13 AM

On coding for fun

On coding for fun -- stay fresh by living a rounded life, and doing small fun technical projects. You know it makes sense.


January 06, 2009 09:11 AM

Dave Ruys

hilker: radicalrevolution: thedailywhat: Duly...



hilker:

radicalrevolution:

thedailywhat:

Duly noted.

[via.]

 All Australians, please also take note.

You’re not actually watching the “footy”, you are in fact watching the “throwy”, or the “barge into peopley”, or the “tedious outdoor pub-punch-on….y”


January 06, 2009 02:40 AM

-----

January 05, 2009

Benno

urllib2 and web applications

For a little personal project I’ve been working on recenly, I needed to create some unit tests for a CGI script, sorry, that should be, a cloud computing application. Of course, I turned to my trusty hammer, Python, and was a little disappointed to discover it didn’t quite have the batteries I needed included.

I kind of thought that urllib2 would more or less do what I wanted out of the box, but unfortunately it didn’t and (shock, horror!) I actually needed to write some code myself! The first problem I ran into is that urllib2 only supports GET and POST out of the box. HTTP is constrained enough in the verbs it provides, so I really do want access to things like PUT, DELETE and HEAD.

The other problem I ran into, is that I did’t want things automatically redirecting (although clearly this would be the normal use-case), because I wanted to check I got a redirect in certain cases.

The final problem that I had is that only status code 200 is treated as success, and other 2xx codes raise exceptions. This is generally not what you want, since 201, is a perfectly valid return code, indicating that new resource was created.

So, urllib2 is billed as An extensible library for opening URLs using a variety of protocols, surely I can just extend it to do what I want? Well, it turns out that I can, but it seemed to be harder than I was expecting. Not because the final code I needed to write was difficult or involved, but because it was quite difficult to work out what the right code to write is. I want to explore for a little bit why (I think) this might be the case.

urllib2 is quite nice, because simple things (fetch a URL, follow redirects, POST data), are very easy to do:

ret = urllib2.urlopen("http://some_url/") data = ret.read()

And, it is definitely possible to do more complex things, but (at least for me), there is a sharp discontinuity in the API which means that learning the easy API doesn’t help you learn the more complex API, and also the documentation (at least as far as I read it), doesn’t make it apparent that there are kind of two modes of operation.

The completely frustrating thing is that the documentation in the source file is much better than the online documentation! Since it talks about some of the things that happen in the module, which are otherwise “magic”.

For example, the build_opener function is pretty magical, since it does a bunch of introspection, and ends up either adding a handler or replacing a handler depending on the class. This is explained in the code as: if one of the argument is a subclass of the default handler, the argument will be installed instead of the default. , which to me makes a lot of sense, where-as the online documentation describes it as: Instances of the following classes will be in front of the handlers, unless the handlers contain them, instances of them or subclasses of them: ....<list of default handlers>. For me the former is much clearer than the latter!

Anyway, here is the code I ended up with:

opener = urllib2.OpenerDirector()
opener.add_handler(urllib2.HTTPHandler())

def restopen(url, method=None, headers=None, data=None):
    if headers is None:
        headers = {}
    if method is None:
        if data is None:
            method = "GET"
        else:
            method = "POST"
    return opener.open(HttpRestRequest(url, method=method, 
                                       headers=headers, data=data))

So, conclusions, if the dos don’t make sense, read the code. If you are writing an API, try to make it easier to get from the easy case to the more complex case. (This is quite difficult to do, and I have definitely been guilty in the past of falling into the same trap in APIs I have provided!). If you can’t get the API to easily transition from easy to hard, make sure you document it well. Finally, Python is great language for accessing services over HTTP, even if it does require a little bit of work to get the framework in place.


January 05, 2009 11:17 PM

Ozone

Objective-C 2.0 Accessors & Memory Management

Quite often, you may have simple setter methods that need to do a just a tiny bit of work before or after setting an instance variable. For example, maybe you need to redraw something after setting the property of an object. So, instead of writing this:

    [self setBackgroundColor:[NSColor blueColor]];
    [view setBackgroundColor:[NSColor blueColor]];

You’d probably want to move the relevant code to your -setBackgroundColor: accessor instead:

    - (void)setBackgroundColor:(NSColor*)color
    {
        // Assuming that _backgroundColor is the ivar you want to set
        if(_backgroundColor != color)
        {
            [_backgroundColor release];
            _backgroundColor = [color retain];
    
            // Update the view's background color to reflect the change
            [view setBackgroundColor:_backgroundColor];
        }
    }

Then you can simply call -setBackgroundColor: and expect it all to work nicely:

    // -setBackgroundColor: updates the view's background color
    // automatically now
    [self setBackgroundColor:[NSColor blueColor]];

(You could use Key-Value Observing to do this, but I generally avoid KVO for simple intra-class property dependencies like this. I don’t think the overhead of maintaining all the KVC dependencies and KVO-related methods is worth the cost.)

Of course, the above method requires that you write all that stupid boilerplate memory management code in the accessor. Instead of doing that, I tend to declare a private _backgroundColor property in the class, @synthesize a method for the private property, and then use the private property’s generated accessors instead:

    @interface MyClass ()
    
    // Declare a _private_ _backgroundColor property (thus the underscore
    // in front, and why it's declared in a class continuation rather than
    // in the public header)
    @property (copy, setter=_setBackgroundColor:) NSColor* _backgroundColor;
    
    @end
    
    //
    
    @implementation MyClass
    
    @synthesize _backgroundColor;
    
    - (NSColor*)backgroundColor
    {
        return [self _backgroundColor];
    }
    
    - (void)setBackgroundColor:(NSColor*)color
    {
        // Use the private property to set the background colour, so it
        // handles the memory management bollocks
        [self _setBackgroundColor:color];
    
        [view setBackgroundColor:[self _backgroundColor]];
    }
    
    ...
    
    @end

With that technique, it’s possible to completely directly setting ivars, and thus avoid -retain and -release altogether. (You’ll still need to use -autorelease at various times, of course, but that’s reasonably rare.) We have some source code files that are well over 2000 lines of code without a single explicit [_ivar retain]; or [_ivar release]; call thanks to this technique. (Yeah, 2000 lines is also large and the class needs refactoring, but that’s another story.)

Of course, you could just use garbage collection which avoids 99% of the need for this bollocks:

    - (void)setBackgroundColor:(NSColor*)color
    {
        // Yay GC!
        self->_backgroundColor = color;
    
        [view setBackgroundColor:self->_backgroundColor];
    }

But plenty of us don’t have that luxury yet. (iPhone, ahem.)


January 05, 2009 10:33 PM

git & less

For the UNIX users out there who use the git revision control system with the oldskool less pager, try adding the following to your ~/.gitconfig file:

[core]
    # search for core.pager in
    # <http://www.kernel.org/pub/software/scm/git/docs/git-config.html>
    # to see why we use this convoluted syntax
    pager = less -$LESS -SFRX -SR +'/^---'

That’ll launch less with three options set:

The last one’s the handy tip. I browse commits using git diff before committing them, and like being able to jump quickly back and forth between files. Alas, since less is a dumb pager and doesn’t understand the semantics of diff patches, we simply set the find regex to ^---, which does what we want.

Of course, feel free to change the options to your heart’s content. See the less(1) manpage for the gory details.

As the comment in the configuration file says, you’ll need to use the convoluted less -$LESS -SFRX prefix due to interesting git behaviour with the LESS environment variable:

Note that git sets the LESS environment variable to FRSX if it is unset when it runs the pager. One can change these settings by setting the LESS variable to some other value. Alternately, these settings can be overridden on a project or global basis by setting the core.pager option. Setting core.pager has no affect on the LESS environment variable behaviour above, so if you want to override git’s default settings this way, you need to be explicit. For example, to disable the S option in a backward compatible manner, set core.pager to "less -+$LESS -FRX". This will be passed to the shell by git, which will translate the final command to "LESS=FRSX less -+FRSX -FRX".

(And sure, I could switch to using a different pager, but I’ve been using less for more than a decade. Yep, I know all about Emacs & Vim’s diff-mode and Changes.app. It’s hard to break old habits.)


January 05, 2009 02:34 PM

Mark Greenaway

My vacation scholarship started today. I'm sitting in one of the honours offices slowly summarising from the textbook I've been assigned to understand. It's the sequel to the analysis textbook I used in second year, so I'm familiar with the author's writing style and standard methods of proof. I loved that course not only from the front and from behind, but from every sequence converging to it in R3, so I'm enjoying it so far. For those that way inclined, I'm looking at derivatives of measures and absolute continuity.
The office you occupy during a vacation scholarship is usually the one you stay in for your honours year, so I'll spend some time making the place comfortable. It seems like a good place to work.


January 05, 2009 05:39 AM

Dave Ruys

Two thousand and None

I decided that this year would be no alcohol for me, and no coffee on weekdays, in an effort to give my body some detox and to cut back on the disgraceful amounts of junk food that I eat when I drink either type of beverage. I thought 12 months should be a decent stint.  People keep asking whether moderation would be a better approach, but frankly I suck at it. Also, I find it a lot easier to work towards moderation from a point of total abstinence, than to try to work backwards from a point of over-indulgence (Silly Season….yeesh!).

You’d better thank me for this, heart….


January 05, 2009 04:13 AM

-----

January 04, 2009

Benno

IMAP passwords in Android

I’ve been setting up my new Android phone and found out that the password handling code in the IMAP client doesn’t work so well with my Dovecot IMAP server.

Now, I really can’t be bothered working out which side is doing it wrong, but Dovecot expects your password to be contained in double-quotes if it contains special character. (No, I don’t precisely know what those characters are!). And, of course, if you are double-quoing the string, you then need to escape and double-quotes in the password itself. Now, like I said, no idea of this is the correct (according to the standard) behaviour, but it is the behaviour that I have to deal with, since I can’t mess with the server. Of course, the Android IMAP client doesn’t do this escaping, and so you get an error message indicating your username / password is incorrect. Frustratingly, the error message doesn’t pass on the information from the server giving detail on exactly what the problem is so you are left guessing.

Anyway, it turns out if you manually escape the password that you give to the Android email client things work fine. Of course, the SMTP server doesn’t need this escaping, and fails if you do have it, so you need to type in a different, unescaped, password for SMTP. Fun and games!

Looking at the lastest source code for Android, it looks like this has been properly fixed, so hopefully and upgrade to the Cupcake branch in the near future will solve the problem.


January 04, 2009 01:47 PM

Silvia Pfeiffer

Top 10 commercials for 2008 on YouTube

I spent the last few days doing some nice research for Vquence, where I was able to watch lots of videos on YouTube. Fun job this is! :-) The full article is on the Vquence metrics blog.

One of the key things that I’ve put together is a list of top 10 commercials for 2008:

Rank Video Views Added
1 Cadbury - Gorilla 3,338,011 August 31, 2007
2 Nike - Take it to the NEXT LEVEL 3,184,329 April 28, 2008
3 Macbook Air 2,648,717 January 15, 2008
4 Centraal Beheer Insurance - Gay Adam 2,512,425 May 30, 2008
5 Vodafone - Beatbox 2,380,237 March 17, 2008
6 E*Trade - Trading Baby 2,061,818 February 01, 2008
7 Guitar Hero - Heidi Klum 1,068,055 November 03, 2008
8 Bridgestone - Scream 980,406 January 30, 2008
9 Bud Light- Will Ferrell 966,177 February 04, 2008
10 Careerbuilder - Follow your heart 605,465 February 03, 2008
Favorable mention OLPC - John Lennon 527,953 December 25, 2008
Favorable mention Blendtec - iPhone 3G 2,711,195 July 11, 2008
Favorable mention Stide Gum - Where the hell is Matt? 15,859,204 June 20, 2008

Enjoy! And let me know in the comments if you know of any other video ad released in 2008 in the same ballpark number of views that is an actual tv-style commercial.


January 04, 2009 11:42 AM

Chizuko Horman

旅に出ま〜す


クリスマス~お正月のお祭りウィークがようやく終わり、

明日から、Milduraというところへ。

シドニーから車で10時間西へ行ったところ。

マリーリバーのほとりに住む、義母の妹さんのところへ行くのです。

道中はファームヒツジおうし座ブタヒヨコあり、砂漠ビックリマークありのなかなかワクワクな旅になりそうにひひ

をたっぷり用意したし、食料も用意したし、ウチのプランツカマキリのドロシーごと預けたし、
まあまあ準備万端かな。

それではいってきますロケット


January 04, 2009 10:07 AM

-----

January 03, 2009

Mary Gardiner

RAID is not a backup solution, times one million

Via slashdot.org (yes really, I still pull in the headlines, although the miracle of feed readers has allowed me to confirm that yes, Ars Technica is a better read), a site called Journal Space, which hosted weblogs, lost all their data. They only had a RAID setup as backup, that is, a system that mirrors content between two disks and is designed to protect against disk failure. If you've heard of RAID, you hopefully already know that it is not the same as a backup: if software error or an accident or a malicious act deletes data from one disk, the RAID setup faithfully mirrors it to the other disk. If not, imagine that you have two magical whiteboards. One is copied exactly to the other. If one magical whiteboard totally breaks down, excellent, you have a full copy of your meeting notes and doodles on the other. (Note for accuracy, not all RAID configurations produce a full mirror and sometimes the mirror is spread over more than one spare disk. But you get the idea.) However, if someone rubs something off the whiteboard, or falls over while holding a can of solvent and splashes it on the first whiteboard, everything on it is immediately deleted from the other.

Instead, for home machines you want, most likely, an incremental backup, that is, a separate disk/machine with several copies of your data going back in time. Your data as it was an hour ago. Your data as it was a day ago. Your data as it was a month ago. And so on. I have snapshots of my data for every three hours over the last two months. (Sensible backup programs will notice when data is the same across two or more time periods and only store it once, so your backup disk does not need to be so very much larger than your normal disk.)

For business systems you want both: the quick recovery from disk failure that mirroring systems such as RAID offer, and incremental backups. (I don't maintain business grade systems, ask someone else for best practices if you need them. Internally consistent database backups are something you want to pay particular attention to.)

I note this because in November I gave a talk on home backups for Linux at SLUG and there is one other point of interest: do not trust third party providers to have good backups. It is getting increasingly common to have a lot of your most interesting data on someone else's servers: your email on Google's, your blog over at wordpress.com, contact details for all your friends on Facebook, and so on. But your provider can make both their own catastrophically bad decisions, like Journal Space, and have their creditors suddenly sell their hard disks off in a fire sale, as happened to Digital Railroad.

Which is a big problem, because a lot of third party providers do not provide an easy way to get your data ('easy' would be both a documented API accessible from common programming languages and an installable application), and lots don't provide any way at all. (There's also a whole batch of interesting issues to do with your comments or Wall postings or whatever: you don't necessarily have the right to reproduce them and there would be privacy implications when allowing you to back them up and reproduce them on some other side. LiveJournal, for one, solves this problem by not allowing easy backups of comments left on your journal.)

If your email host, blog host, calendar host, documents host or social networking host failed or deleted your account, how would you fare?


January 03, 2009 01:28 AM

-----

January 02, 2009

Benno

Debugging embedded code Using GDB and the Skyeye simulator

This post is basically a crash course in using gdb to run and debug low-level code. Learning to us a debugger effectively can be much more powerful than ghetto printf() and putc() debugging. I should point out that I am far from a power-gdb user, and am usually much more comfortable with printf() and putc(), so this is very much a beginners guide, written by a newbie. With those caveats in mind, lets get started.

So the first thing to do is to get our target up and running. For this our target will be a virtual device running with Skyeye. When you start up Skyeye and pass it the -d flag, e.g: $ skeye -c config.cfg -d. This will halt the virtual processor and provide an opportunity to attach the debugger. The debugger will be available on a UNIX socket. It defaults to port 12345. Of course a decent JTAG adapter should be able to give you the same type of thing with real hardware.

Now, you run GDB: $ arm-elf-gdb. Once gdb is running you need to attach to the target. To do this we use: (gdb) target remote :12345. Now you can start the code running with (gdb) continue.

Now, just running the code isn’t very useful, you can do that already. If you are debugging you probably want to step through the code. You do this with the step command. You can step through code line at-a-time, or instruction at-a-time. At the earliest stages you probably want to use the si command to step through instruction at-a-time.

To see what you code is doing you probably want to be able to display information. For low-level start up code, being able to inspect the register and memory state is import. You can look at the register using the info registers command, which prints out all the general-purpose registers as well as the program counter and status registers.

For examing memory the x command is invaluable. The examine command takes a memory address as an argument (actually, it can be a general expression that quates to a memory address). The command has some optional arguments. You can choose the number of units to display, the format to display memory in (hex (x), decimal (d), binary (t), character (c), instruction (i), string(s), etc), and also the unit size (byte (b), halfword (h), word (w)). So, for example to display the first five words in memory as hex we can do: (gdb) x /5x 0x0. If we want to see the values of individual bytes as decimal we could do: (gdb) x /20bd 0x0. Another common example is to display the next 5 instructions, which can be done with (gdb) x /5i $pc. The $pc expression returns the value in the pc register.

Poking at bits and bytes and stepping instruction at a time is great for low-level code, but gdb can end up being a lot more useful if it knows a little bit more about the source code you are debugging. If you have compiled the source code with the -g option, your ELF file should have the debugging information you need embedded in it. You can let gdb know about this file by using the (gdb) symbol program.elf. Now that you actually have symbols and debugging information, you can do things like normal then step command, and it will step through lines of source code (rather than instructions).

The other nice thing you have is that you can easily set breakpoint and watchpoints. (You don’t have to have source debugging enabled for this, but it makes things a lot easier!). Seting a breakpoint is easy, you can set it on a line e.g: (gdb) break file.c:37, or on a particular function e.g: (gdb) break schedule. Breakpoints are neat, but watchpoints are even cooler, since you can test for a specific conditions e.g: (gdb) watch mask 2000.

Now that you have these nice watchpoints and breakpoints, you probably find that most of the time, you just end up printing out some variables each time you hit the point. To avoid this repetitive typing you can use the display command. Each expression you install with the display command will be printed each time program execution stops (e.g: you hit a break-point or watch-point). This avoids a lot of tedious typing!

So, this is of course just scratching the surface. One final thing to consider that will likely make your time using gdb more useful and less painful (i.e: less repetitive typing), is the define command which lets you create simple little command scripts. The other is that when you start gdb you can pass a command script with the -x. So, you might want to consider, instead of littering your code with printf() statements everywhere you might want to write some gdb commands that enable a breakpoint and display some the relevant data.

Good luck, and happy debugging!


January 02, 2009 01:55 PM

Raster

Performance testing of several embedded systems

So I thought I'd do some tests since I got my new Beagleboard (thanks Thomas!). What is this? well you can go follow the link - but it's an OMAP3530 based devel board with pretty much nothing much on it other than the CPU. You need to plug in serial to get the console - there is no input or screen on it... but it does have DVI out and a usb OTG port (which so far has been utterly useless to me as my supposed USB-A to USB-Mini-A doesn't work for OTG :( ). Anyway. This little CPU (SoC - System on a Chip) is not too shabby. it's not even running at the 600Mhz rated that the 3530 can do (mind you that's overdrive - but I'd expect to use that normally as the other power case is 0 clock so while on and doing things - do them fast, otherwise sleep as much as you can). Also the RAM could be clocked up much higher (it's at 133Mhz on this board).

So that covers the hardware. Software wise I'm using expedite below. This is the Software-X11 engine. Output is 16bpp RGB565 (except the Desktop which is RGB32). The Desktop of course will have multiple threads rendering. It also gets MMX and SSE assembly. The ARM systems get the plain C routines. So thus far I have yet to put in any ASM for ARM in the software routines - but I intend to throw in some preload (pld) calls as well... but... what I really need to do is put in some NEON (OMAP3'x equivalent of MMX/SSE that actually doesn't suck). I have NOT done any of this - so the below numbers are RAW as-is. the PXA270 might have some room for optimising with Wireless MMX. The s3x2442 I think is pretty much not going to go anywhere as it's ARMv4, so a lost cause. So take the numbers for the PXA270 as "could be a little better". From what I have read of the 3530's NEON - it could get a fair bit better if not 50% or more on the FPS numbers below (yes the numbers are in frames per second, rendering to a QVGA sized window in all cases).


Treo650 Openmoko GTA02 BB (RevB) Desktop x86

PXA270 @ 312Mhz S3c2442 @ 400Mhz OMAP3530 @ 500Mhz Core2D @ 3Ghz

32M RAM 128M RAM 128M RAM 4096M RAM

FB X11 Glamo X11 FB X11 NV8600GT X11





Image Blend Unscaled 2.83 3.43 13.73 274.99
Image Blend Solid Unscaled 19.69 15.86 63.37 1054.19
Image Blend Nearest Scaled 1.97 2.11 6.41 144.23
Image Blend Nearest Solid Scaled 20.98 15.01 66.84 1171.3
Image Blend Smooth Scaled 0.65 0.47 1.33 26.28
Image Blend Smooth Solid Scaled 13.74 9.28 33.82 552.79
Image Blend Nearest Same Scaled 4.97 5.34 20.29 378.42
Image Blend Nearest Solid Same Scaled 26.56 21.91 82.72 1203.65
Image Blend Smooth Same Scaled 1.56 1.14 4.1 60.03
Image Blend Smooth Solid Same Scaled 14.78 10.65 38.1 436.62
Image Blend Border 0.78 0.56 1.49 33.1
Image Blend Solid Border 15.33 10.76 38.85 482.57
Image Blend Border Recolor 0.68 0.47 1.36 42.68
Image Quality Scale 21.14 12.93 47 1155.77
Image Data ARGB 27.41 17.72 89.24 1966.35
Image Data ARGB Alpha 15.22 2.55 24.96 341.48
Image Data YCbCr 601 Pointer List 16.92 13.1 49.99 1228.43
Image Data YCbCr 601 Pointer List Wide Stride 18.96 13.14 50.94 1107
Image Crossfade 21.75 15.78 66.57 1705.74
Text Basic 13.83 11.33 39.12 477.44
Text Styles 1.63 1.42 3.58 36.94
Text Styles Different Strings 2.12 1.75 4.79 49.94
Text Change 12.02 10.3 33.37 409.65
Textblock Basic 14.77 12.29 38.72 327.28
Textblock Intl 10.86 18.63 67.42 424.31
Rect Blend 2.52 2.35 8.25 411.9
Rect Solid 24.21 16.33 76.55 1220.57
Rect Blend Few 116.42 106.76 448.06 6609.17
Rect Solid Few 135.7 124.04 556.87 7809.12
Image Blend Occlude 1 Few 44.19 35.23 144.58 2749.2
Image Blend Occlude 2 Few 21.7 21.48 91.69 1743.54
Image Blend Occlude 3 Few 21.3 22.37 91.73 1839.74
Image Blend Occlude 1 14.78 12.18 50.21 760.61
Image Blend Occlude 2 10.39 9.58 39.77 680.6
Image Blend Occlude 3 5.71 6.2 25.11 457.63
Image Blend Occlude 1 Many 5.67 4.89 20.44 273.9
Image Blend Occlude 2 Many 4.71 4.52 18.42 270.16
Image Blend Occlude 3 Many 2.4 2.7 10.68 197.47
Image Blend Occlude 1 Very Many 0.5 0.51 2.43 20.84
Image Blend Occlude 2 Very Many 0.47 0.48 2.35 21.42
Image Blend Occlude 3 Very Many 0.38 0.44 1.92 25.46
Polygon Blend 4.51 4.14 14.81 293.42
AVERAGE 17.06 14.34 59.33 963.71


January 02, 2009 12:00 PM

Benno

The trouble with literal pools

Yesterday we saw how with some careful programming, and the right compiler flags we could get GCC to generate optimally small code for our particular function. In this post we take a look at one of the ways in which GCC fails to produce optimal code. Now there are actually many ways, but the I want to concentrate on in this post is the use of literal pools.

So, what is a literal pool? I’m glad you asked. The literal pool is an area of memory (in the text segment), which is used to store constants. These constants could be plain numerical constants, but their main use is to store the address of variables in the system. These addresses are needed because the ARM instruction does not have any instructions for directly loading (or storing) an address in memory. Instead ldr and str can only store at a ±12-bit offset from a register. Now there are lots of ways you could generate code with this restriction, for example, you could ensure your data section is less than 8KiB in size, and reserve a register to be used as a base for all data lookups. But such approach only works if you have a limited data section size. The standard approach that is taken is that when a variable is used its address is written out into a literal pool. The compiler then generates two instructions, first to read the address from this literal pool, and the second is the instruction to access the variable.

So, how exactly does this literal pool work? Well, so that a special register is not needed to point at the literal pool, the compiler uses the program counter (PC) register as the base register. The generated codes looks something like: ldr r3, [pc, #28]. That codes loads a value at a 28-byte offset from the current value of PC into the register r3. r3 then contains the address of the variable we want to access, and can be used like: ldr r1, [r3, #0], which loads the value of the variable (rather than the address) into r1. Now, as the PC is used as the base for the literal pool access, it should be clear that the literal pool is stored close enough to the code that needs to use it.

To ensure that the literal pool is close enough to the code using it, the compiler stores a literal pool at the end of each function. This approach works pretty well (unless you have a 4KiB+ function, which would be silly anyway), but can be a bit of a waste.

To illustrate the problem, consider this (contrived) example code:

static unsigned int value;

unsigned int
get_value(void)
{
    return value;
}

void
set_value(unsigned int x)
{
    value = x;
}

Now, while this example is contrived, the pattern involved exhibits itself in a lot of real-world code. You have some private data in a compilation unit (value), and then you have a set of accessor (get_value) and mutator (set_value) functions that operate on the private data. Usually the functions would be more complex than in our example, and usually there would be more than two. So lets have a look at the generated code:

00000000 <get_value>:
get_value():
   0:	4b01      	ldr	r3, [pc, #4]	(8 <get_value+0x8>)
   2:	6818      	ldr	r0, [r3, #0]
   4:	4770      	bx	lr
   6:	46c0      	nop			(mov r8, r8)
   8:	00000000 	.word	0x00000000
			8: R_ARM_ABS32	.bss

0000000c <set_value>:
set_value():
   c:	4b01      	ldr	r3, [pc, #4]	(14 <set_value+0x8>)
   e:	6018      	str	r0, [r3, #0]
  10:	4770      	bx	lr
  12:	46c0      	nop			(mov r8, r8)
  14:	00000000 	.word	0x00000000
			14: R_ARM_ABS32	.bss

You can see that each function has a literal pool (at address 0x8 and 0x14). You can also see that there is a relocation associated with each of these addresses (R_ARM_ABS32 .bss). This relocation means that at link time the address of value will be stored at locations 0x8 and 0x14. So, what is the big deal here? Well, there are two problems. First, we have two literal pools containing duplicate data, by storing the address of value twice, we are wasting 4 bytes (remember from yesterday, we have a very tight memory budget and we care where every byte goes). The second problem, is that we need to insert a nop in the code (at address 0x6 and 0x12), because the literal pool must be aligned.

So, how could the compiler be smarter? Well, if instead of generating a literal pool for each individual function it did it for the whole compilation unit, then instead of having lots of little literal pools with duplicated data through-out, we would have a single literal pool for the whole file. As a bonus, you would only need alignment once as well! Obviously if the compilation unit ends up being larger than 4KiB then you have a problem, but in this case you could still save up producing the literal pool until after 4KiB worth of code. As it turns out the commercial compiler from ARM, RVCT, does exactly this. So lets have a look at the code it generates:

00000000 <get_value>:
get_value():
   0:	4802      	ldr	r0, [pc, #8]	(c <set_value+0x6>)
   2:	6800      	ldr	r0, [r0, #0]
   4:	4770      	bx	lr

00000006 <set_value>:
set_value():
   6:	4901      	ldr	r1, [pc, #4]	(c <set_value+0x6>)
   8:	6008      	str	r0, [r1, #0]
   a:	4770      	bx	lr
   c:	00000000 	.word	0x00000000
			c: R_ARM_ABS32	.data$0

You see that the code is more or less the same, but there is just one literal pool right at the end of the file, and no extra nops are needed for alignment. Without merging literal pools we have a .text size of 24 bytes, with the merging we slash that down to 16 bytes.

So merging literal pools is pretty good, but the frustrating thing is that in this example, we don’t even need the literal pool!. If we examine the final compiled image for this program:

Disassembly of section ER_RO:

00008000 <get_value>:
get_value():
    8000:	4802      	ldr	r0, [pc, #8]	(800c <set_value+0x6>)
    8002:	6800      	ldr	r0, [r0, #0]
    8004:	4770      	bx	lr

00008006 <set_value>:
set_value():
    8006:	4901      	ldr	r1, [pc, #4]	(800c <set_value+0x6>)
    8008:	6008      	str	r0, [r1, #0]
    800a:	4770      	bx	lr
    800c:	00008010 	.word	0x00008010
Disassembly of section ER_RW:

00008010 <value>:
    8010:	00000000 	.word	0x00000000

You should notice that the actual location of the value variable is 0x8010. At address 0x800c we have the literal pool storing the address of a variable which is in the very next word! If we optimised this by hand, we would end up with something like (need to verify the offset):

Disassembly of section ER_RO:

00008000 <get_value>:
get_value():
    8000:	4802      	ldr	r0, [pc, #4]	(8008 <set_value+0x8>)
    8002:	4770      	bx	lr

00008004 <set_value>:
set_value():
    8004:	6008      	str	r0, [pc, #0]
    8006:	4770      	bx	lr
Disassembly of section ER_RW:

00008008 <value>:
    8008:	00000000 	.word	0x00000000

If we get rid of the literal pool entirely, we save the memory of the literal pool itself (4 bytes), plus the two instructions need to load values out of the literal pool (4 bytes). This cuts our text size down to a total of only 8 bytes! This is a factor 3 improvement over the GCC generated code. Granted, you are not always going to be able to perform this type of optimisation, but when you care about size, it is quite important. It would be nice if gcc supported a small data section concept so that you could specify variables that essentially resised within the literal pool instead of needing an expensive (in terms of space and time) indirection.

For this project, it looks like the code will have to be hand tweaked assembler, which is frustrating, because when you use a person as your compiler iterations become really expensive, and you want to make sure you get your design right up front.


January 02, 2009 11:43 AM

-----

January 01, 2009

Mary Gardiner

New Years' Encouragements

From RavenBlack:

New Year's Encouragements. Instead of making pressurey resolutions for yourself, make positive uplifting recommendations for other people. No negativity allowed, and try not to even imply something negative (eg. "eat better" implies you were eating poorly, but "make delicious home-cooked meals at least once a week" is pretty cleanly positive, and "make more delicious home-cooked meals because your cooking is great" is better still.)

Anyone with encouragements of this positive type may contact me via my preferred method or my LiveJournal, if you have access. (I am beginning, finally, to think about allowing comments on puzzling.org directly, but it's not likely to happen very soon.)

You are hereby invited to do this in your own weblog.


January 01, 2009 10:35 PM

Benno

Small code

A project that I’m working on at the moment calls for a very small footprint. This post is about how to make code really small for the ARM architecture.

As you can probably guess, I’m interested in operating system code, so as an example, I’ve taken a very simple piece of operating system code, and stripped it right back to demonstrate some of the techniques to use when optimising for space. So here is the snippet of code we are going to optimise:

01 struct thread {
02     unsigned int notify_bits;
03 };
04 
05 unsigned int
06 poll(struct thread *thread, unsigned int mask)
07 {
08     unsigned int result;
09     result = thread->notify_bits & mask;
10     if (result) {
11         thread->notify_bits &= ~result;
12     }
13     return result;
14 }

In this very simple operating system we have threads (thread data is stored in struct thread). Each thread has a set of 32 signals (encoded in a single word notify_bits). This poll is used by a thread to determine if it has been sent any signals. The mask parameter is the set of signals that the thread is interested in checking. So, a thread can check if a single signal has been thread, or if any signal has been set, or if a specific subset of signals has been set. The function returns the signals that are available (which is simply the bit-wise and of notify_bits and mask). It is important that the function clears any signals that have been returned. This makes sure that if poll is called twice the same signals are not returned. This is achieved in lines 10—12.

So, our goal is to try and get this code as small as possible. And every byte counts! First off, we just try and compile this with the standard ARM gcc compiler. I’m using version 4.2.2. So we start with: $ arm-elf-gcc -c poll.c -o poll.o. We can then use object dump to work out what the compiler did: $ arm-elf-objdump -dl poll.o.

00000000 <poll&ht:
poll():
   0:	e1a0c00d 	mov	ip, sp
   4:	e92dd800 	push	{fp, ip, lr, pc}
   8:	e24cb004 	sub	fp, ip, #4	; 0x4
   c:	e24dd00c 	sub	sp, sp, #12	; 0xc
  10:	e50b0014 	str	r0, [fp, #-20]
  14:	e50b1018 	str	r1, [fp, #-24]
  18:	e51b3014 	ldr	r3, [fp, #-20]
  1c:	e5932000 	ldr	r2, [r3]
  20:	e51b3018 	ldr	r3, [fp, #-24]
  24:	e0023003 	and	r3, r2, r3
  28:	e50b3010 	str	r3, [fp, #-16]
  2c:	e51b3010 	ldr	r3, [fp, #-16]
  30:	e3530000 	cmp	r3, #0	; 0x0
  34:	0a000006 	beq	54 <poll+0x54>
  38:	e51b3014 	ldr	r3, [fp, #-20]
  3c:	e5932000 	ldr	r2, [r3]
  40:	e51b3010 	ldr	r3, [fp, #-16]
  44:	e1e03003 	mvn	r3, r3
  48:	e0022003 	and	r2, r2, r3
  4c:	e51b3014 	ldr	r3, [fp, #-20]
  50:	e5832000 	str	r2, [r3]
  54:	e51b3010 	ldr	r3, [fp, #-16]
  58:	e1a00003 	mov	r0, r3
  5c:	e24bd00c 	sub	sp, fp, #12	; 0xc
  60:	e89da800 	ldm	sp, {fp, sp, pc}

So, our first go gets us 100 bytes of code. Which is 10 bytes (or 2.5 instructions) for each of our lines of code. We should be able to do better. Well, the first thing is we should try is to use optimisation: $ arm-elf-gcc -c poll.c -o poll.o -O2. This gives us a much better code generation output:

00000000 <poll>:
poll():
   0:	e5902000 	ldr	r2, [r0]
   4:	e1a0c000 	mov	ip, r0
   8:	e0110002 	ands	r0, r1, r2
   c:	11e03000 	mvnne	r3, r0
  10:	10033002 	andne	r3, r3, r2
  14:	158c3000 	strne	r3, [ip]
  18:	e12fff1e 	bx	lr

So this got us down to 28 bytes. A factor 4 improvement for one compiler flag, not bad. Now, -O2 does some standard omptimisations, but -Os, will do optimisations specifically for reducing the amount of code. So trying: $ arm-elf-gcc -c poll.c -o poll.o -Os. This gives a little bit better code-gen:

00000000 <poll>:
poll():
   0:	e5903000 	ldr	r3, [r0]
   4:	e1a02000 	mov	r2, r0
   8:	e0110003 	ands	r0, r1, r3
   c:	11c33000 	bicne	r3, r3, r0
  10:	15823000 	strne	r3, [r2]
  14:	e12fff1e 	bx	lr

Down to 24 bytes (6 instructions), is pretty good. Now, as you can see the generated code has 32-bits per instruction. The some of the ARM architectures have two distinct instruction sets, ARM and Thumb. The Thumb instruction set uses 16-bit per instruction, instead of 32-bit. This denser instruction set can enable much smaller code sizes. Of course there is a trade-off here. The functionality of the 16-bit instructions is going to be less than the 32-bit instructions. But lets give it a try. At the same time, we will tell the compiler the exact CPU we want to compile for (which is the ARM7TDMI-S) in our case. The compiler line is: $ arm-elf-gcc -c poll.c -o poll.o -Os -mcpu=arm7tdmi -mthumb. Which produces code like:

00000000 <poll>:
poll():
   0:	6803      	ldr	r3, [r0, #0]
   2:	1c02      	adds	r2, r0, #0
   4:	1c08      	adds	r0, r1, #0
   6:	4018      	ands	r0, r3
   8:	d001      	beq.n	e <poll+0xe>
   a:	4383      	bics	r3, r0
   c:	6013      	str	r3, [r2, #0]
   e:	4770      	bx	lr

So, now we are down to 16 bytes, so in Thumb we need 8 instructions (2 more than ARM), but each is only 2 bytes, not 4, so we end up with a 1/3 improvement. To get any further, we need to start looking at our code again, and see if there are ways of improving the code. Looking at the code again:

00 unsigned int
01 poll(struct thread *thread, unsigned int mask)
02 {
03     unsigned int result;
04     result = thread->notify_bits & mask;
05     if (result) {
06         thread->notify_bits &= ~result;
07     }
08     return result;
09 }

You may notice that the branch instruction on line 5, you may notice that this is actually redundant. If result is zero, then ~result well be 0xffffffff. Given this thread->notify_bits &= 0xffffffff will not change the value of thread->notify_bits. So, we can reduce this to:

00 unsigned int
01 poll(struct thread *thread, unsigned int mask)
02 {
03     unsigned int result;
04     result = thread->notify_bits & mask;
05     thread->notify_bits &= ~result;
06     return result;
07 }

When we compile this we get down to:

00000000 <poll>:
poll():
   0:	6803      	ldr	r3, [r0, #0]
   2:	4019      	ands	r1, r3
   4:	438b      	bics	r3, r1
   6:	6003      	str	r3, [r0, #0]
   8:	1c08      	adds	r0, r1, #0
   a:	4770      	bx	lr

This gets us down to 6 instructions (12 bytes). Pretty good since we started at 100 bytes. Now lets look at the object code in a little bit more detail. If you look at address 0x8, the instruction simply moves register r1 into register r0 so that it in the right place for return. (Note: The ARM ABI has the return value stored in r0). This seems like a bit of a waste, it would be good if there was a way we could have the value stored in r0 and not waste an instruction just moving values between registers. Now, to get a better understanding of what the instructions are doing, I’m going to slightly rewrite the code, and then compile with debugging, so we can see how the generated code matches up with the source code. So first, lets rewrite the code a little bit:

00 unsigned int
01 poll(struct thread *thread, unsigned int mask)
02 {
03     unsigned int tmp_notify_bits = thread->notify_bits;
04     mask &= tmp_notify_bits;
05     tmp_notify_bits &= ~mask;
06     thread->notify_bits = tmp_notify_bits;
07     return mask;
08 }

You should convince yourself that this code is equivalent to the existing code. (The code generated is identical). So, we can now line up the source code with the generated code. On line 03 (unsigned int tmp_notify_bits = thread->notify_bits), this matches up with address 0x0 (ldr r3, [r0, #0]). So, register r3 is used to store variable tmp_notify_bits. The parameter thread is stored in register r0. Now, line 04 (mask &= tmp_notify_bits) matches directly with address 0x2 (ands r1, r3). Register r1 matches directly with the mask parameter. The important part of restructuring the code as we have done, is that it becomes obvious that we can directly using the mask parameter instead of needing an extra variable like in the previous code. As we continue line 05 (tmp_notify_bits &= ~mask), matches directly to 0x4 (bics r3, r1). The bics instruction is quite neat in that it can essentially do the &= ~ in one 16-bit instruction. Line 06 (thread->notify_bits = tmp_notify_bits) stores the result back to memory, matches directly to 0x6. Now the final line of code (return mask;), needs two instructions 0x8 and 0xa (adds r0, r1, #0 and bx lr). Now the reason we need to instructions is because mask is stored in register r1, and the return value needs to be in register r0. So how can we get mask to be stored in r0 all along? Well, if we switch the parameters around poll(unsigned int mask, struct thread *thread), the mask will instead be stored in r0 instead of r1. (Note: We don’t necessarily have to change the interface of the function. If we want to support keep the interface we can use a simple macro to textually swap the parameters.) If we compile this, we get the following generated code:

00000000 <poll>:
poll():
   0:	680b      	ldr	r3, [r1, #0]
   2:	4018      	ands	r0, r3
   4:	4383      	bics	r3, r0
   6:	600b      	str	r3, [r1, #0]
   8:	4770      	bx	lr

So, we have got this down to 5 instructions, 10 bytes. This is a factor 10 improvement, not bad!

So a bit of a review:


January 01, 2009 09:05 PM

Chizuko Horman

Happy New Year 花火!


あけましておめでとうございます。

日本よりも2時間早く、年が明けました~晴れ

友人宅でのパーティーに参加してたんだけど、バルコニーからハーバーブリッヂ、大きな木を挟んで見えてたから、こりゃ花火も観れるわ~アップと、楽しみに呑み騒ぎながら待ってた。

そうそう、2008年て、いつもよりも1秒多かったんだって。知ってるかい?

カウントダウンは51秒からスタートして、60秒まで数えてその後0になったんだよ~。(通常は50秒からスタートして59秒の次が0だよね。)


地球の回転がちょっとだけだったから、調整するためだったようだけど、いままでもこんなことってあったのかねえ~。

そんで0になった瞬間、ドドーンと一斉に打ち上がった花火クラッカー
友人宅からはハーバーブリッヂ含めて3カ所分の花火が一気に見えた~音譜

体感したい方はコチラ ごらんあれ。

http://www.smh.com.au/photogallery/2009/01/01/1230681584600.html




January 01, 2009 03:35 AM

Russell Coker

Tidal River

Tim (a member of my local LUG) writes about some observations he has made of a nearby river and speculates on a tidal bore-like phenomenon [1]. One thing that surprised me was how short the list was on the Tidal Bore Wikipedia page [2], and the fact that is it missing ...


January 01, 2009 12:47 AM

Mary Gardiner

2009 plans

One year I'd like to do the same project Skud is doing for 2009, that is: a resolution a week. But this year is a finishing year for me, not a starting year.

A quick wrap-up on 2008 resolutions:

The first half of 2008 was pretty difficult for me. In retaliation, Andrew and I screamed around the south island of New Zealand in August. I do recommend its recuperative powers.

Major goals for 2009:

Other plans in 2009:

Also, given the PhD submission thing, I will probably be looking for a job towards the end of 2009. I do not know yet if I am going to apply for postdoctoral positions, this will probably depend on achieving a couple of major conference acceptances in 2009. And on deciding whether I want to live in the northern hemisphere. Even if I am I may be looking for programming or similar work in the short-term to wait out my examination process. So, keep in touch, if you want to offer me work, or come to my submission party. Or both.


January 01, 2009 12:34 AM

-----

December 31, 2008

Russell Coker

Links December 2008

A teacher in Arizona steals Linux CDs from a student and then accuses a Linux distributor of being a criminal [1]. Even though she had used Linux in the past she didn't believe that software was free. Of course that implies that in the past she had performed ...


December 31, 2008 12:47 PM

Chizuko Horman

New Years Eve 2008


今日は朝から大快晴で夏日和なシドニー晴れ


いよいよ今年最後の日

とはやっぱり感じにくいけど。。。

クリスマスのあとは、近所の友達の実家で庭でパラソルの下でランチを食べてる最中、シドニーの夏らしく通り雨がやってきた。パラソルは雨用のパラソルではなく、ただの日よけ用だったので、霧のような雨が食卓に降り注ぐ中、ウチ以外だれも気にせず庭でのランチを楽しんでいた。う~ん、何事に置いても日本での感覚からみてかなりニブイ、もといおおらかなオーストラリアの人々である。

ここ2、3日は、特に夏らしくなってきて、汗を垂れ流しながらアイロンをかけたりドライヤーで髪を乾かしてる尻から汗が噴き出してきて結局乾かないとか、夏を満喫キラキラしている。

クリスマス・ボクシングデーのあとは、普通にみんな31日まで仕事して、新年2日から仕事再開というスケジュールなので、私も仕事をしようかと試みるが、連日の飲み倒しによる二日酔い暑さと日本では休みだという気持ちから、全くやる気がおきない~。。。

そして今日はどんちゃん騒ぎなニューイヤーズイブ。シドニーシティでは、大半の道路が通行止めになり、夜のハーバー五カ所で開催される花火打ち上げキラキラに備えるよう。
メインはやっぱりハーバーブリッヂ。ハーバーブリッヂの上から打ち上げるブリッヂ型の花火シャワーは、今年も目玉になるのだろう。私も観に行きたかったが、人ごみキライやし、今晩は友達の家でパーティーなので、メインの花火は観ないと思う。その友達の家も、ハーバーから近いので、別な場所から打ち上げる花火は見えるかもしれない音譜

さて、今から今晩のパーティーに持参するしょっぱいパウンドケーキを作ろう~音譜

みなさま、来年もどうぞ宜しくお願いいたします~。




December 31, 2008 01:32 AM

-----

December 30, 2008

Dave Ruys

davereed: kristinmick: NASA Image of the Day - Pipsqueak Star...



davereed:

kristinmick:

NASA Image of the Day - Pipsqueak Star Unleashes Monster Flare


December 30, 2008 10:57 PM

Erik de Castro Lopo

Parsec and Expression Parsing.

Haskell's Parsec parsing library (distributed with the GHC compiler since at least version 6.8.2) contains a very powerful expression parsing module Text.ParserCombinators.Parsec.Expr. This module makes it easy to correctly parse arithmetic expressions like the following according to the usual programming language precedence rules.


  2 + a * 4 - b

Once parsed, the abstract syntax tree for that expression should look like this:


hackfest living room

The original Parsec paper by Daan Leijen even gives a small example of using the expression parser. As can be seen below, the operators are defined in a table (ie a list of lists) where the outer list is a set precedence levels (from highest precedence to lowest) and each inner list specifies all the operators for that precedence level.


  module Parser
  where
  
  import Text.ParserCombinators.Parsec
  import Text.ParserCombinators.Parsec.Expr
  
  expr :: Parser Integer
  expr = buildExpressionParser table factor <?> "expression"

  table :: [[ Operator Char st Integer ]]
  table = [
      [ op "*" (*) AssocLeft, op "/" div AssocLeft ],
      [ op "+" (+) AssocLeft, op "-" (-) AssocLeft ]
      ]
    where
      op s f assoc = Infix (do { string s ; return f }) assoc
  
  factor = do { char '(' ; x - expr ; char ')' ; return x }
     <|> number
     <?> "simple expression"
  
  number :: Parser Integer
  number = do { ds <- many1 digit; return (read ds) } <?> "number"

The above simple example works fine but there is a potential for a rather subtle problem if the expression parser is used in conjunction with the Text.ParserCombinators.Parsec.Token module.

The problem arises when trying to parse expressions in C-like languages which have bitwise OR (|) as well as logical OR (||) and where the bitwise operation has higher precedence than the logical. The symptom was that code containing the logical OR would fail because the expression parser was finding two bitwise ORs. After banging my head against this problem for a considerable time, I posted a problem description with a request for clues to the Haskell Cafe mailing list with the following code snippet:


  import qualified Text.ParserCombinators.Parsec.Expr as E

  opTable :: [[ E.Operator Char st Expression ]]
  opTable = [
      -- Operators listed from highest precedence to lowest precedence.

      {- snip, snip -}

      [    binaryOp "&" BinOpBinAnd E.AssocLeft ],
      [    binaryOp "^"  BinOpBinXor E.AssocLeft ],
      [    binaryOp "|"  BinOpBinOr E.AssocLeft ],

      [    binaryOp "&&" BinOpLogAnd E.AssocLeft ],
      [    binaryOp "||" BinOpLogOr E.AssocLeft ]
      ]

  binaryOp :: String -> (SourcePos -> a -> a -> a)
                     -> E.Assoc -> E.Operator Char st a
  binaryOp name con assoc =
      E.Infix (reservedOp name >>
          getPosition >>=
          return . con) assoc

Unfortunately, no real answer was forthcoming and while I still held out hope that an answer might eventuate, I continued to work on the problem.

Eventually, after reasoning about the problem and looking at the source code to the Token module I realised that the problem was with the behaviour of the reservedOp combinator. This combinator was simply matching the given string at the first precedence level it found so that even if the code contained a logical OR (||) the higher precedence bitwise OR would match leaving the second vertical bar character un-parsed.

My first attempt at a solution to this problem was to define my own combinator reservedOpNf using Parsec's notFollowedBy combinator and use that in place of the problematic reservedOp.


  opChar :: String
  opChar = "+-/%*=!<>|&^~"

  reservedOpNf :: String -> CharParser st ()
  reservedOpNf name = try (reservedOp name >> notFollowedBy (oneOf opChar))

This solved the immediate problem of distinguishing between bitwise and logical OR, but I soon ran into another problem parsing expressions like this:


  if (whatever == -1)
     .....

which were failing at the unary minus.

A bit of experimenting suggested that again, the problem was with the behaviour of the reservedOp combinator. In this case it seemed the combinator was matching the given string, consuming any trailing whitespace and then the notFollowedBy was failing on the unary minus.

Once the problem was understood, the solution was easy, replace the reservedOp combinator with the string combinator (which matches the string exactly and doesn't consume trailing whitespace), followed by the notFollowedBy and then finally use the whiteSpace combinator to chew up trailing white space.


    reservedOpNf :: String -> CharParser st ()
    reservedOpNf name =
        try (string name >> notFollowedBy (oneOf opChar) >> whiteSpace)

Despite minor problems like the one above, I am still incredibly impressed with the ease of use and power of Parsec. Over the years I have written numerous parsers, in multiple host languages, using a number of parser generators and I have never before used anything that comes close to matching Haskell and Parsec.


December 30, 2008 09:47 PM

-----

December 29, 2008

Erik de Castro Lopo

Hackfest in the House that Hack Bought.

Just before Newtonmas (also known as xmas to some) I had a small hackfest in the new house that my wife and I are calling "The House that Hack Bought". We had horms, AfC, Scott, raster, kfish, pmiller and myself hacking and chatting from about 10am.

There were 5 of us in the living room:


hackfest living room

while kfish and pmiller hacked in the sun room:


hackfest sun room

and Scott even managed to get a bit of hardware hacking done in the kitchen:


hackfest sun room

It was a great day. Thanks to all who participated.


December 29, 2008 03:47 PM

Mark Greenaway

Sharpening knives

Having used Paige's nice sharp Global knife, I knew going back to my not so sharp knife was going to be unbearable. I always want to be able to get the best out of what I have.

I've got a Japanese cook's knife, and a French style paring knife. You might ask why I chose such an odd combination. Of course, there's a simple explanation - I had no idea what I was doing when I bought those, I just wanted something better than what I was using.
My Mum bought me a diamond steel last year for Christmas, which I've been using to sharpen and hone both my knives. Anyone who knows anything about knives is by now recoiling in horror, because they'll know that Japanese and French knives are quite different, and you shouldn't be using French sharpening technique on a Japanese knife. French knives have much thicker blades, and are usually sharpened on an angle of 20 to 30 degrees. A stone is best, but a few quick swipes with a diamond steel will keep your French knife in good condition for months at a time until it needs sharpening on a stone or machine again.
Japanese knives, with their much thinner blades, need to be sharpened differently - the industrial diamond on a diamond steel is too coarse. They're meant to be sharpened so that the edge has an angle of 10 to 15 degrees. Everything I've read, and been told by people in knife shops, says that you should use a stone. I resisted buying a stone for a long long time, because I'm a miser and hate spending money when I think I've got something that should already do the job. My brother, who's been sharpening his knives on a stone for a long time, also told me that the technique is difficult to learn.
After I gave up on the steel and bought yet another tool for sharpening, I'd have to say the result was worth it, and the while I certainly haven't mastered the technique, it's not that hard to get started. There's a pretty good explanation here that I was able to follow without too much trouble.
You might wonder why the French and Japanese knives are so different. French cuisine is very different from Japanese, with the former having emphasis on complicated technique, while the latter seems to emphasise careful preparation and presentation of ingredients, often raw e.g. sushi. So one possible explanation is that the Japanese knives are designed for the more precise and intricate cutting required by that cuisine, while the French just want to cleave through the vegetables so they can get on with making the demi-glace.

Disclaimer: I am not a chef. I haven't mastered any cuisine. This is pure speculation. Get off my lawn.


December 29, 2008 08:36 AM

Guitars cost too much AND Chords

I went to Allans Music today hoping to get a cheap jazz guitar, or at least a cheaper jazz guitar. No such luck, even though some of the guitars were substantially reduced, they were still way out of my price range. Guitars are getting really expensive, although good guitars always have been. I think unless I'm willing to put down some serious cash, I'm going to be rocking the Strat for a few years yet. One of these days, I might just throw caution and good financial sense to the wind and buy the instrument I really want.
While I couldn't afford a new guitar on a student budget, I could afford to get a book. After having a look around, I found a nice book called the Chord Factory, from Berklee Press. Guitar players in jazz groups spend most of their time accompanying soloists, and the chords can get pretty advanced. Most of the chord books I've seen give you page after page of chord forms in every possible key, and then the implication is that you should memorise them, or maybe look them up as needed. I've never found this approach very useful - I'd much prefer to understand how the chords are built and how I can build voicings to suit my needs in a particular musical situation. I'd also really like to know how these chords are meant to be used. The closest I'd ever seen to what I was looking for was Chord Chemistry by Ted Greene, but that wasn't quite what I wanted either.
I worked out how to build voicings for major and minor chords with root notes on any string, got most of the way with dominant, minor and major seventh chords, and have no idea what to do with ninths and further extensions. I don't know how to voice those chords on guitar either, and didn't have much idea how to started. As the name suggests, the Chord Factory book is all about learning approaches to building chord voicings, rather than learning a couple of chord voicings by rote and hoping they're always going to be enough, no matter what your sadistic band leader puts on your music stand and expects you to play.
Just as important is that the book also begins to explain how to use these chords in a progression, which I've never been able to get any understanding of before. Finally ...


December 29, 2008 07:26 AM

Ted T'so

Debian, Philosophy, and People

Given the recent brouhaha in Debian, and General Resolution regarding Lenny’s Release policy as it relates to Firmware and Debian’s Social Contract, which has led to the resignation of Manoj Srivastava from the position of Secretary for the Debian Project, I’m reminded of the following passage from Gordon Dickson’s Tactics of Mistakes (part of Dickson’s Childe Cycle, in which he tells the story of the rise of the Dorsai):

“No,” said Cletus. “I’m trying to explain to you why I’d never make an Exotic. In your calmness in the face of possible torture and the need to kill yourself, you were showing a particular form of ruthlessness. It was ruthlessness toward yourself—but that’s only the back side of the coin. You Exotics are essentially ruthless toward all men, because you’re philosophers, and by and large, philosophers are ruthless people.”

“Cletus!” Mondar shook his head. “Do you realize what you’re saying?”

“Of course,” said Cletus, quietly. “And you realize it as well as I do. The immediate teaching of philosophers may be gentle, but the theory behind their teaching is without compunction—and that’s why so much bloodshed and misery has always attended the paths of their followers, who claim to live by those teachings. More blood’s been spilled by the militant adherents of prophets of change than by any other group of people down through the history of man.”

The conflict between idealism and pragmatism is a very old one in the Free and Open Source Software Movement. At one end of the spectrum stands Richard Stallman, who has never compromised on issues regarding his vision of Software Freedom. Standing at various distances from this idealistic pole are various members of the Open Source Community. For example, in the mid-1990’s, I used to give presentations about Linux using Microsoft Powerpoint. There were those in the audience that would give me grief about using a non-free program such as MS Powerpoint, but my response was that I saw no difference between driving a car which had non-free firmware and using a non-free slide presentation program. I would prefer to use free office suite, but at the time, nothing approached the usability of Powerpoint, and while dual-booting into Windows was a pain, I could do a better job using Powerpoint than other tools, and I refused to handcap myself just to salve the sensibilities of those who felt very strongly about Free Software and who viewed the use of all non-Free Software as an ultimate evil that must be stamped out at all costs.

It is the notion of Free Software as a philosophy, with no compromises, which has been the source of many of the disputes inside Debian. Consider, if you will, the first clause of the Debian Social Contract:

Debian will remain 100% free

We provide the guidelines that we use to determine if a work is free in the document entitled The Debian Free Software Guidelines. We promise that the Debian system and all its components will be free according to these guidelines. We will support people who create or use both free and non-free works on Debian. We will never make the system require the use of a non-free component.

This clause has in it no room for compromise. Note the use of words such as “100% free” and “never make the system require the use of a non-free component” (emphasis mine). In addition, the Debian Social Contract tends to be interpreted by Computer Programmers, who view such imperatives as constraints that must never be violated, under any circumstances.

Unfortunately, the real world is rarely so cut-and-dried. Even the most basic injunctions, such as “Thou shalt not kill” have exceptions. Few people might agree with claims made by the U.S. Republican Party that the war in Iraq qualified as a Just War as defined by Thomas Aquinas, but rather more people might agree that the July 20, 1944 plot to assassinate Hitler would be considered justifiable. And most people would probably agree most of the actions undertaken by the Allied Soldiers on World War II battlefields that involved killing other soldiers would be considered a valid exception to the moral (and for those in the Judeo-Christian tradition, biblical) injunction, “Thou shalt not kill“.

As another example, consider the novel and musical Les Misérables, by Victor Hugo. One of the key themes of this story is whether or not “Thou shalt not steal” is an absolute or not. Ultimately, the police inspector Javert, who lived his whole life asserting that law (untempered by mercy, or any other human considerations) was more important than all else, drowns himself in the Seine when he realizes that his life’s fundamental organizing principle was at odds with what was ultimately the Right Thing To Do.

So if even the sixth and eighth commandments admit to exceptions, why is it that some Debian developers approach the first clause of the Debian Social Contract with a take-no-prisoners, no-exceptions policy? Especially given the fourth clause of the Debian Social contract:

Our priorities are our users and free software

We will be guided by the needs of our users and the free software community. We will place their interests first in our priorities. We will support the needs of our users for operation in many different kinds of computing environments. We will not object to non-free works that are intended to be used on Debian systems, or attempt to charge a fee to people who create or use such works. We will allow others to create distributions containing both the Debian system and other works, without any fee from us. In furtherance of these goals, we will provide an integrated system of high-quality materials with no legal restrictions that would prevent such uses of the system.

This clause does not have the same sort of absolutist words as the first clause, so many Debian Developers have held that the “needs of the users” is defined by “100% free software”.   Others have not agreed with this interpretation — but regardless of how “needs of the users” should be interpreted, the fact of the matter is, injuctions such as “Thou shalt not kill” are just as absolute — and yet in the real world, we recognize that there are exceptions to such absolutes, apparently unyielding claims on our behavior.

I personally believe that “100% free software” is a wonderful aspirational goal, but in particular with regards to standards documents and firmware, there are other considerations that should be taken into account.   People of good will may disagree about what those exceptions should be, but I think one thing that we should consider as even higher priority and with a greater claim on how we behave is the needs of our users and fellow developers as people.   For those who claim Christianity as their religious tradition, Jesus once stated,

Thou shalt love the Lord thy God with all thy heart, and with all thy soul, and with all thy mind.  This is the first and great commandment.  And the second is like unto it: Thou shalt love thy neighbour as thyself.   On these two commandments hang all the law and the prophets.

Even for those who do not claim Christianity as their religious tradition, most moral and ethical frameworks have some variant on the Golden Rule: “Do unto others as you would have them do unto you”.  I would consider, for example, that the Golden Rule is at least a high priority claim on my behavior as the notion of free speech, and in many cases, it would be a higher priority claim.  The recent controversy surrounding Josselin Mouette was started precisely because Joss has taken a something which is a good thing, namely Free Speech, and relegated it to a principle more important than all else, and claiming that any restraint on such a notion was equivalent to censorship.

I think the same thing is true for free software, although it is a subtler trap.  Philosophical claims than “100% free software” as most important consideration is dangerously close to treating Free Software as the Object of Ultimate Concern — or in religious terms, idolotry.  For those who are religious, it’s clear why this is a bad thing; for those who aren’t — if you are unwilling to worship a supernatural being, you may want to very carefully consider whether you are willing to take a philosophical construct and raise it to a position of commanding your highest allegiance to all else, including how you treat other people.

Ultimately, I consider people to be more important than computers, hardware or software.  So over time, while I may have had some disagreements with how Mark Shuttleworth has run Canonical Software and Ubuntu (but hey, he’s the multimillionaire, and I’m not), I have to give him props for Ubuntu’s Code of Conduct.  If Debian Developer took the some kind of Code of Conduct at least as seriously as the Social Contract, I think interactions between Debian Developers would be far more efficient, and in the end the project would be far more successful.   This may, however, require lessening the importance of philosophical constructs such as Free Speech and Free Software, and perhaps becoming more pragmatic and more considerate towards one another.


December 29, 2008 12:59 AM

Erik de Castro Lopo

learnHaskell :: Problem -> IO Solution

Over the last week or so I've been learning Haskell. Unlike a previous attempt to learn Erlang I think this one will actually end up bearing fruit. The main thing that slowed down my attempt to learning Erlang was that I didn't have a task to apply it to that exploited the strengths of the language.

For Haskell I have a task that is ideally suited to the language and plays to its strengths; writing a parser for an Javascript-like language. Just like when I learned Ocaml I am jumping in at the deep end with a difficult problem and learning the language on the way (ie as a side-effect of tackling the task itself).

Normally my tool of choice for this task would be Ocaml and I did in fact start out writing a parser in Ocaml using the Menhir parser generator. Menhir creates a parser from an LR(1) grammar specification where the LR(1) means that the grammar is left recursive and must need no more than one token look-ahead to resolve ambiguities. Over two days of intense hacking I made quite a lot of progress on the grammar and then hit a wall; numerous rather incomprehensible shift/reduce and reduce/reduce warnings plus a part of the grammar (raw XML embedded directly into the source code) which was not context free and for which I could not see any possible way of parsing with only one token look-ahead.

Since I already knew Ocaml, I looked around at some of the other parser generators for Ocaml like Aurochs and PCL. Aurochs uses Parsing Expression Grammar (or Packrat parsing) and seemed interesting, but was rejected because it seemed to gave very little control over the Abstract Syntax Tree it generated. PCL on the other hand was a monadic parser combinator library which is basically a port of Haskell's Parsec library to Ocaml. The main problem I saw with PCL was that it used Monads and lazy evaluation, neither of which are very common in standard Ocaml code. PCL was also rather new and I wasn't sure if it was stable and mature enough for quite an advanced parsing task.

Of course I had