This page is a collection of news feeds from friends of mine and myself. Its not, as the name might imply, a planet of Horms, or for the most part even stuff written by me. For that, please go here.
Nearby Planets:
Planet SLUG,
Planet Linux Australia,
Planet Russell
Subscriptions:
Alan Robertson
Alexander Reeder
Anand Kumria
Andrew Cowie
Andy Fitzsimon
Benno
Chizuko Horman
Chris DiBona
Chris Yeoh
Craige McWhirter
Dave Miller
Dave Ruys
David Luyer
Erik de Castro Lopo
Horms
Hugh Blemmings
James Morris
Jan Schmidt
Jaq
Jeff Waugh
Jeremy Kerr
John Ferlito
Joseph Arruda
Kfish
Mark Greenaway
Martin Pool
Mary Gardiner
Nick Jenkins
Ozone
Pete Ryland
Peter Hardy
Peter Nixon
Pia Waugh
Raster
Real World Haskell
Robert Collins
Roberto Nibali
Roger Barnes
Russell Coker
Rusty
Sam Johnston
Silvia Pfeiffer
Ted T'so
Tong Master
Tractorgen - Commits
Wichert Akkerman
fusion94
![]()
I've been meaning for a while to write up what's happening with cpumasks in the kernel. Several people have asked, and it's not obvious so it's worth explaining in detail. Thanks to Oleg Nesterov for the latest reminder.
The two obvious problems are
For better or worse, people want NR_CPUS 4096 in stock kernels today, and that number is only going to go up.
Unfortunately, our merge-surge development model makes whack-a-mole the obvious thing to do, but the results (creeping in largely unnoticed) have been between awkward and horrible. Here's some samples across that spectrum:
#define cpu_set(cpu, dst) __cpu_set((cpu), &(dst))
#define cpumask_of_cpu(cpu)
(*({
typeof(_unused_cpumask_arg_) m;
if (sizeof(m) == sizeof(unsigned long)) {
m.bits[0] = 1UL(cpu);
} else {
cpus_clear(m);
cpu_set((cpu), m);
}
&m
}))
Ignoring that this code has a silly optimization and could be better
written, it's illegal since we hand the address of a
going-out-of-scope local var. This is the code which got me looking
at this mess to start with.
#define CPUMASK_ALLOC(m) struct m _m, *m = &_m
...
#define node_to_cpumask_ptr(v, node) \
cpumask_t _##v = node_to_cpumask(node); \
const cpumask_t *v = &_##v
#define node_to_cpumask_ptr_next(v, node) \
_##v = node_to_cpumask(node)
But eternal vigilance is required to ensure that someone doesn't add another cpumask to the stack, somewhere. This isn't going to happen.
These days we avoid Big Bang changes where possible. So we need to introduce a parallel cpumask API and convert everything across, then get rid of the old one.
for_each_cpu_mask(i, my_cpumask) ... if (i == NR_CPUS)That final test should be "(i >= nr_cpu_ids)" to be safe now:
for_each_cpu(i, &my_cpumask) ... if (i >= nr_cpu_ids)
At this point, we will have code that doesn't suck, rules which can be enforced by the compiler, and the possibility of setting CONFIG_NR_CPUS to 16384 as the SGI guys really want.
Importantly, we are forced to audit all kinds of code. As always, some few were buggy, but more were unnecessarily ugly. With less review happening these days before code goes in, it's important that we strive to leave code we touch neater than when we found it.
This is to announce a kernel security subsystem wiki, supported by the kind folk at kernel.org. It's intended for use by community developers and users of kernel security projects. (Note that it's not related to security response: the security contact for the kernel per the MAINTAINERS file is security @ kernel.org).
So far, there are sections on working with the security-testing git repo, a listing of various kernel security projects, and an events page.
If there's something you'd like to see or change on the wiki (particularly if it relates to your own project), create an account and make it so.
![]()
Who knew that OS X has a command line port scanner? I for one didn't. I've always just used the Network Utility version. Here's how to access it.
As the superuser run this:
ln /Applications/Utilities/Network\ Utility.app/Contents/Resources/stroke /bin/stroke
chmod uo+x /bin/stroke
Then as any user you can call it from the terminal as such:
stroke address startPort endPort
So to check your local machine do:
stroke 127.0.0.1 0 1024
You should see output that looks similar to this:
Port Scanning host: 127.0.0.1
Open TCP Port: 80 http
Open TCP Port: 631 ipp
Version 2.6.28 of the Linux kernel was released during Christmas, so I thought it'd be worthwhile waiting until after typical vacation days to post a summary of changes to the security subsystem. As always, thanks to the Kernel Newbies folk who track major kernel changes.
/smack/onlycap), or for all processes if the onlycap label is specified as '*'.
Alex Payne writes of keeping a private journal, separate from his blogs.
The primary and obvious difference between journaling and blogging is that a journal is private. A journal exists only for the author’s personal consumption, or possibly as a posthumous record of a life. Without the modest audience my blog has accrued, I have no incentive to filter what I write for content or style. I can be as dry or as flowery as I like and nobody need suffer.
I'm heading to South America later this month, which should be great. It's on my dopplr page, which looks like an interesting little service.
On coding for fun -- stay fresh by living a rounded life, and doing small fun technical projects. You know it makes sense.

Duly noted.
[via.]
All Australians, please also take note.
You’re not actually watching the “footy”, you are in fact watching the “throwy”, or the “barge into peopley”, or the “tedious outdoor pub-punch-on….y”
![]()
For a little personal project I’ve been working on recenly, I needed to create some unit tests for a CGI script, sorry, that should be, a cloud computing application. Of course, I turned to my trusty hammer, Python, and was a little disappointed to discover it didn’t quite have the batteries I needed included.
I kind of thought that urllib2 would more or less do what I wanted out of the box, but unfortunately it didn’t and (shock, horror!) I actually needed to write some code myself! The first problem I ran into is that urllib2 only supports GET and POST out of the box. HTTP is constrained enough in the verbs it provides, so I really do want access to things like PUT, DELETE and HEAD.
The other problem I ran into, is that I did’t want things automatically redirecting (although clearly this would be the normal use-case), because I wanted to check I got a redirect in certain cases.
The final problem that I had is that only status code 200 is treated as success, and other 2xx codes raise exceptions. This is generally not what you want, since 201, is a perfectly valid return code, indicating that new resource was created.
So, urllib2 is billed as An extensible library for opening URLs
using a variety of protocols
, surely I can just extend it to do
what I want? Well, it turns out that I can, but it seemed to be harder
than I was expecting. Not because the final code I needed to write was
difficult or involved, but because it was quite difficult to work out
what the right code to write is. I want to explore for a little bit
why (I think) this might be the case.
urllib2 is quite nice, because simple things (fetch a URL, follow redirects, POST data), are very easy to do:
ret = urllib2.urlopen("http://some_url/")
data = ret.read()
And, it is definitely possible to do more complex things, but (at least for me), there is a sharp discontinuity in the API which means that learning the easy API doesn’t help you learn the more complex API, and also the documentation (at least as far as I read it), doesn’t make it apparent that there are kind of two modes of operation.
The completely frustrating thing is that the documentation in the source file is much better than the online documentation! Since it talks about some of the things that happen in the module, which are otherwise “magic”.
For example, the build_opener function is pretty
magical, since it does a bunch of introspection, and ends up either
adding a handler or replacing a handler depending on
the class. This is explained in the code as: if one of the
argument is a subclass of the default handler, the argument will be
installed instead of the default.
, which to me makes a lot of
sense, where-as the online documentation describes it as:
Instances of the following classes will be in front of the handlers,
unless the handlers contain them, instances of them or subclasses of
them: ....<list of default handlers>
. For me the former is
much clearer than the latter!
Anyway, here is the code I ended up with:
opener = urllib2.OpenerDirector()
opener.add_handler(urllib2.HTTPHandler())
def restopen(url, method=None, headers=None, data=None):
if headers is None:
headers = {}
if method is None:
if data is None:
method = "GET"
else:
method = "POST"
return opener.open(HttpRestRequest(url, method=method,
headers=headers, data=data))
So, conclusions, if the dos don’t make sense, read the code. If you are writing an API, try to make it easier to get from the easy case to the more complex case. (This is quite difficult to do, and I have definitely been guilty in the past of falling into the same trap in APIs I have provided!). If you can’t get the API to easily transition from easy to hard, make sure you document it well. Finally, Python is great language for accessing services over HTTP, even if it does require a little bit of work to get the framework in place.
Quite often, you may have simple setter methods that need to do a just a tiny bit of work before or after setting an instance variable. For example, maybe you need to redraw something after setting the property of an object. So, instead of writing this:
[self setBackgroundColor:[NSColor blueColor]];
[view setBackgroundColor:[NSColor blueColor]];
You’d probably want to move the relevant code to your -setBackgroundColor: accessor instead:
- (void)setBackgroundColor:(NSColor*)color
{
// Assuming that _backgroundColor is the ivar you want to set
if(_backgroundColor != color)
{
[_backgroundColor release];
_backgroundColor = [color retain];
// Update the view's background color to reflect the change
[view setBackgroundColor:_backgroundColor];
}
}
Then you can simply call -setBackgroundColor: and expect it all to work nicely:
// -setBackgroundColor: updates the view's background color
// automatically now
[self setBackgroundColor:[NSColor blueColor]];
(You could use Key-Value Observing to do this, but I generally avoid KVO for simple intra-class property dependencies like this. I don’t think the overhead of maintaining all the KVC dependencies and KVO-related methods is worth the cost.)
Of course, the above method requires that you write all that stupid boilerplate memory management code in the accessor. Instead of doing that, I tend to declare a private _backgroundColor property in the class, @synthesize a method for the private property, and then use the private property’s generated accessors instead:
@interface MyClass ()
// Declare a _private_ _backgroundColor property (thus the underscore
// in front, and why it's declared in a class continuation rather than
// in the public header)
@property (copy, setter=_setBackgroundColor:) NSColor* _backgroundColor;
@end
//
@implementation MyClass
@synthesize _backgroundColor;
- (NSColor*)backgroundColor
{
return [self _backgroundColor];
}
- (void)setBackgroundColor:(NSColor*)color
{
// Use the private property to set the background colour, so it
// handles the memory management bollocks
[self _setBackgroundColor:color];
[view setBackgroundColor:[self _backgroundColor]];
}
...
@end
With that technique, it’s possible to completely directly setting ivars, and thus avoid -retain and -release altogether. (You’ll still need to use -autorelease at various times, of course, but that’s reasonably rare.) We have some source code files that are well over 2000 lines of code without a single explicit [_ivar retain]; or [_ivar release]; call thanks to this technique. (Yeah, 2000 lines is also large and the class needs refactoring, but that’s another story.)
Of course, you could just use garbage collection which avoids 99% of the need for this bollocks:
- (void)setBackgroundColor:(NSColor*)color
{
// Yay GC!
self->_backgroundColor = color;
[view setBackgroundColor:self->_backgroundColor];
}
But plenty of us don’t have that luxury yet. (iPhone, ahem.)
For the UNIX users out there who use the git revision control system with the oldskool less pager, try adding the following to your ~/.gitconfig file:
[core]
# search for core.pager in
# <http://www.kernel.org/pub/software/scm/git/docs/git-config.html>
# to see why we use this convoluted syntax
pager = less -$LESS -SFRX -SR +'/^---'
That’ll launch less with three options set:
-S: chops long lines rather than folding them (personal preference),-R: permits ANSI colour escape sequences so that git’s diff colouring still works, and+'/^---': sets the default search regex to ^--- (find --- at the beginning of the line), so that you can easily skip to the next file in your pager with the n key.The last one’s the handy tip. I browse commits using git diff before committing them, and like being able to jump quickly back and forth between files. Alas, since less is a dumb pager and doesn’t understand the semantics of diff patches, we simply set the find regex to ^---, which does what we want.
Of course, feel free to change the options to your heart’s content. See the less(1) manpage for the gory details.
As the comment in the configuration file says, you’ll need to use the convoluted less -$LESS -SFRX prefix due to interesting git behaviour with the LESS environment variable:
Note that git sets the LESS environment variable to
FRSXif it is unset when it runs the pager. One can change these settings by setting theLESSvariable to some other value. Alternately, these settings can be overridden on a project or global basis by setting thecore.pageroption. Settingcore.pagerhas no affect on theLESSenvironment variable behaviour above, so if you want to override git’s default settings this way, you need to be explicit. For example, to disable theSoption in a backward compatible manner, setcore.pagerto"less -+$LESS -FRX". This will be passed to the shell by git, which will translate the final command to"LESS=FRSX less -+FRSX -FRX".
(And sure, I could switch to using a different pager, but I’ve been using less for more than a decade. Yep, I know all about Emacs & Vim’s diff-mode and Changes.app. It’s hard to break old habits.)
My vacation scholarship started today. I'm sitting in one of the honours offices slowly summarising from the textbook I've been assigned to understand. It's the sequel to the analysis textbook I used in second year, so I'm familiar with the author's writing style and standard methods of proof. I loved that course not only from the front and from behind, but from every sequence converging to it in R3, so I'm enjoying it so far. For those that way inclined, I'm looking at derivatives of measures and absolute continuity.
The office you occupy during a vacation scholarship is usually the one you stay in for your honours year, so I'll spend some time making the place comfortable. It seems like a good place to work.
I decided that this year would be no alcohol for me, and no coffee on weekdays, in an effort to give my body some detox and to cut back on the disgraceful amounts of junk food that I eat when I drink either type of beverage. I thought 12 months should be a decent stint. People keep asking whether moderation would be a better approach, but frankly I suck at it. Also, I find it a lot easier to work towards moderation from a point of total abstinence, than to try to work backwards from a point of over-indulgence (Silly Season….yeesh!).
You’d better thank me for this, heart….
![]()
I’ve been setting up my new Android phone and found out that the password handling code in the IMAP client doesn’t work so well with my Dovecot IMAP server.
Now, I really can’t be bothered working out which side is doing it wrong, but Dovecot expects your password to be contained in double-quotes if it contains special character. (No, I don’t precisely know what those characters are!). And, of course, if you are double-quoing the string, you then need to escape and double-quotes in the password itself. Now, like I said, no idea of this is the correct (according to the standard) behaviour, but it is the behaviour that I have to deal with, since I can’t mess with the server. Of course, the Android IMAP client doesn’t do this escaping, and so you get an error message indicating your username / password is incorrect. Frustratingly, the error message doesn’t pass on the information from the server giving detail on exactly what the problem is so you are left guessing.
Anyway, it turns out if you manually escape the password that you give to the Android email client things work fine. Of course, the SMTP server doesn’t need this escaping, and fails if you do have it, so you need to type in a different, unescaped, password for SMTP. Fun and games!
Looking at the lastest source code for Android, it looks like this has been properly fixed, so hopefully and upgrade to the Cupcake branch in the near future will solve the problem.
I spent the last few days doing some nice research for Vquence, where I was able to watch lots of videos on YouTube. Fun job this is!
The full article is on the Vquence metrics blog.
One of the key things that I’ve put together is a list of top 10 commercials for 2008:
| Rank | Video | Views | Added |
| 1 | Cadbury - Gorilla | 3,338,011 | August 31, 2007 |
| 2 | Nike - Take it to the NEXT LEVEL | 3,184,329 | April 28, 2008 |
| 3 | Macbook Air | 2,648,717 | January 15, 2008 |
| 4 | Centraal Beheer Insurance - Gay Adam | 2,512,425 | May 30, 2008 |
| 5 | Vodafone - Beatbox | 2,380,237 | March 17, 2008 |
| 6 | E*Trade - Trading Baby | 2,061,818 | February 01, 2008 |
| 7 | Guitar Hero - Heidi Klum | 1,068,055 | November 03, 2008 |
| 8 | Bridgestone - Scream | 980,406 | January 30, 2008 |
| 9 | Bud Light- Will Ferrell | 966,177 | February 04, 2008 |
| 10 | Careerbuilder - Follow your heart | 605,465 | February 03, 2008 |
| Favorable mention | OLPC - John Lennon | 527,953 | December 25, 2008 |
| Favorable mention | Blendtec - iPhone 3G | 2,711,195 | July 11, 2008 |
| Favorable mention | Stide Gum - Where the hell is Matt? | 15,859,204 | June 20, 2008 |
Enjoy! And let me know in the comments if you know of any other video ad released in 2008 in the same ballpark number of views that is an actual tv-style commercial.
クリスマス~お正月のお祭りウィークがようやく終わり、
明日から、Milduraというところへ。
シドニーから車で10時間西へ行ったところ。
マリーリバーのほとりに住む、義母の妹さんのところへ行くのです。
道中はファーム


あり、砂漠
ありのなかなかワクワクな旅になりそう
水をたっぷり用意したし、食料も用意したし、ウチのプランツはカマキリのドロシーごと預けたし、
まあまあ準備万端かな。
それではいってきます
![]()
Via slashdot.org (yes really, I still pull in the headlines, although the miracle of feed readers has allowed me to confirm that yes, Ars Technica is a better read), a site called Journal Space, which hosted weblogs, lost all their data. They only had a RAID setup as backup, that is, a system that mirrors content between two disks and is designed to protect against disk failure. If you've heard of RAID, you hopefully already know that it is not the same as a backup: if software error or an accident or a malicious act deletes data from one disk, the RAID setup faithfully mirrors it to the other disk. If not, imagine that you have two magical whiteboards. One is copied exactly to the other. If one magical whiteboard totally breaks down, excellent, you have a full copy of your meeting notes and doodles on the other. (Note for accuracy, not all RAID configurations produce a full mirror and sometimes the mirror is spread over more than one spare disk. But you get the idea.) However, if someone rubs something off the whiteboard, or falls over while holding a can of solvent and splashes it on the first whiteboard, everything on it is immediately deleted from the other.
Instead, for home machines you want, most likely, an incremental backup, that is, a separate disk/machine with several copies of your data going back in time. Your data as it was an hour ago. Your data as it was a day ago. Your data as it was a month ago. And so on. I have snapshots of my data for every three hours over the last two months. (Sensible backup programs will notice when data is the same across two or more time periods and only store it once, so your backup disk does not need to be so very much larger than your normal disk.)
For business systems you want both: the quick recovery from disk failure that mirroring systems such as RAID offer, and incremental backups. (I don't maintain business grade systems, ask someone else for best practices if you need them. Internally consistent database backups are something you want to pay particular attention to.)
I note this because in November I gave a talk on home backups for Linux at SLUG and there is one other point of interest: do not trust third party providers to have good backups. It is getting increasingly common to have a lot of your most interesting data on someone else's servers: your email on Google's, your blog over at wordpress.com, contact details for all your friends on Facebook, and so on. But your provider can make both their own catastrophically bad decisions, like Journal Space, and have their creditors suddenly sell their hard disks off in a fire sale, as happened to Digital Railroad.
Which is a big problem, because a lot of third party providers do not provide an easy way to get your data ('easy' would be both a documented API accessible from common programming languages and an installable application), and lots don't provide any way at all. (There's also a whole batch of interesting issues to do with your comments or Wall postings or whatever: you don't necessarily have the right to reproduce them and there would be privacy implications when allowing you to back them up and reproduce them on some other side. LiveJournal, for one, solves this problem by not allowing easy backups of comments left on your journal.)
If your email host, blog host, calendar host, documents host or social networking host failed or deleted your account, how would you fare?
![]()
This post is basically a crash course in using gdb to run and debug
low-level code. Learning to us a debugger effectively can
be much more powerful than ghetto printf() and putc()
debugging. I should point out that I am far from a power-gdb user, and
am usually much more comfortable with printf() and putc(),
so this is very much a beginners guide, written by a newbie. With those
caveats in mind, lets get started.
So the first thing to do is to get our target up and running. For
this our target will be a virtual device running with Skyeye. When you start up
Skyeye and pass it the -d flag, e.g: $ skeye -c
config.cfg -d. This will halt the virtual processor and provide
an opportunity to attach the debugger. The debugger will be available
on a UNIX socket. It defaults to port 12345. Of course
a decent JTAG adapter should be able to give you the same type of
thing with real hardware.
Now, you run GDB: $ arm-elf-gdb. Once gdb is running
you need to attach to the target. To do this we use:
(gdb) target remote :12345. Now you can start the
code running with (gdb) continue.
Now, just running the code isn’t very useful, you can do that already.
If you are debugging you probably want to step through the code.
You do this with the step command. You can step through
code line at-a-time, or instruction at-a-time. At the earliest stages
you probably want to use the si command to step through
instruction at-a-time.
To see what you code is doing you probably want to be able to display
information. For low-level start up code, being able to inspect the register
and memory state is import. You can look at the register using the
info registers command, which prints out all the general-purpose
registers as well as the program counter and status registers.
For examing memory the x command is invaluable. The
examine command takes a memory address as an argument (actually, it
can be a general expression that quates to a memory address). The
command has some optional arguments. You can choose the number of
units to display, the format to display memory in (hex (x), decimal
(d), binary (t), character (c), instruction (i), string(s), etc), and
also the unit size (byte (b), halfword (h), word (w)). So, for
example to display the first five words in memory as hex we can do:
(gdb) x /5x 0x0. If we want to see the values of
individual bytes as decimal we could do: (gdb) x /20bd 0x0.
Another common example is to display the next 5 instructions, which can
be done with (gdb) x /5i $pc. The $pc expression
returns the value in the pc register.
Poking at bits and bytes and stepping instruction at a time is
great for low-level code, but gdb can end up being a lot more useful
if it knows a little bit more about the source code you are
debugging. If you have compiled the source code with the
-g option, your ELF file should have the debugging
information you need embedded in it. You can let gdb know about this
file by using the (gdb) symbol program.elf. Now
that you actually have symbols and debugging information, you can
do things like normal then step command, and it will
step through lines of source code (rather than instructions).
The other nice thing you have is that you can easily set
breakpoint and watchpoints. (You don’t have to have
source debugging enabled for this, but it makes things a lot easier!).
Seting a breakpoint is easy, you can set it on a line e.g: (gdb) break file.c:37,
or on a particular function e.g: (gdb) break schedule.
Breakpoints are neat, but watchpoints are even cooler, since you can
test for a specific conditions e.g: (gdb) watch mask 2000.
Now that you have these nice watchpoints and breakpoints, you
probably find that most of the time, you just end up printing out some
variables each time you hit the point. To avoid this repetitive typing
you can use the display command. Each expression you
install with the display command will be printed each time program execution
stops (e.g: you hit a break-point or watch-point). This avoids a lot
of tedious typing!
So, this is of course just scratching the surface. One final thing
to consider that will likely make your time using gdb more useful and
less painful (i.e: less repetitive typing), is the define
command which lets you create simple little command scripts. The other
is that when you start gdb you can pass a command script with the
-x. So, you might want to consider, instead of littering
your code with printf() statements everywhere you might want to write
some gdb commands that enable a breakpoint and display some the relevant
data.
Good luck, and happy debugging!
So I thought I'd do some tests since I got my new Beagleboard (thanks Thomas!). What is this? well you can go follow the link - but it's an OMAP3530 based devel board with pretty much nothing much on it other than the CPU. You need to plug in serial to get the console - there is no input or screen on it... but it does have DVI out and a usb OTG port (which so far has been utterly useless to me as my supposed USB-A to USB-Mini-A doesn't work for OTG :( ). Anyway. This little CPU (SoC - System on a Chip) is not too shabby. it's not even running at the 600Mhz rated that the 3530 can do (mind you that's overdrive - but I'd expect to use that normally as the other power case is 0 clock so while on and doing things - do them fast, otherwise sleep as much as you can). Also the RAM could be clocked up much higher (it's at 133Mhz on this board).
So that covers the hardware. Software wise I'm using expedite below. This is the Software-X11 engine. Output is 16bpp RGB565 (except the Desktop which is RGB32). The Desktop of course will have multiple threads rendering. It also gets MMX and SSE assembly. The ARM systems get the plain C routines. So thus far I have yet to put in any ASM for ARM in the software routines - but I intend to throw in some preload (pld) calls as well... but... what I really need to do is put in some NEON (OMAP3'x equivalent of MMX/SSE that actually doesn't suck). I have NOT done any of this - so the below numbers are RAW as-is. the PXA270 might have some room for optimising with Wireless MMX. The s3x2442 I think is pretty much not going to go anywhere as it's ARMv4, so a lost cause. So take the numbers for the PXA270 as "could be a little better". From what I have read of the 3530's NEON - it could get a fair bit better if not 50% or more on the FPS numbers below (yes the numbers are in frames per second, rendering to a QVGA sized window in all cases).
| Treo650 | Openmoko GTA02 | BB (RevB) | Desktop x86 | |
| PXA270 @ 312Mhz | S3c2442 @ 400Mhz | OMAP3530 @ 500Mhz | Core2D @ 3Ghz | |
| 32M RAM | 128M RAM | 128M RAM | 4096M RAM | |
| FB X11 | Glamo X11 | FB X11 | NV8600GT X11 | |
| Image Blend Unscaled | 2.83 | 3.43 | 13.73 | 274.99 |
| Image Blend Solid Unscaled | 19.69 | 15.86 | 63.37 | 1054.19 |
| Image Blend Nearest Scaled | 1.97 | 2.11 | 6.41 | 144.23 |
| Image Blend Nearest Solid Scaled | 20.98 | 15.01 | 66.84 | 1171.3 |
| Image Blend Smooth Scaled | 0.65 | 0.47 | 1.33 | 26.28 |
| Image Blend Smooth Solid Scaled | 13.74 | 9.28 | 33.82 | 552.79 |
| Image Blend Nearest Same Scaled | 4.97 | 5.34 | 20.29 | 378.42 |
| Image Blend Nearest Solid Same Scaled | 26.56 | 21.91 | 82.72 | 1203.65 |
| Image Blend Smooth Same Scaled | 1.56 | 1.14 | 4.1 | 60.03 |
| Image Blend Smooth Solid Same Scaled | 14.78 | 10.65 | 38.1 | 436.62 |
| Image Blend Border | 0.78 | 0.56 | 1.49 | 33.1 |
| Image Blend Solid Border | 15.33 | 10.76 | 38.85 | 482.57 |
| Image Blend Border Recolor | 0.68 | 0.47 | 1.36 | 42.68 |
| Image Quality Scale | 21.14 | 12.93 | 47 | 1155.77 |
| Image Data ARGB | 27.41 | 17.72 | 89.24 | 1966.35 |
| Image Data ARGB Alpha | 15.22 | 2.55 | 24.96 | 341.48 |
| Image Data YCbCr 601 Pointer List | 16.92 | 13.1 | 49.99 | 1228.43 |
| Image Data YCbCr 601 Pointer List Wide Stride | 18.96 | 13.14 | 50.94 | 1107 |
| Image Crossfade | 21.75 | 15.78 | 66.57 | 1705.74 |
| Text Basic | 13.83 | 11.33 | 39.12 | 477.44 |
| Text Styles | 1.63 | 1.42 | 3.58 | 36.94 |
| Text Styles Different Strings | 2.12 | 1.75 | 4.79 | 49.94 |
| Text Change | 12.02 | 10.3 | 33.37 | 409.65 |
| Textblock Basic | 14.77 | 12.29 | 38.72 | 327.28 |
| Textblock Intl | 10.86 | 18.63 | 67.42 | 424.31 |
| Rect Blend | 2.52 | 2.35 | 8.25 | 411.9 |
| Rect Solid | 24.21 | 16.33 | 76.55 | 1220.57 |
| Rect Blend Few | 116.42 | 106.76 | 448.06 | 6609.17 |
| Rect Solid Few | 135.7 | 124.04 | 556.87 | 7809.12 |
| Image Blend Occlude 1 Few | 44.19 | 35.23 | 144.58 | 2749.2 |
| Image Blend Occlude 2 Few | 21.7 | 21.48 | 91.69 | 1743.54 |
| Image Blend Occlude 3 Few | 21.3 | 22.37 | 91.73 | 1839.74 |
| Image Blend Occlude 1 | 14.78 | 12.18 | 50.21 | 760.61 |
| Image Blend Occlude 2 | 10.39 | 9.58 | 39.77 | 680.6 |
| Image Blend Occlude 3 | 5.71 | 6.2 | 25.11 | 457.63 |
| Image Blend Occlude 1 Many | 5.67 | 4.89 | 20.44 | 273.9 |
| Image Blend Occlude 2 Many | 4.71 | 4.52 | 18.42 | 270.16 |
| Image Blend Occlude 3 Many | 2.4 | 2.7 | 10.68 | 197.47 |
| Image Blend Occlude 1 Very Many | 0.5 | 0.51 | 2.43 | 20.84 |
| Image Blend Occlude 2 Very Many | 0.47 | 0.48 | 2.35 | 21.42 |
| Image Blend Occlude 3 Very Many | 0.38 | 0.44 | 1.92 | 25.46 |
| Polygon Blend | 4.51 | 4.14 | 14.81 | 293.42 |
| AVERAGE | 17.06 | 14.34 | 59.33 | 963.71 |
Yesterday we saw how with some careful programming, and the right compiler flags we could get GCC to generate optimally small code for our particular function. In this post we take a look at one of the ways in which GCC fails to produce optimal code. Now there are actually many ways, but the I want to concentrate on in this post is the use of literal pools.
So, what is a literal pool? I’m glad you asked. The
literal pool is an area of memory (in the text segment), which is used
to store constants. These constants could be plain numerical
constants, but their main use is to store the address of variables
in the system. These addresses are needed because the ARM instruction
does not have any instructions for directly loading (or storing) an
address in memory. Instead ldr and str can only
store at a ±12-bit offset from a register. Now there are lots of ways
you could generate code with this restriction, for example, you could ensure
your data section is less than 8KiB in size, and reserve a register
to be used as a base for all data lookups. But such approach only works
if you have a limited data section size. The standard approach that is
taken is that when a variable is used its address is written out
into a literal pool. The compiler then generates two instructions, first
to read the address from this literal pool, and the second is the
instruction to access the variable.
So, how exactly does this literal pool work? Well, so that a
special register is not needed to point at the literal pool, the
compiler uses the program counter (PC) register as the
base register. The generated codes looks something like: ldr r3,
[pc, #28]. That codes loads a value at a 28-byte offset from
the current value of PC into the register
r3. r3 then contains the address of the
variable we want to access, and can be used like: ldr r1, [r3,
#0], which loads the value of the variable (rather than the
address) into r1. Now, as the PC is used as
the base for the literal pool access, it should be clear that the
literal pool is stored close enough to the code that needs to use it.
To ensure that the literal pool is close enough to the code using it, the compiler stores a literal pool at the end of each function. This approach works pretty well (unless you have a 4KiB+ function, which would be silly anyway), but can be a bit of a waste.
To illustrate the problem, consider this (contrived) example code:
static unsigned int value;
unsigned int
get_value(void)
{
return value;
}
void
set_value(unsigned int x)
{
value = x;
}
Now, while this example is contrived, the pattern involved exhibits
itself in a lot of real-world code. You have some private data in a
compilation unit (value), and then you have a set of
accessor (get_value) and mutator (set_value)
functions that operate on the private data. Usually the functions would
be more complex than in our example, and usually there would be more
than two. So lets have a look at the generated code:
00000000 <get_value>: get_value(): 0: 4b01 ldr r3, [pc, #4] (8 <get_value+0x8>) 2: 6818 ldr r0, [r3, #0] 4: 4770 bx lr 6: 46c0 nop (mov r8, r8) 8: 00000000 .word 0x00000000 8: R_ARM_ABS32 .bss 0000000c <set_value>: set_value(): c: 4b01 ldr r3, [pc, #4] (14 <set_value+0x8>) e: 6018 str r0, [r3, #0] 10: 4770 bx lr 12: 46c0 nop (mov r8, r8) 14: 00000000 .word 0x00000000 14: R_ARM_ABS32 .bss
You can see that each function has a literal pool (at address 0x8
and 0x14). You can also see that there is a relocation associated with
each of these addresses (R_ARM_ABS32 .bss). This
relocation means that at link time the address of value
will be stored at locations 0x8 and 0x14. So, what is the big deal
here? Well, there are two problems. First, we have two literal pools
containing duplicate data, by storing the address of
value twice, we are wasting 4 bytes (remember from
yesterday, we have a very tight memory budget and we care where every
byte goes). The second problem, is that we need to insert a nop
in the code (at address 0x6 and 0x12), because the literal pool must
be aligned.
So, how could the compiler be smarter? Well, if instead of generating a literal pool for each individual function it did it for the whole compilation unit, then instead of having lots of little literal pools with duplicated data through-out, we would have a single literal pool for the whole file. As a bonus, you would only need alignment once as well! Obviously if the compilation unit ends up being larger than 4KiB then you have a problem, but in this case you could still save up producing the literal pool until after 4KiB worth of code. As it turns out the commercial compiler from ARM, RVCT, does exactly this. So lets have a look at the code it generates:
00000000 <get_value>: get_value(): 0: 4802 ldr r0, [pc, #8] (c <set_value+0x6>) 2: 6800 ldr r0, [r0, #0] 4: 4770 bx lr 00000006 <set_value>: set_value(): 6: 4901 ldr r1, [pc, #4] (c <set_value+0x6>) 8: 6008 str r0, [r1, #0] a: 4770 bx lr c: 00000000 .word 0x00000000 c: R_ARM_ABS32 .data$0
You see that the code is more or less the same, but there is just
one literal pool right at the end of the file, and no extra
nops are needed for alignment. Without merging literal
pools we have a .text size of 24 bytes, with the merging we slash
that down to 16 bytes.
So merging literal pools is pretty good, but the frustrating thing is that in this example, we don’t even need the literal pool!. If we examine the final compiled image for this program:
Disassembly of section ER_RO:
00008000 <get_value>:
get_value():
8000: 4802 ldr r0, [pc, #8] (800c <set_value+0x6>)
8002: 6800 ldr r0, [r0, #0]
8004: 4770 bx lr
00008006 <set_value>:
set_value():
8006: 4901 ldr r1, [pc, #4] (800c <set_value+0x6>)
8008: 6008 str r0, [r1, #0]
800a: 4770 bx lr
800c: 00008010 .word 0x00008010
Disassembly of section ER_RW:
00008010 <value>:
8010: 00000000 .word 0x00000000
You should notice that the actual location of the value variable is
0x8010. At address 0x800c we have the literal pool storing
the address of a variable which is in the very next word! If we optimised this
by hand, we would end up with something like (need to verify the offset):
Disassembly of section ER_RO:
00008000 <get_value>:
get_value():
8000: 4802 ldr r0, [pc, #4] (8008 <set_value+0x8>)
8002: 4770 bx lr
00008004 <set_value>:
set_value():
8004: 6008 str r0, [pc, #0]
8006: 4770 bx lr
Disassembly of section ER_RW:
00008008 <value>:
8008: 00000000 .word 0x00000000
If we get rid of the literal pool entirely, we save the memory of the literal pool itself (4 bytes), plus the two instructions need to load values out of the literal pool (4 bytes). This cuts our text size down to a total of only 8 bytes! This is a factor 3 improvement over the GCC generated code. Granted, you are not always going to be able to perform this type of optimisation, but when you care about size, it is quite important. It would be nice if gcc supported a small data section concept so that you could specify variables that essentially resised within the literal pool instead of needing an expensive (in terms of space and time) indirection.
For this project, it looks like the code will have to be hand tweaked assembler, which is frustrating, because when you use a person as your compiler iterations become really expensive, and you want to make sure you get your design right up front.
![]()
From RavenBlack:
New Year's Encouragements. Instead of making pressurey resolutions for yourself, make positive uplifting recommendations for other people. No negativity allowed, and try not to even imply something negative (eg. "eat better" implies you were eating poorly, but "make delicious home-cooked meals at least once a week" is pretty cleanly positive, and "make more delicious home-cooked meals because your cooking is great" is better still.)
Anyone with encouragements of this positive type may contact me via my preferred method or my LiveJournal, if you have access. (I am beginning, finally, to think about allowing comments on puzzling.org directly, but it's not likely to happen very soon.)
You are hereby invited to do this in your own weblog.
A project that I’m working on at the moment calls for a very small footprint. This post is about how to make code really small for the ARM architecture.
As you can probably guess, I’m interested in operating system code, so as an example, I’ve taken a very simple piece of operating system code, and stripped it right back to demonstrate some of the techniques to use when optimising for space. So here is the snippet of code we are going to optimise:
01 struct thread {
02 unsigned int notify_bits;
03 };
04
05 unsigned int
06 poll(struct thread *thread, unsigned int mask)
07 {
08 unsigned int result;
09 result = thread->notify_bits & mask;
10 if (result) {
11 thread->notify_bits &= ~result;
12 }
13 return result;
14 }
In this very simple operating system we have threads (thread data
is stored in struct thread). Each thread has a set of 32
signals (encoded in a single word notify_bits). This poll
is used by a thread to determine if it has been sent any signals. The
mask parameter is the set of signals that the thread is
interested in checking. So, a thread can check if a single signal has
been thread, or if any signal has been set, or if a specific subset of
signals has been set. The function returns the signals that are
available (which is simply the bit-wise and of
notify_bits and mask). It is important that
the function clears any signals that have been returned. This makes
sure that if poll is called twice the same signals are
not returned. This is achieved in lines 10—12.
So, our goal is to try and get this code as small as possible.
And every byte counts! First off, we just try and compile
this with the standard ARM gcc compiler. I’m using version 4.2.2.
So we start with: $ arm-elf-gcc -c poll.c -o poll.o.
We can then use object dump to work out what the compiler did:
$ arm-elf-objdump -dl poll.o.
00000000 <poll&ht:
poll():
0: e1a0c00d mov ip, sp
4: e92dd800 push {fp, ip, lr, pc}
8: e24cb004 sub fp, ip, #4 ; 0x4
c: e24dd00c sub sp, sp, #12 ; 0xc
10: e50b0014 str r0, [fp, #-20]
14: e50b1018 str r1, [fp, #-24]
18: e51b3014 ldr r3, [fp, #-20]
1c: e5932000 ldr r2, [r3]
20: e51b3018 ldr r3, [fp, #-24]
24: e0023003 and r3, r2, r3
28: e50b3010 str r3, [fp, #-16]
2c: e51b3010 ldr r3, [fp, #-16]
30: e3530000 cmp r3, #0 ; 0x0
34: 0a000006 beq 54 <poll+0x54>
38: e51b3014 ldr r3, [fp, #-20]
3c: e5932000 ldr r2, [r3]
40: e51b3010 ldr r3, [fp, #-16]
44: e1e03003 mvn r3, r3
48: e0022003 and r2, r2, r3
4c: e51b3014 ldr r3, [fp, #-20]
50: e5832000 str r2, [r3]
54: e51b3010 ldr r3, [fp, #-16]
58: e1a00003 mov r0, r3
5c: e24bd00c sub sp, fp, #12 ; 0xc
60: e89da800 ldm sp, {fp, sp, pc}
So, our first go gets us 100 bytes of code. Which is 10 bytes
(or 2.5 instructions) for each of our lines of code. We should be able
to do better. Well, the first thing is we should try is to use
optimisation: $ arm-elf-gcc -c poll.c -o poll.o -O2. This
gives us a much better code generation output:
00000000 <poll>: poll(): 0: e5902000 ldr r2, [r0] 4: e1a0c000 mov ip, r0 8: e0110002 ands r0, r1, r2 c: 11e03000 mvnne r3, r0 10: 10033002 andne r3, r3, r2 14: 158c3000 strne r3, [ip] 18: e12fff1e bx lr
So this got us down to 28 bytes. A factor 4 improvement for one
compiler flag, not bad. Now, -O2 does some standard
omptimisations, but -Os, will do optimisations
specifically for reducing the amount of code. So trying:
$ arm-elf-gcc -c poll.c -o poll.o -Os. This gives
a little bit better code-gen:
00000000 <poll>: poll(): 0: e5903000 ldr r3, [r0] 4: e1a02000 mov r2, r0 8: e0110003 ands r0, r1, r3 c: 11c33000 bicne r3, r3, r0 10: 15823000 strne r3, [r2] 14: e12fff1e bx lr
Down to 24 bytes (6 instructions), is pretty good. Now, as you can
see the generated code has 32-bits per instruction. The some of the
ARM architectures have two distinct instruction sets,
ARM and Thumb. The Thumb instruction
set uses 16-bit per instruction, instead of 32-bit. This denser
instruction set can enable much smaller code sizes. Of course there is
a trade-off here. The functionality of the 16-bit instructions is
going to be less than the 32-bit instructions. But lets give it a
try. At the same time, we will tell the compiler the exact CPU we want
to compile for (which is the ARM7TDMI-S) in our case. The compiler
line is: $ arm-elf-gcc -c poll.c -o poll.o -Os -mcpu=arm7tdmi
-mthumb.
Which produces code like:
00000000 <poll>: poll(): 0: 6803 ldr r3, [r0, #0] 2: 1c02 adds r2, r0, #0 4: 1c08 adds r0, r1, #0 6: 4018 ands r0, r3 8: d001 beq.n e <poll+0xe> a: 4383 bics r3, r0 c: 6013 str r3, [r2, #0] e: 4770 bx lr
So, now we are down to 16 bytes, so in Thumb we need 8 instructions (2 more than ARM), but each is only 2 bytes, not 4, so we end up with a 1/3 improvement. To get any further, we need to start looking at our code again, and see if there are ways of improving the code. Looking at the code again:
00 unsigned int
01 poll(struct thread *thread, unsigned int mask)
02 {
03 unsigned int result;
04 result = thread->notify_bits & mask;
05 if (result) {
06 thread->notify_bits &= ~result;
07 }
08 return result;
09 }
You may notice that the branch instruction on line 5, you may
notice that this is actually redundant. If result is
zero, then ~result well be 0xffffffff. Given
this thread->notify_bits &= 0xffffffff will not change
the value of thread->notify_bits. So, we can reduce this
to:
00 unsigned int
01 poll(struct thread *thread, unsigned int mask)
02 {
03 unsigned int result;
04 result = thread->notify_bits & mask;
05 thread->notify_bits &= ~result;
06 return result;
07 }
When we compile this we get down to:
00000000 <poll>: poll(): 0: 6803 ldr r3, [r0, #0] 2: 4019 ands r1, r3 4: 438b bics r3, r1 6: 6003 str r3, [r0, #0] 8: 1c08 adds r0, r1, #0 a: 4770 bx lr
This gets us down to 6 instructions (12 bytes). Pretty good since
we started at 100 bytes. Now lets look at the object code in a little
bit more detail. If you look at address 0x8, the instruction simply
moves register r1 into register r0 so that
it in the right place for return. (Note: The ARM ABI has the
return value stored in r0). This seems like a bit of a waste, it
would be good if there was a way we could have the value stored
in r0 and not waste an instruction just moving values
between registers. Now, to get a better understanding of what the
instructions are doing, I’m going to slightly rewrite the code, and
then compile with debugging, so we can see how the generated code
matches up with the source code. So first, lets rewrite the code a little
bit:
00 unsigned int
01 poll(struct thread *thread, unsigned int mask)
02 {
03 unsigned int tmp_notify_bits = thread->notify_bits;
04 mask &= tmp_notify_bits;
05 tmp_notify_bits &= ~mask;
06 thread->notify_bits = tmp_notify_bits;
07 return mask;
08 }
You should convince yourself that this code is equivalent to the
existing code. (The code generated is identical). So, we can now line
up the source code with the generated code. On line 03 (unsigned
int tmp_notify_bits = thread->notify_bits), this matches up
with address 0x0 (ldr r3, [r0, #0]). So, register
r3 is used to store variable
tmp_notify_bits. The parameter thread is stored in
register r0. Now, line 04 (mask &=
tmp_notify_bits) matches directly with address 0x2 (ands
r1, r3). Register r1 matches directly with the
mask parameter. The important part of restructuring the
code as we have done, is that it becomes obvious that we can directly
using the mask parameter instead of needing an extra variable like in
the previous code. As we continue line 05 (tmp_notify_bits &=
~mask), matches directly to 0x4 (bics r3, r1). The bics
instruction is quite neat in that it can essentially do the &=
~ in one 16-bit instruction. Line 06
(thread->notify_bits = tmp_notify_bits) stores the
result back to memory, matches directly to 0x6. Now the final line of
code (return mask;), needs two instructions 0x8 and 0xa
(adds r0, r1, #0 and bx lr). Now the reason we need to instructions
is because mask is stored in register r1, and the return
value needs to be in register r0. So how can we get mask
to be stored in r0 all along? Well, if we switch the
parameters around poll(unsigned int mask, struct thread
*thread), the mask will instead be stored in
r0 instead of r1. (Note: We don’t
necessarily have to change the interface of the function. If we want
to support keep the interface we can use a simple macro to textually
swap the parameters.) If we compile this, we get the following
generated code:
00000000 <poll>: poll(): 0: 680b ldr r3, [r1, #0] 2: 4018 ands r0, r3 4: 4383 bics r3, r0 6: 600b str r3, [r1, #0] 8: 4770 bx lr
So, we have got this down to 5 instructions, 10 bytes. This is a factor 10 improvement, not bad!
So a bit of a review:
-Os to optimise for space.
あけましておめでとうございます。
日本よりも2時間早く、年が明けました~
友人宅でのパーティーに参加してたんだけど、バルコニーからハーバーブリッヂ、大きな木を挟んで見えてたから、こりゃ花火も観れるわ~
と、楽しみに呑み騒ぎながら待ってた。
そうそう、2008年て、いつもよりも1秒多かったんだって。知ってるかい?
カウントダウンは51秒からスタートして、60秒まで数えてその後0になったんだよ~。(通常は50秒からスタートして59秒の次が0だよね。)
地球の回転がちょっとだけ変だったから、調整するためだったようだけど、いままでもこんなことってあったのかねえ~。
そんで0になった瞬間、ドドーンと一斉に打ち上がった花火
友人宅からはハーバーブリッヂ含めて3カ所分の花火が一気に見えた~
体感したい方はコチラ
ごらんあれ。
http://www.smh.com.au/photogallery/2009/01/01/1230681584600.html
Tim (a member of my local LUG) writes about some observations he has made of a nearby river and speculates on a tidal bore-like phenomenon [1]. One thing that surprised me was how short the list was on the Tidal Bore Wikipedia page [2], and the fact that is it missing ...
One year I'd like to do the same project Skud is doing for 2009, that is: a resolution a week. But this year is a finishing year for me, not a starting year.
A quick wrap-up on 2008 resolutions:
The first half of 2008 was pretty difficult for me. In retaliation, Andrew and I screamed around the south island of New Zealand in August. I do recommend its recuperative powers.
Major goals for 2009:
Other plans in 2009:
Also, given the PhD submission thing, I will probably be looking for a job towards the end of 2009. I do not know yet if I am going to apply for postdoctoral positions, this will probably depend on achieving a couple of major conference acceptances in 2009. And on deciding whether I want to live in the northern hemisphere. Even if I am I may be looking for programming or similar work in the short-term to wait out my examination process. So, keep in touch, if you want to offer me work, or come to my submission party. Or both.
![]()
A teacher in Arizona steals Linux CDs from a student and then accuses a Linux distributor of being a criminal [1]. Even though she had used Linux in the past she didn't believe that software was free. Of course that implies that in the past she had performed ...
今日は朝から大快晴で夏日和なシドニー
。
いよいよ今年最後の日
とはやっぱり感じにくいけど。。。
クリスマスのあとは、近所の友達の実家で庭でパラソルの下でランチを食べてる最中、シドニーの夏らしく通り雨がやってきた。パラソルは雨用のパラソルではなく、ただの日よけ用だったので、霧のような雨が食卓に降り注ぐ中、ウチ以外だれも気にせず庭でのランチを楽しんでいた。う~ん、何事に置いても日本での感覚からみてかなりニブイ、もといおおらかなオーストラリアの人々である。
ここ2、3日は、特に夏らしくなってきて、汗を垂れ流しながらアイロンをかけたり、ドライヤーで髪を乾かしてる尻から汗が噴き出してきて結局乾かないとか、夏を満喫
している。
クリスマス・ボクシングデーのあとは、普通にみんな31日まで仕事して、新年2日から仕事再開というスケジュールなので、私も仕事をしようかと試みるが、連日の飲み倒しによる二日酔いと暑さと日本では休みだという気持ちから、全くやる気がおきない~。。。
そして今日はどんちゃん騒ぎなニューイヤーズイブ。シドニーシティでは、大半の道路が通行止めになり、夜のハーバー五カ所で開催される花火打ち上げ
に備えるよう。
メインはやっぱりハーバーブリッヂ。ハーバーブリッヂの上から打ち上げるブリッヂ型の花火シャワーは、今年も目玉になるのだろう。私も観に行きたかったが、人ごみキライやし、今晩は友達の家でパーティーなので、メインの花火は観ないと思う。その友達の家も、ハーバーから近いので、別な場所から打ち上げる花火は見えるかもしれない
。
さて、今から今晩のパーティーに持参するしょっぱいパウンドケーキを作ろう~
みなさま、来年もどうぞ宜しくお願いいたします~。
![]()

NASA Image of the Day - Pipsqueak Star Unleashes Monster Flare
Haskell's Parsec parsing library (distributed with the GHC compiler since at least version 6.8.2) contains a very powerful expression parsing module Text.ParserCombinators.Parsec.Expr. This module makes it easy to correctly parse arithmetic expressions like the following according to the usual programming language precedence rules.
2 + a * 4 - b
Once parsed, the abstract syntax tree for that expression should look like this:
The original Parsec paper by Daan Leijen even gives a small example of using the expression parser. As can be seen below, the operators are defined in a table (ie a list of lists) where the outer list is a set precedence levels (from highest precedence to lowest) and each inner list specifies all the operators for that precedence level.
module Parser
where
import Text.ParserCombinators.Parsec
import Text.ParserCombinators.Parsec.Expr
expr :: Parser Integer
expr = buildExpressionParser table factor <?> "expression"
table :: [[ Operator Char st Integer ]]
table = [
[ op "*" (*) AssocLeft, op "/" div AssocLeft ],
[ op "+" (+) AssocLeft, op "-" (-) AssocLeft ]
]
where
op s f assoc = Infix (do { string s ; return f }) assoc
factor = do { char '(' ; x - expr ; char ')' ; return x }
<|> number
<?> "simple expression"
number :: Parser Integer
number = do { ds <- many1 digit; return (read ds) } <?> "number"
The above simple example works fine but there is a potential for a rather subtle problem if the expression parser is used in conjunction with the Text.ParserCombinators.Parsec.Token module.
The problem arises when trying to parse expressions in C-like languages which have bitwise OR (|) as well as logical OR (||) and where the bitwise operation has higher precedence than the logical. The symptom was that code containing the logical OR would fail because the expression parser was finding two bitwise ORs. After banging my head against this problem for a considerable time, I posted a problem description with a request for clues to the Haskell Cafe mailing list with the following code snippet:
import qualified Text.ParserCombinators.Parsec.Expr as E
opTable :: [[ E.Operator Char st Expression ]]
opTable = [
-- Operators listed from highest precedence to lowest precedence.
{- snip, snip -}
[ binaryOp "&" BinOpBinAnd E.AssocLeft ],
[ binaryOp "^" BinOpBinXor E.AssocLeft ],
[ binaryOp "|" BinOpBinOr E.AssocLeft ],
[ binaryOp "&&" BinOpLogAnd E.AssocLeft ],
[ binaryOp "||" BinOpLogOr E.AssocLeft ]
]
binaryOp :: String -> (SourcePos -> a -> a -> a)
-> E.Assoc -> E.Operator Char st a
binaryOp name con assoc =
E.Infix (reservedOp name >>
getPosition >>=
return . con) assoc
Unfortunately, no real answer was forthcoming and while I still held out hope that an answer might eventuate, I continued to work on the problem.
Eventually, after reasoning about the problem and looking at the source code to the Token module I realised that the problem was with the behaviour of the reservedOp combinator. This combinator was simply matching the given string at the first precedence level it found so that even if the code contained a logical OR (||) the higher precedence bitwise OR would match leaving the second vertical bar character un-parsed.
My first attempt at a solution to this problem was to define my own combinator reservedOpNf using Parsec's notFollowedBy combinator and use that in place of the problematic reservedOp.
opChar :: String opChar = "+-/%*=!<>|&^~" reservedOpNf :: String -> CharParser st () reservedOpNf name = try (reservedOp name >> notFollowedBy (oneOf opChar))
This solved the immediate problem of distinguishing between bitwise and logical OR, but I soon ran into another problem parsing expressions like this:
if (whatever == -1)
.....
which were failing at the unary minus.
A bit of experimenting suggested that again, the problem was with the behaviour of the reservedOp combinator. In this case it seemed the combinator was matching the given string, consuming any trailing whitespace and then the notFollowedBy was failing on the unary minus.
Once the problem was understood, the solution was easy, replace the reservedOp combinator with the string combinator (which matches the string exactly and doesn't consume trailing whitespace), followed by the notFollowedBy and then finally use the whiteSpace combinator to chew up trailing white space.
reservedOpNf :: String -> CharParser st ()
reservedOpNf name =
try (string name >> notFollowedBy (oneOf opChar) >> whiteSpace)
Despite minor problems like the one above, I am still incredibly impressed with the ease of use and power of Parsec. Over the years I have written numerous parsers, in multiple host languages, using a number of parser generators and I have never before used anything that comes close to matching Haskell and Parsec.
![]()
Just before Newtonmas (also known as xmas to some) I had a small hackfest in the new house that my wife and I are calling "The House that Hack Bought". We had horms, AfC, Scott, raster, kfish, pmiller and myself hacking and chatting from about 10am.
There were 5 of us in the living room:
while kfish and pmiller hacked in the sun room:
and Scott even managed to get a bit of hardware hacking done in the kitchen:
It was a great day. Thanks to all who participated.
Having used Paige's nice sharp Global knife, I knew going back to my not so sharp knife was going to be unbearable. I always want to be able to get the best out of what I have.
I've got a Japanese cook's knife, and a French style paring knife. You might ask why I chose such an odd combination. Of course, there's a simple explanation - I had no idea what I was doing when I bought those, I just wanted something better than what I was using.
My Mum bought me a diamond steel last year for Christmas, which I've been using to sharpen and hone both my knives. Anyone who knows anything about knives is by now recoiling in horror, because they'll know that Japanese and French knives are quite different, and you shouldn't be using French sharpening technique on a Japanese knife. French knives have much thicker blades, and are usually sharpened on an angle of 20 to 30 degrees. A stone is best, but a few quick swipes with a diamond steel will keep your French knife in good condition for months at a time until it needs sharpening on a stone or machine again.
Japanese knives, with their much thinner blades, need to be sharpened differently - the industrial diamond on a diamond steel is too coarse. They're meant to be sharpened so that the edge has an angle of 10 to 15 degrees. Everything I've read, and been told by people in knife shops, says that you should use a stone. I resisted buying a stone for a long long time, because I'm a miser and hate spending money when I think I've got something that should already do the job. My brother, who's been sharpening his knives on a stone for a long time, also told me that the technique is difficult to learn.
After I gave up on the steel and bought yet another tool for sharpening, I'd have to say the result was worth it, and the while I certainly haven't mastered the technique, it's not that hard to get started. There's a pretty good explanation here that I was able to follow without too much trouble.
You might wonder why the French and Japanese knives are so different. French cuisine is very different from Japanese, with the former having emphasis on complicated technique, while the latter seems to emphasise careful preparation and presentation of ingredients, often raw e.g. sushi. So one possible explanation is that the Japanese knives are designed for the more precise and intricate cutting required by that cuisine, while the French just want to cleave through the vegetables so they can get on with making the demi-glace.
Disclaimer: I am not a chef. I haven't mastered any cuisine. This is pure speculation. Get off my lawn.
I went to Allans Music today hoping to get a cheap jazz guitar, or at least a cheaper jazz guitar. No such luck, even though some of the guitars were substantially reduced, they were still way out of my price range. Guitars are getting really expensive, although good guitars always have been. I think unless I'm willing to put down some serious cash, I'm going to be rocking the Strat for a few years yet. One of these days, I might just throw caution and good financial sense to the wind and buy the instrument I really want.
While I couldn't afford a new guitar on a student budget, I could afford to get a book. After having a look around, I found a nice book called the Chord Factory, from Berklee Press. Guitar players in jazz groups spend most of their time accompanying soloists, and the chords can get pretty advanced. Most of the chord books I've seen give you page after page of chord forms in every possible key, and then the implication is that you should memorise them, or maybe look them up as needed. I've never found this approach very useful - I'd much prefer to understand how the chords are built and how I can build voicings to suit my needs in a particular musical situation. I'd also really like to know how these chords are meant to be used. The closest I'd ever seen to what I was looking for was Chord Chemistry by Ted Greene, but that wasn't quite what I wanted either.
I worked out how to build voicings for major and minor chords with root notes on any string, got most of the way with dominant, minor and major seventh chords, and have no idea what to do with ninths and further extensions. I don't know how to voice those chords on guitar either, and didn't have much idea how to started. As the name suggests, the Chord Factory book is all about learning approaches to building chord voicings, rather than learning a couple of chord voicings by rote and hoping they're always going to be enough, no matter what your sadistic band leader puts on your music stand and expects you to play.
Just as important is that the book also begins to explain how to use these chords in a progression, which I've never been able to get any understanding of before. Finally ...
Given the recent brouhaha in Debian, and General Resolution regarding Lenny’s Release policy as it relates to Firmware and Debian’s Social Contract, which has led to the resignation of Manoj Srivastava from the position of Secretary for the Debian Project, I’m reminded of the following passage from Gordon Dickson’s Tactics of Mistakes (part of Dickson’s Childe Cycle, in which he tells the story of the rise of the Dorsai):
“No,” said Cletus. “I’m trying to explain to you why I’d never make an Exotic. In your calmness in the face of possible torture and the need to kill yourself, you were showing a particular form of ruthlessness. It was ruthlessness toward yourself—but that’s only the back side of the coin. You Exotics are essentially ruthless toward all men, because you’re philosophers, and by and large, philosophers are ruthless people.”
“Cletus!” Mondar shook his head. “Do you realize what you’re saying?”
“Of course,” said Cletus, quietly. “And you realize it as well as I do. The immediate teaching of philosophers may be gentle, but the theory behind their teaching is without compunction—and that’s why so much bloodshed and misery has always attended the paths of their followers, who claim to live by those teachings. More blood’s been spilled by the militant adherents of prophets of change than by any other group of people down through the history of man.”
The conflict between idealism and pragmatism is a very old one in the Free and Open Source Software Movement. At one end of the spectrum stands Richard Stallman, who has never compromised on issues regarding his vision of Software Freedom. Standing at various distances from this idealistic pole are various members of the Open Source Community. For example, in the mid-1990’s, I used to give presentations about Linux using Microsoft Powerpoint. There were those in the audience that would give me grief about using a non-free program such as MS Powerpoint, but my response was that I saw no difference between driving a car which had non-free firmware and using a non-free slide presentation program. I would prefer to use free office suite, but at the time, nothing approached the usability of Powerpoint, and while dual-booting into Windows was a pain, I could do a better job using Powerpoint than other tools, and I refused to handcap myself just to salve the sensibilities of those who felt very strongly about Free Software and who viewed the use of all non-Free Software as an ultimate evil that must be stamped out at all costs.
It is the notion of Free Software as a philosophy, with no compromises, which has been the source of many of the disputes inside Debian. Consider, if you will, the first clause of the Debian Social Contract:
Debian will remain 100% free
We provide the guidelines that we use to determine if a work is
freein the document entitledThe Debian Free Software Guidelines. We promise that the Debian system and all its components will be free according to these guidelines. We will support people who create or use both free and non-free works on Debian. We will never make the system require the use of a non-free component.
This clause has in it no room for compromise. Note the use of words such as “100% free” and “never make the system require the use of a non-free component” (emphasis mine). In addition, the Debian Social Contract tends to be interpreted by Computer Programmers, who view such imperatives as constraints that must never be violated, under any circumstances.
Unfortunately, the real world is rarely so cut-and-dried. Even the most basic injunctions, such as “Thou shalt not kill” have exceptions. Few people might agree with claims made by the U.S. Republican Party that the war in Iraq qualified as a Just War as defined by Thomas Aquinas, but rather more people might agree that the July 20, 1944 plot to assassinate Hitler would be considered justifiable. And most people would probably agree most of the actions undertaken by the Allied Soldiers on World War II battlefields that involved killing other soldiers would be considered a valid exception to the moral (and for those in the Judeo-Christian tradition, biblical) injunction, “Thou shalt not kill“.
As another example, consider the novel and musical Les Misérables, by Victor Hugo. One of the key themes of this story is whether or not “Thou shalt not steal” is an absolute or not. Ultimately, the police inspector Javert, who lived his whole life asserting that law (untempered by mercy, or any other human considerations) was more important than all else, drowns himself in the Seine when he realizes that his life’s fundamental organizing principle was at odds with what was ultimately the Right Thing To Do.
So if even the sixth and eighth commandments admit to exceptions, why is it that some Debian developers approach the first clause of the Debian Social Contract with a take-no-prisoners, no-exceptions policy? Especially given the fourth clause of the Debian Social contract:
Our priorities are our users and free software
We will be guided by the needs of our users and the free software community. We will place their interests first in our priorities. We will support the needs of our users for operation in many different kinds of computing environments. We will not object to non-free works that are intended to be used on Debian systems, or attempt to charge a fee to people who create or use such works. We will allow others to create distributions containing both the Debian system and other works, without any fee from us. In furtherance of these goals, we will provide an integrated system of high-quality materials with no legal restrictions that would prevent such uses of the system.
This clause does not have the same sort of absolutist words as the first clause, so many Debian Developers have held that the “needs of the users” is defined by “100% free software”. Others have not agreed with this interpretation — but regardless of how “needs of the users” should be interpreted, the fact of the matter is, injuctions such as “Thou shalt not kill” are just as absolute — and yet in the real world, we recognize that there are exceptions to such absolutes, apparently unyielding claims on our behavior.
I personally believe that “100% free software” is a wonderful aspirational goal, but in particular with regards to standards documents and firmware, there are other considerations that should be taken into account. People of good will may disagree about what those exceptions should be, but I think one thing that we should consider as even higher priority and with a greater claim on how we behave is the needs of our users and fellow developers as people. For those who claim Christianity as their religious tradition, Jesus once stated,
Thou shalt love the Lord thy God with all thy heart, and with all thy soul, and with all thy mind. This is the first and great commandment. And the second is like unto it: Thou shalt love thy neighbour as thyself. On these two commandments hang all the law and the prophets.
Even for those who do not claim Christianity as their religious tradition, most moral and ethical frameworks have some variant on the Golden Rule: “Do unto others as you would have them do unto you”. I would consider, for example, that the Golden Rule is at least a high priority claim on my behavior as the notion of free speech, and in many cases, it would be a higher priority claim. The recent controversy surrounding Josselin Mouette was started precisely because Joss has taken a something which is a good thing, namely Free Speech, and relegated it to a principle more important than all else, and claiming that any restraint on such a notion was equivalent to censorship.
I think the same thing is true for free software, although it is a subtler trap. Philosophical claims than “100% free software” as most important consideration is dangerously close to treating Free Software as the Object of Ultimate Concern — or in religious terms, idolotry. For those who are religious, it’s clear why this is a bad thing; for those who aren’t — if you are unwilling to worship a supernatural being, you may want to very carefully consider whether you are willing to take a philosophical construct and raise it to a position of commanding your highest allegiance to all else, including how you treat other people.
Ultimately, I consider people to be more important than computers, hardware or software. So over time, while I may have had some disagreements with how Mark Shuttleworth has run Canonical Software and Ubuntu (but hey, he’s the multimillionaire, and I’m not), I have to give him props for Ubuntu’s Code of Conduct. If Debian Developer took the some kind of Code of Conduct at least as seriously as the Social Contract, I think interactions between Debian Developers would be far more efficient, and in the end the project would be far more successful. This may, however, require lessening the importance of philosophical constructs such as Free Speech and Free Software, and perhaps becoming more pragmatic and more considerate towards one another.
Over the last week or so I've been learning Haskell. Unlike a previous attempt to learn Erlang I think this one will actually end up bearing fruit. The main thing that slowed down my attempt to learning Erlang was that I didn't have a task to apply it to that exploited the strengths of the language.
For Haskell I have a task that is ideally suited to the language and plays to its strengths; writing a parser for an Javascript-like language. Just like when I learned Ocaml I am jumping in at the deep end with a difficult problem and learning the language on the way (ie as a side-effect of tackling the task itself).
Normally my tool of choice for this task would be Ocaml and I did in fact start out writing a parser in Ocaml using the Menhir parser generator. Menhir creates a parser from an LR(1) grammar specification where the LR(1) means that the grammar is left recursive and must need no more than one token look-ahead to resolve ambiguities. Over two days of intense hacking I made quite a lot of progress on the grammar and then hit a wall; numerous rather incomprehensible shift/reduce and reduce/reduce warnings plus a part of the grammar (raw XML embedded directly into the source code) which was not context free and for which I could not see any possible way of parsing with only one token look-ahead.
Since I already knew Ocaml, I looked around at some of the other parser generators for Ocaml like Aurochs and PCL. Aurochs uses Parsing Expression Grammar (or Packrat parsing) and seemed interesting, but was rejected because it seemed to gave very little control over the Abstract Syntax Tree it generated. PCL on the other hand was a monadic parser combinator library which is basically a port of Haskell's Parsec library to Ocaml. The main problem I saw with PCL was that it used Monads and lazy evaluation, neither of which are very common in standard Ocaml code. PCL was also rather new and I wasn't sure if it was stable and mature enough for quite an advanced parsing task.
Of course I had