Using Free and Open Source Software Principles to Create an Integrated Multilingual Desktop Environment

Horms, Tokyo, June 2005
Revised February 2006

Abstract

Free and Open Source sofware makes use of the ability for users to create their own solutions, and collaboration to create high quality software at a rapid rate. This discussion looks at how similar techniques could be used to improve multilingual desktop environments.

Being a programmer, and thus spending a lot of my life in front of a computer screen, when I initially started learning Japanese one of the first things I did was to set up a Japanese language environment on my Linux desktop. I did this by cobbling together a configuration based on information available on the internet. From time to time friends of mine who are interested in learning Japanese ask me for assistance in getting this working, and I occasionally repeat the process when I get new equipment.

Recently a friend asked me to help him out with this task, on his otherwise working out of the box laptop running Ubuntu Again, hand editing of configuration files was required to get things running. A few weeks later I was talking about Linux as a desktop with my Japanese teacher, whose computer skills don't extend much further than being able to locate the power button. She can only speak Japanese and thus any solution that did not present Japanese out of the box was not going to work for her. Ubuntu was out of the question, and I got to thinking about how to improve things.

At first I thought about some fairly basic integration issues, that is how to get a Linux desktop working out of the box. In a nutshell this means having the text of the interface translated to Japanese, providing Japanese fonts and a Japanese input environment. I soon realised that Asian focused linux distributions, such as Sun Wah Linux and Miracle Linux already provide Asian language support in a form that works from install. And that with the growing prevalence of UTF-8 it was only really a matter of time before more English/European language centred distributions such as Ubuntu were able to do this too.

I then thought about some more implementation-oriented issues, basically making input-systems work out of the box. I also thought briefly about the problem of fonts, specifically that there are very few under a free licence that support Japanese. It was at this point that an idea struck.

In free and open source software key factors that drive innovation and rapid development of high quality software is the freedom for individuals and groups to improve software so that it meets their needs. This tends to lead to collaboration between groups of people with common goals, a good example of this is the Linux Kernel itself which includes code contributed by thousands of people. When the kernel doesn't behave correctly, people are free to modify it and make their changes available. In many cases these changes make it back into the main code and thus are made available in the next release. In many other cases they are maintained as a set of patches - customisations that users can add to their system if they wish. This leads to a system where software is developed very rapidly to meet the needs of users. So if my needs from a Japanese language input system aren't being met, perhaps something can be learnt from the customisation and collaboration that occurs during the development cycle of free and open source software projects.

A Japanese language environment makes heavy use of dictionary-like lookups of information. A non-native speaker will typically want a dictionary available to translate words to and from their native language. Native and non-native speakers alike make use of an engine to convert kana into kanji, this process boils down to parsing the input text into individual words and then converting the words into kanji based on a dictionary lookup. There are also online dictionaries such as Goo Jisho, translation engines such as Babel Fish, and there is Pop Jisyo, a proxy server of sorts which presents an enhanced version of a given web page which will pop up a window translating the word that your mouse pointer is over. These are all to a greater or lesser extent dictionary lookups.

Although these systems are all dictionary lookups of sort, the do not share a common interface for accessing them or for updating them. By access, I mean an API that allows applications to call on them and a protocol that allows them to be transfered over a network. And by update I mean the user's ability to annotate entries, and to create their own entries.

This led me to think af a system where a Pop Jisho-like environment is embedded into the desktop, in the same way that accessibility extensions are available in GTK and in turn Gnome for physically less able people. Underneath the hood, applications, such as a Japanese dictionary, or the kana to kanji conversion engine, could have access to the same database of information using a standardised API. The same API could provide translations of text used in the user interface, and even font data. The local database could be built from multiple sources, at the end-user's discretion. Perhaps a local copy of edict, a commercial dictionary in pdict format that was loaded onto the system from a CD-ROM, and access to online dictionaries, either free of charge or through a subscription service.

I envisage a system where users can create knowledge that the system uses to provide them with a multilingual environment. This may be new translations of words that are not in the dictionary, it may be colloquial terms that a user wants to make their own dictionary of, or it may be teaching a Pop Jisho-like engine and at the same time a kana to kanji conversion engine about some new language constructs or alternate kanji for a word.

A system where this knowledge is collaboratively created. This collaboration could be at a personal level, where it is only shared between the different systems that a user has access to. In this case I imagine a situation where a user sends the information that they have created to an online service from work, and they then synchronise this information to their PC at home. Synchronisation could also occur using portable media such as a USB memory stick or a floppy disk.

The collaboration could be at a community level, where users share their dictionaries between their friends - friends that have an interest in having a desktop environment that allows them to readily input the colloquial language of that group of friends. One can easily imagine an online service which has individuals and communities such as Orkut and Mixi. One can easily imagine this being a service that is provided to users as part of existing online communities.

And collaboration could be done at a golbal level, where users make their dictionaries and language data available publicly, much in the manner of publishing free software, or contributing an entry to Wikipedia (http://www.wikipedia.com). It provides some challenges regarding addressing licencing of the material, avoiding poisoning of the collective work by material that was not authorised for release buy the copyright holder - specifically people copying commercially available dictionaries, and protecting against spam type attacks. In all of these cases solutions can be sought from existing collaborative projects such as Wikipedia, and the open source and free software development community at large.

Collaboration for translations is already been done to some extent by the Rosetta component of Launchpad, which addresses this to some extent by providing a translator's interface to supported projects through the web. This project came about in part from a desire to free would-be translators from the burden of having to learn the development cycle and tools of a given project in order to assist with translations.

In all, I am thinking about a system where users can contribute to the knowledge-base that is needed for a multilingual desktop environment. Such an environment should implicitly grow to meet the needs of users. And by sharing the knowledge that users create a collaborative system that mirrors the power of language itself can be created.

I think that the key to this is to put in place the building blocks to describe the language information that is being stored and allow it to be easily accessed, stored and transferred. To define schemas for storing information, to provide protocols for transferring information, and to provide APIs and to make this information available to applications. And that the reference implementations of this be open source and the protocols made available under an open licence, reflecting the spirit of the goal project itself.

I have mainly spoken as someone who has a native English speaking person who wants a desktop environment that also supports Japanese, and makes Japanese/English translation information more readily available. However I believe it is equally applicable to native Japanese speaker. I also believe that English and Japanese are sufficiently different that a system that can encapsulate both English and Japanese, and make this available through a consistent API, should also be able to handle many other language combinations, and in particular languages with ideographic alphabets such as Chinese, languages with phonetic alphabets such as Thai, languages with both, such as Korean and Japanese, and both Eastern and Western European languages.


Appendix A: Fonts - Separating Function and Style

With assistance from Carsten Haitzler aka Raster
February 2006

One of the greatest challenges in creating a font is consistency. This often means that a font is designed by an individual or team that manually ensures style consistency. For fonts designed to cover European languages this a reasonably managable task, as the number of charaters required is fairly small. For example the English alphabet has 26 case characters, and even allowing for upper and lower case, numbers, and commonly used symbols, the number of characters is in the order of a hundred.

Enter Asian languages. For example, the Japanese language makes use of four different alphabets and around 2,000 characters are needed for day to day use. This excludes characters which are only used in names, which themselves number in the thousands. Clearly creating such a large number of characters is a problem in istelf. And manually enforcing style is a very difficult task.

Although there are many, many kanji characters, they are formed from a relatively small number of sub-characters called radicals. In Japanese there are only about 200 radicals. Although many of them have multiple forms, and there special variations used in some characters, they are still much fewer than the kanji that they compose. So a system that allows characters to be composed of their radicals should greatly ease the burden of style consistency for Asian fonts.

In essance font designers could provide radicals that had a consistent style. And these radicals could be used in templates that describe a character. Fine tuning could be done by hand. But the important point is that for the most part describing what a character looks like, and describing the style of a font has been separated.

This should allow contributions of new charachters to be made, without the contributer needing to be a font designer familiar with the style of the prevailing font. It should also allow variants of a font to be made by changing the radicals.

The idea of building kanji fonts from radicals isn't new. I believe that display devices - such as LED displays at train stations - with extreemly limited memory employed this, perhaps such devices still do. And perhaps its not new in the idea of font creation. It seems a logical enough way to build up a font, even if the end user just gets a generated ttf at the end. But it does provide an excellent way for colaboration on font development.

I saw Andy Fitzsimon give a presentation at Linux.Conf.Au 2006, part of what he talked about was separating style from form when producing icons. His argument was that to do a desktop properly a very large number of icons are needed. His idea is to try and build up a mechanism where icons can be provided, and then the style for the prevailing desktop can be added on top. Thus avoiding the prohibitive task of redrawing all the icons. On reflection I relise that the icons problem and the fonts problem is exactly the same.


Appendix B: Fonts - Helping Font Sets

With assistance from Carsten Haitzler aka Raster
February 2006

Usually a font only has a subset of the code points defined by UTF-8. This is partly because there are an extreemly large number of code points, and this it is a very large amount of work to create them all. And it's partly because most users only need a subset of the code points. For example, if you only ever read English text, then you only need those characters.

As more and more information is made available over the internet, users' font needs become slightly more sophisticated. So while someone searching the internet may only read English, its nice that search results in other languages come up with the correct characters. And of course, there are many people who do deal with multiple languages on a day to day basis.

Font sets are virtual fonts composed from multiple actual fonts. Its a reasonably simple scheeme, where a list of fonts are given, and when a character is needed, then each font is checked in turn, and the character is taken from the first font that provides it. This allows characters to be displayed, as long as at least one of the fonts provides it. And style issues asside, it works reasonably well. For an example of font sets, see Font Config.

However, the control over the ordering is quite coarse. And this is made worse by the fact that often fonts cover a wide range of characters, crossing over natural language boundaries.

As an example. Lets say that that someone wants to be able to read English, Japanese and Korean text. Which is actually a reasonably common thing to want to do given the closeness of Japan and Korea to each other, and the prevalance of English in international communication. Three fonts are selected, which have English, Japanese and Korean respectively. It happens that the Japanese and Korean fonts also include English characters, but the user prefers the ones in the English font, so it is placed at the head of the list. The English font doesn't contain any Japanese or Korean characters, so it doesn't interfere with the Korean or Japanese fonts, which are placed second and third. However, the Korean font provides the Japanese phoetic alphabets, hirugana and katakana, and the Japanese font includes the Korean hangul characters. So regardless of how the Korean and Japanese fonts are ordered, the user able to select which characters use which font in the way that they would like.

What is really needed is to give the user finer grained control over the way that font sets are composed, while allowing the font set implementation to have font-ordering on a per character basis. To this end, it seems that breaking up fonts a bit more, might be a good solution. The English font in the example above would likely be left as is under such a scheeme. But the Korean and Japanese fonts could be broken up, providing variants with only the English, Korean and Japanese characters that the original font contained. Of course any character which is common, such as kanji that are present in both Japanese and Korean could simply appear multiple times, or perhaps be broken out into their own font file.

What I am really thinking about is a tool to allow fonts to be broken up. On general purpose operating systems, it might make sense to make fairly fine-grained font files - the packaging over head is small and the needs of users vary wildly. And on other systems, it might make sense to try and compose a single font file that contains as many characters as possible - perhaps because they won't have font sets available at run time.

Copyright © 1995-2008 Horms
Last Modified: Sat, 04 Mar 2006 02:34:06 -0500