The Unicode Technical Committee is working on encoding emoji (絵文字) in the Unicode Standard and ISO10646. It has spurred loads of discussions on the Unicode mailing list with more than a handful of forked threads, leading to fundamental questions like whether we should even encode them, and what constitutes a character.
Not to worry, the Unicode consortium is a veteran when it comes to dealing with the hairy issues of creating standards that work across languages, cultures and geographic regions. They simply can’t please everyone.
To me, the motivation for this is clear — interoperability. The current state of affairs in the Japanese mobile industry leaves a lot to be desired: across the carriers, there exist different sets of supported emoji’s, different private-use characters, substitution mappings, and code pages (user-defined characters in Shift_JIS, really). As one can imagine, the results is chaos, and as I software engineer, I really don’t want to imagine what those poor software engineers have to do to make it “just work” when a message cross the carrier boundaries.
To illustrate my point, let’s look at what Google does when you send a message with some emoji characters from GMail to each of DoCoMo, Softbank, and au.

(more…)