[Air-L] Information wants to be ASCII or Unicode? Tibetan-written information cannot be ASCII anyway.

Mike Stanger mstanger at sfu.ca
Thu Jul 16 12:14:02 PDT 2009


> Just as a bit of evidence of how difficult it can be to grok  
> character issues: Unicode is not "an encoding" itself, but a  
> repertoire of characters, their names, and (abstract) code points  
> (i.e., UCS), plus a set of encodings (i.e., UTF-8, UTF-16), extra  
> properties, and algorithms. And I'm sure a Unicode geek could pick  
> some wholes in what I've said!

True enough :-)  Part of the problem in discussing Unicode (and other  
things) is that one can speak to it at a 'standards' level or an 'in  
practice' level at whatever level of practice the person encounters  
Unicode.  By encoding I wasn't intending to imply that it was like  
dealing with a codepage equivalent, but that there are assumptions  
that are part of using Unicode that may not be visible to the people  
using it.

I'm thinking that the stated intent by a programmer, say in an open  
source project, that the project is using unicode for the purposes of  
being 'politically friendly' and interoperable would have the effect  
of not only making the statement, but encouraging people to help guide  
the programmer(s) in actually achieving that goal -- those who have a  
deeper understanding of the issues informing those who are looking for  
the practical goal of interoperability.

Mike



More information about the Air-L mailing list