[Air-L] Information wants to be ASCII or Unicode? Tibetan-written information cannot be ASCII anyway.

Tue Jul 14 23:37:02 PDT 2009

At the risk of sounding like an apologist for a particular linguistic- 
centricism, English or otherwise, from a programmer's standpoint there  
are challenges beyond simply the choice to use Unicode or some  
language specific codepage.

Just using Unicode doesn't guarantee that the application viewing the  
content will have the appropriate fonts, for one, even if the proper  
unicode character sequences are sent (much as marking your pages as  
GB2312 doesn't give the end-user's machine the automatic ability to  
display the content), so it's questionable that the end-user usability  
will actually improve just by using Unicode and I would expect that at  
some levels it makes it more difficult to guarantee interoperability  
when the incoming stream is arbitrary content of an arbitrary language  
or set of languages.

When I'm coding, I'm actually much more comfortable knowing that I  
have a specific codepage to address rather than just knowing that I'll  
have a Unicode stream, for example, because I'll know exactly what my  
application should support. Unicode really tells me nothing other than  
the content could be any known character, including the famous  
"snowman" symbol  :-)  If I'm trying to mash-up a site and my code  
sees that it's in GB2312 I can take appropriate steps to support it,  
or report back that the feed is incompatible.  If I get a Unicode  
source, I have to be constantly aware that the feed might at some time  
have some requirements that I haven't yet addressed.

I might suggest that rather than restricting the phrase to linguistic  
elements and suggesting that "Unicode" is a superior term to "ASCII"  
in this case, I'd broaden it out and say "Information wants to be  
Digital" -- I think that's more the heart of the matter, but the term  
ASCII conveys more meaning about language/etc. and likely helps makes  
the implication of the argument more direct.

YMMV - There are of course libraries of routines to address such  
issues in code, but I think that actually points to the fact that  
sometimes Unicode is not a simple, direct answer to a problem as  
people might expect it to be.

Mike

On 14-Jul-09, at 12:21 AM, Han-Teng Liao (OII) wrote:

> Dear all,
>
>  Running the risk of trolling and misrepresenting the famous motto  
> "Information wants to be ASCII", I want to raise the question of the  
> difference between "Information wants to be ASCII" versus  
> "Information wants to be Unicode" from a multilingual perspective.
>
>  It should be pointed out when  Lev Manovich declare "Information  
> wants to be ASCII"  when talking about remix and remixability of  
> information, it was in 2005 when the adoption of Unicode was just in  
> the early adoption period globally.  So I do not intend to raise the  
> question to make lazy criticism against the America-centric  
> implication inside ASCII, but rather raise the question about remix  
> and remixability across linguistic boundaries.
>   Why the Unicode is not universally deployed yet?  How can we  
> measure the remixability across linguistic boundaries simply because  
> the information are encoded not in Unicode?  Why so many user- 
> generated content websites in China are only using their simplified- 
> Chinese-only kind of "national standard" (GB2312) even when Hong  
> Kong (using traditional Chinese not included in GB2312) is part of  
> China and Beijing claims Taiwan is part of China?  What about  
> Tibetan-written information: is it want to be Unicode or GB  
> 18030-2000?  Tibetan-written information cannot be ASCII anyway.
>
>      I really like to hear from you.
>
> Best regards,
>
> -- 
> Han-Teng Liao
> PhD Candidate
> Oxford Internet Institute
> http://www.oii.ox.ac.uk/people/students.cfm?id=123
>
> _______________________________________________
> The Air-L at listserv.aoir.org mailing list
> is provided by the Association of Internet Researchers http://aoir.org
> Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
>
> Join the Association of Internet Researchers:
> http://www.aoir.org/