[Air-l] personal homepages

Frank Schaap architext at fragment.nl
Thu Nov 27 03:53:09 PST 2003

Rowin A Cross wrote:
> By content, do you mean subject matter rather than linguistic factors - 
> as discussed briefly at 
> http://www.nature.com/nsu/nsu_pf/030714/030714-13.html?  It would be 
> interesting to compare 'apparent' gender as implied by content with 
> linguistic gender (for want of a better expression), given the 
> discussions earlier on this list about hiding or creating identities 
> online.

This research has been implemented in the online Gender Genie, at:


You can feed it text and with ~70% correctness it will tell you the gender
of the author of the text. I fed it 3 of my own texts and it miss-assigned
one, while correctly guessing my gender from the other 2.

However, it only does this with English texts. I recon the general gist of
the algoritm could be translated, but you probably need to do a lot of work
to actually make it work for other languages, as language is such a
intricate instrument. I'm not an expert on this algoritm, but I guess it's
out for the moment for analyzing Dutch websites.

So, I actually relied on my own cultural and linguistic understandings of
what makes Dutch masculinity and femininity, "reading" both subject matter
and linguistic patterns, although it must be said, I did so very loosely and


PS this reply is given 189 feminine and 191 masculine points by the Gender 
Genie, barely giving me a pass on my gender ;)

More information about the Air-L mailing list