[Air-L] [Correction] RE: Sampling strategies for classification tasks

Wed Apr 29 02:17:31 PDT 2020

Sorry to spam,

I couldn't find literature on sampling strategies for generating training data for classification tasks. I wasn't looking for a training dataset. So any literature recommendation on the possible sampling strategies is much appreciated.

Best,
Sina

>-----Original Message-----
>From: Air-L <air-l-bounces at listserv.aoir.org> On Behalf Of Sina Furkan
>Özdemir
>Sent: Wednesday, April 29, 2020 11:08 AM
>To: air-l at listserv.aoir.org
>Subject: [Air-L] Sampling strategies for classification tasks
>
>Dear all,
>
>I have been following some 800 Twitter accounts for my Ph.D. dissertation
>over the last four months. I have ended up with 400.000 tweets that I need to
>categorize by four mutually exclusive categories.
>
>I looked up some previous works with similar tasks, and it seems that the best
>way is to use a combination of word embeddings and recurrent neural
>networks with LSTM structure.
>
>The problem I am having right now is that I couldn't find training data for the
>classification. Can anyone recommend me some literature on sampling
>strategies for short-text classification tasks?
>
>Best,
>Sina Özdemir
>Ph.D. Candidate
>NTNU, Trondheim
>M.A Comparative and International Studies ETH Zurich & University of Zurich,
>Switzerland B.A. Political Science and International Relations Middle East
>Technical University, Turkey
>
>_______________________________________________
>The Air-L at listserv.aoir.org mailing list is provided by the Association of
>Internet Researchers http://aoir.org Subscribe, change options or unsubscribe
>at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
>
>Join the Association of Internet Researchers:
>http://www.aoir.org/