[Air-L] Sampling strategies for classification tasks
Sina Furkan Özdemir
sina.ozdemir at ntnu.no
Wed Apr 29 02:08:23 PDT 2020
Dear all,
I have been following some 800 Twitter accounts for my Ph.D. dissertation over the last four months. I have ended up with 400.000 tweets that I need to categorize by four mutually exclusive categories.
I looked up some previous works with similar tasks, and it seems that the best way is to use a combination of word embeddings and recurrent neural networks with LSTM structure.
The problem I am having right now is that I couldn't find training data for the classification. Can anyone recommend me some literature on sampling strategies for short-text classification tasks?
Best,
Sina Özdemir
Ph.D. Candidate
NTNU, Trondheim
M.A Comparative and International Studies
ETH Zurich & University of Zurich, Switzerland
B.A. Political Science and International Relations
Middle East Technical University, Turkey
More information about the Air-L
mailing list