[Air-l] Question re Size of Data Set...
paul.teusner at rmit.edu.au
Wed Jan 24 15:47:53 PST 2007
I'm a PhD student too and I'm going through exactly the same problem.
For me the question isn't so much about the amount of data but how are you
going to use it. How in depth is your analysis going to be into the text and
how much discussion do you want to generate out of it?
>From those two questions I'd perhaps want to ask if you're doing any
discourse analysis in your ethnographic research (and I'm only assuming
you're doing an ethnography of the message board because you mentioned
Hine). If you are, then the analysis could get quite deep, and all those
"chat" type posts will be useful. You might want to check out Ron Scollon
and Suzie Wong Scollon's book, Nexus Analysis to help you out there.
>From the looks of it, in my small postgrad opinion, you could do a lot with
the sample you have. I would just go with what you have.
fishers, surfers and casters
From: air-l-bounces at listserv.aoir.org
[mailto:air-l-bounces at listserv.aoir.org] On Behalf Of Matthew Pearson
Sent: Thursday, 25 January 2007 10:32
To: air-l at listserv.aoir.org
Subject: [Air-l] Question re Size of Data Set...
This is my debut post to this excellent list after a long time spent
reading lots of good stuff from others.
I have a question re the size of my data set for my dissertation
project: Is my data set too large, too small, or just right?
I'd very much appreciate any insight/ideas/feedback anyone has about
this. My advisor and I aren't that sure about this issue, and I
haven't been able to discern much about this issue from a lot of the
studies I've read.
I do realize my question is thus far meaningless without knowing
anything about my project, so here's some more information/background:
I'm doing a close look at one message board community--one devoted to
discussion of a particular college basketball team. I've got all
sorts of things I'm interested in, but my central research question
has to do with the ways that people teach each other and learn from
one another the conventions for discourse in/on the message board.
(I'm also interested in potential emerging genres of writing, the
influence of sports fandom on online literacy practices, and perhaps
even examining issues related to gender (which I realize is a pretty
general thing to say, but I'll keep it at that for now).)
I've got two main sources of data: both (1) archived threads/posts
from the message board, and (2) online questionnaires that
participants/members filled out. My question concerns source (1)--
the archival data.
I have tons of data archived. I used one of those "site-sucker"
programs to grab all the discussions on the message board over about
a 8 month period of time. Given that this message board is a pretty
busy one and that I'm using a ground-theory approach to the data
analysis, I chose to sample a smaller set of the overall data. I
used an "event sampling" method and, with input from posters on the
message board, chose 5 "big" events around which to sample
discussion. I then also chose 5 other events that occurred during
the months I archived discussion that were not listed as "big" events
by anyone who offered their sense of the "big" events. I didn't,
though, choose just those threads of discussion related to those
"big" and non-big events, but rather used those as anchoring moments
in time, and then sampled ALL discussion that occurred on those
dates, and one day prior and one day later. This resulted in such a
large data set that I ended up using only 3 "big" events and 3 non-
big ones, and then sampling for those dates, and the days immediately
What I'm left with now is about 4000 individual .html pages, some of
which have fairly detailed threads of discussion, with sizable
individual posts, and also, of course, many of which that have
cursory, short sentences that perhaps look more like "chat." This is
a lot of stuff to wade through, yet it does represent only 18 days of
life on this message board. Thus far I've been going through the
data in separate "passes," looking for answers to particular aspects
of my research question, and it's a daunting thing. I know research
takes a lot of work and time, but I thought it wise to get feedback
to see if I'm going overboard here.
So does my sample sound reasonable? I'm well aware that the way I
sample will directly impact the kinds of conclusions can draw and
level of rigor folks see in my work.
Any thoughts? Good sources re this kind of methodology? I've got
Virtual Methods Ed. by Hine, among other sources, and haven't seen
anything yet re sample size. Maybe I missed it somehow?
mdpearson at wisc.edu
PhD Candidate, University of Wisconsin Department of English--
Composition and Rhetoric;
Research Assistant, UC-Irvine Writing Project;
& Man on the Street
The air-l at listserv.aoir.org mailing list
is provided by the Association of Internet Researchers http://aoir.org
Subscribe, change options or unsubscribe at:
Join the Association of Internet Researchers:
More information about the Air-l