[Air-L] on the Wayback Machine (was public/private [part 1 of 2])

Mon Aug 13 12:07:42 PDT 2007

I'm thinking more about less formal expressions than newspapers, etc  
- things people had little intent to be archived for posterity. Ie,  
the occasional posting to a usenet board (I used usenet in the early  
90s having no clue that they would be archived, let alone indexed and  
searchable by my name over a decade later)...

Is there a pre-Internet analog to having such informal utterances  
automatically - and often unknowingly - archived by 3rd parties who  
were not part of (nor mediated) the exchange?  I'm hoping someone can  
point to work critically exploring the ethics of the Internet Archive  
itself.

-mz

On Aug 13, 2007, at 12:29 PM, Jeremy Hunsinger wrote:

>
> On Aug 13, 2007, at 11:06 AM, Michael Zimmer wrote:
>
>> This has been an interesting discussion, and mention of IA's Wayback
>> Machine prompts interesting questions which I'm sure others on the
>> list can help answer:
>>
>> (a) Are there other media forms (current or historical) where
>> publishing content means that it is automatically scanned and
>> archived by external aggregators (search spiders, Internet Archive,
>> etc)? [If I posted a note on "The Wall" at Yale Law School, no one
>> routinely takes a snapshot of the wall to keep a permanent record of
>> it, right?]
>
> Journals, newpapers, magazines come to mind as archived externally
> and internally.
>>
>> (b) If examples for (a) exist, are typical publishers of said content
>> aware that their works are being aggregated and archived in such a
>> way?
>
> yes, and they try to get as much profit out of the arrangement as
> they can, I think, but alas... it isn't always such an arrangement.
>
>> Would a new user know this? Are they notified?
>
> In the case of Newspapers, there were a few court cases a few years
> ago dealing with the NYT archiving and distributing itself online,
> but i don't recall anyone complaining about third party distribution
> such as through firstsearch or similar tools.  I think that there is
> now a standard contract in place for much of this in the publishing
> industry.
>
>> [My concern here
>> is that while many realize that search engines might crawl their
>> content, few realize they keep a cached copy, and even fewer realize
>> that even deleted content is archived by Wayback Machine]
>
>>
>> (c) Also, if examples of (a) exist, what means are provided to
>> prevent such automatic archiving? Is it opt-in or opt-out? How
>> technically proficient must one be? [Concern here is that even if you
>> know about Internet Archive, you have to be proficient with
>> robots.txt standards in order to keep them out]
>
> dunno, most organizations seem to want to participate, but only under
> the best terms they can get
>>
>> (d) Given (a), how can someone remove past items from such archives?
>> [Wayback Machine will remove all domain-specific content already in
>> its archive if you place a robots.txt file to block it going forward]
>>
>> I guess what I'm wondering is why there seems to be a presumption
>> that just because I posted something on a website in 1999 I want it
>> to always be accessible. Just because bits don't degrade like paper
>> doesn't mean they -must- persist, does it?
>
> no, but shouldn't we preserve as much as we can?   I appreciate the
> will to destroy, that's fine.  But for the people who do not care,
> the content that they have contributed constitutes evidence of many
> things.
>>
>> Keep up the good discussion,
>> michael
>>
>
> _______________________________________________
> The Air-L at listserv.aoir.org mailing list
> is provided by the Association of Internet Researchers http://aoir.org
> Subscribe, change options or unsubscribe at: http:// 
> listserv.aoir.org/listinfo.cgi/air-l-aoir.org
>
> Join the Association of Internet Researchers:
> http://www.aoir.org/