[Air-L] on the Wayback Machine (was public/private [part 1 of 2])

Lois Ann Scheidt lscheidt at indiana.edu
Mon Aug 13 15:49:48 PDT 2007


Do remember that The Wayback Machine is not the only archive...possibly 
just the most well known at least to the online research set.  Several 
libraries are archiving blogs they view as significant, The National 
Library of Australia comes to mind here.

To my knowledge selection for archiving there is opt-out...not that I 
know of anyone who has done so...nor do I know what the library would 
do if someone said know.  Please more info here from anyone in the 
know.  I do know that those who have been selected have been informed 
of the selection...not asked if it was ok for their work to be included.

Lois Ann Scheidt

Doctoral Student - School of Library and Information Science, Indiana
University, Bloomington IN USA

Adjunct Instructor - School of Informatics, IUPUI, Indianapolis IN USA and
IUPUC, Columbus IN USA

Webpage:  http://www.loisscheidt.com
Blog:  http://www.professional-lurker.com


Quoting Michael Zimmer <michael.zimmer at nyu.edu>:

> I'm thinking more about less formal expressions than newspapers, etc
> - things people had little intent to be archived for posterity. Ie,
> the occasional posting to a usenet board (I used usenet in the early
> 90s having no clue that they would be archived, let alone indexed and
> searchable by my name over a decade later)...
>
> Is there a pre-Internet analog to having such informal utterances
> automatically - and often unknowingly - archived by 3rd parties who
> were not part of (nor mediated) the exchange?  I'm hoping someone can
> point to work critically exploring the ethics of the Internet Archive
> itself.
>
> -mz
>
> On Aug 13, 2007, at 12:29 PM, Jeremy Hunsinger wrote:
>
>>
>> On Aug 13, 2007, at 11:06 AM, Michael Zimmer wrote:
>>
>>> This has been an interesting discussion, and mention of IA's Wayback
>>> Machine prompts interesting questions which I'm sure others on the
>>> list can help answer:
>>>
>>> (a) Are there other media forms (current or historical) where
>>> publishing content means that it is automatically scanned and
>>> archived by external aggregators (search spiders, Internet Archive,
>>> etc)? [If I posted a note on "The Wall" at Yale Law School, no one
>>> routinely takes a snapshot of the wall to keep a permanent record of
>>> it, right?]
>>
>> Journals, newpapers, magazines come to mind as archived externally
>> and internally.
>>>
>>> (b) If examples for (a) exist, are typical publishers of said content
>>> aware that their works are being aggregated and archived in such a
>>> way?
>>
>> yes, and they try to get as much profit out of the arrangement as
>> they can, I think, but alas... it isn't always such an arrangement.
>>
>>> Would a new user know this? Are they notified?
>>
>> In the case of Newspapers, there were a few court cases a few years
>> ago dealing with the NYT archiving and distributing itself online,
>> but i don't recall anyone complaining about third party distribution
>> such as through firstsearch or similar tools.  I think that there is
>> now a standard contract in place for much of this in the publishing
>> industry.
>>
>>> [My concern here
>>> is that while many realize that search engines might crawl their
>>> content, few realize they keep a cached copy, and even fewer realize
>>> that even deleted content is archived by Wayback Machine]
>>
>>>
>>> (c) Also, if examples of (a) exist, what means are provided to
>>> prevent such automatic archiving? Is it opt-in or opt-out? How
>>> technically proficient must one be? [Concern here is that even if you
>>> know about Internet Archive, you have to be proficient with
>>> robots.txt standards in order to keep them out]
>>
>> dunno, most organizations seem to want to participate, but only under
>> the best terms they can get
>>>
>>> (d) Given (a), how can someone remove past items from such archives?
>>> [Wayback Machine will remove all domain-specific content already in
>>> its archive if you place a robots.txt file to block it going forward]
>>>
>>> I guess what I'm wondering is why there seems to be a presumption
>>> that just because I posted something on a website in 1999 I want it
>>> to always be accessible. Just because bits don't degrade like paper
>>> doesn't mean they -must- persist, does it?
>>
>> no, but shouldn't we preserve as much as we can?   I appreciate the
>> will to destroy, that's fine.  But for the people who do not care,
>> the content that they have contributed constitutes evidence of many
>> things.
>>>
>>> Keep up the good discussion,
>>> michael
>>>
>>
>> _______________________________________________
>> The Air-L at listserv.aoir.org mailing list
>> is provided by the Association of Internet Researchers http://aoir.org
>> Subscribe, change options or unsubscribe at: http://
>> listserv.aoir.org/listinfo.cgi/air-l-aoir.org
>>
>> Join the Association of Internet Researchers:
>> http://www.aoir.org/
>
> _______________________________________________
> The Air-L at listserv.aoir.org mailing list
> is provided by the Association of Internet Researchers http://aoir.org
> Subscribe, change options or unsubscribe at:
> http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
>
> Join the Association of Internet Researchers:
> http://www.aoir.org/
>






More information about the Air-L mailing list