[Air-L] on the Wayback Machine (was public/private [part 1 of 2])

Vidar Falkenberg vidar at imv.au.dk
Tue Aug 14 00:50:37 PDT 2007


In Denmark, the archiving and preservation of the Danish portion of the  
internet is done by The State and University Library and the Royal Library.

 From www.netarchive.dk :

"A new legal deposit law came into force on the 1st of July 2005. As a  
result, The State and University Library and the Royal Library are now  
collecting and preserving the Danish portion of the internet.
In practical terms, this means that the two institutions collect  
internet-material using so-called "harvesters" (web crawlers).

The archive is not publicly accessible and initially can only be used for  
research purposes and with prior permission from the Danish Data  
Protection Agency.
A set of guidelines regarding the new law has been developed. It outlines  
the implications of the new law for the national libraries and for  
websites and content-providers. "


The guidelines are in danish only, but the law itself states that setting  
up a password might not be sufficient to avoid harvesting:

"§ 10. The person under the legal deposit obligation must upon demand  
inform the legal deposit institution about access codes and provide other  
information etc. necessary for gaining access to the material, produce  
copies of the material and make the material available to the general  
public. "

http://www.bs.dk/content.aspx?itemguid=%7B332484E6-A5B1-4CEE-B953-059843182050%7D

The archive is meant to collect everything made publically available.  
Examples of material not considered public are closed intranets with a  
limited set of users like company employees or research groups, or emails  
and the like aimed at a limited group of persons. The decisive factor  
between public and private seems to be whether a password is obtainable  
for all (= public) or by invitation only (= private).

Opt-out is not possible, and robots.txt is ignored.

Vidar Falkenberg
Ph.d.-student
Institute of Information and Media Studies
University of Aarhus
Denmark


The Den 14.08.2007 kl. 00:50:10 skrev <air-l at listserv.aoir.org>:

> Do remember that The Wayback Machine is not the only archive...possibly
> just the most well known at least to the online research set.  Several
> libraries are archiving blogs they view as significant, The National
> Library of Australia comes to mind here.
>
> To my knowledge selection for archiving there is opt-out...not that I
> know of anyone who has done so...nor do I know what the library would
> do if someone said know.  Please more info here from anyone in the
> know.  I do know that those who have been selected have been informed
> of the selection...not asked if it was ok for their work to be included.
>
> Lois Ann Scheidt
>
> Doctoral Student - School of Library and Information Science, Indiana
> University, Bloomington IN USA
>
> Adjunct Instructor - School of Informatics, IUPUI, Indianapolis IN USA  
> and
> IUPUC, Columbus IN USA
>
> Webpage:  http://www.loisscheidt.com
> Blog:  http://www.professional-lurker.com
>



More information about the Air-L mailing list