Product Code Database
Example Keywords: playbook -hat $59
   » » Wiki: Archive Site
Tag Wiki 'Archive Site'.

In , an archive site is a that stores information on webpages from the past for anyone to view.

Common techniques
Two common techniques for archiving web sites are using a or soliciting user submissions:

  1. Using a : By using a web crawler (e.g., the ) the service will not depend on an active community for its content, and thereby can build a larger database faster. However, web crawlers are only able to index and archive information the public has chosen to post to the Internet, or that is available to be crawled, as web site developers and system administrators have the ability to block web crawlers from accessing certain web pages (using a robots.txt).
  2. User submissions: While it can be difficult to start user submissions services due to potentially low rates of user submission, this system can yield some of the best results. By crawling web pages one is only able to obtain the information the public has chosen to post online; however, potential content providers may not bother to post certain information, assuming no one would be interested in it, because they lack a proper venue in which to post it, or because of copyright concerns. However, users who see someone wants their information may be more apt to submit it.


Google Groups
On February 12, 2001, acquired the discussion group archives from and turned it into their service. They allow users to search old discussions with Google's search technology, while still allowing users to post to the .

Internet Archive
The is building a compendium of websites and . Starting in 1996, the Archive has been employing a web crawler to build up their database. It is one of the best known archive sites.

NBCUniversal Archives
NBCUniversal Archives offer access to exclusive content from and its subsidiaries. Their NBCUniversal Archives website provides easy viewing of past and recent news clips, and it is a prime example of a news archive. NBCUniversal Archives

offers an automated -based, SaaS for marketing, compliance, and litigation related needs including electronic discovery.

PANDORA (), founded in 1996 by the National Library of , stands for Preserving and Accessing Networked Documentary Resources of Australia, which encapsulates their mission. They provide a long-term catalog of select online publications and web sites authored by Australians or that are of an Australian topic. They employ their PANDAS (PANDORA Digital Archiving System) when building their catalog. is a large library of old text files maintained by Jason Scott Sadofsky. Its mission is to archive the old documents that had floated around the bulletin board systems (BBS) of his youth and to document other people's experiences on the bulletin board systems.

See also

Page 1 of 1
Page 1 of 1


Pages:  ..   .. 
Items:  .. 


General: Atom Feed Atom Feed  .. 
Help:  ..   .. 
Category:  ..   .. 
Media:  ..   .. 
Posts:  ..   ..   .. 


Page:  .. 
Summary:  .. 
1 Tags
10/10 Page Rank
5 Page Refs