History & Motivation
Web Archeology project at SRC
- Large-scale mining of web content
- Empirical studies and measurements of the Web
- Improving Web clients
Our work often involves processing large numbers of pages retrieved from the web
- Would like a rapid prototyping environment for doing so
- Need a tool that automates Web computations
- To simplify tasks, the tool must have some understanding of the Web’s protocols and data structures