To foster the study of the structure and dynamics of Web traffic networks, we are making available to the research community a large Click Dataset of about 13 billion HTTP requests collected at Indiana University. During about seven months of collection in 2006-2007, our system generated data at a rate of about 60 million requests per day, or about 30 GB/day of raw data. We hope that this data will help develop a better understanding of user behavior online and create more realistic models of Web traffic. The potential applications of this data include improved designs for networks, sites, and server software; more accurate forecasting of traffic trends; classification of sites based on the patterns of activity they inspire; and improved ranking algorithms for search results.
Via Complexity Digest, Spaceweaver