Our work makes three primary contributions. First, this is the largest and most detailed measurement study to date of the file hosting ecosystem, with focus on five popular hosting services, as observed from a large edge network.
Second, we use detailed HTTP transaction logs that allowed us to study how the clients identify and select the content they download. For example, we identified signatures for user clickstreams in the transaction logs to separate free and premium user instances. This has not been previously characterized, and provides a deeper understanding of the usage of these services, as well as the dynamics of new-age content sharing and distribution.
Third, we compare and contrast these services with each other as well as with P2P file sharing and video sharing services. Our results have implications on caching, network management, content placement, and data centre provisioning, and are likely to be relevant for both network administrators and researchers.
Our study concentrates on the top five file hosting services (generating over 60% of the file hosting traffic volume) in the campus network: RapidShare, Megaupload, zSHARE, MediaFire, and Hotfile. Table 5 presents some high level characteristics of the five services. Over 90,000 files were downloaded using the top five services. In comparison, around 150,000 downloads (61 TB of P2P traffic volume) were done using BitTorrent in the campus network. The top five services were used almost every day of the year.
The file hosting ecosystem appears to be flourishing. There are hundreds of file hosting services at the disposal of users, which gives them enough choice to select a service of their liking. Our results indicate that there are a significant number of premium users, suggesting that the economic model based on advertisement and subscription revenue is sustainable.
One of the drivers of file hosting service growth is the incentive schemes instituted by the services to attract content publishers. As more content is uploaded, it causes more consumers to download the content, which in turn increases traffic. These incentive schemes have become controversial lately.
This is the authors’ version of a work that has been accepted for publication in IFIP Performance 2011 conference to be held in Amsterdam from 18 October 2011 until 20 October 2011. The final version will appear in a special issue of the Performance Evaluation journal (Elsevier). Changes resulting from the publishing process, such as editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication.