Large Data Collection and HCI
Why is it that economists have so much shared and common data to work with, while we in HCI do not? There is so much raw data out there for economists about the stock market, GNP, GDP, exchange rates, option prices, oil prices, car crashes, sumo wrestling, and so on. Imagine what HCI could be like if we could have that much rich data.
Just off the top of my head, some data sources that I'd love to be able to use:
Unfortunately, these are all corporate secrets, and also have privacy issues involved. But think how much good we could do for everyone if this kind of data were available.
Just off the top of my head, some data sources that I'd love to be able to use:
- Google search terms
 
- Orkut and Friendster social network connections
 
- Microsoft Windows crash data, (you know, those popups that appear after a program crashes, asking if you want to send it to MSFT. What programs crash most often? What trends are there over time?)
 
- Yahoo IM, AIM, and MSN Messenger usage trends
 
- ISP usage data (how much traffic is file sharing, web, IM, etc)
 
- Yahoo web page usage trends (What happened when a change was made? What changes have been most popular? Least popular? Which parts of the navigation do people use most, ie nav bar, pictures, text links, etc? Is there a correlation between web page size and traffic?)
 
- Ebay usage trends (what factors lead to the most popular sales? What indicators are there of fraud? How have sales changed over time? Is EBay now dominated by power sellers? What product trends are there?)
Unfortunately, these are all corporate secrets, and also have privacy issues involved. But think how much good we could do for everyone if this kind of data were available.
Comments