Big data is a broad term for data sets so large or complex that traditional data processing applications are inadequate. Challenges include analysis, capture, data curation , search, sharing , storage, transfer, visualization, and information privacy . (WIKIPEDIA)¶
2006: AWS EC2 (cloud-based computing clusters)
Google article on MapReduce by Dean and Ghemawat, 2004
把系统与算法结合,设计大规模分布式的机器学习算法与系统,使得机器学习算法可以在多处理器和多机器的集群环境下作业,处理更大量级的数据。 这方面较为知名的系统包括:
imMens: Real-time Visual Querying of Big Data from Stanford Visualization Group on Vimeo.
Bin-Summarize-Smooth: A Framework for Visualizing Large Data (Hadley Wickham)
"Why Exploring Big Data is Hard and What We Can Do About It", Danyel Fisher's talk at OpenVisConf 2015
d. boyd and K. Crawford, "Critical Questions for Big Data"
Information, Communication & Society Volume 15, Issue 5, 2012 http://www.tandfonline.com/doi/abs/10.1080/1369118X.2012.678878
Google Flu Trends: The Limits of Big Data (NYT)
Lazer, David, Ryan Kennedy, Gary King, and Alessandro Vespignani. 2014. “The Parable of Google Flu: Traps in Big Data Analysis.” Science 343 (14 March): 1203-1205.
Halevy, Norvig, Pereira
维克托•迈尔•舍恩伯格(Viktor Mayer-Schönberger) 大数据时代:生活、工作与思维的大变革, 浙江人民出版社 Big Data:A Revolution That Will Transform How We Live, Work, and Think 译者: 周涛 2012-12 页数: 261
http://ghostweather.slides.com/lynncherny/what-is-big-data-anyway