Ying-Ting Lin

Arts & Sciences: Economics | MA


Cohort 2014


Graduated 2016


Career: Data Analyst | Taikoo Motors Group | Taipei, Taiwan

Scholar Highlights

Economics in the Age of the Data Revolution

In the past few years, the world has witnessed an exponential growth in data volume. Companies become the main creators of data, as people’s daily activities rely on computers more and more. These computer-mediated activities generate unprecedentedly large amounts of data. As of recently, 2.5 quintillion bytes of data are created daily. To conceptualize this, imagine you would need 57.5 billion 32 gigabyte iPads per day in order to store the data.

In 2013, the information technology consulting company Gartner published a survey on industry’s investment in big data related technology. Their results show that more than half of the organizations across all industries have such investment plans, and a growing number of companies are joining the trend. In this context the economics profession has been active in exploiting the new data technology.

Until the mid-1980s, the majority of economics papers were theoretical; the remainder relied mainly on government statistics or surveys. With the rise of the big data revolution, a number of pioneers in economics have tried to gauge academia’s understanding of this rapid change.

Hal Varian was a professor at University of California Berkeley and is now the Chief Economist at Google. He has published many works at the intersection of traditional economic research and information technology. With his substantial influence in the economic academia, his work has gauged many economists’ understanding about new data and tools that can be used to manipulate and analyze data. His standard advice to graduate students these days is “go to the computer science department and take a class in machine learning.”

Other pioneers are Jonathan Levin and Liran Einav, both professors at Stanford University. They have seen extensive collaboration between computer science and statistics professions in recent years, while the collaboration between computer scientists and economists is much less common. Levin and Einav argue that the profession could harvest considerable gains from trade with other disciplines.

Two recent research projects that have taken advantage of new data sets and have yielded fruitful results are also worth mentioning. The first is the scraped price data project hosted by Alberto Cavallo at Massachusetts Institute of Technology (MIT). His group uses programs to automate the process of extracting price data from the Web. In the U.S., the scraped data can closely replicate the government’s consumer price index (CPI) statistics. The advantage is two-fold. The data have higher frequency, real-time availability, and more details. In addition, in countries where government may be suspected of manipulating public statistics, the data become a validation device.

Another interesting research project is by Varian & Choi (2011). They used Google query data (Google Trend) to forecast economic indicators, including unemployment claims, auto sales, consumer confidence, and more. They showed that search engine queries have predictive power in situations where consumers plan their purchases in advance.

I believe the previous two examples just mark the beginning of a new chapter of inquiry, where new private sector statistics take on more momentum in economics research. More and more companies now offer data products that were not available before. For example, ADP, a company that sells human resource management systems, now publishes monthly employment statistics. MasterCard also offers retail sales data products based on credit card transactions. Their data releases are earlier than government or traditional sources.

One common feature is that more data are available in real time. In contrast, traditional government data are released with a lag of months or even years and hence are inherently retrospective. In some cases, they may not be an ideal candidate for immediate forecasts, and Internet data may provide considerable value to guide policy or business actions. As data are growing exponentially, the challenge lies in extracting useful information from such a large amount of data. In the future, I look forward to seeing researchers collaborate across disciplines and producing fruitful research by exploiting the new opportunities in the digital age.