Econometrics and Data Science

Hua Shi
4 min readJul 28, 2020

--

In 2009, I learned the first data analysis tool — Eviews which is mainly for time-series oriented econometric analysis. I published my first academic paper in a well-known magazine based on econometrics methodologies and Eviews. It gave me the motivation to learn more about data science.

When I was learning data science and machine learning algorithms, I realized that econometrics is super powerful and useful for data scientists. I am glad that I have a solid foundation of advanced econometrics which offers me a better understanding of data science algorithms and statistical analysis.

Of course, data scientists work in various territories, and if you are a big fan of machine learning or statistical analysis, you may need a strong foundation of econometrics, so that you can interpret the results and the causality better.

Why is econometrics important for data scientists?

Machine Learning and econometrics share a lot of common interests, such as Linear Regression, Logistic Regression, ARIMA & VAR model for Time-Series, Panel Data, Null Hypothesis Test, Maximum Likelihood, Central Limite Theorems, etc.

For example, Linear Regression is a basic model of econometrics and machine learning. Even though this model is very classic, nowadays it is still very commonly and frequently used in different territories. As we know the purpose of OLS (Ordinary Least Squares) is to take first differentiate respect with intercept and coefficients to minimize the sum of the squared of Residuals (RSS or ESS). When I learned linear regression with the python and sklearn, the whole picture of the OLS process and all the assumptions already in my mind. I can understand the mathematical meaning behind machine learning algorithms and confidently interpret the results. That is how the econometrics powerful.

from CodeEmporium video — Linear Regression

Based on an econometrics background, you have a superior understanding of causal relations which allows you to think beyond the numbers and extract actionable insights. When an econometric-related or data science topic is presented, there are always some different approaches in your mind.

As the picture shows below, we can see the data science lifecycle roughly contains seven parts from business understanding to data visualization. Similarly, econometric models are used routinely for tasks ranging from data collection, data cleaning to data analysis, and ultimately interpret the results from the model to help decision makers. Therefore it will be very helpful to a person who wants to become a data scientist if she/he has an econometrics background.

What is the difference between Data Science and Econometrics?

Econometrics is used constantly in business, finance, economics, government, policy organizations, and many other fields. It is for analyzing the relationships between variables, and more emphasis on prediction and causal relations. The results from the models are interpretable.

There are several econometrics software tools such as Eviews, R, and Stata. Eviews and Stata have advanced-level environments for time series and panel data respectively. R is very popular for statistical and graphical data analysis. If you are interested in econometrics, here is the link to relevant materials or you can read the book — Fumio Hayashi Econometrics (My favorite econometrics book).

On the other hand, data science is an emerging branch of statistics. It focuses more on the development of optimal algorithms and obtaining higher accuracy via tuning the parameters and cross-validation. Sometimes the results from the models are very difficult to interpret. Python is mainly used in data science and there are very useful and powerful libraries and built-in functions. I also recommend a book — “Hands-on Machine Learning with Scikit-Learn & TensorFlow”.

Conclusion

Econometrics is central to the work of a wide variety of governments, policy organizations, central banks, financial services, and economic consulting firms. Besides, most profit companies use econometrics for strategic planning tasks such as investments, pricing, advertising and budgeting revenues, etc. Data scientists who have an econometrics background can have a great grasp of the intuition behind Machine Learning models.

Reference:

--

--

Hua Shi
Hua Shi

Written by Hua Shi

Data Engineer /Data Analyst /Machine Learning / Data Engineer/ MS in Economics

No responses yet