Software and Packages for Empirical Research: Statistic Tests, Econometrics, and Machine Learning

·

About

In this instruction document, we introduce 11 useful tools for Economics: Neural Network Playground, TensorFlow, PyTorch, Eli5, Scipy, Statsmodel, Pingouin, SKlearn (scikit-learn), Keras, FinTA, and Kaggle Kernel. Basic information is provided for every tool, including introduction, license, required citation for this tool. Also, we provide examples for three of the tools: TensorFlow, SKlearn, and Kaggle Kernel. We hope this project would be helpful for those who want to conduct machine learning and statistical analysis in the field of economics.

[Neural Network Playground] is a web app written in JavaScript running in your browser. It lets you play with a real neural network and visualize it (Sato, 2016). Check its [Github repo] for more details

2. License

Neural Network Playground is a free and open-Source software, released under the Apache-2.0 License.

3. Required Citation

M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Man´e, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Vi´egas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng. TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. Software available from tensorflow.org.

[TensorFlow] is a free and open-source software library used for machine learning. It can be applied for solving various tasks but has a focus on training deep neural networks.

2. License

TensorFlow is a free and open Source software, released under the terms of the Apache License 2.0.

3. Required Citation

M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Man´e, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Vi´egas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng. TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. Software available from tensorflow.org.

4. Example

Figure 2 presents how to classify images of clothing by TensorFlow. The example is provided by the [TensorFlow official website].

Figure 2: TensorFlow Example: Classify images of clothing

[PyTorch] is an open-source machine learning library that is based on the Torch library. It can be used in fields such as computer vision and natural language processing (NLP).

2. License

PyTorch is a free and open Source software, released under the BSD License.

[ELI5] is a Python library which allows users to visualize and debug various Machine Learning models using the unified API. It has built-in support for several ML frameworks and provides a way to explain black-box models (ELI5, 2017).

2. License

Eli5 is a free and open Source software, released under the MIT License.

3. Required Citation

Angela Fan, Yacine Jernite, Ethan Perez, David Grangier, Jason Weston, Michael Auli. ELI5: Long Form Question Answering, Proceedings of ACL 2019.

SciPy

1. Introduction

[SciPy] is a free and open-source Python library and math toolkit. It can be used for scientific and technical computing.

2. License

SciPy is a free and open Source software, released under the BSD License.

3. Required Citation

Pauli Virtanen, Ralf Gommers, Travis E. Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser, Jonathan Bright, Stéfan J. van der Walt, Matthew Brett, Joshua Wilson, K. Jarrod Millman, Nikolay Mayorov, Andrew R. J. Nelson, Eric Jones, Robert Kern, Eric Larson, CJ Carey, İlhan Polat, Yu Feng, Eric W. Moore, Jake VanderPlas, Denis Laxalde, Josef Perktold, Robert Cimrman, Ian Henriksen, E.A. Quintero, Charles R Harris, Anne M. Archibald, Antônio H. Ribeiro, Fabian Pedregosa, Paul van Mulbregt, and SciPy 1.0 Contributors. (2020) SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nature Methods, 17(3), 261-272.

Statsmodel

1. Introduction

[Statsmodels] is a Python package that allows users to browse, estimate statistical models, and perform statistical tests.

2. License

Statsmodel is a Free/Open Source software, released under the open-source Modified BSD (3-clause) license.

3. Required Citation

Seabold, S., & Perktold, J. statsmodels: Econometric and statistical modeling with python. 2010. In 9th Python in Science Conference.

Pingouin

1. Introduction

[Pingouin] is an open-source statistical package written in Python 3 and based mostly on Pandas and NumPy (Pingouin, 2021).

2. License

Pingouin is a Free/Open Source software, released under the GNU General Public License v3.0.

3. Required Citation

Vallat, R. (2018). Pingouin: statistics in Python. Journal of Open Source Software, 3(31), 1026, https://doi.org/10.21105/joss.01026

SKlearn (scikit-learn)

1. Introduction

[Scikit-learn] is a free software machine learning library for the Python programming language.

FinA is open source and free to use under the LGPL-3.0 license.

Kaggle Kernel

1. Introduction

[Kaggle] Kernels are essentially Jupyter notebooks in the browser, which means that users can save themselves the hassle of setting up a local environment and have a Jupyter notebook environment inside the browser (Yufeng, 2017).

2. Example

Figure 5 shows an example of Kaggle Kernel to do data processing and simple machine learning based on the dataset on Kaggle. Check [this example] for more details.

In this section, we provide three case studies for TensorFlow, Scikit-Learn, and Kaggle Kernel. [The case of TensorFlow] is provided by the TensorFlow official website, which trains a neural network model to classify images of clothing; [The case of Scikit-Learn] shows how scikit-learn can be used to recognize images of hand-written digits, from 0-9; [The case of Kaggle Kernel] presents how to do data processing and machine learning based on the Kaggle dataset related to Bitcoin. By time series algorithm, we can predict the Bitcoin prices according to the previous ones.

Java allows you to perform everything in the contemporary computing industry. JavaFX is a current approach for generating a graphical user interface. The openFX community designed this sort of language.

JavaEE specs are also known as Jakarta EE specifications. These specs are mostly utilised in the creation of web pages. It enables developers to create excellent applications. The Java programming language is becoming more popular in the fast-growing country.

It Education is one of the leading training institutes in Pune that offers Full Stack Java Training In Pune. Expertise in both front end and back end JavaScript-based technologies.

?

surbhi nahta:

Java is superior to other programming dialects. It is the most broadly developing programming language. Making web applications and platforms is utilized with Job Oriented Java Certification Course in Pune.