Quantititive Methods

WMS

Last Updated: 2024-04-12

Content

1. Goal
2. Calendar
3. Lectures

1. The history of innovation
2. Getting started with R
3. Importing Data in R
4. Data Wrangling in R
5. Building Models in R
6. Introduction to Companies
7. Automated Reporting in R
8. Bigger Data and Faster Code
9. Ethics

4. Exam

Goal

In this program we focus on a selection of the material presented in the boook "The big R-book: from data science to learning machines and big data." We start with introducing the staticistical programming language R and use it to wrangle data, build models, verify models and builds reports.
The homepage of the book is here.

Calendar

#	Date	Time	Where	Content
1	2023-10-02	9:15–12:00	1093	introduction and agreeements
2	2023-10-09	9:15 – 10:4	1093	Introduction: history of innovation and starting with R
3	2023-10-06	9:15 – 12:00	1093	Starting with R

Lectures and Content

#	Lecture	Description	Downloads	Other Resources
1	The history of innovation	A historical view of banking and capitalism, the importance of exponential growth, innovations, and the great waves of capitalism. We explore the different waves and conclude that the latest wave is based on artificial intelligence, while some other promising technologies such as quantum computing, biotech and nanotech are just around the corner. This is in line with the introduction of the book "The big R-book: from data science to learning machines and big data." The homepage of the book is here	slides handouts
2	Getting started with R	In this module we get started using R and RStudio. This module introduces you to the language R.	slides R-code excercises	This material corresponds to part II of the book "The big R-book: from data science to learning machines and big data." The homepage of the book is here.
3	Importing Data in R	In this module we learn the basics of databases in general and relational databases in particular. Then we learn how to import data from SQL databases directly into R.	slides R-code excercises	This material corresponds to part III of the book "The big R-book: from data science to learning machines and big data." The homepage of the book is here.
4	Data Wrangling in R	When data is pulled from a database, it is seldom in the right format that would allow us to build a model right awary. In this module we learn how to manipulate data to prepare it in order to build models. This includes adding columns, calculations, insertions, normalising, working with strings, understanding dates, data binning, dealing with missing data, etc.	slides R-code excercises	This material corresponds to part IV of the book "The big R-book: from data science to learning machines and big data." The homepage of the book is here.
5	Building Models in R	This is where the rubber hits the road: the long preprations of importing data and preparing it for models comes to fruition now: we can start building models. We look into linear regressions, generalised linear regressions (eg. logistic regressions) and also machine learning techniques such as decision tree, random forest, support vector machines, neural networks, culstering with k-means, etc.	slides R-code excercises	This material corresponds to part V of the book "The big R-book: from data science to learning machines and big data." The homepage of the book is here.
6	Introduction to Companies	To be effective in a private enterprise it is useful to understand the basics of wealth creation and how that is reflecting in a balance sheet, profit and loss statement. This value creation chain leads to wealth creation in companies and hence this is a good hook to talk about company vaulation. Company valuation is an entry to financial markets with many financial instruments such as bonds, equities, options, futures, etc.	slides R-code excercises	This material corresponds to part VI of the book "The big R-book: from data science to learning machines and big data." The homepage of the book is here.
7	Automated Reporting in R	Even the most fantastic model or data analysis is useless if one cannot convince other people to take action. R and RMarkown provide all the tools to build an automated chain to of importing data, manipulating data, building models and reporting. We learn how to integrate code, text and layout in one document, that can be compiled to slides, or static websites. We even find out how to build an interactive application with R and {shiny}.	slides R-code	This material corresponds to part VII of the book "The big R-book: from data science to learning machines and big data." The homepage of the book is here.
8	Bigger Data and Faster Code	Even the fastest PC cannot deal with the huge amounts of data that humanity collects. We see how one can gradually use differnt techniquest to upscale data processing capacity of the computer: using more cores in the CPU, using the GPU, faster computers all the way up to the parallelism of big data solutions such as Spark. Of course efficient programming techniques remain paramount too and also here we share many tips ranging from clean and efficint code to using compiled and C++ from within R.	slides R-code excercises	This material corresponds to part IX of the book "The big R-book: from data science to learning machines and big data." The homepage of the book is here.
9	Ethics	An introduction to Ethics. What is it? What is ethical and what not? How does the refernce point of view our judgement?	slides handouts	See references in the slides

Exam

Students form groups of 3 to 5 people and present a groupwork. The groupworks consists of

find a problem to be solved with a model (eg. build an acceptance modeld for car insurance)
find an appropriate dataset
build the best possible models and compare them
prepare a report about the work
present the work in a short presentation