Quantititive Methods

WMS
Last Updated: 2025-10-28
Content
- 1. Goal
- 2. Calendar
- 3. Lectures
- 1. Agreements and Introduction program
- 2. The history of innovation
- 3. Getting started with R
- 4. Importing Data in R
- 5. Data Wrangling in R
- 6. Building Models in R
- 7. Introduction to Companies
- 8. Automated Reporting in R
- 9. Bigger Data and Faster Code
- 10. Ethics
- 11. Bias in data
- 12. Ideas for the end-projects
- 4. Exam
Goal
In this program we focus on a selection of the material presented in the boook "The big R-book: from data science to learning machines and big data." We start with introducing the staticistical programming language R and use it to wrangle data, build models, verify models and builds reports.
The homepage of the book is here.
Calendar
| # | Date | Time | Where | Content |
|---|---|---|---|---|
| 1 | 2025-10-03 | 9:45–11:15 | C7/2.11 | introduction and agreements |
| 2 | 2025-10-10 | 9:45–11:15 | C7/2.11 | NO CLASSES |
| 3 | 2025-10-17 | |||
| 4 | 2025-10-24 | 9:45–11:15 | C7/2.11 | Introduction: history of innovation and starting with R + [3] Starting with R |
| 4 | 2025-11-07 | 9:45–11:15 | C7/2.11 | [4 + 5] Tidyverse, data manipulation and databases and [8] automated reporting with RMarkdown |
| 5 | 2025-11-14 | 9:45–11:15 | C7/2.11 | |
| 6 | 2025-11-21 | 9:45–11:15 | C7/2.11 | [8] automated reporting with RMarkdown and [6] Linear regressions in R |
| 7 | 2025-11-28 | 9:45–11:15 | C7/2.11 | [6] Cross validation |
| 8 | 2025-12-05 | 9:45–11:15 | C7/2.11 | [6] Logisitc regression in R |
| 9 | 2025-12-12 | 9:45–11:15 | C7/2.11 | [6] Performance of binary classification models |
| 10 | 2025-12-19 | 9:45–11:15 | C7/2.11 | [6] AI: decision tree and random forest |
| 11 | 2026-01-09 | 9:45–11:15 | C7/2.11 | [6] AI: Neural networks and deep learning |
| 11 | 2026-01-16 | 9:45–11:15 | C7/2.11 | [6] SVN and k-means |
| 11 | 2026-01-23 | 9:45–11:15 | C7/2.11 | questions or elective topic |
| 12 | 2026-01-30 | 9:45–11:15 | HSBC, Ul. Kapelanka 42A, 30-347 Krakow | EXAM |
Lectures and Content
| # | Lecture | Description | Downloads | Other Resources |
|---|---|---|---|---|
| 1 | Agreements and Introduction program | Explain how the course will work, how we work together to the final presentations, how the scores are determined, etc. | ||
| 2 | The history of innovation | Explore capitalism’s evolution through history, exponential growth, and innovation waves—from banking’s roots to today’s AI-driven era. Dive into emerging frontiers like quantum computing, biotech, and nanotech. Inspired by 'The Big R-Book: From Data Science to Learning Machines and Big Data'. | ||
| 3 | Getting started with R | In this module we get started using R and RStudio. This module introduces you to the language R. | This material corresponds to part II of the book "The big R-book: from data science to learning machines and big data." The homepage of the book is here. | |
| 4 | Importing Data in R | In this module we learn the basics of databases in general and relational databases in particular. Then we learn how to import data from SQL databases directly into R. | This material corresponds to part III of the book "The big R-book: from data science to learning machines and big data." The homepage of the book is here. | |
| 5 | Data Wrangling in R | Raw database data is rarely analysis-ready. Master essential preprocessing skills—feature engineering (adding columns, calculations), data cleaning (missing values, normalization), and transforming dates, strings, and bins—to transform raw data into a goldmine for modeling. | This material corresponds to part IV of the book "The big R-book: from data science to learning machines and big data." The homepage of the book is here. | |
| 6 | Building Models in R | This course transitions from data preparation to practical model-building, covering foundational statistical methods (linear regression, generalized linear models like logistic regression) and key machine learning techniques (decision trees, random forests, support vector machines, neural networks, and k-means clustering). Focused on implementation, it bridges theory with real-world application. | This material corresponds to part V of the book "The big R-book: from data science to learning machines and big data." The homepage of the book is here. | |
| 7 | Introduction to Companies | This course explores the fundamentals of wealth creation in private enterprises, linking financial statements (balance sheets, profit/loss statements) to company valuation. It introduces core concepts of valuing businesses and connects these principles to financial markets, covering instruments like equities, bonds, options, and futures. Focused on practical insights, it bridges corporate finance theory with real-world market applications. | This material corresponds to part VI of the book "The big R-book: from data science to learning machines and big data." The homepage of the book is here. | |
| 8 | Automated Reporting in R | This course focuses on transforming data insights into actionable outcomes through effective communication. Using R, RMarkdown, and Shiny, you’ll master automated workflows—from data import and analysis to generating dynamic reports, slides, static websites, and interactive dashboards. Learn to seamlessly integrate code, text, and visuals in reproducible documents, ensuring your findings drive informed decisions. | This material corresponds to part VII of the book "The big R-book: from data science to learning machines and big data." The homepage of the book is here. | |
| 9 | Bigger Data and Faster Code | This course addresses the computational challenges of large-scale data by teaching scalable processing techniques. Learn to optimize performance through multi-core CPU utilization, GPU acceleration, distributed systems (e.g., Apache Spark), and efficient coding practices—including clean code design and integrating compiled languages like C++ into R workflows. Balance hardware scalability with software optimization to tackle real-world big data demands. | This material corresponds to part IX of the book "The big R-book: from data science to learning machines and big data." The homepage of the book is here. | |
| 10 | Ethics | An introduction to Ethics. What is it? What is ethical and what not? How does the refernce point of view our judgement? | See references in the slides | |
| 11 | Bias in data | Recognising bias in data and models and building robust, unbiased models. | ||
| 12 | Ideas for the end-projects | The end-project is making a model, cross-validating it and reporting back. To do that, you will need data. Feel free to bring your own data to the party, but in case you struggle to find good sources, here are some ideas. |
Exam
Students form groups of 3 to 5 people and present a groupwork. The groupworks consists of
- find a problem to be solved with a model (eg. build an acceptance modeld for car insurance)
- find an appropriate dataset
- build the best possible models and compare them
- prepare a report about the work
- present the work in a short presentation