Data Science Topics

Year
1
Academic year
2025-2026
Code
02038778
Subject Area
Optional
Language of Instruction
Portuguese
Other Languages of Instruction
English
Mode of Delivery
Face-to-face
Duration
SEMESTRIAL
ECTS Credits
6.0
Type
Elective
Level
2nd Cycle Studies - Mestrado

Recommended Prerequisites

Calculus, Linear Algebra, Programmin.

Teaching Methods

T classes: presentation and discussion of concepts, techniques and algorithms. In PL, the student exercises in computer the use of algorithms in the resolution of data science problems of average complexity, making possible simulations by means of tools and frameworks. This work is done in a group, in the PL class, with the teacher's monitoring. This component weighs in the final evaluation (20%). Out-of-class achievement of a project with a report and public defense (40% of final grade). Written exam weighing 40%.

Learning Outcomes

The UC intends to introduce the area of data science, presenting the student with an overview of the area, its methodological principles, its challenges and its main applications. It is also intended to introduce the basic algorithms of a pipeline of data analysis with particular emphasis on data preparation, extraction attributes and reduction of dimensionality and on machine learning and validation. At the end it is intended that the student be able to identify from drawing pipelines and validate experimentally and formally the best algorithmic solution for a particular task. It is also intended to foster autonomous learning and group work, interpersonal relationships, and oral and written communication.

Work Placement(s)

No

Syllabus

Chapter 1: Introduction
- Big Data, Computational Learning and Data Science
- Lifecycle and the pipeline
- Typical problems and applications

Chapter 2: Attributes and Cleaning
- Preliminary data analysis
- Attribute types and attribute conversion
- Binning (fixed and adaptive)
- Normalization
- Data cleaning
- Treatment of unbalanced data

Chapter 3: Attribute Engineering
- Types of attributes
- Dimensionality reduction (PCA and FDA and LDA)
- Selection of attributes (Filters, Wrappers and Embedded Methods)
- Time-frequency and space-frequency analysis (FFT, STFT, wavelets)

Chapter 4: Computational Learning
- Computational Learning Taxonomies
- The Computational Learning process (data partitioning, training and evaluation models, metrics, model tuning)
- Models and Algorithms (feedforward neural networks, SVM and decision trees and their applications in classification and regression problems; Bayes algorithms)

Head Lecturer(s)

Paulo Fernando Pereira de Carvalho

Assessment Methods

Assessment
Resolution Problems: 20.0%
Project: 40.0%
Exam: 40.0%

Bibliography

- Peter Flach, Machine Learning: the art and science of algorithms that maker sense of data, Cambridge University Press, 2012.
I. Ilyas and X. Chu, Data Cleaning, ACM, 2019
P Duboue, The art of Feature Enguineering: Essentials for Machine Learning,
C. Bishop, Pattern Recognition and Machine Learning, Springer-Verlag, 2016
A. Géron, Hand-on Machine Learning with Scikit-Learn, Keras & TensorFlow, O'Reilly,
Deepti Chopra, Roopal Khurana, Introduction to Machine Learning with Python, Beenthem Books, 2023
Chapman & Hall, Feature Engineering and Selection, CRC Data Science Series, 2021

- Introduction to Machine Learning with Python, Andreas C. Muller and Sarah Guido, O'Reilly, 2017.
- García, Luengo & Herrera (2015). "Data Preprocessing in Data Mining". Springer.
- Nixon & Aguado (2008). "Feature Extraction & Image Processing". Academic Press