Data Science Topics
1
2025-2026
02038778
Optional
Portuguese
English
Face-to-face
SEMESTRIAL
6.0
Elective
2nd Cycle Studies - Mestrado
Recommended Prerequisites
Calculus, Linear Algebra, Programmin.
Teaching Methods
T classes: presentation and discussion of concepts, techniques and algorithms. In PL, the student exercises in computer the use of algorithms in the resolution of data science problems of average complexity, making possible simulations by means of tools and frameworks. This work is done in a group, in the PL class, with the teacher's monitoring. This component weighs in the final evaluation (20%). Out-of-class achievement of a project with a report and public defense (40% of final grade). Written exam weighing 40%.
Learning Outcomes
The UC intends to introduce the area of data science, presenting the student with an overview of the area, its methodological principles, its challenges and its main applications. It is also intended to introduce the basic algorithms of a pipeline of data analysis with particular emphasis on data preparation, extraction attributes and reduction of dimensionality and on machine learning and validation. At the end it is intended that the student be able to identify from drawing pipelines and validate experimentally and formally the best algorithmic solution for a particular task. It is also intended to foster autonomous learning and group work, interpersonal relationships, and oral and written communication.
Work Placement(s)
NoSyllabus
Chapter 1: Introduction
- Big Data, Computational Learning and Data Science
- Lifecycle and the pipeline
- Typical problems and applications
Chapter 2: Attributes and Cleaning
- Preliminary data analysis
- Attribute types and attribute conversion
- Binning (fixed and adaptive)
- Normalization
- Data cleaning
- Treatment of unbalanced data
Chapter 3: Attribute Engineering
- Types of attributes
- Dimensionality reduction (PCA and FDA and LDA)
- Selection of attributes (Filters, Wrappers and Embedded Methods)
- Time-frequency and space-frequency analysis (FFT, STFT, wavelets)
Chapter 4: Computational Learning
- Computational Learning Taxonomies
- The Computational Learning process (data partitioning, training and evaluation models, metrics, model tuning)
- Models and Algorithms (feedforward neural networks, SVM and decision trees and their applications in classification and regression problems; Bayes algorithms)
Head Lecturer(s)
Paulo Fernando Pereira de Carvalho
Assessment Methods
Assessment
Resolution Problems: 20.0%
Project: 40.0%
Exam: 40.0%
Bibliography
- Peter Flach, Machine Learning: the art and science of algorithms that maker sense of data, Cambridge University Press, 2012.
I. Ilyas and X. Chu, Data Cleaning, ACM, 2019
P Duboue, The art of Feature Enguineering: Essentials for Machine Learning,
C. Bishop, Pattern Recognition and Machine Learning, Springer-Verlag, 2016
A. Géron, Hand-on Machine Learning with Scikit-Learn, Keras & TensorFlow, O'Reilly,
Deepti Chopra, Roopal Khurana, Introduction to Machine Learning with Python, Beenthem Books, 2023
Chapman & Hall, Feature Engineering and Selection, CRC Data Science Series, 2021
- Introduction to Machine Learning with Python, Andreas C. Muller and Sarah Guido, O'Reilly, 2017.
- García, Luengo & Herrera (2015). "Data Preprocessing in Data Mining". Springer.
- Nixon & Aguado (2008). "Feature Extraction & Image Processing". Academic Press