Data Science Topics
1
2020-2021
02038778
Optional
Portuguese
English
Face-to-face
SEMESTRIAL
6.0
Elective
2nd Cycle Studies - Mestrado
Recommended Prerequisites
Calculus, Linear Algebra, Programming
Teaching Methods
T classes: presentation and discussion of concepts, techniques and algorithms. In PL, the student exercises in computer the use of algorithms in the resolution of data science problems of average complexity, making possible simulations by means of tools and frameworks. This work is done in a group, in the PL class, with the teacher's monitoring. This component weighs in the final evaluation (20%). Out-of-class achievement of a project with a report and public defense (40% of final grade). Written exam weighing 40%.
Learning Outcomes
The UC intends to introduce the area of data science, presenting the student with an overview of the area, its methodological principles, its challenges and its main applications. It is also intended to introduce the basic algorithms of a pipeline of data analysis with particular emphasis on data preparation, extraction attributes and reduction of dimensionality and on machine learning and validation. At the end it is intended that the student be able to identify from drawing pipelines and validate experimentally and formally the best algorithmic solution for a particular task. It is also intended to foster autonomous learning and group work, interpersonal relationships, and oral and written communication.
Work Placement(s)
NoSyllabus
Cap 1: Introduction
- Big Data and Data Science
- Current situation and prospects
- Required skills
Cap 2: Problems and Applications
- Life cycle and the pipeline
- Typical problems and applications C
Cap 3: Data processing
- Evaluation of signal-to-noise ratio
- Time series filtering
- Detection and treatment of outliers
- Detection and treatment of missing values
- Time-frequency transformations: extraction of non-stationary attributes
Cap 4: Attribute Handling
- Discretization of continuous variables, conversion of categorical variables
- Normalization
- Treatment of unbalanced data
Cap 5: Selection and reduction of attributes
- Classifier / regressor independent methods: Filters
- Methods based on classification / regression performance: "Wrappers"
- Embedded Methods
- Unsupervised reduction
- Supervised reduction
Chapter 6: Computational Learning
- Supervised and unsupervised learning
Cap 7: Validation
Head Lecturer(s)
Paulo Fernando Pereira de Carvalho
Assessment Methods
Assessment
Resolution Problems: 20.0%
Exam: 40.0%
Project: 40.0%
Bibliography
- Peter Flach, Machine Learning: the art and science of algorithms that maker sense of data, Cambridge University Press, 2012.
- Introduction to Machine Learning with Python, Andreas C. Muller and Sarah Guido, O'Reilly, 2017.
- García, Luengo & Herrera (2015). "Data Preprocessing in Data Mining". Springer.
- Nixon & Aguado (2008). "Feature Extraction & Image Processing". Academic Press.