Feature Engineering

Year
3
Academic year
2023-2024
Code
01016642
Subject Area
Informatics
Language of Instruction
Portuguese
Mode of Delivery
Face-to-face
Duration
SEMESTRIAL
ECTS Credits
6.0
Type
Compulsory
Level
1st Cycle Studies

Recommended Prerequisites

Mathematical Analysis I, Mathematical Analysis II, Numerical Linear Algebra and Scientific Calculus, Statistics

Teaching Methods

Teaching methodologies:
- Theoretical classes (2 hours weekly) for presentation and discussion of the matter and problem solving, establishing links with the practical laboratorial classes, using slides and computer demonstrations.
- Practical Laboratorial classes (2 hours weekly) to support the execution of exercises of the Worksheets and of the Mini-project.

Adopted resources:
- Slides to support theoretical classes
- Miscellaneous Bibliography (books on the cov

Learning Outcomes

The aim of this course is to provide students with the theoretical knowledge and tools to extract, select and transform information so that it can be efficiently used by analysis and computational learning algorithms.
The discipline addresses the data from two perspectives. Firstly at the level of the raw data and consequently at the level of the extracted raw data features.
The student should at the end of the course be able to:
-analyzing the quality of the information (eg evaluation of the signal-to-noise ratio);
-filtering time series;
-feature extraction;
-develop strategies for data annotation;
-develop techniques for detection and treatment of outliers;
-missing values handling;
-convert data (eg normalization, discretization);
-to treat imbalance in data and
-project and implement techniques for selection and reduction of features.

Work Placement(s)

No

Syllabus

Chapter 1: Introduction to Feature Engineering
Chapter 2: Raw Data Processing
-Data acquisition
-Data annotation strategies
-Data quality assessment: signal-to-noise ratio
-Time-series filtering: finite and infinite impulse response filters
-Detection and treatment of outliers
-Detection and treatment of missing values
-Time-frequency transforms: non-stationary feature extraction.
Chapter 3: Feature Handling
-Data quality assessment
-Detection and treatment of outliers
-Detection and treatment of missing values
-Data conversion: Discretization of continuous variables, conversion of categorical variables,...
-Normalization
-Unbalanced data handling
Chapter 4: Feature Selection and reduction
- Independent from the Classifier/Regressor Methods: Filters
-Methods based on classification/regression performance: Wrappers
-Embedded Methods
-Unsupervised Reduction: PCA, MDS,...
-Supervised Reduction: LDA

Head Lecturer(s)

Paulo Fernando Pereira de Carvalho

Assessment Methods

Assessment
Project: 20.0%
Mini Tests: 25.0%
Exam: 55.0%

Bibliography

1. García, Luengo & Herrera (2015). "Data Preprocessing in Data Mining". Springer.

2. Nixon & Aguado (2008). "Feature Extraction & Image Processing". Academic Press.

3. Giannakopoulos & Pikrakis (2015). "Introduction to Audio Analysis". Academic Press.