Introduction to Data Science and Engineering

Year
1
Academic year
2023-2024
Code
01016710
Subject Area
Informatics
Language of Instruction
Portuguese
Mode of Delivery
Face-to-face
Duration
SEMESTRIAL
ECTS Credits
6.0
Type
Compulsory
Level
1st Cycle Studies

Recommended Prerequisites

Programing.

Teaching Methods

The teaching methodology follows a model of one theoretical classes and one practical classes, with two hours each. Theoretical classes correspond to an expository component of concepts, principles, examples and good practices, and the practical classes serve as contexts in which problems will be presented that students will have to solve and that allow to consolidate key concepts of the curricular unit. In the practical classes will resort to tools and modules that will be used and integrated by the student.

Learning Outcomes

The curr. unit intends to introduce the areas of data science and data engineering, providing student with an overview of the area, its methodological principles, its challenges and its main applications. The goal is that student create sensitivity to the set of technical, scientific and methodological challenges that a Data Science Engineer will experience in his/her professional practice, allowing him/her to create sensitivity for choosing the appropriate methodologies in the analysis and design of solutions as well as for the creation of value in Data Sciences. The uc will thus serve as a link to the disciplinary curricular units that make up the curricular plan, enabling the student to integrate disciplinary knowledge in a more comprehensive perspective of Engineering and Data Science at all times. It is also intended to foster autonomous learning and group work, interpersonal relationships, and oral and written communication.

Work Placement(s)

No

Syllabus

Introduction: What is Data Science and Data Engineering?
- Big Data and Data Science
- Why now? – Datafication
- Current landscape of perspectives
- Skill sets needed

Problems and Applications
- Essential Concepts of Data
- The data science life cycle and pipeline
- Typical problems: regression, classification, clustering and association rules
- Popular Data Analytics Applications

Data Engineering landscape
• Common challenges in data engineering
• Data, memory and storage
• Operating Syst.
• Database Syst.
• Networking and the Internet
• S/W Engineering
• HPC&Cloud computing

Data Science landscape
- Why do we need diferente methods?
- Common Challenges in data science
- Exploratory Data Science
- Data preparation and cleaning
- Feature Engineering and curse of dimentionality
- Regression,  classification, clustering
- Fusion
- Validation

Basic Machine Learning Algorithms
- Linear Regression
- k-NN

Head Lecturer(s)

Alberto Jorge Lebre Cardoso

Assessment Methods

Assessment
Exam: 50.0%
Resolution Problems: 50.0%

Bibliography

João Moreira, Andre Carvalho, Tomás Horvath, A General Introduction to Data Analytics,  1st Edition, Wiley (2019)

Wes McKinney, Python for Data Analysis,  O'Reilly Media, Inc, USA (2018)