Data Science
Data Science and Data Engineering for Architects
Overview
This course covers the theoretical and practical aspects of applying the principles and methods of Data Science and Data Engineering. Students are introduced to the relevant concepts, terminology, theory, and tools used in the field. This training course is complemented by a variety of hands-on exercises to help the attendees reinforce their theoretical knowledge of the material being studied.
Who Should Take This Course
Audience
This course is suitable for: Software Developers, Technical IT Managers, Data Engineers, Data Scientists.
Prerequisites
Participants should have a working knowledge of Python or have strong programming experience with another language. Familiarity with core statistical concepts such as variance, correlation, etc. is helpful.
Course Outline
Data Science and Data Engineering for Architects
- What is Data Science?
- What is Data Engineering?
- Distributed Computing Concepts
- Data Processing Phases
- Introduction to NumPy
- Introduction to pandas
- Data Grouping and Aggregation with pandas
- Descriptive Statistics Computing Features in Python
- Repairing and Normalizing Data
- Data Visualization with matplotlib
- Data Science and ML Algorithms
- Parallel Data Processing with PySpark
- Operational Data Analytics with Splunk
- Python as a Cloud Scripting Language
- Amazon SageMaker
- Introduction to AWS Glue