Data Science

Data Science and Data Engineering for Architects

Overview

This course covers the theoretical and practical aspects of applying the principles and methods of Data Science and Data Engineering. Students are introduced to the relevant concepts, terminology, theory, and tools used in the field. This training course is complemented by a variety of hands-on exercises to help the attendees reinforce their theoretical knowledge of the material being studied.

Who Should Take This Course

Audience

This course is suitable for: Software Developers, Technical IT Managers, Data Engineers, Data Scientists.

Prerequisites

Participants should have a working knowledge of Python or have strong programming experience with another language. Familiarity with core statistical concepts such as variance, correlation, etc. is helpful.

Course Outline

Data Science and Data Engineering for Architects

  1. What is Data Science?
  2. What is Data Engineering?
  3. Distributed Computing Concepts
  4. Data Processing Phases
  5. Introduction to NumPy
  6. Introduction to pandas
  7. Data Grouping and Aggregation with pandas
  8. Descriptive Statistics Computing Features in Python
  9. Repairing and Normalizing Data
  10. Data Visualization with matplotlib
  11. Data Science and ML Algorithms
  12. Parallel Data Processing with PySpark
  13. Operational Data Analytics with Splunk
  14. Python as a Cloud Scripting Language
  15. Amazon SageMaker
  16. Introduction to AWS Glue
Search UMBC Training Centers