Data Science

Introduction to Data Analytics

Overview

This course is a survey of processes and tools commonly used in applications that rely heavily on data analysis. The course will describe data pipelines deployed by data engineers and data scientists to ingest data for use in an application and to manipulate that data for use by analysts. Hands-on activities will include a combination of instructor demos and several instructor-guided labs to promote understanding of select topics.

Who Should Take This Course

Audience

This course is suitable for: Analysts, Team Leads, Project Managers, Data analysts, Data engineers, and Developers.

Prerequisites

Students should have basic proficiency navigating websites using a browser and also with using a spreadsheet application such as Excel. A basic proficiency in some programming language is helpful but not required.

Why You Should Take This Course

  • Understand the different components of a modern data ecosystem.
  • Describe the typical roles and responsibilities of Data Engineers, Data Analysts, Data Scientists, and Business Intelligence Analysts.
  • Describe common types of data structures, file formats, and sources of data such as CSV, TSV, XML, JSON, Parquet and Logging formats.
  • Gain an understanding of the languages and tools that data professionals use to manipulate data.
  • Explain ETL processes used to extract, transform, and load data into data repositories.
  • Gain experience with some different tools for acquiring, importing, wrangling, and cleaning data. along with some of their characteristics, strengths, limitations, and applications.
  • Become aware of typical use cases for different types of data repositories such as Databases, Data Warehouses, Data Marts, and Data Lakes.
  • Understand tools available for Data Visualization.
  • Understand platforms and tools for Big Data processing.
  • Understand Machine Learning and available tools.

Course Outline

Introduction to Data Analytics

  1. Data Science Ecosystems
  2. Data Sources and Formats
  3. Extract, Transform, and Load (ETL)
  4. Data Repositories
  5. Acquiring and Wrangling Data
  6. Using pandas for Data Analysis
  7. Data Visualization
  8. Big Data Processing
  9. Machine Learning
Search UMBC Training Centers