Data Science

Course Overviews

UMBC Training Centers’ Data Science Training Program was designed to help individuals and organizations gain the technical and managerial skills required to conduct data analytic studies; evaluate, plan, manage and complete data analytics projects; develop custom data analytic software; and administer and maintain analytic systems at scale.

View upcoming Data Science Courses

General Audience

Introduction to Data Analytics

This course is a survey of processes and tools commonly used in applications that rely heavily on data analysis. The course will describe data pipelines deployed by data engineers and data scientists to ingest data for use in an application and to manipulate that data for use by analysts. Hands-on activities will include a combination of instructor demos and several instructor-guided labs to promote understanding of select topics. Click for more information
CompTIA Data+

As the importance for data analytics grows, more job roles are required to set context and better communicate vital business intelligence. Collecting, analyzing, and reporting on data can drive your organization’s priorities and lead business decision-making. Click for more information
CompTIA DataSys+

The CompTIA DataSys+ certification validates that those managing databases can properly collect, store, organize, protect and maintain data throughout its lifecycle to ensure its availability, accuracy, consistency and security. CompTIA DataSys+ is designed for those with 2-3 years of hands-on experience working in a database administrator (DBA) role. Click for more information

Data Analysts / Data Scientists (map)

Introduction to Machine Learning

This course introduces participants to both supervised and unsupervised learning algorithms with discussion of what datasets lend themselves to solutions with the various ML techniques. Hands-on labs are designed to assist the learner in understanding the concepts and are all done using Jupyter Notebooks. Where necessary, background material in Linear Algebra, Probability, and Python will be presented. Click for more information
Introduction to Data Visualization

We are constantly faced with a vast amount of complex information – often more than we can handle. Well-designed visual interpretations of data improve comprehension, communication, and decision making. This workshop introduces data methods and techniques that increase the understanding of complex data. The focus is on conveying ideas effectively with visually appealing charts, graphs and maps. Click for more information
Introduction to SQL

Through hands-on labs, students will explore the SQL standard, giving users an understanding of the SQL Language. The course introduces the concepts of database design and data modeling. Click for more information
Data Analysis with Excel

The objective of this course is to fully explore the uses of Microsoft Excel as a data analysis tool. Most business professionals are familiar with the core functionality of Excel. This course explores some of the additional capabilities and advanced features of Excel for analyzing, manipulating and visualizing data. Click for more information
Applied Data Science with Python

This course provides theoretical and practical aspects of using Python applied to Data Science, Business Analytics, and Data Logistics. Emphasis is on a survey of core concepts, terminology, and theory. This course is supplemented by a variety of hands-on labs that help participants reinforce their theoretical knowledge of the learned material. Click for more information
Big Data Overview

This course provides an in-depth overview of the choices you have in processing Big Data. It introduces Big Data, the types of data you might have, approaches to working on and processing the data, and the capabilities, strengths, and weaknesses of those approaches. Click for more information
Introduction to Cloud Technology

This course provides an overview of cloud computing and a vendor-agnostic overview of cloud services offered. Focus is on introducing and comparing services provided by the three major cloud vendors (Amazon Web Services, Microsoft Azure, Google Cloud Platform). Click for more information
SQL For Data Analytics

This course provides you with an overview of Structured Query Language (SQL) so that you can quickly begin working with and analyzing data with other data science tools. Before you can analyze data, you need to have the correct data. Many organizations store their data in structured databases and SQL is the language of choice to extract, manipulate, filter, and generally wrangle that data. Click for more information
Data Visualization with Tableau

Data Visualization is the graphical representation of large datasets using graphs and charts such as bar charts, line graphs, scatterplots, etc. Learn how to elegantly present datasets that allow your audience to quickly digest, understand, and derive insights or see trends from the data. This course teaches students how to work with Tableau to create effective visualizations of datasets and to build Dashboards within Tableau. Click for more information
Python for Data Science

This course introduces the Python language to students who want to use Python as a tool for their data science initiatives. The goal is to become proficient enough with the Python language to leverage powerful Data Science packages such as Pandas and matplotlib. Click for more information
Data Visualization with Matplotlib & Seaborn

Matplotlib is a data visualization library for Python. As part of the SciPy data analysis library it is widely used to create data graphics. However, Matplotlib is older than the pandas library, the most common Python library for data frame manipulation. The Matplotlib library requires some extra steps when plotting data from pandas data frames that sometimes make it more cumbersome to use. Seaborn was created to address some of those issues. Seaborn presents more natural default settings and works with pandas data frames directly. Both libraries should be in a data analyst’s tool box. Click for more information
Machine Learning & Data Science with Python

In recent years industry, not just academia, has found that creating powerful data models provides the next level of value past traditional business intelligence. This course focuses on state of the art machine learning techniques combined with a practical approach designed to teach you to process your data and build models using Python’s scikit-learn. In this class you will learn to load and analyze your data with Pandas (a data analysis library), build visualizations with pyplot, and create predictive models using scikit-learn. Click for more information
Hadoop With Spark

Hadoop is a mature Big Data environment and Hive is the de-facto standard for the SQL interface. Today, the computations in Hadoop are usually done with Spark. Spark offers an optimized compute engine that includes batch, and real-time streaming, and machine learning. Click for more information
AWS – Practical Data Science with Amazon SageMaker

In this intermediate-level course, individuals learn how to solve a real-world use case with Machine Learning (ML) and produce actionable results using Amazon SageMaker. This course walks through the stages of a typical data science process for Machine Learning from analyzing and visualizing a dataset to preparing the data, and feature engineering. Individuals will also learn practical aspects of model building, training, tuning, and deployment with Amazon SageMaker. Real life use cases include customer retention analysis to inform customer loyalty programs. Click for more information
Data Warehousing on Amazon Web Services (AWS)

Data Warehousing on AWS introduces you to concepts, strategies, and best practices for designing a cloud-based data warehousing solution using Amazon Redshift, the petabyte-scale data warehouse in AWS. This course demonstrates how to collect, store, and prepare data for the data warehouse by using other AWS services such as Amazon DynamoDB, Amazon EMR, Amazon Kinesis Firehose, and Amazon S3. Additionally, this course demonstrates how to use business intelligence tools to perform analysis on your data. Click for more information
Planning and Designing Databases on AWS

In this course, you will learn about the process of planning and designing both relational and nonrelational databases. You will learn the design considerations for hosting databases on Amazon Elastic Compute Cloud (Amazon EC2). You will learn about our relational database services including Amazon Relational Database Service (Amazon RDS), Amazon Aurora, and Amazon Redshift. Click for more information
Advanced Data Analytics with PySpark

This class introduces participants to the Apache Spark platform, the Spark Shell and Spark SQL for big data processing applications. In addition to the Spark platform, participants will learn fundamental tools in the pandas library and gain experience with data visualization using seaborn. Click for more information
Data Engineering with PySpark

Data Engineering has become an important role in the Data Science space. For Data Analysts to do productive work, they need to have consistent datasets to analyze. A Data Engineer provides this consistency for analysts by accessing data in a variety of formats, using a variety of tools. This class will introduce programmers to tools for ETL applications as well as big data applications using Apache Spark. Participants will gain experience with PySpark, the Spark SQL module, and DataFrames. Click for more information
Practical Machine Learning With Apache Spark

This intensive hands-on training introduces the audience to the core aspects of scalable data processing using Python on the Apache Spark platform. The students will learn the essentials of Python with the primary focus being on the capabilities of the Apache Spark platform and its Machine Learning module. The students will be introduced to the terminology, concepts, and algorithms used in Machine Learning. Click for more information
Apache Spark

Success of many organizations depends on their ability to derive business insights from massive amount of raw data coming from various sources. Apache Spark offers many engineering improvements over the traditional MapReduce programming model as implemented in Hadoop by providing multi-pass in-memory processing of data which boosts the overall performance of your ETL and machine-learning algorithms. Click for more information
Data Science and Data Engineering for Architects

Data Warehousing on AWS introduces you to concepts, strategies, and best practices for designing a cloud-based data warehousing solution using Amazon Redshift, the petabyte-scale data warehouse in AWS. This course demonstrates how to collect, store, and prepare data for the data warehouse by using other AWS services such as Amazon DynamoDB, Amazon EMR, Amazon Kinesis Firehose, and Amazon S3. Additionally, this course demonstrates how to use business intelligence tools to perform analysis on your data. Click for more information
R Programming

This course teaches many concepts and capabilities of the R programming language. Some of the topics include importing data, data visualization using ggplot2, built-in R datatypes & structures, and general R syntax. Upon completion of the course students will be able to import, analyze, and summarize large, complex data sets using R. Click for more information

Course Overviews

Stay in the Loop

Contact Us