Hadoop With Spark
Hadoop is a mature Big Data environment and Hive is the de-facto standard for the SQL interface. Today, the computations in Hadoop are usually done with Spark. Spark offers an optimized compute engine that includes batch, and real-time streaming, and machine learning. This course covers Hadoop 3, Hive 3, and Spark 3.
Machine Learning & Data Science with Python
In recent years industry, not just academia, has found that creating powerful data models provides the next level of value past traditional business intelligence. This course focuses on state of the art machine learning techniques combined with a practical approach designed to teach you to process your data and build models using Python’s scikit-learn. In […]
Data Visualization with Tableau
Data Visualization is the graphical representation of large datasets using graphs and charts such as bar charts, line graphs, scatterplots, etc. Learn how to elegantly present datasets that allow your audience to quickly digest, understand, and derive insights or see trends from the data. This course teaches students how to work with Tableau to create […]
SQL for Data Analytics
This course provides you with an overview of Structured Query Language (SQL) so that you can quickly begin working with and analyzing data with other data science tools. Before you can analyze data, you need to have the correct data. Many organizations store their data in structured databases and SQL is the language of choice to […]
Data Science Overview
This course provides an in-depth overview of the choices you have in processing Big Data. It introduces Data Science, the types of data you might have, approaches to working on and processing the data, and the capabilities, strengths, and weaknesses of those approaches. Topics covered include: NewSQL Databases NoSQL Overview Hadoop and MapReduce Apache Pig […]
Introduction to Machine Learning
This course introduces participants to both supervised and unsupervised learning algorithms with discussion of what datasets lend themselves to solutions with the various ML techniques. Hands-on labs are designed to assist the learner in understanding the concepts and are all done using Jupyter Notebooks. Where necessary, background material in Linear Algebra, Probability, and Python will […]
Introduction to Data Visualization
We are constantly faced with a vast amount of complex information – often more than we can handle. Well-designed visual interpretations of data improve comprehension, communication, and decision making. This workshop introduces data methods and techniques that increase the understanding of complex data. The focus is on conveying ideas effectively with visually appealing charts, graphs and […]
R Programming
This course teaches many concepts and capabilities of the R programming language. Some of the topics include importing data, data visualization using ggplot2, built-in R datatypes & structures, and general R syntax. Upon completion of the course students will be able to import, analyze, and summarize large, complex data sets using R.
Data Warehousing on Amazon Web Services (AWS)
Data Warehousing on AWS introduces you to concepts, strategies, and best practices for designing a cloud-based data warehousing solution using Amazon Redshift, the petabyte-scale data warehouse in AWS. This course demonstrates how to collect, store, and prepare data for the data warehouse by using other AWS services such as Amazon DynamoDB, Amazon EMR, Amazon Kinesis […]