View Class Schedule 
► ProgramsHortonworks HDP Developer: Apache Pig and Hive

Hortonworks HDP Developer: Apache Pig and Hive

Overview

This course is designed for developers who need to create applications to analyze Big Data stored in Apache Hadoop using Pig and Hive. Topics include: Hadoop, YARN, HDFS, MapReduce, data ingestion, workflow definition, using Pig and Hive to perform data analytics on Big Data and an introduction to Spark Core and Spark SQL.

 

Outcomes

Upon completion of this course, students will be able to:
  • Describe Hadoop, YARN and use cases for Hadoop
  • Describe Hadoop ecosystem tools and frameworks
  • Describe the HDFS architecture
  • Use the Hadoop client to input data into HDFS
  • Transfer data between Hadoop and a relational database
  • Explain YARN and MaoReduce architectures
  • Run a MapReduce job on YARN
  • Use Pig to explore and transform data in HDFS
  • Understand how Hive tables are defined and implemented
  • Use Hive to explore and analyze data sets
  • Use the new Hive windowing functions
  • Explain and use the various Hive file formats
  • Create and populate a Hive table that uses ORC file formats
  • Use Hive to run SQL-like queries to perform data analysis
  • Use Hive to join datasets using a variety of techniques
  • Write efficient Hive queries
  • Create ngrams and context ngrams using Hive
  • Perform data analytics using the DataFu Pig library
  • Explain the uses and purpose of HCatalog
  • Use HCatalog with Pig and Hive
  • Define and schedule an Oozie workflow
  • Present the Spark ecosystem and high-level architecture
  • Perform data analysis with Spark's Resilient Distributed Dataset API
  • Explore Spark SQL and the DataFrame API

Audience

This class is for Software developers who need to understand and develop applications for Hadoop.

Prerequisites

Students should be familiar with programming principles and have experience in software development. SQL knowledge is also helpful. No prior Hadoop knowledge is required.

HD-ILT

HD-ILT™ (High Definition Instructor Led Training) is a patented, state-of-the-art video conferencing/remote lab training modality that allows students to study from Reston, VA; Columbia, MD or any of our other 30+ locations across North America, or remotely from a home or office. Students in the HD-ILT lab will receive live instruction in a virtual environment, including the same hands-on labs used in classroom-based courses.  For more information on our HD-ILT learning environment, click here.

System Requirements

For students choosing the live online class format, a complete list of system requirements can be found here.

Duration

4 Days


Course Outline

Save with Early Registration!

Register 21 days before class start date and save $250!
Enter Discount Code EARLY250 during registration.

Session Dates Session Time Location Price Registration
8/20/18 - 8/23/18 10:00 a.m. - 6:00 p.m. EST
Monday - Thursday
HD-ILT $2800.00
10/29/18 - 11/1/18 8:30 a.m. - 4:30 p.m.
Monday - Thursday
Washington, DC $2800.00

Group Training Available

UMBC Training Centers can deliver any of our courses in a group training environment at our facilities or yours. Group training can be an effective and economical method to quickly assure competency and consistency of knowledge and skills within an organization or department.