Software Development Foundations with Python
This course is a bootcamp-style course that introduces the audience to key concepts in programming using the Python language. The course uses a practical approach to learning all the key Python constructs that all Python programmers need to know.
Being an introductory class, coding is done using Google Colab (Jupyter Notebooks) as well as an online command-line like tool called repl.it. All fundamental Python topics are covered including datatypes, control flow, looping constructs, functions, modules, and collections (lists, dictionaries). The theme of “thinking like a programmer” is used throughout the course.
Following detailed exposure to Python, the course moves to fundamentals of Data Science. You will learn about data pipelines and build ETL pipelines. ETL processes are often deployed by data engineers and data scientists to ingest data for use in an application and to manipulate that data for use by analysts. Participants will build ETL pipelines and ingest JSON and CSV files using Internet APIs, Python and pandas. Further, participants will gain experience building a full pipeline data application using tools in Google Cloud Platform (GCP), including Google Cloud Storage and Google BigQuery.
Hands-on activities will include a combination of instructor demos, short exercises and challenging multi-hour projects that participants can display in their GitHub accounts.
This course prepares students to:
● Understand core concepts of computer programming
● Learn how to program using the Python Programming Language
● Gain a strong working knowledge of fundamental Python programming constructs such as built-in datatypes, variables, conditional logic, control flow, loops, and user-defined functions
● Gain significant experience manipulating Python strings, lists, and dictionaries to solve programming problems using Python
● Code using Google Colaboratory and repl.it
● Use pip to install Python modules
● Learn to debug code and describe the difference between syntax and logic errors in your code
● Understand the different components of modern data pipelines
● Gain exposure to the Google Cloud Platform (GCP)
● Work with data formats commonly used in Data Science by data engineers and developers: JSON, CSV
● Understand the Extract, Transform, and Load (ETL) pipeline for processing data
● Use HTTP and the Python requests module to access data made available by Internet APIs such as movie data, data about pets,, and archived newspaper articles data
● Use the Python Data Science library, pandas, to ingest CSV data into a data pipeline
● Create Python classes and populate objects with data ingested from public sources
● Understand the data engineering concepts of filtering and cleaning data in an ETL pipeline
● Gain basic data analytics skills using the pandas library by working with large datasets
● Create a full pipeline ETL process using Python, Google Cloud Storage and Google BigQuery Data Warehouse
● Progress from writing basic Python programs through creating significant programs that will expose participants to technologies that are in wide use within the software industry today
● Host your code on GitHub
This course is suitable for anyone new to programming that wants to learn how to code using Python and apply what they learned to Data Science applications. Participants should be comfortable using a browser, a file-system, and applications such as Slack and Zoom.
Students should have basic proficiency navigating websites using a browser, creating and saving files to a filesystem and also using common applications such as Zoom and Slack. Prior exposure to some programming language is helpful but not required. Prior experience using a spreadsheet application such as Excel is also helpful.
● What is Google Colab?
● Running Python from Google Colab
● Basic I/O
Running Python from the Command Line
● Python Interactive Shell
● Booleans: True, False
● if, if-else, if-elif-else Statements
● Logical Operators
● for, while
● Working with Strings
● Built-in Python String Functions
● List Comprehensions
● User-Defined Functions
● Python Built-in Functions
Object-oriented Concepts in Python
● Classes and Objects
Python Libraries and Modules
● Built-in Libraries and Modules
● Installing Modules with PIP
Errors and Debugging
● Types of Errors
● Planning your Program Design
● Iterative Design – Write, Test, Repeat
● Common Pitfalls for New Programmers
Data Pipelines and Data Formats
● What is a Data Pipeline?
● Structured, Unstructured, Semi-Structured Data
● Important Data Formats: JSON, CSV, XML
● The Structure of JSON Data
The Data Collection Phase: Accessing Datasets Available on the Web
● Hypertext Transfer Protocol (HTTP)
● Application Programming Interfaces (APIs)
● Finding Publicly Available Datasets
● URL Structures of APIs
● Downloading CSV and JSON Data
Leveraging Python to Ingest Data Sources via the Web
● Programmatically Accessing API Data via the Web
● Python requests Library
● Building Python Objects via API Data
Working with the pandas Data Analysis Library
● Why Pandas?
● Populating DataFrames
● Importing CSV, Excel Data
● DataFrame Columns and Cells
● DataFrame Manipulation
● ETL with pandas
The Extract, Transform, and Load (ETL) Pipeline
● What is an ETL Pipeline?
● Data Ingest
● Data Ingest via Internet APIs
● Transforming Data in an ETL Pipeline
● ETL with Apache Nifi
Data Analysis with pandas
● Functions on DataFrames
● Merging and Concatenating DataFrames
● Data Cleaning
● Data Analysis
● Aggregate Functions
ETL Pipelines Using Google Cloud Platform
● What is Google Cloud Platform?
● Google Cloud Storage
● What is a Data Warehouse?
● Google BigQuery
● Demo: Full pipeline ETL using Python, Google Cloud Storage and Google BigQuery
Is there a discount available for current students?
UMBC students and alumni, as well as students who have previously taken a public training course with UMBC Training Centers are eligible for a 10% discount, capped at $250. Please provide a copy of your UMBC student ID or an unofficial transcript or the name of the UMBC Training Centers course you have completed. Asynchronous courses are excluded from this offer.
What is the cancellation and refund policy?
Student will receive a refund of paid registration fees only if UMBC Training Centers receives a notice of cancellation at least 10 business days prior to the class start date for classes or the exam date for exams.