Data engineering and architecting pipelines using snowflake & AWS cloud
Snowflake is the next big thing and it is becoming a full blown data eco-system . With the level of scalability & efficiency in handling massive volumes of data and also with a number of new concepts in it ,this is the right time to wrap your head around Snowflake and have it in your toolkit . This course not only covers the core features of Snowflake but also teaches you how to deploy python/pyspark jobs in AWS Glue and Airflow that communicate with Snowflake , which is one of the most important aspects of building pipelines .
What you’ll learn
- Snowflake as a data-warehouse along with other important features.
- Build Automated Pipelines with Snowflake.
- Use AWS Cloud with Snowflake as a data-warehouse.
- Integrating real time streaming data and data orchestration with Airflow and Snowflake.
Course Content
- Introduction –> 4 lectures • 7min.
- Introduction to Snowflake and AWS –> 9 lectures • 47min.
- Snowflake – Tables –> 6 lectures • 30min.
- Snowflake – Partitioning , Clustering and Performance Optimization –> 9 lectures • 1hr 12min.
- Snowflake – Data Loading/Ingestion and Extraction –> 9 lectures • 54min.
- Snowflake – Tasks and Query Scheduling –> 4 lectures • 15min.
- Snowflake – Streams and Change Data Capture –> 11 lectures • 58min.
- Snowflake – User Defined Functions –> 7 lectures • 36min.
- Snowflake – External Functions –> 6 lectures • 28min.
- Snowflake with Python,Spark and Airflow on AWS –> 12 lectures • 52min.
- Real Time Streaming with Kafka and Snowflake –> 6 lectures • 32min.
- Snowflake – Data Protection and Governance –> 7 lectures • 24min.
Requirements
Snowflake is the next big thing and it is becoming a full blown data eco-system . With the level of scalability & efficiency in handling massive volumes of data and also with a number of new concepts in it ,this is the right time to wrap your head around Snowflake and have it in your toolkit . This course not only covers the core features of Snowflake but also teaches you how to deploy python/pyspark jobs in AWS Glue and Airflow that communicate with Snowflake , which is one of the most important aspects of building pipelines .
Anyone who has a basic understanding of cloud and belong to one of the below backgrounds can benefit from this course :
– Data Scientists / Analysts
– Data Engineers / Software Developers
– SQL Programmers or DBA’s
– Aspiring Data analysts and scientists who are learning SQL and Python
This Course covers :
- What is Snowflake
- Most Crucial Aspects of Snowflake in a very practical manner
- Writing Python/Spark Jobs in AWS Glue Jobs for data transformation
- Real Time Streaming using Kafka and Snowflake
- Interacting with External Functions & use cases
- Security Features in Snowflake
Prerequisites for this course are :
- Knowing SQL or at least some prior knowledge in writing queries
- Scripting in Python (or any language )
- Willingness to explore ,learn and put in the extra effort to succeed
- An active AWS Account & know-how of basic cloud fundamentals
Important Note – You need to have an active AWS Account in order to perform the sections related to Python and PySpark . For the rest of the course , a free trial snowflake account should suffice .
Some Tips :
- Try to watch the videos at 1.2X speed
- Read the reference links and the official documentation of Snowflake as much as possible