This repository contains coursework and projects for the Big Data Analytics class at Seattle University. The course focuses on big data processing using Hadoop, MapReduce, Hive, and Spark.
-
Hadoop Ecosystem
- Hadoop Architecture
- Hadoop Distributed File System (HDFS)
-
MapReduce
- Programming Model
- Common Algorithms
- Implementing MapReduce in Java
-
Apache Spark
- Spark Basics
- Spark SQL
-
Hive
- Hive Basics
- Hive Optimization