1. 1. Introduction
    1. 1.1. Goal of this lab
    2. 1.2. Code repositories
    3. 1.3. Groups
    4. 1.4. AWS
    5. 1.5. Grading
  2. 2. Getting Started
    1. 2.1. Docker
    2. 2.2. Scala
    3. 2.3. Apache Spark
      1. 2.3.1. Resilient Distributed Datasets
      2. 2.3.2. Dataframe and Dataset
      3. 2.3.3. Packaging your application using SBT
    4. 2.4. Amazon Web Services
    5. 2.5. Apache Kafka
    6. 2.6. OpenStreetMap
    7. 2.7. ALOS Global Digital Surface Model
  3. 3. Lab 1
    1. 3.1. Before you start
    2. 3.2. Assignment
    3. 3.3. Deliverables
    4. 3.4. Rubric
  4. 4. Lab 2
    1. 4.1. Before you start
    2. 4.2. Assignment
    3. 4.3. Deliverables
    4. 4.4. Rubric
  5. 5. Lab 3
    1. 5.1. Before you start
    2. 5.2. Assignment
    3. 5.3. Deliverables
    4. 5.4. Rubric
  6. 6. FAQ
  7. 7. Quiz example
  8. 8. Useful links

Supercomputing for Big Data - Lab Manual

Useful links

Below are some links that are useful:

  • Git cheatsheet

Often-used API docs:

  • Spark all APIs
  • Spark DataSet API