Apache Spark | Getting started

Apache Spark is a lightning-fast cluster computing designed for fast computation. It was built on top of Hadoop MapReduce and it extends the MapReduce model to efficiently use more types of computations which includes Interactive Queries and Stream Processing.

This is an extract from this brief tutorial that explains the basics of Spark Core programming.

Inhaltsverzeichnis

Environment / Requirements

Installation on Mac OS X

Check or install java

$ java -version
java version "12.0.1" 2019-04-16
Java(TM) SE Runtime Environment (build 12.0.1+12)
Java HotSpot(TM) 64-Bit Server VM (build 12.0.1+12, mixed mode, sharing)

Check or install Scala

$ brew install scala

$ scala -version
Scala code runner version 2.13.0 -- Copyright 2002-2019, LAMP/EPFL and Lightbend, Inc.

Check or install Apache Spark

Setup environment in .bashrc

export PATH="$PATH:$SPARK_HOME/bin"

Installation on Ubuntu

Prepate Upuntu

apt update
apt upgrade

 apt-get install openjdk-8-jdk
 java -version

Links and Resources

Apache Spark in Python: Beginner’s Guide

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Developer Blog

Apache Spark | Getting started

Environment / Requirements

Installation on Mac OS X

Check or install java

Check or install Scala

Check or install Apache Spark

Setup environment in .bashrc

Installation on Ubuntu

Links and Resources

Leave a Reply
Cancel reply

Leave a Reply

Developer Blog

Environment / Requirements

Installation on Mac OS X

Check or install java

Check or install Scala

Check or install Apache Spark

Setup environment in .bashrc

Installation on Ubuntu

Links and Resources

Leave a Reply Cancel reply

Leave a Reply

Leave a Reply
Cancel reply