Apache Zeppelin | Getting Started

First Steps with Zeppelin

Zeppelin and MySQL

Create a new Interpreter

Create a new interpreter

or confgure existing mysql interpreter

Configure Mysql Interpreter

Under artifact, add absoulte path of mysql-connector-java-8.0.19.jar.

Add/modify properties for



Prepare MySQL Database

Create a database user spark with password spark

Create a database spark wirth all permissions to user spark

Add demo values

Test Mysql Conection

Create a new notebook with mysql interpreter

Write sample code

select * from spark.demo;


Install with Docker

docker run -p 8080:8080 — rm — name zeppelin apache/zeppelin:0.8.1

Set docker volume options to persist notebooks and logs like

docker run -p 8080:8080 — rm -v $PWD/logs:/logs -v $PWD/notebook:/notebook -e ZEPPELIN_LOG_DIR=’/logs’ -e ZEPPELIN_NOTEBOOK_DIR=’/notebook’ — name zeppelin apache/zeppelin:0.8.1

Install in a vagrant box

Setup base Vagrant Box

vagrant init ubuntu/trusty64
vagrant up
vagrant ssh

Update Operating System

sudo apt-get update -y
sudo apt-get upgrade -y

Install the Vagrant Key

The only way that all the vagrant commands will be able to communicate over ssh from the host machine to the guest server is if the guest server has this “insecure vagrant key” installed. It’s called “insecure” because essentially everyone has this same key and anyone can hack into everyone’s vagrant box if you use it.

mkdir -p /home/vagrant/.ssh
chmod 0700 /home/vagrant/.ssh
wget --no-check-certificate \
    https://raw.github.com/mitchellh/vagrant/master/keys/vagrant.pub \
    -O /home/vagrant/.ssh/authorized_keys
chmod 0600 /home/vagrant/.ssh/authorized_keys
chown -R vagrant /home/vagrant/.ssh

Install Zeppelin and required Software

Detailed description can be found here.

sudo apt-get install -y gcc build-essential linux-headers-server
sudo apt-get install git
sudo apt-get install openjdk-7-jdk
sudo apt-get install npm
sudo apt-get install libfontconfig
sudo apt-get install r-base-dev
sudo apt-get install r-cran-evaluate
git clone https://github.com/apache/zeppelin.git
sudo apt-get -y install maven
mvn clean package -DskipTests -Pspark-2.0 -Phadoop-2.4 -Pr -Pscala-2.11

Configure Zeppelin

Apache Spark | Getting started

Apache Spark is a lightning-fast cluster computing designed for fast computation. It was built on top of Hadoop MapReduce and it extends the MapReduce model to efficiently use more types of computations which includes Interactive Queries and Stream Processing.

This is an extract from this brief tutorial that explains the basics of Spark Core programming.

Environment / Requirements

Installation on Mac OS X

Check or install java

$ java -version
java version "12.0.1" 2019-04-16
Java(TM) SE Runtime Environment (build 12.0.1+12)
Java HotSpot(TM) 64-Bit Server VM (build 12.0.1+12, mixed mode, sharing)

Check or install Scala

$ brew install scala
$ scala -version
Scala code runner version 2.13.0 -- Copyright 2002-2019, LAMP/EPFL and Lightbend, Inc.

Check or install Apache Spark

Setup environment in .bashrc

export PATH="$PATH:$SPARK_HOME/bin"

Installation on Ubuntu

Prepate Upuntu

apt update
apt upgrade
 apt-get install openjdk-8-jdk
 java -version

Links and Resources