0%

安装Spark

这篇文章,我们选择记录最简单的安装方式,而非编译安装。

下载安装包

到Apache Spark官网的下载页面 http://spark.apache.org/downloads.html

选择Spark2.2.3版本,package type为Pre-built for Apache Hadoop 2.6,因为目前环境hadoop版本是hadoop-2.6.0-cdh5.7.0

1
2
wget https://archive.apache.org/dist/spark/spark-2.2.3/spark-2.2.3-bin-hadoop2.6.tgz
tar -zxvf spark-2.2.3-bin-hadoop2.6.tgz

添加环境变量

1
vi ~/.bash_profile

添加如下内容

1
export SPARK_HOME=/usr/local/spark/spark-2.2.3-bin-hadoop2.6

执行source命令让其配置生效

1
source ~/.bash_profile

启动Spark

1
2
3
4
5
6
7
8
9
cd $SPARK_HOME

bin/spark-shell --help

Usage: ./bin/spark-shell [options]

Options:
--master MASTER_URL spark://host:port, mesos://host:port, yarn, or local.
...

从上面可以看出可以通过spark://host:port, mesos://host:port, yarnlocal等方式启动Spark。

我们先采用本地模式启动。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
bin/spark-shell --master local[2]

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
19/01/22 22:56:54 WARN Utils: Your hostname, localhost resolves to a loopback address: 127.0.0.1; using 192.168.1.6 instead (on interface en0)
19/01/22 22:56:54 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
19/01/22 22:56:54 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Spark context Web UI available at http://192.168.1.6:4040
Spark context available as 'sc' (master = local[2], app id = local-1548169015800).
Spark session available as 'spark'.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.2.3
/_/

Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_181)
Type in expressions to have them evaluated.
Type :help for more information.

scala>

启动成功,我们同时可以看看Spark context Web UI, http://localhost:4040