0%

这篇文章,我们选择记录最简单的安装方式,而非编译安装。

下载安装包

到Apache Spark官网的下载页面 http://spark.apache.org/downloads.html

选择Spark2.2.3版本,package type为Pre-built for Apache Hadoop 2.6,因为目前环境hadoop版本是hadoop-2.6.0-cdh5.7.0

1
2
wget https://archive.apache.org/dist/spark/spark-2.2.3/spark-2.2.3-bin-hadoop2.6.tgz
tar -zxvf spark-2.2.3-bin-hadoop2.6.tgz

添加环境变量

1
vi ~/.bash_profile

添加如下内容

1
export SPARK_HOME=/usr/local/spark/spark-2.2.3-bin-hadoop2.6

执行source命令让其配置生效

1
source ~/.bash_profile

启动Spark

1
2
3
4
5
6
7
8
9
cd $SPARK_HOME

bin/spark-shell --help

Usage: ./bin/spark-shell [options]

Options:
--master MASTER_URL spark://host:port, mesos://host:port, yarn, or local.
...

从上面可以看出可以通过spark://host:port, mesos://host:port, yarnlocal等方式启动Spark。

我们先采用本地模式启动。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
bin/spark-shell --master local[2]

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
19/01/22 22:56:54 WARN Utils: Your hostname, localhost resolves to a loopback address: 127.0.0.1; using 192.168.1.6 instead (on interface en0)
19/01/22 22:56:54 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
19/01/22 22:56:54 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Spark context Web UI available at http://192.168.1.6:4040
Spark context available as 'sc' (master = local[2], app id = local-1548169015800).
Spark session available as 'spark'.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.2.3
/_/

Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_181)
Type in expressions to have them evaluated.
Type :help for more information.

scala>

启动成功,我们同时可以看看Spark context Web UI, http://localhost:4040

这篇文章将详细的讲述一下HBase的安装。

下载安装包

进入CDH Archive页面 http://archive.cloudera.com/cdh5/cdh/5/

选择需要安装的HBase版本,这次安装选择的是hbase-1.2.0-cdh5.7.0

下载链接 http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz

1
2
cd /usr/local/hbase
wget http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz

解压安装包

1
tar -zxvf hbase-1.2.0-cdh5.7.0.tar.gz

添加环境变量

1
vi ~/.bash_profile

在文件里面加入以下内容

1
export HBASE_HOME=/usr/local/hbase/hbase-1.2.0-cdh5.7.0/

然后执行source命令使配置文件生效

1
2
3
4
source ~/.bash_profile

echo $HBASE_HOME
/usr/local/hbase/hbase-1.2.0-cdh5.7.0/

修改配置文件

进入到$HBASE_HOME目录,并修改conf/hbase-env.sh配置文件

1
2
cd $HBASE_HOME
vi conf/hbase-env.sh

修改JAVA_HOME参数,指定JAVA_HOME的位置

1
export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_181.jdk/Contents/Home

修改HBASE_MANAGES_ZK参数为false,不让HBase管理自己的Zookeeper实例,因为我们需要自己安装Zookeeper

1
export HBASE_MANAGES_ZK=false

修改hbase-site.xml文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:8020/hbase</value>
</property>

<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>localhost:2181</value>
</property>
</configuration>

regionserver文件,如果是单机伪分布环境,则可以不用改,默认为localhost

启动zookeeper

在启动HBase之前,我们需要先启动zookeeper

1
2
3
4
5
6
cd $ZK_HOME
sbin/zkServer.sh start

JMX enabled by default
Using config: /usr/local/zookeeper/zookeeper-3.4.5-cdh5.7.0/sbin/../conf/zoo.cfg
Starting zookeeper ... STARTED
1
2
3
jps

50469 QuorumPeerMain

zookeeper实例已经启动

启动HBase

1
2
cd $HBASE_HOME
bin/start-hbase.sh
1
2
3
4
starting master, logging to /usr/local/hbase/hbase-1.2.0-cdh5.7.0//logs/hbase-simon-master-localhost.out
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
starting regionserver, logging to /usr/local/hbase/hbase-1.2.0-cdh5.7.0//logs/hbase-simon-1-regionserver-localhost.out

我们可以看到starting masterstarting regionserver

1
2
3
4
jps

50607 HMaster
50703 HRegionServer

说明HBase已经成功的启动了。

1
2
3
4
5
6
cd $HADOOP_HOME

bin/hadoop fs -ls /
19/01/22 13:42:14 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 4 items
drwxr-xr-x - simon supergroup 0 2019-01-22 13:34 /hbase

可以看到前面设置的hdfs://localhost:8020/hbase已经被成功创建了。

通过Web UI查看HBASE信息

在浏览器地址栏输入 http://localhost:60010

HBase shell

1
2
3
4
5
6
7
8
9
10
11
12
13
14
bin/hbase shell

2019-01-22 13:41:26,984 INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
2019-01-22 13:41:28,153 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hbase/hbase-1.2.0-cdh5.7.0/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop/hadoop-2.6.0-cdh5.7.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 1.2.0-cdh5.7.0, rUnknown, Wed Mar 23 11:46:29 PDT 2016

hbase(main):001:0>

查看版本号

1
2
hbase(main):001:0> version
1.2.0-cdh5.7.0, rUnknown, Wed Mar 23 11:46:29 PDT 2016

查看状态

1
2
hbase(main):002:0> status
1 active master, 0 backup masters, 1 servers, 0 dead, 2.0000 average load

创建表member,包含infoaddress两个列蔟

1
2
3
4
hbase(main):003:0> create 'member', 'info', 'address'
0 row(s) in 2.3700 seconds

=> Hbase::Table - member

查看数据库表列表

1
2
3
4
5
6
hbase(main):004:0> list
TABLE
member
1 row(s) in 0.0380 seconds

=> ["member"]

查看member表信息

1
2
3
4
5
6
7
8
9
10
11
hbase(main):005:0> describe 'member'
Table member is ENABLED
member
COLUMN FAMILIES DESCRIPTION
{NAME => 'address', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING =
> 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE
=> '0'}
{NAME => 'info', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => '
NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE =>
'0'}
2 row(s) in 0.1090 seconds

我们再次查看HBase Web UI

发现在Tables标题栏里面已经多了一条记录了。

清空数据表

1
2
3
4
5
hbase(main):009:0> truncate 'member'
Truncating 'member' table (it may take a while):
- Disabling table...
- Truncating table...
0 row(s) in 4.0020 seconds

通过Hexo搭建了博客,准备做一些简单的SEO,让搜索引擎收录博客中的页面。所以安装了两个hexo的插件。

安装sitemap插件

1
2
npm install hexo-generator-sitemap
npm install hexo-generator-baidu-sitemap

一个是传统的sitemap,一个是百度专用的sitemap。

安装成功后,启动本地服务

1
hexo s

则可以通过如下链接访问到sitemap的页面

http://localhost:4000/sitemap.xml
http://localhost:4000/baidusitemap.xml

注册百度搜索资源平台

之前叫做百度站长平台,现在已经改为搜索资源平台

进入后,添加自己的网站到平台。按要求绑定一个CNAME,让百度确定这个网站是归属于你的。

提交sitemap

把刚才提到的两个sitemap.xml在公网上的地址提交到百度平台即可。成功以后如图所示: