Hadoop 集群搭建 (七):Hive

Hive:不要把我认成 Hbase

mounzer-awad-348688-unsplash

Hive 安装配置

安装 Hive

仅在 Master 仅在 Client
  • cd /mnt/hgfs/Hadoop
  • cp apache-hive-2.3.4-bin.tar.gz /usr/local/src/
  • cd /usr/local/src/
  • tar zxvf apache-hive-2.3.4-bin.tar.gz
  • rm -rf apache-hive-2.3.4-bin.tar.gz

配置 Hive 环境变量:

  • vim ~/.bashrc
# 添加如下信息
# SET Hive PATH
export HIVE_HOME=/usr/local/src/apache-hive-2.3.4-bin
export PATH=$PATH:$HIVE_HOME/bin
  • source ~/.bashrc

配置 mysql 驱动包

仅在 Master
  • cd /mnt/hgfs/Hadoop
  • cp mysql-connector-java-5.1.47.tar.gz /usr/local/src/
  • cd /usr/local/src/
  • tar zxvf mysql-connector-java-5.1.47.tar.gz
  • rm -rf mysql-connector-java-5.1.47.tar.gz
  • cp mysql-connector-java-5.1.47-bin.jar /usr/local/src/apache-hive-2.3.4-bin/lib/

更换 jline 包(版本不一致):

  • cp apache-hive-2.3.4-bin/lib/jline-2.12.jar /usr/local/src/hadoop-2.8.5/share/hadoop/yarn/lib/

配置 hive

仅在 Master 仅在 Client
  • cd apache-hive-2.3.4-bin

创建临时目录 / 日志目录 / 数仓目录:

  • mkdir -p data/hive/log
  • mkdir -p data/hive/tmp
  • mkdir -p data/hive/warehouse

配置文件:

  • cd conf
  • cp hive-env.sh.template hive-env.sh
  • vim hive-env.sh
# 添加如下信息
export JAVA_HOME=/usr/local/src/jdk1.8.0_212
export HADOOP_HOME=/usr/local/src/hadoop-2.8.5
export HIVE_HOME=/usr/local/src/apache-hive-2.3.4-bin
export HIVE_CONF_DIR=/usr/local/src/apache-hive-2.3.4-bin/conf
export HIVE_AUX_JARS=/usr/local/src/apache-hive-2.3.4-bin/lib
  • cp hive-default.xml.template hive-site.xml
  • vim hive-site.xml
# 添加如下信息

<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://master:3306/hive?createDatabaseIfNotExist=true&amp;useSSL=false</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>alessa0</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>{密码}</value>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/usr/local/src/apache-hive-2.3.4-bin/data/hive/warehouse</value>
</property>
<property>
<name>hive.exec.scratchdir</name>
<value>/usr/local/src/apache-hive-2.3.4-bin/data/hive/tmp</value>
</property>
<property>
<name>hive.querylog.location</name>
<value>/usr/local/src/apache-hive-2.3.4-bin/data/hive/log</value>
</property>
</configuration>

把 {system:java.io.tmpdir} 改成 /usr/local/src/apache-hive-2.3.4-bin/data/hive/tmp:

:%s/${system:java.io.tmpdir}/\/usr\/local\/src\/apache-hive-2.3.4-bin\/data\/hive\/tmp/g 

把 {system:user.name} 改成 {user.name} :

:%s/${system:user.name}/alessa0/g

初始化 hive (MySQL 版)

  • schematool -dbType mysql -initSchema

配置使用 hiveserver2

仅在 Master
  • vim hive-site.xml
# 添加如下信息

<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
</property>
<property>
<name>hive.server2.thrift.bind.host</name>
<value>master</value>
</property>
<property>
<name>hive.server2.enable.doAs</name>
<value>false</value>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://master:9083</value>
</property>
<property>
<name>hive.support.concurrency</name>
<value>true</value>
</property>
<property>
<name>hive.zookeeper.quorum</name>
<value>master:2181,slave1:2181,slave2:2181</value>
</property>
<property>
<name>hive.server2.webui.host</name>
<value>master</value>
</property>
<property>
<name>hive.server2.webui.port</name>
<value>10002</value>
</property>

服务端启动

仅在 Master

启动 metastore 服务

  • nohup hive --service metastore >> /usr/local/src/apache-hive-2.3.4-bin/logs/hivelog.log 2>&1 &

启动 hiveserver2 服务

  • nohup hiveserver2 1>/usr/local/src/apache-hive-2.3.4-bin/logs/hiveserver.log 2>/usr/local/src/apache-hive-2.3.4-bin/logs/hiveserver.err &

测试

  • Web UI:http://master:10002/

客户端连接

仅在 Client

启动 beeline :

  • beeline -u "jdbc:hive2://master:10000" alessa0 1008

退出 beeline :

  • !q