傻瓜式CDH集群部署指南

CDH环境配置


clay-banks-1554989-unsplash

单机内存最低8G

集群环境

系统

CentOS-7-x86_64-Minimal-1810.iso

  • cdh-master 192.168.69.111 4core 16G
  • cdh-slave1 192.168.69.112 4core 8G
  • cdh-slave2 192.168.69.113 4core 8G

关闭防火墙及修改hosts

永久关闭内核防火墙

  • vim /etc/selinux/config
1
2
# 修改如下信息
SELINUX=disabled

关闭系统防火墙

停止firewall

  • systemctl stop firewalld.service

禁止firewall开机启动

  • systemctl disable firewalld.service

修改hosts文件

  • vim /etc/hosts
1
2
3
4
# 添加如下信息
192.168.69.111 cdh-master
192.168.69.112 cdh-slave1
192.168.69.113 cdh-slave2

SSH互信

生成密钥对(公钥和私钥)

  • ssh-keygen -t rsa -P ‘’

追加authorized_keys

1
2
3
4
5
6
7
# 追加authorized_keys
cat ~/.ssh/id_rsa.pub > ~/.ssh/authorized_keys

# 修改权限
chmod g-w ~
chmod 700 ~/.ssh
chmod 600 ~/.ssh/authorized_keys

追加密钥到Master

  • ssh cdh-slave1 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
  • ssh cdh-slave2 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

复制密钥到从节点

  • scp ~/.ssh/authorized_keys cdh-slave1:~/.ssh/authorized_keys
  • scp ~/.ssh/authorized_keys cdh-slave2:~/.ssh/authorized_keys

ntp时间同步

所有节点安装相关ntp组件

  • yum -y install ntp

所有节点设置时区

  • timedatectl set-timezone Asia/Shanghai

启动ntp,以及设置开机启动

1
2
3
4
5
# 启动ntp
systemctl start ntpd

# 设置开机启动
systemctl enable ntpd

配置ntp服务器(master节点)

  • vim /etc/ntp.conf
1
2
3
4
5
6
7
8
9
10
# 修改如下几行
#server 0.centos.pool.ntp.org iburst
#server 1.centos.pool.ntp.org iburst
#server 2.centos.pool.ntp.org iburst
#server 3.centos.pool.ntp.org iburst

# 添加如下几行
restrict 192.168.69.2 mask 255.255.255.0 nomodify notrap
server 127.127.1.0
fudge 127.127.1.0 stratum 10

配置ntp服务器(slave节点)

  • vim /etc/ntp.conf
1
2
3
4
5
6
7
8
9
# 修改如下几行
#server 0.centos.pool.ntp.org iburst
#server 1.centos.pool.ntp.org iburst
#server 2.centos.pool.ntp.org iburst
#server 3.centos.pool.ntp.org iburst

# 添加如下几行(master节点)
restrict 192.168.69.2 mask 255.255.255.0 nomodify notrap
server 192.168.69.111

重启ntp服务

  • systemctl restart ntpd

主节点定时服务

  • crontab -e
1
0-59/10 * * * * /usr/sbin/ntpdate -u asia.pool.ntp.org

手动同步master的时间

  • ntpdate -u 192.168.69.111

查看同步状态

  • ntpstat

配置Cloudera rpm仓库

下载repo文件

使用GPG key导入仓库

下载parcel文件

安装 Java(64-bit)

卸载掉自带的 OpenJdk

  • rpm -qa | grep java

使用Cloudera Manager 安装 Java

  • sudo yum -y install oracle-j2sdk1.8

设置环境变量

  • vim /etc/profile
1
2
3
4
# SET JAVA PATH 
export JAVA_HOME=/usr/java/jdk1.8.0_181-cloudera/
export CLASSPATH=.:$CLASSPATH:$JAVA_HOME/lib
export PATH=$PATH:$JAVA_HOME/bin
  • source /etc/profile

yum update报错Error: Delta RPMs disabled because /usr/bin/applydeltarpm not installed.

1
2
3
# 安装deltarpm
yum provides '*/applydeltarpm'
yum install deltarpm

安装Cloudera Manager Server

安装Cloudera Manager包

  • sudo yum -y install cloudera-manager-daemons cloudera-manager-agent cloudera-manager-server

Ps:可以在https://archive.cloudera.com/cm6/6.2.0/redhat7/yum/RPMS/x86_64/ 提前下好rpm包进行安装。

启用Auto-TLS

待续

安装和配置数据库MariaDB for Cloudera Software(master节点)

安装MariaDB

  • sudo yum -y install mariadb-server

配置MariaDB

  • vim /etc/my.cnf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
transaction-isolation = READ-COMMITTED
# Disabling symbolic-links is recommended to prevent assorted security risks;
# to do so, uncomment this line:
symbolic-links = 0
# Settings user and group are ignored when systemd is used.
# If you need to run mysqld under a different user or group,
# customize your systemd unit file for mariadb according to the
# instructions in http://fedoraproject.org/wiki/Systemd

key_buffer = 16M
key_buffer_size = 32M
max_allowed_packet = 32M
thread_stack = 256K
thread_cache_size = 64
query_cache_limit = 8M
query_cache_size = 64M
query_cache_type = 1

max_connections = 550
#expire_logs_days = 10
#max_binlog_size = 100M

#log_bin should be on a disk with enough free space.
#Replace '/var/lib/mysql/mysql_binary_log' with an appropriate path for your
#system and chown the specified folder to the mysql user.
log_bin=/var/lib/mysql/mysql_binary_log

#In later versions of MariaDB, if you enable the binary log and do not set
#a server_id, MariaDB will not start. The server_id must be unique within
#the replicating group.
server_id=1

binlog_format = mixed

read_buffer_size = 2M
read_rnd_buffer_size = 16M
sort_buffer_size = 8M
join_buffer_size = 8M

# InnoDB settings
innodb_file_per_table = 1
innodb_flush_log_at_trx_commit = 2
innodb_log_buffer_size = 64M
innodb_buffer_pool_size = 4G
innodb_thread_concurrency = 8
innodb_flush_method = O_DIRECT
innodb_log_file_size = 512M

[mysqld_safe]
log-error=/var/log/mariadb/mariadb.log
pid-file=/var/run/mariadb/mariadb.pid

#
# include all files from the config directory
#
!includedir /etc/my.cnf.d

启动MariaDB

开机启动MariaDB

  • sudo systemctl enable mariadb

启动MariaDB服务

  • sudo systemctl start mariadb

安全设置

  • sudo /usr/bin/mysql_secure_installation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
[...]
Enter current password for root (enter for none):
OK, successfully used password, moving on...
[...]
Set root password? [Y/n] Y
New password: 123
Re-enter new password: 123
[...]
Remove anonymous users? [Y/n] Y
[...]
Disallow root login remotely? [Y/n] Y
[...]
Remove test database and access to it [Y/n] Y
[...]
Reload privilege tables now? [Y/n] Y
[...]
All done! If you've completed all of the above steps, your MariaDB
installation should now be secure.

Thanks for using MariaDB!

安装MySQL JDBC Driver for MariaDB

下载驱动包

解压驱动包

  • tar zxvf mysql-connector-java-5.1.46.tar.gz
  • rm -rf mysql-connector-java-5.1.46.tar.gz

复制驱动包

  • sudo mkdir -p /usr/share/java/
  • cd mysql-connector-java-5.1.46
  • sudo cp mysql-connector-java-5.1.46-bin.jar /usr/share/java/mysql-connector-java.jar

创建数据库

登录root账户

  • mysql -uroot -p

创建hive数据库

  • CREATE DATABASE metastore DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
  • CREATE DATABASE scm DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
  • CREATE DATABASE oozie DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
  • CREATE DATABASE hue DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;

创建账户

  • GRANT ALL ON metastore.* TO ‘hive’@’%’ IDENTIFIED BY ‘123’;
  • GRANT ALL ON scm.* TO ‘scm’@’%’ IDENTIFIED BY ‘123’;
  • GRANT ALL ON oozie.* TO ‘oozie’@’%’ IDENTIFIED BY ‘123’;
  • GRANT ALL ON hue.* TO ‘hue’@’%’ IDENTIFIED BY ‘123’;

刷新权限

  • flush privileges;

查看数据库

  • SHOW DATABASES;

查看权限

  • SHOW GRANTS FOR ‘hive’@’%’;

设置Cloudera Manager数据库(master节点)

设置语法

  • sudo /opt/cloudera/cm/schema/scm_prepare_database.sh mysql scm scm 123

准备数据库

  • sudo /opt/cloudera/cm/schema/scm_prepare_database.sh mysql scm scm

安装CDH及其他软件

【强烈建议】所有节点拍摄快照

启动Cloudera Manager Server(master节点)

  • sudo systemctl start cloudera-scm-server

查看日志(master节点)

  • sudo tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log

直至出现INFO WebServerImpl:com.cloudera.server.cmf.WebServerImpl: Started Jetty server.

登录web页面

账户:admin
密码:admin

安装组件

- [x] 接受最终用户许可条款和条件

选择Cloudera Express免费版

集群安装

Cluster Basics

  • CDH

Specify Hosts

  • 输入cdh-master, cdh-slave1, cdh-slave2点击’搜索’

选择存储库

  • 选择Public Cloudera Repository
  • 选择Parcels
  • 其他默认

JDK 安装选项

  • 选中安装 Oracle Java SE 开发工具包 (JDK)

设置登录凭据

  • 选择root
  • 输入密码

安装agents

安装Parcels

Inspect Hosts

  • 虚拟内存设置

    1
    2
    sysctl -w vm.swappiness=10
    echo vm.swappiness = 10 >> /etc/sysctl.conf
  • 大内存页设置

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    # 临时
    echo never>/sys/kernel/mm/transparent_hugepage/defrag
    echo never>/sys/kernel/mm/transparent_hugepage/enabled

    # 永久
    vim /etc/rc.local
    # 加入如下信息
    echo never>/sys/kernel/mm/transparent_hugepage/defrag
    echo never>/sys/kernel/mm/transparent_hugepage/enabled
    # 设置权限
    chmod +x /etc/rc.d/rc.local

组件列表

组件版本
Supervisord3.0
Cloudera Manager Agent6.2.0
Cloudera Manager Management Daemon6.2.0
Flume NG1.9.0+cdh6.2.0
Hadoop3.0.0+cdh6.2.0
HDFS3.0.0+cdh6.2.0
HttpFS3.0.0+cdh6.2.0
hadoop-kms3.0.0+cdh6.2.0
MapReduce 23.0.0+cdh6.2.0
YARN3.0.0+cdh6.2.0
HBase2.1.0+cdh6.2.0
Lily HBase Indexer1.5+cdh6.2.0
Hive2.1.1+cdh6.2.0
HCatalog2.1.1+cdh6.2.0
Hue4.2.0+cdh6.2.0
Impala3.2.0+cdh6.2.0
Java 81.8.0_181
Kafka2.1.0+cdh6.2.0
Kite(仅限 CDH 5 )1.0.0+cdh6.2.0
kudu1.9.0+cdh6.2.0
Oozie5.1.0+cdh6.2.0
Parquet1.9.0+cdh6.2.0
Pig0.17.0+cdh6.2.0
sentry2.1.0+cdh6.2.0
Solr7.4.0+cdh6.2.0
spark2.4.0+cdh6.2.0
Sqoop1.4.7+cdh6.2.0
ZooKeeper3.4.5+cdh6.2.0

使用向导设置群集

选择服务

  • 选择所有服务
  • 选择HBase HDFS Hive Hue Kafka Oozie YARN(MR2 Included) ZooKeeper

自定义角色分配

  • 默认

数据库设置

Hive

  • MySQL > 否 > cdh-master > metastore > hive > 123

Oozie

  • MySQL > cdh-master > oozie > oozie > 123

Hue

  • MySQL > cdh-master > hue > hue > 123

审核更改

  • 默认

命令详细信息

Ps:若搭建过程中多次尝试安装,建议删除/dfs/nn下文件,避免HDFS服务报错

汇总

错误排查

推荐链接 >> https://blog.csdn.net/zzq900503/article/details/53393721



Alessa0 wechat
(> <)  中国儿童少年基金会  &  Alessa0.cn  谢谢您的帮助!
--------- 本文结束 感谢您的阅读 ---------

本文标题:傻瓜式CDH集群部署指南

文章作者:Alessa0

发布时间:2019年05月17日 - 14:05

最后更新:2019年08月05日 - 15:08

原始链接:https://alessa0.cn/posts/5578f504/

版权声明: CC BY-NC-ND 4.0 转载请保留原文链接及作者。

0%