Flink电商团购项目(零):环境搭建

Flink项目环境搭建


janko-ferlic-214224-unsplash

第一章 大数据集群搭建

集群组件安装

安装JDK

  • cd /mnt/hgfs/aboutyun
  • cp jdk-8u212-linux-x64.tar.gz /usr/aboutyun
  • cd /usr/aboutyun
  • tar zxvf jdk-8u212-linux-x64.tar.gz
  • rm -rf jdk-8u212-linux-x64.tar.gz
  • mv jdk1.8.0_212 jdk

#配置JDK环境变量

  • vim /etc/profile
1
2
3
4
# SET JAVA PATH 
export JAVA_HOME=/usr/aboutyun/jdk
export CLASSPATH=.:$CLASSPATH:$JAVA_HOME/lib
export PATH=$PATH:$JAVA_HOME/bin
  • source /etc/profile

安装Scala

  • cd /mnt/hgfs/aboutyun
  • cp scala-2.11.12.tgz /usr/aboutyun
  • cd /usr/aboutyun
  • tar zxvf scala-2.11.12.tgz
  • rm -rf scala-2.11.12.tgz
  • mv scala-2.11.12 scala

#配置Scala环境变量

  • vim /etc/profile
1
2
3
# SET SCALA PATH 
export SCALA_HOME=/usr/aboutyun/scala
export PATH=$PATH:$SCALA_HOME/bin
  • source /etc/profile

安装miniconda3

  • cd /mnt/hgfs/aboutyun
  • cp Miniconda3-latest-Linux-x86_64.sh /usr/local/src/
  • cd /usr/local/src
  • sudo yum -y install bzip2
  • sh Miniconda3-latest-Linux-x86_64.sh
  • rm -rf Miniconda3-latest-Linux-x86_64.sh

#配置环境变量:

  • source ~/.bashrc

#更新conda环境:

  • conda update —all

安装Zookeeper

仅在Slave节点
  • cd /mnt/hgfs/aboutyun
  • cp zookeeper-3.4.9.tar.gz /usr/aboutyun/
  • cd /usr/aboutyun/
  • tar zxvf zookeeper-3.4.9.tar.gz
  • rm -rf zookeeper-3.4.9.tar.gz
  • mv zookeeper-3.4.9 zookeeper
  • cd zookeeper

#配置Zookeeper环境变量

  • vim /etc/profile
1
2
3
# SET ZOOKEEPER PATH 
export ZOOKEEPER_HOME=/usr/aboutyun/zookeeper
export PATH=$PATH:$ZOOKEEPER_HOME/bin
  • source /etc/profile

#修改Zookeeper配置

  • mkdir data
  • mkdir logs
  • cd conf
  • cp zoo_sample.cfg zoo.cfg
  • vim zoo.cfg
1
2
3
4
5
dataDir=/usr/aboutyun/zookeeper/data 
dataLogDir=/usr/aboutyun/zookeeper/logs
server.1=flink-slave1:2888:3888
server.2=flink-slave2:2888:3888
server.3=flink-slave3:2888:3888

#分别添加ID

1
2
3
4
5
6
#Slave1 
echo "1" > /usr/aboutyun/zookeeper/data/myid
#Slave2
echo "2" > /usr/aboutyun/zookeeper/data/myid
#Slave3
echo "3" > /usr/aboutyun/zookeeper/data/myid

#启动Zookeeper服务

  • zkServer.sh start

#查看运行状态

  • zkServer.sh status

  • jps

#关闭Zookeeper服务

  • zkServer.sh stop

安装Hadoop

  • cd /mnt/hgfs/aboutyun
  • cp hadoop-2.7.0.tar.gz /usr/aboutyun/
  • cd /usr/aboutyun/
  • tar zxvf hadoop-2.7.0.tar.gz
  • rm -rf hadoop-2.7.0.tar.gz
  • mv hadoop-2.7.0 hadoop
  • cd hadoop

#配置Hadoop环境变量

  • vim /etc/profile
1
2
3
4
5
# SET HADOOP PATH 
export HADOOP_HOME=/usr/aboutyun/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
  • source /etc/profile

#创建临时目录和文件目录

  • mkdir -p /usr/aboutyun/hadoop/dfs/name
  • mkdir -p /usr/aboutyun/hadoop/dfs/data
  • mkdir -p /usr/aboutyun/hadoop/tmp/dfs
  • mkdir -p /usr/aboutyun/hadoop/journal
  • mkdir -p /usr/aboutyun/hadoop/yarn/logs

#修改Hadoop配置文件

  • cd etc/hadoop
  • vim hadoop-env.sh
1
export JAVA_HOME=/usr/aboutyun/jdk
  • vim yarn-env.sh
1
export JAVA_HOME=/usr/aboutyun/jdk
  • vim slaves
1
2
3
flink-slave1 
flink-slave2
flink-slave3
  • vim core-site.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://mycluster</value>
<description>默认文件系统的名称。一个URI,其方案和权限决定了FileSystem的实现。</description>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>flink-slave1:2181,flink-slave2:2181,flink-slave3:2181</value>
<description>由逗号分隔的ZooKeeper服务器地址列表,由ZKFailoverController在自动故障转移中使用。</description>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/aboutyun/hadoop/tmp</value>
<description>数据目录目录</description>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence
shell(/bin/true)</value>
<description>用于服务防护的防护方法列表。可能包含内置方法(例如shell和sshfence)或用户定义的方法。</description>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/aboutyun/.ssh/id_rsa</value>
<description>用于内置sshfence fencer的SSH私钥文件。</description>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
<description>SequenceFiles中使用的读/写缓冲区的大小。</description>
</property>
<property>
<name>ipc.client.connect.max.retries</name>
<value>100</value>
<description>客户端为建立服务器连接而重试的次数。</description>
</property>
<property>
<name>ipc.client.connect.retry.interval</name>
<value>10000</value>
<description>客户端在重试建立服务器连接之前将等待的毫秒数。</description>
</property>
</configuration>
  • vim hdfs-site.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
<configuration>
<property>
<name>dfs.nameservices</name>
<value>mycluster</value>
</property>
<property>
<name>dfs.ha.namenodes.mycluster</name>
<value>flink-master1,flink-master2</value>
<description>给定名称服务的前缀包含给定名称服务的逗号分隔的名称节点列表。</description>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.flink-master1</name>
<value>flink-master1:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.flink-master2</name>
<value>flink-master2:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.flink-master1</name>
<value>flink-master1:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.flink-master2</name>
<value>flink-master2:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://flink-slave1:8485;flink-slave2:8485;flink-slave3:8485/mycluster</value>
<description>HA群集中多个名称节点之间的共享存储上的目录。此目录将由活动写入并由备用数据库读取,以保持命名空间同步。</description>
</property>
<property>
<name>dfs.client.failover.proxy.provider.mycluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
<description>配置Java类的名称,DFS客户端将使用该名称来确定哪个NameNode是当前的Active,以及哪个NameNode当前正在为客户端请求提供服务。</description>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
<description>是否启用自动故障转移。</description>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
<description>如果为“true”,则启用HDFS中的权限检查。如果为“false”,则关闭权限检查,但所有其他行为都保持不变。</description>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/usr/aboutyun/hadoop/journal</value>
<description>指定JournalNode在本地磁盘存放数据的位置</description>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///usr/aboutyun/hadoop/dfs/name</value>
<description>设置namenode存放路径</description>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///usr/aboutyun/hadoop/dfs/data</value>
<description>设置datanode存放径路</description>
</property>
<property>
<name>dfs.blocksize</name>
<value>268435456</value>
<description>大型文件系统的HDFS块大小为256MB。</description>
</property>
<property>
<name>dfs.namenode.handler.count</name>
<value>100</value>
<description>namenode的服务器线程数</description>
</property>
</configuration>
  • mv mapred-site.xml.template mapred-site.xml
  • vim mapred-site.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
<description>指定mr框架为yarn方式</description>
</property>
<property>
<name>mapreduce.map.memory.mb</name>
<value>512</value>
<description>每个Map任务的物理内存限制</description>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>512</value>
<description>每个Reduce任务的物理内存限制</description>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>0.0.0.0:10020</value>
<description>MapReduce JobHistory服务器IPC主机:端口</description>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>0.0.0.0:19888</value>
<description>MapReduce JobHistory服务器Web浏览时的主机:端口</description>
</property>
<property>
<name>mapreduce.application.classpath</name>
<value>
/usr/aboutyun/hadoop/etc/hadoop,
/usr/aboutyun/hadoop/share/hadoop/common/*,
/usr/aboutyun/hadoop/share/hadoop/common/lib/*,
/usr/aboutyun/hadoop/share/hadoop/hdfs/*,
/usr/aboutyun/hadoop/share/hadoop/hdfs/lib/*,
/usr/aboutyun/hadoop/share/hadoop/mapreduce/*,
/usr/aboutyun/hadoop/share/hadoop/mapreduce/lib/*,
/usr/aboutyun/hadoop/share/hadoop/yarn/*,
/usr/aboutyun/hadoop/share/hadoop/yarn/lib/*
</value>
</property>
</configuration>
  • vim yarn-site.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
<configuration>
<!-- Site specific YARN configuration properties-->
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
<description>启动后启用RM以恢复状态。如果为true,则必须指定yarn.resourcemanager.store.class。</description>
</property>
<property>
    <name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
    <value>true</value>
</property>
<property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
<description>用作持久存储的类。</description>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>flink-slave1:2181,flink-slave2:2181,flink-slave3:2181</value>
<description>ZooKeeper服务的地址,多个地址使用逗号隔开</description>
</property>
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
<description>启用RM高可用性。启用时,(1)默认情况下,RM以待机模式启动,并在提示时转换为活动模式。(2)RM集合中的节点列在yarn.resourcemanager.ha.rm-ids中(3)如果明确指定了yarn.resourcemanager.ha.id,则每个RM的id来自yarn.resourcemanager.ha.id或者可以通过匹配yarn.resourcemanager.address。</description>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
<description>启用HA时群集中的RM节点列表。最少2个</description>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm1</name>
<value>flink-master1:8088</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm2</name>
<value>flink-master2:8088</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>mycluster-yarn-ha</value>
<description>集群HA的id,用于在ZooKeeper上创建节点,区分使用同一个ZooKeeper集群的不同Hadoop集群</description>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>flink-master1</value>
<description>主机名</description>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>flink-master2</value>
<description>主机名</description>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
<description>reducer取数据的方式是mapreduce_shuffle</description>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>2048</value>
<discription>每个节点可用内存,单位MB</discription>
</property>
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>2</value>
<discription>每个节点可用cpu</discription>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>512</value>
<discription>单个任务可申请最少内存,默认1024MB</discription>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>1024</value>
<discription>单个任务可申请最大内存,默认8192MB</discription>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-vcores</name>
<value>1</value>
<discription>最小的cores 1 个,默认的就是一个</discription>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-vcores</name>
<value>2</value>
<discription>最多可分配的cores 2 个</discription>
</property>
<property>
<name>yarn.nodemanager.pmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
<discription>是否开启聚合日志</discription>
</property>
<property>
<name>yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds</name>
<value>-1</value>
<discription>定义NM唤醒上载日志文件的频率。默认值为-1。默认情况下,应用程序完成后将上载日志。通过设置此配置,可以在应用程序运行时定期上载日志。可设置的最小滚动间隔秒数为3600。</discription>
</property>
<property>
<name>yarn.log.server.url</name>
<value>http://flink-master1:19888/jobhistory/logs</value>
<discription> 配置日志服务器的地址</discription>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>-1</value>
<discription> 在删除聚合日志之前保留多长时间。-1禁用。单位是秒</discription>
</property>
<property>
<name>yarn.nodemanager.log-dirs</name>
<value>/usr/aboutyun/hadoop/yarn/logs/</value>
<discription>nodemanager存放container日志的本地路径</discription>
</property>
<property>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>/tmp/logs</value>
<discription>nodemanager存放container日志的本地路径</discription>
</property>
<property>
<name>yarn.resourcemanager.am.max-attempts</name>
<value>4</value>
<description>The maximum number of application master execution attempts.</description>
</property>
</configuration>
  • cd /usr/aboutyun/hadoop/sbin

#编辑 start-dfs.sh,stop-dfs.sh 脚本

1
2
3
4
5
6
7
# 在开始处 #!/usr/bin/env bash 的下面,增加以下内容:
HDFS_DATANODE_USER=root
HDFS_DATANODE_SECURE_USER=hdfs
HDFS_ZKFC_USER=root
HDFS_JOURNALNODE_USER=root
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root

#编辑 start-yarn.sh,stop-yarn.sh 脚本

1
2
3
4
# 在开始处 #!/usr/bin/env bash 的下面,增加以下内容:
YARN_RESOURCEMANAGER_USER=root
HADOOP_SECURE_DN_USER=yarn
YARN_NODEMANAGER_USER=root

#启用JournalNode集群

  • hadoop-daemon.sh start journalnode

#初始化NameNode(仅master1)

  • hadoop namenode -format

#格式化Zookeeper(仅master1)

  • hdfs zkfc -formatZK

#启动NameNode(仅master1)

  • hadoop-daemon.sh start namenode

#将 NameNode 数据复制到备用 NameNode(仅master2)

  • hdfs namenode -bootstrapStandby
  • hadoop-daemon.sh start namenode

#启动Hadoop集群

  • start-dfs.sh (仅master1)
  • start-yarn.sh (仅master1)
  • yarn-daemon.sh start resourcemanager(仅master2)

#监控页面

安装mysql

仅在Master节点

#安装mysql

  • yum -y install mariadb-server mariadb
  • rpm -q mariadb mariadb-server

#设置mysql开机启动

  • systemctl enable mariadb
  • systemctl daemon-reload

#开启mysql

  • systemctl start mariadb

#关闭mysql

  • systemctl stop mariadb

#重启mysql

  • systemctl restart mariadb

#查看mysql状态

  • systemctl status mariadb

#通过内置的安全脚本实现对数据库的安全保护

  • mysql_secure_installation

#登录root账户

  • mysql -uroot -p

#创建账户

  • CREATE DATABASE metastore DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
  • GRANT ALL ON metastore.* TO ‘hive’@’%’ IDENTIFIED BY ‘123’;

#刷新权限

  • flush privileges;

安装Hive

仅在Master节点
  • cd /mnt/hgfs/aboutyun
  • cp apache-hive-2.2.0-bin.tar.gz /usr/aboutyun/
  • cd /usr/aboutyun/
  • tar zxvf apache-hive-2.2.0-bin.tar.gz
  • rm -rf apache-hive-2.2.0-bin.tar.gz
  • mv apache-hive-2.2.0-bin hive

#配置Hive环境变量

  • vim /etc/profile
1
2
3
# SET Hive PATH 
export HIVE_HOME=/usr/aboutyun/hive
export PATH=$PATH:$HIVE_HOME/bin
  • source /etc/profile

#配置mysql驱动包

  • cd /mnt/hgfs/aboutyun
  • cp mysql-connector-java-5.1.47.tar.gz /usr/aboutyun/
  • cd /usr/aboutyun/
  • tar zxvf mysql-connector-java-5.1.47.tar.gz
  • rm -rf mysql-connector-java-5.1.47.tar.gz
  • cp mysql-connector-java-5.1.47-bin.jar /usr/aboutyun/hive/lib/

#更换jline包(版本不一致)

  • cp hive/lib/jline-2.12.jar /usr/aboutyun/hadoop/share/hadoop/yarn/lib/

#配置hive

  • cd hive
  • mkdir -p data/hive/log
  • mkdir -p data/hive/tmp
  • mkdir -p data/hive/warehouse
  • cd conf
  • cp hive-env.sh.template hive-env.sh
  • vim hive-env.sh
1
2
3
4
5
export JAVA_HOME=/usr/aboutyun/jdk 
export HADOOP_HOME=/usr/aboutyun/hadoop
export HIVE_HOME=/usr/aboutyun/hive
export HIVE_CONF_DIR=/usr/aboutyun/hive/conf
export HIVE_AUX_JARS=/usr/aboutyun/hive/lib
  • cp hive-default.xml.template hive-site.xml
  • vim hive-site.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
<configuration> 
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true&amp;useSSL=false</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123</value>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/usr/aboutyun/hive/data/hive/warehouse</value>
</property>
<property>
<name>hive.exec.scratchdir</name>
<value>/usr/aboutyun/hive/data/hive/tmp</value>
</property>
<property>
<name>hive.querylog.location</name>
<value>/usr/aboutyun/hive/data/hive/log</value>
</property>
</configuration>

把{system:java.io.tmpdir} 改成 /usr/aboutyun/hive/data/hive/tmp

1
:%s/${system:java.io.tmpdir}/\/usr\/aboutyun\/hive\/data\/hive\/tmp/g

把 {system:user.name} 改成 {user.name}

1
:%s/${system:user.name}/aboutyun/g

#初始化hive(MYSQL版)

  • schematool -dbType mysql -initSchema

安装Flume

  • cd /mnt/hgfs/aboutyun
  • cp apache-flume-1.9.0-bin.tar.gz /usr/aboutyun/
  • cd /usr/aboutyun/
  • tar zxvf apache-flume-1.9.0-bin.tar.gz
  • rm -rf apache-flume-1.9.0-bin.tar.gz
  • mv apache-flume-1.9.0-bin flume
  • cd flume

#Hbase环境变量

  • vim /etc/profile
1
2
3
# SET FLUME PATH 
export FLUME_HOME=/usr/aboutyun/flume
export PATH=$PATH:$FLUME_HOME/bin
  • source /etc/profile

#修改Flume配置

  • cd conf
  • cp flume-env.sh.template flume-env.sh
  • vim flume-env.sh
1
export JAVA_HOME=/usr/aboutyun/jdk

#验证

Server

1
2
3
4
5
6
7
8
# NetCat 
flume-ng agent --conf conf --conf-file conf/flume-netcat.conf --name=agent -Dflume.root.logger=INFO,console

# Exec
flume-ng agent --conf conf --conf-file conf/flume-exec.conf --name=agent -Dflume.root.logger=INFO,console

# Avro
flume-ng agent --conf conf --conf-file conf/flume-netcat.conf --name=agent -Dflume.root.logger=DEBUG,console

Client

1
2
3
4
5
6
7
8
# NetCat 
flume-ng agent --conf conf --conf-file conf/flume-netcat.conf --name=agent -Dflume.root.logger=INFO,console

# Exec
while true;do echo `date` >> /data/hadoop/flume/test.txt ; sleep 1; done

# Avro
telnet master 44444

安装Kafka

  • cd /mnt/hgfs/aboutyun
  • cp kafka_2.11-0.11.0.3.tgz /usr/aboutyun
  • cd /usr/aboutyun
  • tar zxvf kafka_2.11-0.11.0.3.tgz
  • rm -rf kafka_2.11-0.11.0.3.tgz
  • mv kafka_2.11-0.11.0.3 kafka
  • cd kafka

#Kafka环境变量

  • vim /etc/profile
1
2
3
# SET KAFKA PATH 
export KAFKA_HOME=/usr/aboutyun/kafka
export PATH=$PATH:$KAFKA_HOME/bin
  • source /etc/profile

#修改Kafka配置

  • mkdir logs
  • cd config
  • vim server.properties
1
2
3
4
5
6
7
8
9
10
11
12
13
log.dirs=/usr/aboutyun/kafka/logs 
zookeeper.connect=flink-slave1:2181,flink-slave2:2181,flink-slave3:2181

#Master1
broker.id=0
#Master2
broker.id=1
#Slave1
broker.id=2
#Slave2
broker.id=3
#Slave3
broker.id=4

普通启动:

  • kafka-server-start.sh -daemon /usr/aboutyun/kafka/config/server.properties

关闭集群:

  • kafka-server-stop.sh
  • cd /mnt/hgfs/aboutyun
  • cp flink-1.7.2-bin-hadoop27-scala_2.11.tgz /usr/aboutyun
  • cd /usr/aboutyun
  • tar zxvf flink-1.7.2-bin-hadoop27-scala_2.11.tgz
  • rm -rf flink-1.7.2-bin-hadoop27-scala_2.11.tgz
  • mv flink-1.7.2 flink
  • cd flink

#Flink环境变量

  • vim /etc/profile
1
2
3
# SET FLINK PATH 
export FLINK_HOME=/usr/aboutyun/flink
export PATH=$PATH:$FLINK_HOME/bin
  • source /etc/profile

#修改Flink配置

  • cd conf
  • vim flink-conf.yaml
1
2
3
4
5
6
7
8
# 修改如下内容
jobmanager.rpc.address: flink-master1
high-availability: zookeeper
high-availability.zookeeper.path.root: /flink
high-availability.cluster-id: flink
high-availability.storageDir: hdfs:///flink/ha/
high-availability.zookeeper.quorum: flink-slave1:2181,flink-slave2:2181,flink-slave3:2181
yarn.application-attempts: 10
  • vim masters
1
2
flink-master1:8081
flink-master2:8081
  • vim slaves
1
2
3
flink-slave1 
flink-slave2
flink-slave3

#启动Standalone集群HA

  • start-cluster.sh

#关闭Standalone集群

  • stop-cluster.sh

#启动YARN集群HA

  • yarn-session.sh

#监控网页

Flink(master1)



Alessa0 wechat
(> <)  中国儿童少年基金会  &  Alessa0.cn  谢谢您的帮助!
--------- 本文结束 感谢您的阅读 ---------

本文标题:Flink电商团购项目(零):环境搭建

文章作者:Alessa0

发布时间:2019年05月17日 - 16:05

最后更新:2019年08月05日 - 15:08

原始链接:https://alessa0.cn/posts/8d61cd2a/

版权声明: CC BY-NC-ND 4.0 转载请保留原文链接及作者。

0%