美文网首页
Impala负载均衡

Impala负载均衡

作者: 一拳超疼 | 来源:发表于2020-07-25 14:40 被阅读0次

Impala主要有三个组件,分别是statestore,catalog和impalad,对于Impalad节点,每⼀个节点都可以接收客户端的查询请求,并且对于连接到该Impalad的查询还要作为Coordinator节点(需要消耗⼀定的内存和CPU)存在,为了保证每⼀个节点的资源开销的平衡需要对于集群中的Impalad节点做⼀下负载均衡.

本文使用Cloudera官⽅推荐的代理⽅案:HAProxy;
⽣产中应该选择⼀个⾮Impalad节点作为HAProxy的安装节点

具体步骤

  • 安装haproxy
yum install haproxy -y
  • 配置⽂件
    vim /etc/haproxy/haproxy.cfg
    具体配置内容
#---------------------------------------------------------------------
# Example configuration for a possible web application. See the
# full configuration options online.
#
# http://haproxy.1wt.eu/download/1.4/doc/configuration.txt
#
#---------------------------------------------------------------------
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
 log 127.0.0.1 local2
 chroot /var/lib/haproxy
 pidfile /var/run/haproxy.pid
 maxconn 4000
 user haproxy
 group haproxy
 daemon
 # turn on stats unix socket
 stats socket /var/lib/haproxy/stats
#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
 mode http#mode { tcp|http|health },tcp 表示4层,http表示7层,health仅作为健康检查使⽤
 log global
 option httplog
 option dontlognull
 #option http-server-close
 #option forwardfor except 127.0.0.0/8
 #option abortonclose##连接数过⼤⾃动关闭
 option redispatch#如果失效则强制转换其他服务器
 retries 3#尝试3次失败则从集群摘除
 timeout http-request 10s
 timeout queue 1m
 #timeout connect 10s
 #timeout client 1m
 #timeout server 1m
 timeout connect 1d#连接超时时间,重要,hive查询数据能返回结果的保证
 timeout client 1d#同上
 timeout server 1d#同上
 timeout http-keep-alive 10s
 timeout check 10s#健康检查时间
 maxconn 3000#最⼤连接数
listen status#定义管理界⾯
 bind 0.0.0.0:1080#管理界⾯访问IP和端⼝
 mode http#管理界⾯所使⽤的协议
 option httplog
 maxconn 5000#最⼤连接数
 stats refresh 30s#30秒⾃动刷新
 stats uri /stats
listen impalashell
 bind 0.0.0.0:25003#ha作为proxy所绑定的IP和端⼝
 mode tcp#以4层⽅式代理,重要
 option tcplog
 balance roundrobin#调度算法 'leastconn' 最少连接数分配,或者 'roundrobin',轮询分
 server impalashell_1 linux121:21000 check
 server impalashell_2 linux122:21000 check
 server impalashell_3 linux123:21000 check
listen impalajdbc
 bind 0.0.0.0:25004#ha作为proxy所绑定的IP和端⼝
 mode tcp#以4层⽅式代理,重要
 option tcplog
 balance roundrobin #调度算法 'leastconn' 最少连接数分配,或者 'roundrobin',轮询分
 server impalajdbc_1 linux121:21050 check
 server impalajdbc_2 linux122:21050 check
 server impalajdbc_3 linux122:21050 check
#---------------------------------------------------------------------
# main frontend which proxys to the backends
#---------------------------------------------------------------------
frontend main *:5000
 acl url_static path_beg -i /static /images /javascript/stylesheets
 acl url_static path_end -i .jpg .gif .png .css .js
 use_backend static if url_static
 default_backend app
#---------------------------------------------------------------------
# static backend for serving up images, stylesheets and such
#---------------------------------------------------------------------
backend static
 balance roundrobin
 server static 127.0.0.1:4331 check
#---------------------------------------------------------------------
# round robin balancing between the various backends
#---------------------------------------------------------------------
backend app
 balance roundrobin
 server app1 127.0.0.1:5001 check
 server app2 127.0.0.1:5002 check
 server app3 127.0.0.1:5003 check
 server app4 127.0.0.1:5004 check

-启动

 开启: service haproxy start
 关闭: service haproxy stop
 重启: service haproxy restart

可能会启动失败,遇到Starting proxy status: cannot bind socket [0.0.0.0:1080]问题
解决方法如下:
执行语句

setsebool -P haproxy_connect_any=1
  • 使用
impala-shell -i hadoop3:25003

相关文章

网友评论

      本文标题:Impala负载均衡

      本文链接:https://www.haomeiwen.com/subject/bbfslktx.html