forked from ahangchen/dirtysalt.github.io
-
Notifications
You must be signed in to change notification settings - Fork 0
/
alibaba-hbase-practice.html
83 lines (82 loc) · 4.31 KB
/
alibaba-hbase-practice.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>阿里HBase业务设计实践</title>
<meta name="generator" content="Org-mode" />
<meta name="author" content="dirtysalt" />
<link rel="shortcut icon" href="http://dirtysalt.info/css/favicon.ico" />
<link rel="stylesheet" type="text/css" href="./css/site.css" />
</head>
<body>
<div id="content">
<h1 class="title">阿里HBase业务设计实践</h1>
<p>
产品线、客户端使用建议
</p>
<ul class="org-ul">
<li>海量数据,rowkey范围和分布已知,建议进行预分配</li>
<li>Rowkey一定要尽量短 (如:时间用时间戳整数表示、编码压缩)</li>
<li>CF设计:尽量少,建议CF数量在1-2个</li>
<li>Rowkey设计:写入要分散;如历史交易订单:biz_order_id做reverse后做rowkey</li>
<li>Autoflush参数设置为true;否则极端情况下会丢失数据
<ul class="org-ul">
<li>Hbase client的重试次数为3次以上。否则会由于split导致region not onle;从而导致写入失败(udc集群出现过)。</li>
<li>hbase.rpc.timeout 一次rpc的timeout;默认60秒</li>
<li>hbase.client.pause 客户端的一次操作失败,到下次重试之间的等待时间</li>
<li>hbase.client.retries.number 客户端重试的次数</li>
<li>hbase.regionserver.lease.period 客户端租期超时阀值;scan量大时可以考虑增大;否则”Lease Exception: lease -70000000000000001 does not exist”</li>
</ul></li>
<li>ZK连接/HTable对象的使用注意
<ul class="org-ul">
<li>Configure对象的使用. 必须是static or singleton模式</li>
<li>默认:每台机器与zk直接的连接数不超过30个</li>
<li>HTable的使用
<ul class="org-ul">
<li>线程不安全</li>
<li>使用HTableV2</li>
<li>HTablePool (推荐的方式)</li>
</ul></li>
</ul></li>
</ul>
<p>
影响汇总
</p>
<ol class="org-ol">
<li>对于写速度而言,影响因素的效果主要为: 写hlog > split > compact;</li>
<li>对于写速度波动而言,想完全不波动是不可能,影响因素的效果主要为:split > 写hlog > compact;</li>
<li>对于写频率较高的应用而言,一台region server上不适合有太多的region; (hbase.hregion.max.filesize = 64G)</li>
<li>Pre-Sharding可以不做,建议做;</li>
<li>对于日志应用可以考虑关闭compact/split
<ul class="org-ul">
<li>hbase.regionserver.regionSplitLimit 1关闭split</li>
<li>hbase.hstore.compactionThreshold Integer.MAX_VALUE关闭Compact</li>
<li>hbase.hstore.blockingStoreFiles Integer.MAX_VALUE不要因为store file数量而产生阻塞</li>
</ul></li>
</ol>
<p>
风险点:集群稳定/容灾
</p>
<ul class="org-ul">
<li>regionserver的单点问题
<ul class="org-ul">
<li>导致部分数据短暂不可用</li>
</ul></li>
<li>跨机房容灾
<ul class="org-ul">
<li>目前还只是部署在单个机房</li>
<li>跨机房性能衰减</li>
</ul></li>
<li>实现:
<ul class="org-ul">
<li>程序双写</li>
<li>复制的测试(push的replication已经上线、pull在研)</li>
<li>消息中间件实现(异步消息)</li>
</ul></li>
</ul>
</div>
<!-- DISQUS BEGIN --><div id="disqus_thread"></div><script type="text/javascript">/* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE * * *//* required: replace example with your forum shortname */var disqus_shortname = 'dirlt';var disqus_identifier = 'alibaba-hbase-practice.html';var disqus_title = 'alibaba-hbase-practice.html';var disqus_url = 'http://dirtysalt.github.io/alibaba-hbase-practice.html';/* * * DON'T EDIT BELOW THIS LINE * * */(function() {var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true;dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js';(document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);})();</script><noscript>Please enable JavaScript to view the <a href="http://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript><a href="http://disqus.com" class="dsq-brlink">comments powered by <span class="logo-disqus">Disqus</span></a><!-- DISQUS END --></body>
</html>