Consistent Hashing
Consistent Hashing (Horizontal Sharding秘密武器)
1. 为什么要Consistent Hashing
不一致哈希:假设有n台机器,最简单的做法是定义hash function为 f(x) = x % n, 然后根据f(x)的值将x放在对应的机器上。但这个做法的缺点是,当我们新增一台机器,n变为n + 1的时候, 每个key % n 和 % (n + 1)基本不一致
2. 原理
(1)一致哈希为将hash区间看成一个环,这个环范围是0~2^64 - 1
(2)将数据和机器看作是环上的点
(3)每增加一台机器,就在换上随机撒1000个点作为virtual node
(4)计算每个key所在的服务器时,首先计算该key的hash value,得到0~2^64-1的一个数,对应环上一个点,然后顺时针找到第一个离该key的hash value最近的一个virtual node, 该virtual node所在的机器就是该key所在的服务器
(5) 新加一台机器做数据迁移时,virtual node各自以顺时针方向向下一个virtual node要数据
3. Implementation
public class Solution {
// Number of range
private int n;
// Number of machines
private int k;
private Set<Integer> ids;
private Map<Integer, List<Integer>> machines;
// @param n a positive integer
// @param k a positive integer
// @return a Solution object
public static Solution create(int n, int k) {
// Write your code here
Solution solution = new Solution();
solution.n = n;
solution.k = k;
solution.ids = new HashSet<>();
solution.machines = new HashMap<>();
return solution;
}
// @param machine_id an integer
// @return a list of shard ids
public List<Integer> addMachine(int machine_id) {
Random rand = new Random();
List<Integer> virtualNodes = new ArrayList<>();
for (int i = 0; i < k; i++) {
int index = rand.nextInt(n);
while (ids.contains(index)) {
index = rand.nextInt(n);
}
ids.add(index);
virtualNodes.add(index);
}
Collections.sort(virtualNodes);
machines.put(machine_id, virtualNodes);
return virtualNodes;
}
// @param hashcode an integer
// @return a machine id
public int getMachineIdByHashCode(int hashcode) {
int distance = Integer.MAX_VALUE;
int diff = 0;
int machineId = 0;
for (Map.Entry<Integer, List<Integer>> entry : machines.entrySet()) {
int key = entry.getKey();
List<Integer> virtualNodes = entry.getValue();
for (int node : virtualNodes) {
diff = node - hashcode;
if (diff < 0) {
diff += n;
}
if (diff < distance) {
distance = diff;
machineId = key;
}
}
}
return machineId;
}
}
Last updated
Was this helpful?