System Design
  • Introduction
  • Design Twitter
  • Design Tiny Url
  • Design Instagram
  • Consistent Hashing
  • NOSQL
    • Dynamo
    • Big Table
  • 在浏览器输入URL
  • RESTful API Design
  • 索引
    • Hash Indexes --- Bitcask
    • B-tree
    • LSM
  • Design Instagram
  • Design KV Store
  • Design Message System
  • Design Localization System
  • Design RSS Reader
  • Design Ticket Master
Powered by GitBook
On this page
  • Consistent Hashing (Horizontal Sharding秘密武器)
  • 1. 为什么要Consistent Hashing
  • 2. 原理
  • 3. Implementation

Was this helpful?

Consistent Hashing

PreviousDesign InstagramNextNOSQL

Last updated 5 years ago

Was this helpful?

Consistent Hashing (Horizontal Sharding秘密武器)

1. 为什么要Consistent Hashing

不一致哈希:假设有n台机器,最简单的做法是定义hash function为 f(x) = x % n, 然后根据f(x)的值将x放在对应的机器上。但这个做法的缺点是,当我们新增一台机器,n变为n + 1的时候, 每个key % n 和 % (n + 1)基本不一致

2. 原理

(1)一致哈希为将hash区间看成一个环,这个环范围是0~2^64 - 1

(2)将数据和机器看作是环上的点

(3)每增加一台机器,就在换上随机撒1000个点作为virtual node

(4)计算每个key所在的服务器时,首先计算该key的hash value,得到0~2^64-1的一个数,对应环上一个点,然后顺时针找到第一个离该key的hash value最近的一个virtual node, 该virtual node所在的机器就是该key所在的服务器

(5) 新加一台机器做数据迁移时,virtual node各自以顺时针方向向下一个virtual node要数据

3. Implementation

public class Solution {
    // Number of range
    private int n;
    // Number of machines
    private int k;
    private Set<Integer> ids;
    private Map<Integer, List<Integer>> machines;

    // @param n a positive integer
    // @param k a positive integer
    // @return a Solution object
    public static Solution create(int n, int k) {
        // Write your code here
        Solution solution = new Solution();
        solution.n = n;
        solution.k = k;
        solution.ids = new HashSet<>();
        solution.machines = new HashMap<>();
        return solution;
    }

    // @param machine_id an integer
    // @return a list of shard ids
    public List<Integer> addMachine(int machine_id) {
        Random rand = new Random();
        List<Integer> virtualNodes = new ArrayList<>();

        for (int i = 0; i < k; i++) {
            int index = rand.nextInt(n);
            while (ids.contains(index)) {
                index = rand.nextInt(n);
            }
            ids.add(index);
            virtualNodes.add(index);
        }

        Collections.sort(virtualNodes);
        machines.put(machine_id, virtualNodes);
        return virtualNodes;
    }

    // @param hashcode an integer
    // @return a machine id
    public int getMachineIdByHashCode(int hashcode) {
        int distance = Integer.MAX_VALUE;
        int diff = 0;
        int machineId = 0;

        for (Map.Entry<Integer, List<Integer>> entry : machines.entrySet()) {
            int key = entry.getKey();
            List<Integer> virtualNodes = entry.getValue();

            for (int node : virtualNodes) {
                diff = node - hashcode;

                if (diff < 0) {
                    diff += n;
                }

                if (diff < distance) {
                    distance = diff;
                    machineId = key;
                }
            }
        }
        return machineId;
    }
}
Consistent Hashing II