Skip to content

问题反馈:hash_map_open_addressing.js,线性探测解决哈希冲突,会产生无限循环。 #1728

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
yaomk opened this issue Apr 4, 2025 · 1 comment

Comments

@yaomk
Copy link

yaomk commented Apr 4, 2025

问题描述

  • 先向哈希表尽可能多的添加数据,不触发扩容机制,然后删除这些数据,使得 TOMBSTONE 空桶尽可能的占满 buckets,然后执行添加操作。
  • 进入 findBucket(key) 方法,while (this.#buckets[index] !== null) 此时没有桶为 null,并且没有已存在 key 与添加的 key 相同,无法终止循环。

其他语言应该都能复现,已在 js,java 中复现。

可能的解决方法:

在扩容时,将 TOMBSTONE 的数量也算进去,这样的话,必定会有 null 的空桶。不过会导致空间浪费。😢

复现代码:

let hashmap = new HashMapOpenAddressing()
hashmap.put(0,'0')
hashmap.put(1,'0')
hashmap.put(2,'0')

// 此时,hashmap.#buckets为 [{key:0,val: '0'}, {key:1,val: '0'}, {key:2,val: '0'}, null]
hashmap.remove(0)
hashmap.remove(1)
hashmap.remove(2)
// 此时,hashmap.#buckets为 [{key:-1,val: '-1'}, {key:-1,val: '-1'}, {key:-1,val: '-1'}, null]

hashmap.put(3,'0')
// [{key:-1,val: '-1'}, {key:-1,val: '-1'}, {key:-1,val: '-1'}, {key:3,val: '0'}]
hashmap.put(7, '0') // 此时添加不触发扩容操作,继续执行 findBucket(key) 方法,进入无限循环。
// 理想情况下应该为 [{key:7,val: '0'}, {key:-1,val: '-1'}, {key:-1,val: '-1'}, {key:3,val: '0'}]
@Chieko-Seren
Copy link
Contributor

Chieko-Seren commented May 9, 2025

我认为您可以在插入操作中允许复用TOMBSTONE桶,而不是仅寻找 null 桶。具体来说,修改 findBucket 方法,在探测时记录第一个遇到的TOMBSTONE位置。如果没有找到匹配的 key 且无 null 桶,则使用记录的TOMBSTONE位置插入新键值对。

class HashMapOpenAddressing {
  #buckets = new Array(4).fill(null);
  #size = 0;
  #loadFactor = 0.75;

  put(key, value) {
    if (this.#size / this.#buckets.length >= this.#loadFactor) {
      this.#resize();
    }
    const { index, tombstoneIndex } = this.#findBucket(key);
    let insertIndex = index;
    if (this.#buckets[index] === null || this.#buckets[index].key === -1) {
      this.#size++;
      // 如果有 TOMBSTONE 且未找到匹配 key,优先使用 TOMBSTONE
      if (tombstoneIndex !== -1 && this.#buckets[index]?.key !== key) {
        insertIndex = tombstoneIndex;
      }
    }
    this.#buckets[insertIndex] = { key, value };
  }

  remove(key) {
    const { index } = this.#findBucket(key);
    if (this.#buckets[index] && this.#buckets[index].key === key) {
      this.#buckets[index] = { key: -1, value: '-1' };
      this.#size--;
    }
  }

  #findBucket(key) {
    let index = key % this.#buckets.length;
    let tombstoneIndex = -1;
    let first = true;
    while (first || index !== key % this.#buckets.length) {
      first = false;
      if (this.#buckets[index] === null) break;
      if (this.#buckets[index].key === key) break;
      if (this.#buckets[index].key === -1 && tombstoneIndex === -1) {
        tombstoneIndex = index; // 记录第一个 TOMBSTONE
      }
      index = (index + 1) % this.#buckets.length;
    }
    return { index, tombstoneIndex };
  }

  #resize() {
    const oldBuckets = this.#buckets;
    this.#buckets = new Array(oldBuckets.length * 2).fill(null);
    this.#size = 0;
    for (const bucket of oldBuckets) {
      if (bucket && bucket.key !== -1) {
        this.put(bucket.key, bucket.value);
      }
    }
  }
}

如果您需要更详细的代码调试、性能分析,或针对特定语言的进一步优化,请告诉我! 👀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants