自定义C++ Hash表设计思路

发布时间：2024-11-20 10:15:39 作者：小樊
来源：亿速云阅读：102

设计一个自定义的C++哈希表需要考虑以下几个关键点：

哈希函数：一个好的哈希函数应该能够将输入的键均匀地分布在整个哈希表中，以减少冲突的概率。
冲突解决策略：当两个不同的键映射到同一个哈希桶时，需要有一种策略来解决冲突。常见的冲突解决策略有开放寻址法（线性探测、二次探测、双散列）和链地址法。
动态扩容：为了保持哈希表的性能，当其负载因子超过一定阈值时，应该进行动态扩容。
内存管理：需要考虑如何高效地分配和释放内存，避免内存泄漏。
线程安全：如果需要在多线程环境中使用哈希表，还需要考虑线程安全问题。

下面是一个简单的自定义C++哈希表设计示例，使用了链地址法来解决冲突，并且支持动态扩容：

#include <iostream>
#include <list>
#include <vector>
#include <algorithm>

template <typename Key, typename Value>
class HashTable {
private:
    struct Entry {
        Key key;
        Value value;
        bool occupied;
        Entry() : occupied(false) {}
    };

    std::vector<std::list<Entry>> table;
    size_t size;
    size_t capacity;
    float loadFactorThreshold;

    size_t hash(const Key& key) const {
        return std::hash<Key>{}(key) % capacity;
    }

    void rehash(size_t newCapacity) {
        std::vector<std::list<Entry>> newTable(newCapacity);
        for (const auto& bucket : table) {
            for (const auto& entry : bucket) {
                if (entry.occupied) {
                    size_t newIndex = std::hash<Key>{}(entry.key) % newCapacity;
                    newTable[newIndex].push_back(entry);
                }
            }
        }
        table = std::move(newTable);
        capacity = newCapacity;
    }

public:
    HashTable(size_t initialCapacity = 10, float loadFactorThreshold = 0.75)
        : size(0), capacity(initialCapacity), loadFactorThreshold(loadFactorThreshold) {
        table.resize(capacity);
    }

    void insert(const Key& key, const Value& value) {
        if (static_cast<float>(size) / capacity >= loadFactorThreshold) {
            rehash(capacity * 2);
        }

        size_t index = hash(key);
        for (auto& entry : table[index]) {
            if (entry.key == key) {
                entry.value = value;
                return;
            }
        }

        table[index].push_back({key, value, true});
        ++size;
    }

    bool remove(const Key& key) {
        size_t index = hash(key);
        for (auto it = table[index].begin(); it != table[index].end(); ++it) {
            if (it->key == key) {
                table[index].erase(it);
                --size;
                return true;
            }
        }
        return false;
    }

    bool find(const Key& key, Value& value) const {
        size_t index = hash(key);
        for (const auto& entry : table[index]) {
            if (entry.key == key) {
                value = entry.value;
                return true;
            }
        }
        return false;
    }

    size_t getSize() const {
        return size;
    }

    size_t getCapacity() const {
        return capacity;
    }
};

int main() {
    HashTable<int, std::string> ht;
    ht.insert(1, "one");
    ht.insert(11, "eleven");
    ht.insert(21, "twenty-one");

    std::string value;
    if (ht.find(11, value)) {
        std::cout << "Found: " << value << std::endl;
    } else {
        std::cout << "Not found" << std::endl;
    }

    ht.remove(11);

    if (ht.find(11, value)) {
        std::cout << "Found: " << value << std::endl;
    } else {
        std::cout << "Not found" << std::endl;
    }

    return 0;
}

关键点解释：

哈希函数：使用std::hash<Key>来生成键的哈希值。
冲突解决策略：使用链地址法，每个桶是一个std::list<Entry>。
动态扩容：当负载因子超过loadFactorThreshold时，将容量加倍。
内存管理：使用std::vector来管理桶，自动处理内存分配和释放。
线程安全：这个示例没有考虑线程安全，如果需要在多线程环境中使用，可以考虑使用互斥锁或其他同步机制。

这个示例是一个基础的哈希表实现，可以根据具体需求进行扩展和优化。

自定义C++ Hash表设计思路

关键点解释：

相关阅读