Hashing in DBMS: An introduction and Types of Hashing

Introduction

In the complex structure of a database, data searching can be a hustle if one has to go through multiple index values to reach the required data. To tackle this problem hashing technique is used that calculates the direct location of a data record on the disk without using index structure. The data is stored in a memory location known as data blocks or data buckets. The address of these blocks is generated using the hashing function.

A hash function is a simple mathematical function to any complex mathematical function. Usually it uses the primary key to generate the address of the data block however this is not a restriction and the function can choose any of the column value to generate the address. The primary key itself can be considered as the address of the data block meaning each row with the same address as a primary key contained in the data block.

The main difference between indexing and hashing is that indexing uses B-tree like data structure for ordered and range queries and stores data in and ordered fashion providing logarithmic time in most operations. Contrary to that, hashing may provide constant time access on performing any operation but it can have collisions.

Types of Hashing:

Hashing has two types:

Static Hashing
dynamic Hashing

Static Hashing

In static hashing when a search-key value is provided, the hash algorithm always returns the same address. For instance, only 5 values will be generated given that the mod-4 hash function is used. The output address must always be same for that function. The number of buckets provided remains static or unchanged at all times.

A number of operations can be done when using static hashing.

Insert a record: To insert a record using static hash, an address will be generated for a new record based on the hash key and record is stored in that location. The hash function h computes the bucket address for search key K, where the record will be stored. For such case, Bucket address = h(K)
Searching a record: To search a record, the same hash function h(k) retrieves the address of the bucket where the data is stored.
Update a Record: To update a record, it will first be searched using a hash function and then the data record is updated.
Delete a Record: This is simply a search followed by a deletion operation where the record to be deleted will first be fetched and then deleted for that address in memory.

The problem of Bucket Overflow

Suppose we want to insert some new record into the file but the address of a data bucket generated by the hash function is not empty or data already exists in that address. This situation in the static hashing is known as bucket overflow or Collision. This is a fatal state in this method. To overcome this problem, two methods can be used:

Linear Probing / Open Hashing: When a hash function generates an address at which data is already stored, the next free bucket is allocated to it.
Overflow Chaining / Closed Hashing: When buckets are full, a new bucket is allocated for the same hash result and is linked after the previous one.

Dynamic Hashing

The dynamic hashing also known as extended hashing method is used to tackle the static hashing problems such as bucket overflow. One of the major issues with static hashing is that it does not expand or shrink dynamically in accordance with the size of the database. Dynamic hashing provides a mechanism in which data buckets are added and removed dynamically and on-demand.

The hash function in dynamic hashing generates a large number of values but only a few are used initially. The prefix of an entire hash value is considered as a hash index however to compute bucket addresses only a portion of the hash value is taken into account. Each hash index has a depth value to signify the number of bits that are being used for computing a hash function. These bits can address 2n buckets. When all these bits are consumed meaning all the buckets are full, the depth value is increased linearly and twice the buckets are allocated.

The list of operations when using dynamic hashing is as follows:

Query: Look at the depth value of the hash index and use those bits to compute the bucket address.
Insert a record: Compute the address of the bucket. If the bucket is already full then add more buckets, add additional bits to the hash value and Re-compute the hash function. Otherwise add data to the bucket.If all the buckets are full, perform the remedies of static hashing.
Update a record: Perform a query operation mentioned above and update the data.
Delete a record: Perform a query operation to locate the desired data and delete the same.

Hashing in DBMS: An introduction and Types of Hashing

Introduction

Types of Hashing:

Static Hashing

The problem of Bucket Overflow

Dynamic Hashing

By admin

Related Post

Leave a Reply Cancel reply

Hashing in DBMS: An introduction and Types of Hashing

Introduction

Types of Hashing:

Static Hashing

The problem of Bucket Overflow

Dynamic Hashing

By admin

Related Post

A Glance At React Hooks – useState, useEffect

Ongoing Web Development Trends

Leave a Reply Cancel reply