Sorted Array: Interpolation Search

Interpolation search is a modification of binary search, where if the keys are in a uniform distribution then we get a much better time complexity.

To better understand what interpolation search is, let us take an example: 

Suppose you wanted to search for the word "Algorithm" in the dictionary. You know that the word 'Algorithm' starts with alphabet 'A' hence you won't be going on the middle page of the dictionary in the 'M' section but would open it a little closer to the start of the dictionary. This will help you search the word faster.

Interpolation search is similar to what you do while searching in the dictionary. It uses the fact that the keys in array are uniformly distributed and calculates the index (mid index in case of binary search) to be searched efficiently.

The value of the index ('mid') is calculated by the below formula:


# mid -> index to search on
# low -> index of left most element in search space
# high -> index of right most element in search space
# arr[low], arr[high] -> values at index low and high in the array

mid = low + ((key - arr[low]) * (high - low) / (arr[high] - arr[low]))


The rest of the algorithm is same as binary search.

Code in Python:

def interpolation_search(arr, key):
    low = 0
    high = len(arr) - 1

    while arr[high] != arr[low] and key >= arr[low] and key <= arr[high]:
        mid = int(low + ((key - arr[low]) * (high - low) / (arr[high] - arr[low])))
        
        if arr[mid] == key:
            return mid
        elif arr[mid] < key:
            low = mid + 1
        else:
            high = mid - 1

    return -1
    
# input arr
arr = [2, 4, 6, 8, 10, 12, 14, 16]

# interpolation_search call to search 3 in arr
print('6 is at index: ', interpolation_search(arr, 6))

# Output: 6 is at index:  2

Example:

Ex 1. Let the input array be: [2, 4, 6, 8, 10, 12, 14, 16] 
    and key = 6

    low = 0, high = 7 
    mid = low + ((key - arr[low]) * (high - low) / (arr[high] - arr[low]))
    mid = 0 + ((6 - 2)  * (7 - 0) / (16 - 2)) = 2 
    arr[mid] = 6
    arr[mid] = key -> element found -> return mid

Ex 2. Let the input array be: [1, 2, 6, 10, 13, 20, 25, 26, 27, 30]
    and key = 20

    low = 0, high = 9
    mid = low ((key - arr[low]) * (high - low) / (arr[high] - arr[low]))
    mid = 0 + ((20 - 2)  * (9 - 0) / (30 - 1)) = 5
    arr[mid] = 20
    arr[mid] = key -> element found -> return mid

(Ex. 2 in our binary search implementation took 3 iterations whereas interpolation search found the element in the 1st iteration itself!!)

Try it yourself on  the input arr = [1, 2, 4, 8, 16, 32, 64, 128] to search for 4. It should take 3 iterations to find 4.

Time Complexity:

If we make no assumption about the distribution of keys in the array, interpolation search is O(N) where N is the size of the array, that is, in the worst case the algorithm is linear in time complexity.
Worst case example: keys are distributed exponentially.
arr = [1, 2, 4, 8, 16, 32, 64,...]

However, if the keys are distributed uniformly, interpolation search is O(log log N).

--
Have any questions or suggestions? Post in the comments below!!

By,
Anubhav Shrimal

Comments

  1. Why is the best case completixity O ( log log n )? Can't it be O( log log log ... n ) as the size of the problem grows?

    ReplyDelete

Post a Comment

Popular posts from this blog

Segregate Even Odd value nodes in a linked list

Tree data structure

Reverse alternate k nodes in a linked list