Sorted Array: Interpolation Search
Interpolation search is a modification of binary search, where if the keys are in a uniform distribution then we get a much better time complexity.
To better understand what interpolation search is, let us take an example:
Suppose you wanted to search for the word "Algorithm" in the dictionary. You know that the word 'Algorithm' starts with alphabet 'A' hence you won't be going on the middle page of the dictionary in the 'M' section but would open it a little closer to the start of the dictionary. This will help you search the word faster.
Interpolation search is similar to what you do while searching in the dictionary. It uses the fact that the keys in array are uniformly distributed and calculates the index (mid index in case of binary search) to be searched efficiently.
The value of the index ('mid') is calculated by the below formula:
The rest of the algorithm is same as binary search.
To better understand what interpolation search is, let us take an example:
![]() |
Interpolation search is similar to what you do while searching in the dictionary. It uses the fact that the keys in array are uniformly distributed and calculates the index (mid index in case of binary search) to be searched efficiently.
The value of the index ('mid') is calculated by the below formula:
# mid -> index to search on # low -> index of left most element in search space # high -> index of right most element in search space # arr[low], arr[high] -> values at index low and high in the array mid = low + ((key - arr[low]) * (high - low) / (arr[high] - arr[low]))
The rest of the algorithm is same as binary search.
Code in Python:
def interpolation_search(arr, key): low = 0 high = len(arr) - 1 while arr[high] != arr[low] and key >= arr[low] and key <= arr[high]: mid = int(low + ((key - arr[low]) * (high - low) / (arr[high] - arr[low]))) if arr[mid] == key: return mid elif arr[mid] < key: low = mid + 1 else: high = mid - 1 return -1 # input arr arr = [2, 4, 6, 8, 10, 12, 14, 16] # interpolation_search call to search 3 in arr print('6 is at index: ', interpolation_search(arr, 6)) # Output: 6 is at index: 2
Example:
Ex 1. Let the input array be: [2, 4, 6, 8, 10, 12, 14, 16]
and key = 6
low = 0, high = 7
mid = low + ((key - arr[low]) * (high - low) / (arr[high] - arr[low]))
mid = 0 + ((6 - 2) * (7 - 0) / (16 - 2)) = 2
arr[mid] = 6
arr[mid] = key -> element found -> return mid
Ex 2. Let the input array be: [1, 2, 6, 10, 13, 20, 25, 26, 27, 30]
and key = 20
low = 0, high = 9
mid = low + ((key - arr[low]) * (high - low) / (arr[high] - arr[low]))
mid = 0 + ((20 - 2) * (9 - 0) / (30 - 1)) = 5
arr[mid] = 20
arr[mid] = key -> element found -> return mid
(Ex. 2 in our binary search implementation took 3 iterations whereas interpolation search found the element in the 1st iteration itself!!)
Try it yourself on the input arr = [1, 2, 4, 8, 16, 32, 64, 128] to search for 4. It should take 3 iterations to find 4.
and key = 6
low = 0, high = 7
mid = low + ((key - arr[low]) * (high - low) / (arr[high] - arr[low]))
mid = 0 + ((6 - 2) * (7 - 0) / (16 - 2)) = 2
arr[mid] = 6
arr[mid] = key -> element found -> return mid
Ex 2. Let the input array be: [1, 2, 6, 10, 13, 20, 25, 26, 27, 30]
and key = 20
low = 0, high = 9
mid = low + ((key - arr[low]) * (high - low) / (arr[high] - arr[low]))
mid = 0 + ((20 - 2) * (9 - 0) / (30 - 1)) = 5
arr[mid] = 20
arr[mid] = key -> element found -> return mid
(Ex. 2 in our binary search implementation took 3 iterations whereas interpolation search found the element in the 1st iteration itself!!)
Try it yourself on the input arr = [1, 2, 4, 8, 16, 32, 64, 128] to search for 4. It should take 3 iterations to find 4.
Time Complexity:
If we make no assumption about the distribution of keys in the array, interpolation search is O(N) where N is the size of the array, that is, in the worst case the algorithm is linear in time complexity.
Worst case example: keys are distributed exponentially.
arr = [1, 2, 4, 8, 16, 32, 64,...]
However, if the keys are distributed uniformly, interpolation search is O(log log N).
--
Have any questions or suggestions? Post in the comments below!!
By,
Anubhav Shrimal
Have any questions or suggestions? Post in the comments below!!
By,
Anubhav Shrimal
Why is the best case completixity O ( log log n )? Can't it be O( log log log ... n ) as the size of the problem grows?
ReplyDelete