Sunday, September 22, 2013

Binary Search Algorithm


Binary search also known as half-interval search, finds the index of a given integer (key) from a sorted array. 
 If the given integer is found, it returns the index of the integer inside array (positive value between 0 and n-1)
 Else it returns -1 (negative number that suggests that value don't exist in the array).

For example, suppose the given sorted array is:

                     

                                                                   Figure:  1

Simulation:
Input    : The integer that needs to be searched in the index
Output : Index of the integer found in array. (-1 means value not found)

Example 1:
Input:  5.
Output: 2.

Example 2:
Input:  1.
Output: 0.

Example 3:
Input:  25.
Output: -1.

Basic:  

In order to apply binary search on a given array of integers, the given array should be sorted (either in ascending or descending order).  Suppose the given sorted array  is arr[] whose length is n. We will be using the following terms during the explanation of binary search:

arr     -  given sorted array.
n       -  size of the array.
key  -   the number that needs to be searched in the array.
start   - index of first element of array.
end    - index of the last element of array + 1.
mid    - denotes the middle index of the array that we are sending in the algorithm.

Algorithm:

 binarySearch(arr, start, end, key)  {

     if (start < end) {
         mid = (start + end) / 2;   // get middle index of array.
         if(arr[mid] == key)   // check if middle element matches key
             return mid;      
         else if (arr[mid] < key)  // check if value exists in left sub array
   return binarySearch(arr, start, mid, key);  // perform binarySearch on left sub array
else
return binarySearch(arr, mid + 1, end, key);  // perform binarySearch on right sub array
     }
     return -1;  // value not found.
}

Explanation:

(Note: For current explanation, we consider that the array is sorted in ascending order)

The binarySearch function takes the following parameters as input:
arr - given array
start - start of array
end - size of array.
key - value to be found.

The above pseudocode is recursive approach to binary search.  For each iteration it performs the following steps: 

For our example, the values we are going to use are:

arr = {1, 3, 5, 10, 19, 23}
start = 0
end = 5
key = 23

Initially we check whether start < end.  This is done to make sure that our current index is within the range of the array.   

1) Find the middle element: 

In binary search, the first step is to find the middle index of the array.  We do this by taking average of start and end index of the array.

         mid = (start + end) / 2;



                                                                    Figure:  2

 In current example, start = 0, end  = 5.

mid = (start + end) / 2
      = (0 + 5) / 2
      = 3 

2) Check the middle element: 

Now, we have divided array into 3 parts:
          1) Middle element
          2) Left Sub array (Elements of array on left of the middle element)
          3) Right Sub array (Elements of array on right of the middle element)

First we check whether the middle element of array matches key or not.  If it matches return the index of the middle element.

         if(arr[mid] == key)
             return mid;

Here mid = 3, if the key we were searching for was 10, then arr[mid] would be equal to key and it would return the mid = 3.  But in our example key = 23, so it will bypass this condition.

3) Check Left Sub array: 

If the middle element does not match with given value, we check whether the element exists in the left sub array or not. To do this we simply check whether the middle element value is greater than key or not. If middle element is greater than key, then it follows that all elements on right sub array are greater than key (as the array is sorted in ascending order). So in this step we perform recursive call to perform binary search on left sub array. To do this we send start as starting index and mid as ending index. 

         else if (arr[mid] < key)
   return binarySearch(arr, start, mid, key);

 If key was less than arr[mid],  then key exists in the left sub array. As shown in figure 2, for left sub array:
start = 0
end  = ( index of last element of left sub array +  1 )
        = 2 + 1 
        = 3 (which is same as mid).

Thus while calling binary search recursively, we send start and mid to denote left sub array.

4) Check Right Sub array:

In above steps we already checked whether that key does not match middle element and it does not exist in the left sub array.  So the only possibility of existence of key on array is on the right sub array. So in this step we perform recursive call to perform binary search on right sub array.

else
return binarySearch(arr, mid + 1, end, key);


From above steps we can say that key may possibly exist in the right sub array. As shown in figure 2, for right sub array:
start = starting index of the right sub array (index of element after mid)
        = mid + 1
        = 4
end   =  5.

Thus while calling binary search recursively, we send (mid+1) and end to denote right sub array.

We perform this recursively until the key is found, or till the condition start < end fails.  If the condition fails, it means we have searched the whole array and the value is not found. So we return -1 to denote that value was not found in array.


Run time:  

If we divide the array recursively from mid and form a tree where root is middle element, left sub-tree is formed from left sub array and right sub-tree is formed from right sub array, we get a tree as shown below from the given array.


  

If we look closely to the newly obtained tree from the array, we can see that the tree is actually a binary search tree.  Now when we perform binary search on an array, we simply perform search on the binary search tree from root to leaves.  

The height of the tree = log(size of the array) = log(n). 

For our example size of array is 6.
Height of tree = log(6) = 3.

Thus, every time we perform binary search on an array, the run time will never exceed the height of tree, which is log(n). 
Thus the worst case run time of binary search is O( log(n) ).

If the key is same as middle element of the array, then we obtain the result in 1 step.  Thus the best case run-time of binary search is O(1)

The average case run-time of binary search is O( log(n) ).

Recursive Binary Search Implementation in Java:

public class BinarySearch {

private int[] arr;
int key;

void setValues(int[] arr, int key) {
this.arr = arr;
this.key = key;
}

int binarySearch(int start, int end) {

if (start < end) {

int mid = (start + end) / 2;

   // checks if value is found or not.
if (arr[mid] == key)
return mid;  // returns index if value is found.
else if (arr[mid] > key)
return binarySearch(start, mid);
else
return binarySearch(mid + 1, end);

}

return -1;  // value not found
}

public static void main(String[] args) {

int[] arr = { 1, 2, 3, 5, 6, 7, 9, 10 };
BinarySearch obj = new BinarySearch();
obj.setValues(arr, 2);
int keyIndex = obj.binarySearch(0, arr.length);

if (keyIndex != -1)
System.out.println("Value found at Index : "+keyIndex);
else
System.out.println("Value not found");
}
}

2 comments:

Anonymous said...

Sir can you tell me what obj.setValues does?

Unknown said...

The setValues() is a setter to set the values of array and the key. Once we set the values of array and key, we don't have to pass them for every recursive call in binarySearch().