Binary Search For Duplicate Data

GOAL

The goal of this article it to describe an algorithm to find first target item or last target item from sorted list with duplication using binary search and how to implement them in C++.

What is binary search?

Binary search is one of the search algorithms, that finds position or judges existence of an target value on sorted array.

Binary search compares the current value defined as a half of target range of the array to the target value. Binary search compares the middle element of current target range in the array to the target value. The current target range is narrowed according to the result of comparison as below.

What if there are same values in the array?

If there are same values, the found element can be first one, last one or other. How can we change this algorithm to search for the smallest or largest element?

Find first index

If the value currently focused is equal to the target value, the current value is contained to the next range as a max value. End the process when min index equal to max index, then check if the min value is equal to the target value.

Find last index

If the value currently focused is equal to the target value, the current value is contained to the next range as a min value. End the process when min index equal to max index, then check if the max value is equal to the target value.

Implementation

int binary_search(int nums[], int length, int target){
    int min_index = 0;
    int max_index = length-1;
    int current_index;
    while(min_index<=max_index){
        current_index = (min_index+max_index)/2;
        int current_value = nums[current_index];
        if(current_value == target){
            return current_index;
        }
        if(current_value < target){
            min_index = current_index+1;
        }
        else{
            max_index = current_index-1;
        }
    }
    return -1;
}
int binary_search_lower(int nums[], int length, int target){
    int min_index = 0;
    int max_index = length-1;
    int current_index;
    while(min_index<max_index){
        current_index = (min_index+max_index)/2;
        int current_value = nums[current_index];
        if(current_value == target){
            max_index = current_index;
        }
        else if(current_value < target){
            min_index = current_index+1;
        }else{
            max_index = current_index-1;
        }
    }
    if(nums[min_index] == target){
        return min_index;
    }
    return -1;
}
int binary_search_upper(int nums[], int length, int target){
    int min_index = 0;
    int max_index = length-1;
    int current_index;
    while(min_index<max_index){
        current_index = (min_index+max_index+1)/2;
        int current_value = nums[current_index];
        if(current_value == target){
            min_index = current_index;
        }
        else if(current_value < target){
            min_index = current_index+1;
        }else{
            max_index = current_index-1;
        }
    }
    if(nums[max_index] == target){
        return max_index;
    }
    return -1;
}
int main(void){
    int nums[10] = {1,4,4,5,8,9,10,10,10,15};
    int length = 10;
    int target = 10;
    cout << "binary_sarch: " << binary_search(nums, length, target) << endl;
    cout << "binary_sarch_lower: " << binary_search_lower(nums, length, target) << endl;
    cout << "binary_sarch_upper: " << binary_search_upper(nums, length, target) << endl;
    return 0;
}

lower_bound() and upper_bound()

There are functions to do similar process, lower_bound() and upper_boud().

lower_bound()

Returns an iterator pointing to the first element in the range [first, last) that is not less than (i.e. greater or equal to) value, or last if no such element is found.

from std::lower_bound in cppreference.com

upper_bound()

Returns an iterator pointing to the first element in the range [first, last) that is greater than value, or last if no such element is found.

from std::upper_bound in cppreference.com