Quick Info

Category: Sorting
Time Complexity: Best: Ω(n+k)
Average: Θ(n+k)
Worst: O(n²)
Space Complexity: O(n+k)
Stable: Yes
In-Place: No

Bucket Sort

Description

Bucket Sort is a distribution-based sorting algorithm that works by distributing elements into a number of buckets, sorting these buckets individually, and then concatenating them to get the final sorted array. It's particularly efficient when input is uniformly distributed over a range.

History

Bucket Sort Concept

Bucket Sort emerged from the need to sort large datasets with uniformly distributed values. Its origins can be traced back to the early days of computer science when researchers were exploring ways to sort data more efficiently than comparison-based methods. The algorithm gained prominence in the 1950s alongside other distribution-based sorting methods. Its ability to achieve linear time complexity under certain conditions made it particularly valuable for specific applications, such as sorting floating-point numbers between 0 and 1. The concept of bucket sort has influenced various modern sorting techniques and is still used in specialized applications where data distribution is known to be roughly uniform.

How It Works

  1. Create Buckets:
    • Initialize array of empty buckets
    • Number of buckets based on input range
  2. Distribute:
    • Scatter elements into appropriate buckets
    • Based on element values
  3. Sort Buckets:
    • Sort elements within each bucket
    • Can use any sorting algorithm
  4. Gather:
    • Concatenate all buckets in order
    • Results in final sorted array

Visualization

Click Start to begin visualization

Implementation


def bucket_sort(arr):
    # Find range of input array
    if len(arr) == 0:
        return arr
        
    # Create buckets
    n_buckets = len(arr)
    buckets = [[] for _ in range(n_buckets)]
    
    # Distribute elements into buckets
    for num in arr:
        # Assuming numbers are in range [0,1)
        bucket_idx = int(n_buckets * num)
        buckets[bucket_idx].append(num)
    
    # Sort individual buckets
    for bucket in buckets:
        bucket.sort()
    
    # Concatenate all buckets into final result
    result = []
    for bucket in buckets:
        result.extend(bucket)
    
    return result

# Example usage
if __name__ == "__main__":
    arr = [0.78, 0.17, 0.39, 0.26, 0.72, 0.94, 0.21, 0.12, 0.23, 0.68]
    print("Original array:", arr)
    sorted_arr = bucket_sort(arr)
    print("Sorted array:", sorted_arr)
                                        

#include <vector>
#include <algorithm>
#include <iostream>

void bucketSort(std::vector<float>& arr) {
    int n = arr.size();
    
    // Create n empty buckets
    std::vector<std::vector<float>> buckets(n);
    
    // Put array elements into different buckets
    for(int i = 0; i < n; i++) {
        int bucketIndex = n * arr[i];
        buckets[bucketIndex].push_back(arr[i]);
    }
    
    // Sort individual buckets
    for(int i = 0; i < n; i++) {
        std::sort(buckets[i].begin(), buckets[i].end());
    }
    
    // Concatenate all buckets into arr[]
    int index = 0;
    for(int i = 0; i < n; i++) {
        for(float j : buckets[i]) {
            arr[index++] = j;
        }
    }
}

int main() {
    std::vector<float> arr = {0.897, 0.565, 0.656, 0.1234, 0.665, 0.3434};
    
    std::cout << "Original array: ";
    for(float i : arr) std::cout << i << " ";
    std::cout << std::endl;
    
    bucketSort(arr);
    
    std::cout << "Sorted array: ";
    for(float i : arr) std::cout << i << " ";
    std::cout << std::endl;
    
    return 0;
}
                                        

using System;
using System.Collections.Generic;
using System.Linq;

public class BucketSort {
    public static void Sort(float[] arr) {
        int n = arr.Length;
        
        // Create n empty buckets
        List<float>[] buckets = new List<float>[n];
        for(int i = 0; i < n; i++) {
            buckets[i] = new List<float>();
        }
        
        // Put array elements into different buckets
        for(int i = 0; i < n; i++) {
            int bucketIndex = (int)(n * arr[i]);
            buckets[bucketIndex].Add(arr[i]);
        }
        
        // Sort individual buckets
        for(int i = 0; i < n; i++) {
            buckets[i].Sort();
        }
        
        // Concatenate all buckets into arr[]
        int index = 0;
        for(int i = 0; i < n; i++) {
            foreach(float item in buckets[i]) {
                arr[index++] = item;
            }
        }
    }
    
    public static void Main(string[] args) {
        float[] arr = { 0.897f, 0.565f, 0.656f, 0.1234f, 0.665f, 0.3434f };
        
        Console.WriteLine("Original array: " + string.Join(" ", arr));
        Sort(arr);
        Console.WriteLine("Sorted array: " + string.Join(" ", arr));
    }
}
                                        

Applications

  • Sorting uniformly distributed data
  • Sorting floating-point numbers in range [0,1)
  • Database operations
  • External sorting with limited memory
  • Parallel sorting implementations

Complexity Analysis

Case Time Complexity Description
Best Case Ω(n + k) When elements are uniformly distributed across buckets
Average Case Θ(n + k) With uniform distribution and O(1) elements per bucket
Worst Case O(n²) When all elements go to the same bucket
Space Complexity O(n + k) Where k is the number of buckets

Notes:

  • n - Number of elements in the array
  • k - Number of buckets
  • Performance heavily depends on input distribution
  • Number of buckets affects time-space trade-off
  • Individual bucket sorting affects overall complexity

Advantages and Disadvantages

Advantages

  • Linear time complexity for uniform distribution
  • Stable sorting algorithm
  • Can be parallelized easily
  • Good for floating-point numbers

Disadvantages

  • Not in-place sorting
  • Requires extra space
  • Performance depends on data distribution
  • Not suitable for integer sorting