排序算法 | 桶排序算法原理及实现和优化

xiaoxiao2025-09-03 235

排序充斥着我们的生活，比如站队、排队买票、考试排名、公司业绩排名、将电子邮件按时间排序、QQ 好友列表中的会员红名靠前，等等。

这里先举个例子，通过这个例子让我们接触第 1 个算法。

在某个期末考试中，老师要把大家的分数排序，比如有 5 个学生，分别考 5、9、5、1、6 分（满分 10 分），从大到小排序应该是 9、6、5、5、1，大家有没有办法写一段程序随机读取 5 个数，然后对它们排序呢？

看到这个问题，我们用 5 分钟想一下该怎么办。办法当然很多，这里使用桶排序的思想来处理。

我们找到 11 个桶，分别编号为 0-10，对应 0-10 分，如图 1 所示。

图 1 准备 11 个桶并编号

着我们把这些分数按照桶的编号放入桶中，如图 2 所示。

图 1 准备 11 个桶并编号

接着我们从最大编号的桶到最小编号的桶依次输出每个桶中的分数，分别是 9、6、5、5、1 了。是不是很轻松地完成排序了呢？这就是桶排序的思想。

什么是桶排序

桶排序，也叫作箱排序，是一个排序算法，也是所有排序算法中最快、最简单的排序算法。其中的思想是我们首先需要知道所有待排序元素的范围，然后需要有在这个范围内的同样数量的桶，接着把元素放到对应的桶中，最后按顺序输出。

这实际上是简易版的桶排序，我们想象一下，如果考试分数的范围是 0～100 万该怎么办？弄 100 万个桶吗？

实际上在这种情况下，一个桶并不总是放同一个元素，在很多时候一个桶里可能会放多个元素，这是不是与散列表有点相似呢？其实真正的桶排序和散列表有一样的原理。

除了对一个桶内的元素做链表存储，我们也有可能对每个桶中的元素继续使用其他排序算法进行排序，所以更多时候，桶排序会结合其他排序算法一起使用。

桶排序的实现

简易版实现

我们怎么在代码中实现桶排序呢？其实很简单，使用数组就好了。比如有 11 个桶，我们只需要声明一个长度为 11 的数组，然后每把一个元素往桶中放时，就把数组指定位置的值加 1，最终倒序输出数组的下标，数组每个位置的值为几就输出几次下标，这样就可以实现桶排序了。

下面我们一起看看简易版桶排序的代码。

public class BucketSort01 { public static void main(String[] args) { int[] array = {5, 9, 1, 9, 5, 3, 7, 6, 1};// 待排序数组 int[] buckets = new int[11]; sort(array, buckets); print(buckets); } /** 从小到大排序 */ public static void sort(int array[], int buckets[]) { for (int i = 0; i < array.length; i++) { buckets[array[i]]++; } } /** 从小到大排序 */ public static void print(int buckets[]) { // 顺序输出数据 for (int i = 0; i < buckets.length; i++) { // 元素中值为几，说明有多少个相同值的元素，则输出几遍 for (int j = 0; j < buckets[i]; j++) { System.out.print(i + " "); } } } }

正式版实现

一个桶并不总是放同一个元素，在很多时候一个桶里可能会放多个元素。

public class BucketSort03 { public static void main(String[] args) { int[] array = {50, 9, 1, 9, 53, 33, 27, 6, 1};// 待排序数组 sort(array); print(array); } /** 从小到大排序 */ public static void sort(int[] array) { // 确定元素的最值 int max = Integer.MIN_VALUE; int min = Integer.MAX_VALUE; for (int i = 0; i < array.length; i++) { max = Math.max(max, array[i]); min = Math.min(min, array[i]); } // 桶数：(max - min) / array.length的结果为数组大小的倍数（最大倍数），以倍数作为桶数 int bucketNum = (max - min) / array.length + 1; // 初始化桶 ArrayList<ArrayList<Integer>> bucketArr = new ArrayList<>(bucketNum); for (int i = 0; i < bucketNum; i++) { bucketArr.add(new ArrayList<Integer>()); } // 将每个元素放入桶 for (int i = 0; i < array.length; i++) { // 计算每个(array[i] - min)是数组大小的多少倍，看看放入哪个桶里 int num = (array[i] - min) / (array.length); bucketArr.get(num).add(array[i]); } // 对每个桶进行排序 for (int i = 0; i < bucketArr.size(); i++) { Collections.sort(bucketArr.get(i)); } // 合并数据 int j = 0; for (ArrayList<Integer> tempList : bucketArr) { for (int i : tempList) { array[j++] = i; } } } /** 打印数组 */ public static void print(int array[]) { for (int i = 0; i < array.length; i++) { System.out.print(array[i] + " "); } System.out.println(); } }

高级版实现

基于基数排序实现的桶排序。参考：基数排序算法原理及实现和优化

public class BucketSort04 { public static void main(String[] args) { int[] array = {51, 944, 1, 9, 57, 366, 79, 6, 1, 345};// 待排序数组 sort(array); System.out.println("最终排好序的数据："); print(array); } /** * 从小到大排序 */ public static void sort(int data[]) { int n = data.length; // 使用数组来模拟链表（当然牺牲了部分的空间，但是操作却是简单了很多，稳定性也大大提高了） // 十个桶。建立一个二维数组，行向量的下标0—9代表了10个桶，每个行形成的一维数组则是桶的空间 int bask[][] = new int[10][n]; // 用来计算每个桶使用的容量 int index[] = new int[10]; // 计算最大的数有多少位。比如：5978，有4位 int max = Integer.MIN_VALUE; for (int i = 0; i < n; i++) { int k = (data[i] + "").length(); max = max > k ? max : k; } String str; // 循环内将所有数据补齐，长度都为 max 。第一轮 i 代表个位，第二轮 i 代表十位。。。 // 按照个、十、百、千...的位置来计算 // 第一轮将10以内的数据排好序，第二轮将100以内的数据排好序...... for (int i = max - 1; i >= 0; i--) { System.out.println("第" + (max - i) + "轮补齐后的数据："); // 所有的数字都循环一遍 for (int j = 0; j < n; j++) { str = ""; // 按照 max 将所有的数据补齐，位数不足的前面补零 if (Integer.toString(data[j]).length() < max) { for (int k = 0; k < max - Integer.toString(data[j]).length(); k++) str += "0"; } str += Integer.toString(data[j]); System.out.printf("%5s", str); // index[str.charAt(i) - '0']用于第二层循环计算每个桶使用的容量，第二层循环结束后会将index[str.charAt(i) - '0']都初始化为零 // 第一轮取 str 的个位（str.charAt(i--)），放在第（str.charAt(i--) - '0'）个桶的第（index[str.charAt(i) - '0']++）个位置 // 第二轮取 str 的十位（str.charAt(i--)），放在第（str.charAt(i--) - '0'）个桶的第（index[str.charAt(i) - '0']++）个位置 // ....... bask[str.charAt(i) - '0'][index[str.charAt(i) - '0']++] = data[j]; } // 将桶内的数据重新放入data数组内 int pos = 0; for (int j = 0; j < 10; j++) { // 第j个桶内有index[j]个数据 for (int k = 0; k < index[j]; k++) { data[pos++] = bask[j][k]; } } System.out.println(); System.out.println("第" + (max - i) + "轮index内的数据："); print(index); System.out.println("第" + (max - i) + "轮桶内的数据："); print(bask); System.out.println("第" + (max - i) + "轮结束后data内的数据："); print(data); System.out.println(); // 将index[x]归零 for (int x = 0; x < 10; x++) index[x] = 0; } } public static void print(int array[][]) { for (int j = 0; j < array.length; j++) { for (int k = 0; k < array[j].length; k++) { System.out.printf("%5d", array[j][k]); } System.out.println(); } } public static void print(int array[]) { for (int j = 0; j < array.length; j++) { System.out.printf("%5d", array[j]); } System.out.println(); } }

控制台输出：

第一轮将10以内的数据排好序，第二轮将100以内的数据排好序…

第1轮补齐后的数据： 051 944 001 009 057 366 079 006 001 345 第1轮index内的数据： 0 3 0 0 1 1 2 1 0 2 第1轮桶内的数据： 0 0 0 0 0 0 0 0 0 0 51 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 944 0 0 0 0 0 0 0 0 0 345 0 0 0 0 0 0 0 0 0 366 6 0 0 0 0 0 0 0 0 57 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9 79 0 0 0 0 0 0 0 0 第1轮结束后data内的数据： 51 1 1 944 345 366 6 57 9 79 第2轮补齐后的数据： 051 001 001 944 345 366 006 057 009 079 第2轮index内的数据： 4 0 0 0 2 2 1 1 0 0 第2轮桶内的数据： 1 1 6 9 0 0 0 0 0 0 51 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 944 345 0 0 0 0 0 0 0 0 51 57 0 0 0 0 0 0 0 0 366 6 0 0 0 0 0 0 0 0 79 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9 79 0 0 0 0 0 0 0 0 第2轮结束后data内的数据： 1 1 6 9 944 345 51 57 366 79 第3轮补齐后的数据： 001 001 006 009 944 345 051 057 366 079 第3轮index内的数据： 7 0 0 2 0 0 0 0 0 1 第3轮桶内的数据： 1 1 6 9 51 57 79 0 0 0 51 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 345 366 0 0 0 0 0 0 0 0 944 345 0 0 0 0 0 0 0 0 51 57 0 0 0 0 0 0 0 0 366 6 0 0 0 0 0 0 0 0 79 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 944 79 0 0 0 0 0 0 0 0 第3轮结束后data内的数据： 1 1 6 9 51 57 79 345 366 944 最终排好序的数据： 1 1 6 9 51 57 79 345 366 944

桶排序的时间复杂度

对于N个待排数据，M个桶，平均每个桶[N/M]个数据的桶排序平均时间复杂度为：

O(N)+O(M*(N/M)*log(N/M)) = O(N+N*(logN-logM)) = O(N+N*logN-N*logM)

当N=M时，即极限情况下每个桶只有一个数据时。桶排序的最好效率能够达到O(N)。

总结：桶排序的平均时间复杂度为线性的O(N+C)，其中C=N*(logN-logM)。如果相对于同样的N，桶数量M越大，其效率越高，最好的时间复杂度达到O(N)。当然桶排序的空间复杂度为O(N+M)，如果输入数据非常庞大，而桶的数量也非常多，则空间代价无疑是昂贵的。此外，桶排序是稳定的。

通过上面的性能分析，我们可以知道桶排序的特点，那就是速度快、简单，但是也有相应的弱点，那就是空间利用率低，如果数据跨度过大，则空间可能无法承受，或者说这些元素并不适合使用桶排序算法。

桶排序的适用场景

桶排序的适用场景非常明了，那就是在数据分布相对比较均匀或者数据跨度范围并不是很大时，排序的速度还是相当快且简单的。

但是当数据跨度过大时，这个空间消耗就会很大；如果数值的范围特别大，那么对空间消耗的代价肯定也是不切实际的，所以这个算法还有一定的局限性。同样，由于时间复杂度为 O(n+m)，如果 m 比 n 大太多，则从时间上来说，性能也并不是很好。

但是实际上在使用桶排序的过程中，我们会使用类似散列表的方式去实现，这时的空间利用率会高很多，同时时间复杂度会有一定的提升，但是效率还不错。

我们在开发过程中，除了对一些要求特别高并且数据分布较为均匀的情况使用桶排序，还是很少使用桶排序的，所以即使桶排序很简单、很快，我们也很少使用它。

桶排序更多地被用于一些特定的环境，比如数据范围较为局限或者有一些特定的要求，比如需要通过哈希映射快速获取某些值、需要统计每个数的数量。但是这一切都需要确认数据的范围，如果范围太大，就需要巧妙地解决这个问题或者使用其他算法了。

桶排序的应用

百度百科：https://baike.baidu.com/item/桶排序/4973777#5

转载请注明原文地址: https://www.6miu.com/read-5035688.html

Java

最新回复(0)