Why is the heap in heapsort "the wrong way"?

Question

In heapsort, the heap is max-heap where each time we extract the maximum element from index 0 and place it at the right side of the array. I'm now wondering why we don't build the max-heap in reverse, with its greatest value at the right (excluding already sorted elements). This way that value would be already at its final location, and it wouldn't need to be moved? Wouldn't that also be efficient when the input array turned out to be already sorted?

Is this due to implementation difficulties? For me, it seems to solve a lot of issues associated with heap sort.

Is there something I am missing?

As background info, here is the animation of standard heapsort taken from Wikipedia. Here the max-heap is at the left and values get moved to the right (the sorted partition):

It's not clear what you mean. A heap is not sorted in the traditional sense of the word. Rather, it's organized so that the lower (or highest) value can repeatedly be extracted. — 500 - Internal Server Error
– 500 - Internal Server Error, Commented Jan 16 at 12:34
@500-InternalServerError I added a gif for clarification. What I wanted to know is why the biggest element of the heap is not to the right and therefore less swapping should be needed. — Fff
– Fff, Commented Jan 16 at 12:38
Because heap is not sorted. So it needs to be on a the opposite side of the array relative to where the actually sorted elements need to end-up. You then gradually shrink the heap and grow the sorted portion. — Branko Dimitrijevic
– Branko Dimitrijevic, Commented Jan 16 at 12:54

trincot · Accepted Answer · 2025-01-16 16:50:02Z

At first that idea looks nice: you would turn the input array into a max-heap with its root at the right, and you would not have to move the max element anymore, as it is already in its final position.

But after that operation we would need to reduce the heap in size from 𝑛 to 𝑛−1. This is easy when we need to give up the index where the last leaf is positioned, but not at all when the index to give up is where the heap's root is located. In your suggested scenario, the value at index 𝑛−1 becomes final, and it is that index that should be excluded from the heap. Once you remove that index from the heap's scope, you no longer have a heap. Instead you have two heaps, each with their own root: one root at index 𝑛−2 and another at index 𝑛−3. To merge the second heap into the first is a O(𝑛) operation, which would make the whole sorting algorithm inefficient.

The standard heap sort algorithm does not have this problem, as there the heap's root is always at index 0. The index to give up is the one where the last leaf is located. That index is easy to exclude from the heap: the so shortened heap remains a valid heap. And so the operation to move the max value from index 0 to its final position, and have a leaf value sift down into the shorter heap, represents a more efficient operation: it is O(log𝑛) where 𝑛 is the current size of the heap.

Thomas Koelle · Accepted Answer · 2025-01-16 12:45:18Z

0

Heapsort is mostly an umbrella term. There are both min-heaps and max-heaps. And whatever you use then if you call it again then both versions will act as you specify.

answered Jan 16 at 12:45

Thomas Koelle

3,8742 gold badges30 silver badges54 bronze badges

2 Comments

Fff Jan 16 at 12:55

I was not meaning min or max heap, but rather why not build the heap in a way that the biggest item of the heap is in the right position of the sorted list after the heap is finished. So you don't build the heap as in the gif, but mirrored from right to left but still max-heap

Thomas Koelle Jan 16 at 13:36

Maybe forget the term heapsort and focus on the heap: en.wikipedia.org/wiki/Heap_(data_structure)

Jim Mischel · Accepted Answer · 2025-01-16 17:19:25Z

First of all, remember that building a heap doesn't necessarily (in fact almost never) sort the list. It puts the items in an order that makes it very efficient to repeatedly locate and remove the smallest (largest) item.

Sure, you can build the heap backwards. That is, consider a min heap with the numbers 1 through 7:

         1
      5     2
     7 6   4 3

That's stored as an array in memory as [1,5,2,7,6,4,3]. The root node (with value 1) is at index 0 in the array. And then the extraction process starts. It removes the '1', fixes up the heap, and the result is [2,5,3,7,6,4,1], and the heap size is reduced to 6.

Provided you know how many items will be in the heap, it's easy enough to build the heap backwards. Change a few lines of code and you could have something that builds the heap backwards. That is, your memory representation of the heap would be [3,4,6,7,2,5,1]. And the tree view would look like this:

     7 6   4 3
      5     2
         1

But you have the same problem, except in reverse. You still take from the root and replace the lowest, right-most (or in this case, the highest, right-most) leaf.

The limiting factor on Heapsort is the log(n) operations it has to make to fix up the heap after each removal. But it only works if the root remains in a constant place. If you "give up" (to use trincot's term) the root index, then you have to rebuild the entire heap: an O(n) operation.

Collectives™ on Stack Overflow

Why is the heap in heapsort "the wrong way"?

3 Answers 3

Comments

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related