CISC 1115
Introduction to Programming Using Java
Lecture #15
Sorting & Searching
Overview
Searching
To search an array for a particular value
- Loop through the array comparing each element with the value
- If you find a match return either
true
, or the index
- If you drop through the loop, return some indication of failure --
false
or -1 (why -1?)
- A classical error is to return failure after not matching the first element only
The above is known as linear search (because it moves in a straight line through the
array performing the search).
Sorting
Sorting involves rearranging the elements of the array so that the elements are either in
ascending or descending order. That is:
- An array is sorted in ascending order if every element is less than (or equal to) the
element to the right
- An array is sorted in descending order if every element is greater than (or equal to) the
element to the right
Note that the last element has no element to the right to compare against (similarly element 0
has no element to its left to be compared to).
Selection Sort
The algorithm (i.e., the approach) used by selection sort corresponds closely to the technique you might use if
you were asked to sort a deck of cards, a pile of folders, a list of names:
- Find the smallest value, set it aside, and make it the first
- Find the smallest value of the remaining items and make it the second
- Repeat until you've gone through the whole set of items
Selection sort basically does the same thing, but we take into consideration that we're working with an array:
- We don't set the smallest value to the side, we rather move it to the beginning of the array
- We have to do something with the element that's already there-- we'll swap them
- Each successive 'smallest' element will go into the second, third, etc elements of the array
We can thus think of the array as being divided into two parts:
- The left side which is sorted and growing
- The right side which is unsorted and shrinking
What we will do is ave a pair of nested loops:
- The outer loop will determine which 'smallest' element we're selecting
- The first time through it will be the actual smallest element, the second time through the next-to-smallest,
etc.
- Each time through this outer loop, the sorted portion gets one bigger and the unsorted portion one element smaller.
- The inner loop will find the smallest element remaining in the unsorted portion
- Start at the left end of the sorted portion and move through the array
- Whenever we find an element smaller than the leftmost element, swap them
- When we have reached the end of the array, the leftmost element contains the smallest (remaining) value
Initially:
+----------------------------------------------------------------------------+
| |
| unsorted |
| |
+----------------------------------------------------------------------------+
After one iteration through the outer loop:
+----------------------------------------------------------------------------+
| | |
|smallest| unsorted |
| | |
+----------------------------------------------------------------------------+
Halfway through the outer loop:
+----------------------------------------------------------------------------+
| | |
| sorted | unsorted |
| | |
+----------------------------------------------------------------------------+
At end:
+----------------------------------------------------------------------------+
| |
| sorted |
| |
+----------------------------------------------------------------------------+
The Selection Sort Algorithm (SwappySelectionSort)
A very high-level statement of the approach:
Go through the array from beginning to end
Find the smallest element in the array from the current position to the end,
and place it in that position
In a bit more detail:
Go through the array from beginning to end
Go through the array from the current position to the end
If an element is less than the value in the current position swap them
Let's start getting down to code-level
for (currentPosition = 0; currentPosition < size; currentPosition++)
for (j = currentPosition+1; j < size; j++)
if (arr[j] < arr[currentPosition]
swap(arr[j], arr[currentPosition])
Here it is turned into a method:
void selectionSort(int [] arr, int size) {
for (currentPosition = 0; currentPosition < size; currentPosition++)
for (j = currentPosition+1; j < size; j++)
if (arr[j] < arr[currentPosition]
swap(arr, j, currentPosition)
}
An Optimization of Selection Sort (SelectionSort)
Rather than constantly swapping each time we find a new minimum value, we remember the position
of the smallest value encountered and it's only once we've gone completely through the array in the
inner loop that we swap the smallest value into the current position.
void selectionSort(int [] arr, int size) {
for (currentPosition = 0; currentPosition < size; currentPosition++)
int pos = currentPosition;
for (j = currentPosition+1; j < size; j++)
if (arr[j] < arr[currentPosition]
pos = j;
swap(arr, pos, currentPosition)
}
One More Optimization (LazySelectionSort)
Each time we've gone completely through the inner loop, before we swap the smallest element to the current position, we make sure
the smallest value is not already at the current position.
void selectionSort(int [] arr, int size) {
for (currentPosition = 0; currentPosition < size; currentPosition++)
int smallestPosition = currentPosition;
for (j = currentPosition+1; j < size; j++)
if (arr[j] < arr[currentPosition]
smallestPosition = j;
if (smallestPosition != currentPosition) swap(arr, pos, currentPosition)
}
and here it is with the inner loop (a single pass through the array) turned into a method:
void selectionSort(int [] arr, int size) {
for (currentPosition = 0; currentPosition < size; currentPosition++)
int smallestPosition = findSmallestPosition(arr, size);
if (smallestPosition != currentPosition) swap(arr, pos, currentPosition)
}
int findSmallestPosition(int [] arr, int size) {
int smallestPosition = currentPosition;
for (j = currentPosition+1; j < size; j++)
if (arr[j] < arr[currentPosition]
smallestPosition = j;
return smallestPosition;
}
A Second Sort - Bubble Sort
In bubble sort:
- We make repeated passes through the array
- During each pass, we compare adjacent elements.
- If two adjacent elements are out of order, we swap them.
- When we perform an entire pass without a single swap, we're done
Some Questions to Answer
- Why does this work?
- How do we know it will eventually end?
- Why are we bothering with a second sort?
Why does this work?
- Each time we go through the array, we're making it a bit more sorted
- Out-of-order adjacent elements are being put into proper order
- Furthermore, the largest elements are bubbling towards the right to their proper positions
How do we know this will eventually end?
- Each pass through the array has the next largest element settling into its proper position as a result
of the bubbling
- After n-1 passes every element has been deposited in its proper position
The Bubble Sort Algorithm(SmartBubbleSort)
A very high-level statement of the approach:
Pass through the array until a pass is completed with no swaps performed
Compare adjacent elements and swap if they're out of order
In a bit more detail:
Go through the array from the beginning to end
If an element at position i is less than the element at position i+1 swap them
If a swap was performed repeat
Let's start getting down to code-level
do {
bool swapped = false;
for (i = 0; i < n-1; i++)
if (arr[i] > arr[i+1] {
swap(arr, i, i+1)
swapped = true;
}
} while (swapped)
Notice the pattern of how swapped is used:
- It's reminiscent of finding the max--
- Assume first is max and then check rest of the array and correct as needed
- Same thing here
- Assume swapped will be false and then go through array and correct accordingly
Here is a very simple version of bubble sort for you to remember:
for (int i = 0; i < n; i++)
for (int j = 0; j < n-1; j++)
if (arr[j] > arr[j+1])
swap(arr, j, j+1);
- Each time through the outer loop, the next largest element 'bubbles' it to the right until it settles into its proper place
- Any smaller element that is swapped with a larger predecessor moves only one position to the left during a pass through the array
- If the smallest element is at the last position, it will take n-1 pass for it to be swapped to the first position (where it belongs)
- That means at the very worst, we need no more than n passes through the array
- The inner loop represents a single pass through; since we are comparing each element with its neighbor to the right; and
the last element has no neighbor to the right, we stop at the next to last element (
n-2
) (not doing so would
result in an ArrayIndexOutOfBoundsException
)
Here is a minor variation that's a bit more efficient:
for (int last = n; last > 0; last--)
for (int i = 0; i < last; i++)
if (arr[i] > arr[i+1])
swap(arr, i, i+1);
- The outer loop is still controlling the passes through the array, but it starts at the end this time
- Since the largest element bubbles to the last position on the first pass, there is no reason to
compare against it in the next (or subsequent passes)
- We thus use the index of the outer loop to represent the last position of the unsorted portion
of the array to be compared against; all positions beyond last are already sorted
And here is this version with the swapped optimizationi (i.e. checking for no swaps in a single pass):
boolean swapped = true;
for (int last = n; last > 0 && swapped; last--) {
swapped = false;
for (int i = 0; i < last; i++)
if (arr[i] > arr[i+1]) {
swap(arr, i, i+1);
swapped = true;
}
}
- The boolean variable
swapped
will be set in the inner loop (the loop performing the comparison pass)
if any swap is performed
- The outer loop (the loop controlling the number of passes) adds a check if any swap has been done in the previous pass of the
inner loop. If none have been done, the array is already sorted, and the sort terminates
- The outer loop is forced into a first pass by initializing
swapped
to true
- We have to go through the loop at least once even if it';s already sorted to verify that fact
- This 'forced' first pass is reflected in our original version by the use of a
do/while
Why do we have to bother with two sorts
- Examine bubble sort-- what if the array came in sorted to begin with? What about selection sort?
- Our selection sort is actually a bit simple-minded in constantly swapping-- there's a more
'swap-efficient' version that does significantly less movement than the version we presented--
and bubble sort for that matter.
Program 24.1
Write a program that initializes an array of integers prints them, sorts them using bubble sort, and prints the sorted array.
Binary Search — Searching a Sorted Array
We've already seen a linear search of an array, in which we begin at the first element and compare successive elements
for equality with the value we are looking for. If the array is sorted, we can do quite better than searching the entire array:
A High-level Version of the Approach
- Compare the value with the middle (sub)array element
- If they're equal, you're done!
- Otherwise, if the value being searched for is less then the element in the middle, we want to search the
subarray consisting of the left half of the array
- Otherwise, we should search the subarray consisting of the right half of the array
A More Detailed Presentation
As we will be looking at various subarrays (left side right side, etc), we initially view the entire array as a subarray with
bounds 0 … arr.length-1 (inclusive). We will call these bounds lo
and hi
respectively.
- if
lo
and hi
have crossed, i.e., lo is greater than hi, i.e., the subarray is empty, report failure
- Calculate the location of the middle element of the subarray
lo
… hi
; call it mid
- Compare the middle element with the value to be searched
- if equal, return the position
- if the value is less than the middle element, repeat step 1, using the subarray
lo
… mid-1
- otherwise repeat step 1, using the subarray
mid+1
… hi
Here is the actual Java code for binary search:
public static int find(int [] arr, int val) {
int lo = 0, hi = arr.length-1;
while (lo <= hi) {
int mid = (hi+lo)/2;
if (val == arr[mid])
return mid;
else if (val < arr[mid])
hi = mid - 1;
else
lo = mid + 1;
}
return -1;
}