Algorithms and Data Structures 2020

Week 2. Data Structures

Algorithms and Data Structures

3 Week 2. Data Structures

Reading 2 Goodrich, Tamassia, & Goldwasser: Chapter (2), 3

3.1 Key Concepts

   3.1.1 LinkedList and ArrayList
   3.1.2 ADT (Abstract Data Type)
   3.1.3 The Array ADT
   3.1.4 The List ADT
   3.1.5 Algorithm Theory
   3.1.6 Copies

3.1.1 LinkedList and ArrayList

In Java, you are probably familiar with the following Data Structures or Data Types:

Array
ArrayList
LinkedList

What is the difference between them?

Array is a primitive data type. Let’s leave that aside for now.

Both ArrayList and LinkedList are classes inheriting the abstract class List. Therefore they share a number of methods.

add(index,element)
get(index)
remove(index)

There are others, but these three are particularly interesting from an algorithmic point of view. We shall discuss their complexity.

3.1.2 ADT (Abstract Data Type)

Definition 3.1 (Abstract Data Type) An Abstract Data Type (ADT) is a set of operations.

In Object Oriented Programming the set of operations will normally be a set of methods. Java can codify an ADT as an interface, and to implement the ADT a class must implement the interface. However, the ADT define the semantics (behaviour) of the methods, while the interface only defines the syntax (call signature). Thus an implementation of the interface is not necessarily an implementation of the ADT.

The ADT itself, is the mathematical model describing the interface and its semantics. An API, in contrast, is the interface of the implementation.

1.

Static and Dynamic Definitions

Java Generics
Type parameters

3.1.3 The Array ADT

The Array ADT would normally provides methods to

get the element at a given position
set (change) the element at a given position

The Java ArrayList class in Java provides many more methods. Most importantly, you can change the size of the ArrayList. Adding or removing an element at the end of the ArrayList is an important functional extension. When Arrays are used in the theoretical literature, the size is usually assumed to be fixed.

THe other methods of ArrayList are convenience methods which can be implemented using the methods above.

3.1.4 The List ADT

The List ADT is stateful. It always has a cursor pointing to the current position.

add an element at the current position
remove an element at the current position
next get the next element in the list

The ListIterator Notably, the list classes in Java does not implement an interface representing the List ADT. Instead, the List interface has the listIterator() which returns an object implementing the ListIterator interface, which represents the List ADT.

This makes it possible to access the same List concurrently in different subprograms, because each iterator maintains its own cursor.

In Java:

AbstractList (Abstract Class)
LinkedList and ArrayList (Concrete Classes)
List (interface) $\sim$ Array (ADT)
ListIterator (Java) $\sim$ List (ADT)
LinkedList also implements the Deque ADT

3.1.5 Algorithm Theory

The interesting questions in algorithm theory is how we implement the ADTs so that they are correct and efficient, and exactly how efficient each operation is in terms of complexity.

3.1.6 Copies

1.: Equality
2.: Cloning - shallow and deep copies

3.2 Projects and Exercises

3.2.1 Compulsory Assignment
3.2.2 Exercises

3.2.1 Compulsory Assignment

Problem 3.1 (Experimental Running Time) Compare the running time of LinkedList and ArrayList in Java.

Use the file ArrayListDemo.java as a basis. Compile and run the program, and look at the output as well as the source code. What does this test tell us about the performance of ArrayList?

It is possible that the program runs out of memory. In that case you may have to increase the heap size with an option to JVM.

1.: Modify the program to use LinkedList instead of ArrayList. Compile and run the new version. What does it tell us about the performance of LinkedList? What are the advantages and disadvantages of LinkedList compared to ArrayList?
2.: Modify the programs to initialise a smaller array, maybe about 100 elements, and test both ArrayList and LinkedList. How do the two implementations compare when the list is small?
3.: Try with some different list sizes to see how the run time developes. Which operations run in constant time? Linear time? Slower than linear?
4.: Review the theoretical properties of LinkedList and ArrayList, and compare your empirical output to the theory. Are the test results reasonable?
5.: When would you prefer to use ArrayList and when would you prefer LinkedList? Try to find example applications where you would recommend one or the other.

Problem 3.2 (Optional) In the above tests, we measured in milliseconds, and mostly only a single operation at a time. This does not always allow us an exact comparison.

There are two things you can do:

1.: Use the java.lang.System.nanoTime() function instead.
2.: Make several, similar operations in sequence, e.g. for instance 100 get() calls on adjacent indices.

Try to modify the test programs, and see if you can learn a bit more about the cases where the first test showed zero milliseconds.

(Remember that we rarely care much about the time of a single operations, but when the same operation is made thousands or millions of times, even a difference of less than a millisecond may matter.)

Problem 3.3 (Based on Goorich et al C-3.20) Suppose you are making a multiplayer game with $n \geq 1000$ players, numbered from 1 to $n$ , interacting in an enchanted forest. The winner is the player who first meets every other player. The game is constructed so that a function, meet $(i, j)$ is called every time player $i$ meets player $j$ ,

Design the algorithm to be implemented in meet, so that it records who has met whom, and reliably detects when someone has won (i.e. has met every other player).

You should also make an init() algorithm to initialise the data structure used by meet $(i, j)$ .

1.: Can you demonstrate that the algorithm is correct?
2.: What is the time complexity?
3.: Is it possible to make it faster?

Hint. You may have to record both whether or not $i$ and $J$ have met, and how many players $i$ has met.

3.2.2 Exercises

Problem 3.4 (Hall of Fame) Consider a Computer Game where you are going to implement a Hall of Fame, i.e. a list of, say, the 100 best scores ever achieved. What data structure would you choose? What benefits and disadvantages do you consider to make that choice?

Describe the data structure (do not implement it) with all the operations required in the application. For each operation, consider the following:

1.: How fast/slow is the operation?
2.: When and how often do you expect the operation to be used?

Would your choices change if the number of scores to be stored changes?

Problem 3.5 (Based on Goorich et al C-3.19)

Design an algorithm, shuffle, which takes a list $A$ with $n$ elements and reorders it such that every permutation is equally likely. Assume that you have a function random( $n$ ) to return a random integer in the range $0, 1, \dots, n - 1$ .

1.: What is the running time (complexity) of your algorithm?
2.: Is it possible to solve the problem in linear time?
3.: Is it best to use ArrayList or LinkedList as input and output? Why?

Problem 3.6 Goodrich et al C-3.22–23. Comment on the run time complexity for your solutions.

Problem 3.7 Goodrich et al C-3.24. Comment on the run time complexity for your solutions.

Problem 3.8 Goodrich et al C-3.25.

Comment on the run time complexity for your solution.