Course: Programming II - Abstract Data Types. Introduction. Abstract Data Types

Course: Programming II - Abstract Data Types Introduction • Course: Programming II Abstract Data Types • Lecturer: Alessandra Russo ¾ ¾ email: o...
Author: Rosemary McGee
1 downloads 0 Views 81KB Size
Course: Programming II - Abstract Data Types

Introduction • Course:

Programming II Abstract Data Types

• Lecturer:

Alessandra Russo

¾ ¾

email: office hours:

• Duration:

[email protected] available in my office (room 560) between 1:30-3:30pm on Wednesday.

10 lectures and 5 tutorials

Introduction

Slide Number 1

This is the second half of the course “Programming II”. It will follow the standard course arrangement of 2 lectures and 1 tutorial per week. There will be unassessed exercise sheets each week at each tutorial. There will be an assessed coursework on this second part of the course, so I strongly recommend that you attend the tutorials and you ask questions. The exam will cover all that has been taught during the lectures, the tutorials and the labs. Hence, for a successful completion of this course, I strongly recommend: a) to attend BOTH lectures and tutorials b) to read lecture notes and recommended reading c) ensure that late revision is revision

1

Course: Programming II - Abstract Data Types

Aims •

To help you gain an understanding of, and ability to use, Abstract Data Types, in particular: Lists, Stacks, Queues, Trees, Heaps, Hash tables, Graphs



To extend your implementation skills by learning how to design, and implement Abstract Data Types, and how to use them within Java program solutions to real problems.

The course is linked on its theoretical side (via notions of axioms, and specifications) with the theory courses, and on its practical side with your laboratory course, which includes three exercise on abstract data types. Introduction

Slide Number 2

This part of the course is about Abstract Data Types (ADT) and their role in the development of Java program solutions to real application problems. In particular, we will look at the most common ADTs, which are lists, stacks, queues, trees, heaps, hash tables, and graphs. The aims are: ™ To help you gain a good understanding of, and ability to use, abstract data types; familiarise you with the above listed ADTs and associated operations. ™ To help you extend your programming skills by learning how to design and implement different data structure solutions for each of the above ADTs; and evaluate the practical benefits and limitations of different implementation choices. Learning outcomes: At the end of this part of the course, you will • Understand basic principles, main features and operations of the above types of ADTs. • Learn about fundamental algorithms associated with these ADTs, including tree traversal and heapsort. • Be able to develop Java implementations of ADTs using different approaches, and evaluate their differences. • Be able to use ADT and related implementations in designing and implementing efficient solutions to straightforward application problems.

2

Course: Programming II - Abstract Data Types

Reading Material • Books recommended are: ƒ

“Java Software Solutions, Foundation of Program Design”, J. Lewis and W. Loftus, Addison Wesley, 2000.

ƒ

“Data Structures and Other Objects using Java”, Michael Main, Addison Wesley, 1999.

ƒ

“Data Structures: an object-oriented approach”, D. Barnard, R. Holt and J. Hume, Holt Software Association, 1995.

• Slides and notes, complemented with information given during lectures and tutorials.

Introduction

Slide Number 3

The first book is the main text book recommended for the whole course of Programming II. It does not cover ADTs in detail, but the last chapter is a brief summary of some of the main ADTs that we are going to see during these lectures. The second book provides instead a comprehensive presentation of ADTs covering both specifications and implementations issues. This is the main text book for this part of the course. The third textbook is a basic text book on ADTs within the object-oriented paradigm. It doesn’t refer to Java implementations, but provides simple descriptions of the basic properties and features of ADTs. The slides will be available on the Web at http://www.doc.ic.ac.uk/~ar3

3

Course: Programming II - Abstract Data Types

Overview What is an Abstract Data Type (ADT) Introduce individual ADTs Understand the data type abstractly Define the specification of the data type Use the data type in small applications, based solely on its specification

Implement the data type Static approach Dynamic approach

Lists Stacks Queues Trees Heaps Hash tables Graphs

Some fundamental algorithms for some ADTs Introduction

Slide Number 4

This is a brief overview of the main topics that we are going to cover. We will begin with a general introduction abstraction, and a definition of the main terminology used during these lectures, e.g. data structure, data type, and abstract data type. We will then look into each of the main abstract data types listed above. For each of these data types, we will first see what the data type means in abstract terms, then we will define its specification and see some examples (through your tutorial sheets) of how data types can be used within the context of application Java programs. This includes, for instance, how an application Java program can access the data type and how it can manipulate it. Then we will see how the abstract data type can be implemented in Java, so as to meet its specification. In particular, we will look at two different approaches to data type implementation: the static approach, which uses only arrays, and the dynamic approach, which deploys more elaborate data structures. Finally, for some ADTs like trees, heaps and hash tables, we will look at some specific algorithms for handling them.

4

Course: Programming II - Abstract Data Types

What is Abstraction? Definition “To abstract” is to deduct, remove, consider apart from the concrete… [from a dictionary]

With Abstraction: “Certain properties and characteristics of the real objects are ignored, as they are peripheral and irrelevant to the particular problem” [N.Wirth, Algorithms and Data Structure, 1976]

Abstraction is at the heart of all problem solving. It can be described as ignoring unnecessary details and thus simplifying the task under consideration

Introduction

Slide Number 5

Before defining what an abstract data type is, it’s useful to understand what abstraction is, within the context of problem solving and program design. Looking at a dictionary definition of the word “abstraction”, you’ll find that to abstract means to draw away, to separate, to remove, to consider apart from the concrete. Niklaus Wirth describes the process of abstraction by saying that “certain properties and characteristics of the real objects are ignored, as they are peripheral and irrelevant to the particular problem”. We can therefore say that, in solving a particular task, abstraction means separating, removing or ignoring unnecessary details of real objects in order to simplify the task under consideration. A simple example is given by the following hierarchy of abstraction in a computer system, where each level is composed of entities from the level below:

A program Modules

Increasingly abstract

Classes Type declarations, methods Machine code

Increasingly concrete

5

Course: Programming II - Abstract Data Types

Two Processes of Abstraction: (1) Procedural Abstraction Stepwise Refinement (Wirth, 1971) Stepwise decomposition of a task into a number of separate, less complicated subtasks. Modularity Design technique that provides modular solution to a task, where the complexity of the program is kept manageable by controlling the interaction of its modules. Procedural Abstraction is the process of separating the purpose of a module from its implementation, by specifying what the module does, not how it does it. Introduction

Slide Number 6

Stepwise refinement in solving a given task consists of decomposing the task into smaller, simpler and independent subtasks. The modular approach to software design facilitates such a process by enabling the design of a problem solution in terms of modules and interaction between modules. Procedural abstraction complements modularity by separating the purpose of a module from its implementation. Whereas modularity helps break a solution into modules, abstraction helps specifying each module clearly before implementing it in a programming language. Each module can be seen as a black box that states what it does but not how it does it. No module knows how any other module performs its tasks. Abstraction separates the purpose of a module from its implementation, by defining what the module does, rather than how it does it. For instance, abstraction means specifying what a module assumes and what action the module takes. When you use existing software components, e.g. Java libraries, you need to know what the component does and what are its assumptions; when you write a new method you need to decide what the method should do and what are the relevant assumptions for it to work. An example is the definition of pre and post-conditions of methods that you have seen in the “Reasoned Programming” course. Abstraction, therefore, facilitates separate, independent and isolated development of modules, as if each module was surrounded by a “wall”. However, at the same time you also need to know how the modules interact with each other; for instance how a module can make use of another module. The surrounding wall has therefore to have a slit, which is not large enough to allow the outside world to see the method’s inner workings, but which allows things to pass through into and out of the method. An example of this slit is the passing of parameters to a method and the result value being returned by the method.

6

Course: Programming II - Abstract Data Types

An example Part of a program solution is sorting some data; one of the modules can be a sorting algorithm, the other modules know that the sorting module sorts, but they don’t know how it sorts. I can sort data into ascending order Unorganised data

sort

Sort this data for me; I don’t care how you do it.

Data sorted into ascending order

aModule

1. Design and implementation of aModule focuses just on its functionality. 2. Different sorting algorithms can be used, without affecting the rest of the solution. Introduction

Slide Number 7

Suppose for instance that a program needs to operate on sorted data. Therefore, one of its modules will be concerned with sorting some data (box labelled “sort”) and other modules will use it to perform their own tasks (e.g., the box “aModule” might need to search a given collection of unsorted data). Although aModule knows that the module “sort” sorts data, it does not need to know how this sorting task is accomplished. However aModule should be know that it can pass an unsorted collection of data to sort and that the same collection of data will be returned by sort as a sorted collection of data.

aModule

Unsorted data

Sort Sorted data

Once the module Sort is written, it can be used without knowing the details of its algorithm as long as there is a statement of its purpose and a description of its parameters.

7

Course: Programming II - Abstract Data Types

Two Processes of Abstraction: (2) Data Abstraction Data abstraction To think in terms of what operations can be performed on the data, independently from how they are implemented.

With data abstraction The underlying representation of the data is immaterial. It is easier to manipulate data in abstract form, so that the program reflects the way we think about the data, not the way the computer stores or manipulates them. Data Abstraction is the process of defining a collection of data and a set of operation on that data, by specifying what operations can be done on the data, not how they are done. Introduction

Slide Number 8

Often the solution to a problem requires operations on data. Such operations can include, for instance, “Add” data to a data collection, “Remove” data from a data collection, and “Ask questions” about the data in a data collection. The details of the operation vary from application to application. Data abstraction asks that you think in terms of what you can do to a collection of data independently of how you do it. Data abstraction allows you, then, to think abstractly about data – that is, to focus on what operations you will perform on the data instead of how you will implement them. Using data abstraction, modules of a program will “know” what operations can be performed on the data, without knowledge of how the data is stored on the computer, or of how the operations are implemented. In this way, we can think about the data types in ways that are easy to understand, without considering the details of the implementation. In the next slide I give you an example of using the data type “date” in a program.

8

Course: Programming II - Abstract Data Types

An Example Use of “dates” in a program • In abstract form we think of dates as “Day Month Year” • We identify a limited number of operations that make sense when applied to a date - the next day after a date - the previous day before a date - comparison for equality of two dates.

How might such dates be stored in a computer? 1. Julian form – as the number of days since 1 January 1995 2. With three fields – year, month, day

e.g. 2 January 1996 Introduction

0366

(Julian form)

96 01 02 (Three field) Slide Number 9

This example considers the use of “dates” in a program. In abstract form, we think of a date as given by a Day, a Month, and a Year. In defining this data type, we could just simply think of this common form of a date and identify a number of operations that make sense when applied to a date, as for instance the operation of the day after a given date, the day before a given date, the date after a specified period before/after a given date, whether two dates are the same, etc. Data abstraction enables us to consider the underlying representation of the dates as immaterial, so that our program will reflect the way we think about dates, not the way the computer stores them. In terms of the representation of dates, a computer could store dates in various forms. For instance, using the Julian form, i.e. as the number of elapsed days since a known start date (e.g. since 1 January 1995), or using three fields, called respectively Year, Month, Day. So, for example, for a given date “2 January 1996”, we would have the Julian form representation “0366”, or the three field representation 96 01 02, with their respective binary code decimal machine representation.

9

Course: Programming II - Abstract Data Types

Example of Levels of Data Abstraction Level of Abstraction

Example

Abstract Data Type

List

Predefined structured Data Types

Array of Real

Predefined simple Data Types

Real

Machine Language Type

0110111011

Data abstraction using Abstract Data Types permits control of the interaction between a program and its data structures. It guards against: • Inadvertently erroneous use of the data • Deliberate mis-use of the data • Modification of purpose or implementation of shared data Introduction

Slide Number 10

We can have different examples of level of data abstraction. The most abstract one is the Abstract Data Type, whereas the other given in the slide are progressively more concrete examples of data abstractions. These levels are related with each other: e.g., structured data types can be defined in terms of simple data types (e.g., an array can be a collection of real numbers), and abstract data types can be defined themselves in terms of pre-defined structured data types. Data abstraction is a method for controlling the interaction between a program and its pre-defined structured data types. It guards from inadvertent and/or deliberate mis-use of the (structured) data types used in the implementations. An Abstract Data Type (ADT) defines a Data Abstraction. Basic definitions are given in the next slide.

10

Course: Programming II - Abstract Data Types

Definitions An Abstract Data Type is a collection of data together with a set of data management operations, called Access Procedures, defined on these data. Definition and use of an ADT are independent of the implementation of the data and of their access procedures.

A Data Structure, or structured data type, is an organised collection of data elements, created using - Predefined constructors (e.g., array, vectors) - Pre-defined types (e.g., Boolean, real, integer) - User-defined types (e.g., a “date” class) Introduction

Slide Number 11

An Abstract Data Type (ADT) is a collection of data together with a set of operations, called access procedures, defined on that data. The description of an ADT must be rigorous enough to specify completely their effects on the data, yet it must not specify how to store the data, nor how to carry out the operations. When we implement an ADT we choose a particular data structure. A data structure is therefore a construct within a programming language to store data. For example, arrays that are built into Java are data structures. You can also define other data structures. For instance, suppose that you want to implement a list of employees’ names and salaries. You could use two Java arrays to store, respectively, the name and the salary of each employee. When a program has to perform data operations that are not supported by the programming language, then you will have to define an abstract data type and carefully specify what the ADT operations do. Defining an ADT can be seen as building a wall (of operations) around the data structures of a program. The interaction of the program with its data structures is defined by the operations described in the ADT.

Add

Program

Request operation

Remove

Result operation

Data Structure

Find

Wall of ADT operations

11

Course: Programming II - Abstract Data Types

Why abstraction 1. Less detail to worry about 2. May use previous work (don’t re-invent the wheel) 3. It complements modular design

Introduction

Slide Number 12

In both cases of procedural abstraction and data abstraction, abstraction allows us to concentrate on the desired properties of the object that we want to build, whether it’s a module or a collection of data, and on what this object should do for us, rather than on how it should do it. Unnecessary details are therefore removed so reducing the complexity of the task under consideration. Therefore, procedural abstraction helps clarify the design of a solution, because it enables us to focus on the high-level functionalities of each individual module without the distraction of the implementation details. Moreover, it allows us to modify parts of the solution, without significantly affecting the other parts, so using previous work (without having to re-invent the wheel). For instance, in the example given in Slide 7, the development of aModule doesn’t need to take into account the sorting algorithm details. At the same time, we are able to change the sorting algorithm, by just replacing the module Sort, without affecting the whole solution. Data abstraction also allows us to concentrate on the properties of the data type, rather than on how it is implemented. Moreover, the separation of the definition of an ADT from its implementation gives us the opportunity to change the implementation of the access procedures, or to change the supporting data structure, or to include additional operations (by just adding additional methods to the same ADT definition) without affecting the rest of the program.

12

Course: Programming II - Abstract Data Types

Our Approach to ADT To define an Abstract Data Type we need to: ¾ Establish the abstract concept of the data type ƒ Start by considering why we need it; ƒ Define what properties we would like it to have (i.e. axioms); ƒ Define the necessary access procedures.

¾ Consider possible implementations ƒ Static Implementation (array-based) ƒ Dynamic Implementation (reference-based)

Introduction

Slide Number 13

For the remainder of this course we will look at particular types of ADTs. Our general methodology for defining Abstract Data Types will be composed of two main parts: (a) the definition of the concept of the particular ADT, and (b) its implementation. In (a) we will have to consider mainly why the particular ADT would be needed, and what are its main properties and its main access procedures. We will call the properties of each ADT “Axioms”. As for the implementation, we will consider two main different types, called static and dynamic implementations, respectively. The word static means that the memory required by the data structure that support the particular data type is allocated at compilation time, whereas the word dynamic means that the memory is allocated at run-time, as required by the underlying data structure. The data structures that we will consider for static implementations will mainly be arrays. These, in fact, cannot change their size or shape, once they have been declared. For the dynamic implementation we will use, instead, reference-based implementation.

13

Course: Programming II - Abstract Data Types

Summary We want to ensure the quality and reliability of our programs by using methods which make them more comprehensible, amenable to proving their properties and as much like the way we think as possible.

So we use

Data Abstraction Implementation Independence

Both procedural and data abstraction ask you to think “what”, not “how”. An ADT is a collection of data and a set of operations on that data. Specifications indicate what ADT operations do, but not how to implement them. Data structures are part of an ADT’s implementation. ADT and data structures are not the same thing. Introduction

Slide Number 14

To summarise, when we do software development one of our main goals should be to produce software that is reliable and which is of high quality. To achieve this goal, we need to make programs more comprehensible, amenable to proving their properties and whose structure is close to our way of thinking, so to be also more manageable. To do so we make use of data abstraction and implementation independence. The bullet points given above recap the definitions and concepts that we have seen in this first lecture. In the next lecture we will consider the first type of ADT: the list.

14

Suggest Documents