Example 1 : For a program that can compute the average of n numbers, the actual numbers whose average we would like the program to compute in a particular run, as well as the value of n comprise the data. (If we consider the actual code of the program we are likely to find some more pieces of data.)The code of a program is based on certain assumptions about the data that is handled at different points during the execution of the program. These assumptions are mainly regarding -
Example 2 : Program to sort n names alphabetically.In Example 2, we have names as data items. Each name is a string of alphabetic characters (letters). The valid values of an alphabetic letter string and the operations on such strings are different from that of numeric data. So we may consider alphabetic letter string as another data type.
In example 1 it is clear that the given numbers have the relationship that is
their similarity of purpose, and it is distinct from the purpose of
n. Most likely the program will add the numbers one by one to obtain
their sum. This inherent relationship between the numbers leads us to
represent them in an array, as we shall see later. The
same is true for the list of names in example 2.
The memory bytes of a computer can basically hold numbers in binary form. But
as we have seen, we need to store other types of values, such as alphabetic
letters and strings of such letters. Even certain numeric types require
more than the basic accessible units. For this reason certain mapping is used
to convert from the users' data values to the numeric format supported by the
computers' hardware. For example, for the letters of the Roman alphabet there
is the ASCII encoding scheme wherein the letter A is represented internally
as 65, B as 66, and so on. Thus each letter of the Roman alphabet is
represented in a single byte. For other data types more bytes may be used to
represent a single value. For example, to represent a floating point number
(say) 8 consecutive bytes may be used, and different parts may have different
meanings (characteristic and mantissa).
Example 3 : There are the names of n students and the respective marks that each has obtained in a test.In Example 3, suppose we need a program that will tell the name of the student who has scored the highest mark in the test. The data types that are required are alphabetic letter string for the names and numeric for the marks. But in this case we need a way to associate each name with the corresponding mark. In other words we are dealing with an abstraction called tuple that is the pair (name, mark) for each student.
Some of the frequently encountered abstract data types are - tuples (also called records), lists, queues, circular queues, stacks, hierarchies (trees), tables, sets, files, graphs, etc.
Most programming languages provide some mechanisms for the user to implement such ADT using the basic data types. In data structures we consider the various possibilities of representing the ADTs using the different options supported by the programming languages. While it may be possible to implement an ADT in more than one way, the choice of implementation affects the convenience of the software developer, efficiency of the program and economy of memory utilization. To select the most suitable implementation of an ADT for a particular program, it is necessary to keep in mind the range of values required to be supported for each data item and the set of operations that are to be performed using the ADTs.
The basic elements of the mechanisms in programming languages for
implementing the ADTs are array ,
record and use of addresses .
Array is one of the basic data structures which can be used to implement several different ADTs.
Accessing the individual data items of an array - Suppose, each data item is of size s bytes and the memory area designated for the array starts at address A. Then the i th data item of the array is located at address A + (i - 1) * s. In general the objects cannot be distinguished in the memory by their contents.
The individual data items in an array must be of the same size, or else the
individual data items cannot be accessed using the above expression. This
requirement raises certain issues when the individual data items are likely
to be variable sizes, eg., names of persons.
Memory structure for a record - All the fields of a record are stored one after another in a contiguous memory area. The distance (offset) of a field from the beginning of the record is the former's characteristic, and the field name is actually mapped to this value. Thus if the starting address in memory of a record is known, each field can be accessed by adding the corresponding offset to it. The size of a record is the sum of the sizes of the individual fields. Note that a field can, in turn, be a record of some other type.
Pointer fields in records - A record can also have one or more "pointer", or address fields. These fields are used to locate other data items in the memory.
Like array, record is another basic data structures which can be used to
implement several different ADTs.
Unlike an array, the individual nodes of a linked list can be scattered at different places in the computer's memory. The notion of offset of a node from the beginning of the list does not hold here. The only way to get to a node is using its address. Generally a program does not maintain the address of the individual nodes outside the list. Hence the nodes can be accessed only by traversing the list from the head. This is a time consuming process compared to accessing the individual data items in an array. On the other hand, considering the economy of memory space, linked lists are suitable in situations when the maximum number of data items that may occur can vary greatly and cannot be determined in advance. However, for small number of items, the pointer fields may be an overhead.
Doubly linked list