Saturday, December 26, 2015

IT (3): Data Structure (Runtime complexity and algorithm approaches).......

Data Structure: Systematic organization of data, so as to optimize its usage
Data structure ought to be correct and should have low time complexity (less time usage, which can be achieved by less number of operations) and low space complexity (less memory usage).
The efficiency is a major criteria to tackle searching huge data size, maintain processor speed and cater to numerous user queries simultaneously.
Data structures  implements ADTs (stacks, queues, priority queue, tree, graph)
Interface: Operations (like, traversing, searching, insertion, deletion, sorting, merging)
Implementation: Architecture of data structure
Execution cases: worst, average and best (desirable)
A priori analysis:  theoretical analysis (before experiment)
A posterior analysis:  empirical analysis (after experiment)
Asymptomatic notations: Used to calculate run time complexity of an algorithm
Big Oh (O): Upper bound
Omega: Lower bound
Theta: Both upper and lower bound
Program run complexity can be constant, logarithmic, linear, nlogn, quadratic, cubic, polynomial, exponential 

Greedy approach: Achieves immediate optimization, but expensive for global scale
e.g. Travelling salesman problem, job scheduling problem

Divide and conquer approach: Consists of 3 steps i.e. divide/break, conquer/solve, merge/combine
e.g. Merge sort, Quick sort, Binary search

Dynamic programming approach: Almost like Divide and Conquer, though in stead of merging, memorization occurs
e.g. Fibonacci number series, project scheduling

Greedy algorithm (local optimization is objective) versus dynamic programming algorithm (global optimization is objective)

Built-in data:integer, boolean, float, character, string
Derived data: list, array, stack, queue
Linked List is a chain of data items. Each node of linked list contains an element and a pointer to next node. (types: simple (only forward navigation), doubly (both forward and backward navigation) and circular (last item points towards first item and first points towards last). Its used for insertion, deletion, display, search and remove.
Recursion: When a module or function calls itself
To prevent recursive function to go indefinitely, criteria followed are base criteria and progressive approach (to make the call closer to base criteria).
Stack frame keeps track of the recursion-related parameters.
While iteration and recursion do the same job, the latter is much efficient than the former. But, based on implementation, their time complexity varies. Space complexity of recursion is higher.

ADT: Abstract data type
Common ADT e.g. stacks (piles of things, vertical) and queues (sequence of things, horizontal).
-------------------------------------------------
Stacks(LIFO)
Insert or store: push
Remove or access: pop
Other stack-related functionalities are peek(), isFull(), isEmpty(), which furnishes information about the top element, full and empty status, respectively.
Top pointer indicates the outermost element.
Size can be fixed or dynamic
This ADT is implemented with array, structure, linked list
Common algorithm structure for each function is begin, return and end.
-------------------------------------------------
Queue(FIFO)
Insert or store: enqueue
Remove or access: dequeue
Other stack-related functionalities are peek(), isFull(), isEmpty(), which furnishes information about the top element, full and empty status, respectively.
Top pointer indicates the outermost element
Size can be fixed or dynamic
This ADT is implemented with array, structure, linked list
Common algorithm structure for each function is begin, return and end.
-------------------------------------------------
peek() returns an int
isFull() and isEmpty() return boolean value
-------------------------------------------------
Linear Search
Search engine moves along data sequence till the required element is found.
For each item in the sequence, if item ==value, return index of the item.
Run-time complexity O(n).
-------------------------------------------------
Binary Search
Search engine compares the value with middle element of the sequence.
Array must be sorted.
If value is larger, left half of array is discarded
I value is smaller, right half of array is discarded.
Very fast algorithm based on divide and conquer with run-time complexity O(logn).
#low, mid, high are the index of the elements
mid = low + (high - low) / 2
low = mid + 1

mid = low + (high - low) / 2
*Interpolation search, based on finding the position of required value is an improved form of binary search, with a run-time complexity of O(log(logn)).
#All the search methods return the index of the element.
-------------------------------------------------
Data structures: Array (simple list), record (tuple or struct), Hash table, Binary Tree, Graphs
Data structures can be linear (Array, List) or non-linear (Binary tree, B tree, Heaps, Tree)
Array: Bitmap, Dynamic array, Lookup table, matrix, sorted array
List: Linked list, skip list, doubly linked list
Hash table (hash map): To store data array, where each data element has its unique index. The knowledge of index makes data insertion and search fast. Hashing technique is used to generate the unique indices for the key values. Linear probing is used to find if a duplicate index has been generated. Key operations are search, insertion and deletion, all of which are based on computation of hash code and from it index finding.
Heap: binary heap, Fibonacci heap
Binary tree: binary search tree, AVL tree, red-black tree, scapegoat tree, T tree, splay tree
Sorting algorithms 
Sorting: Arranging data in a particular order (to make data search faster). Tow important aspect is in-place and stability.
Sorting can be in-place (no extra memory needed e.g. bubble sort, insertion sort, selection sort) or not in-place (extra memory needed e.g. merge sort).
Sorting can be stable (does not change sequence of original elements e.g. ) or not stable (does not change sequence of original elements e.g. ).
Bubble sort: Its simple algorithm that relies on swapping a pair of un-ordered numbers. Unsuitable for large data set as iterations number will be too large. Worst case complexity is O(n^2). Its in-place.
Insertion sort: A sorted sub-list is maintained and elements are inserted in proper position in the sub-list.. Unsuitable for large data set as iterations number will be too large. Worst case complexity is O(n^2). Its in-place.
Insertion sort: Array is divided into LHS sorted and RHS unsorted. Smallest element from unsorted part is put at proper position in sorted part. Unsuitable for large data set as iterations number will be too large. Worst case complexity is O(n^2). Its in-place.
--------------------------------
Merge sort: Sorting based on divide and conquer. The array is divided into equal parts (till units), followed by sorting and their combination. Worst case complexity is O(nlogn). Its not in-place.
Shell sort: Its an improved version of insertion sort. It sorts widely spaced elements followed by less widely spaced elements. The intervals are calculated by Knuth's formula. Divide and sort is the approach taken. Worst case complexity is O(n). Its in-place.
Quick sort: The right-most element is made pivot and based on it the array is divided into sub-arrays. Recursive sorting is done. Worst case complexity is O(nlogn). Its in-place.
***********************************
Graphs: Picture where objects are vertices and their connectors are edges
Depth first search (DFS) and Breadth first search (BFS) algorithms traverse the graph in depth and breadth-ward fashion, respectively.
Tree: Picture where nodes are linked by edges.(A root, paths, parent nodes , child nodes, leaf nodes)
One tree can have sub trees and many levels. elation between nodes can be parent-child or siblings.
Visiting: Checking the value of a node
Traversing: Passing through nodes (in order, pre order, post order)
A node has data part and references to its left and right child nodes.
Creation of a tree by insertion of an element and search by traversing.
Binary Tree: The tree-shaped data structure where each node can have at most 2 children.
Its merits include fast search as in ordered array and fast insertion/deletion  as in linked list.
LHS node has value less than parent whereas RHS node has value higher than parent.
element search starts from root (the head node) and depending on element to be searched, LHS or RHS sub tree is explored.

A binary search tree (BST) has worst case complexity is O(n), which is same with linear search. Data in sorted order leads to worst performance as tree becomes unbalanced with LHS and RHS height being stark. 
To address the lacuna with regular BST, other improved BST trees  have been developed.
AVL tree ensures that difference between LHS and RHS is not more than 1. To maintain balance, it performs 4 kinds of rotations (left, right, left-right, right-left).
Heap is a special type of balanced BST where root node key is compared with its children and arranged accordingly.
It can result in Min Heap (where value of root node is less than or equal to either of its children) or Max Heap (where value of root node is greater than or equal to either of its children).
#Tower of Hanoi is algorithm with formula (2^n)-1, where number of discs are n.
#Fibonacci Series is tye algorithm to generate  subsequent number by adding two previous numbers. 
-------------------------------------------
#Using the link http://www.tutorialspoint.com/compile_c_online.php
-------------------------------------------
#Hello World C program (first compile, then execute)
#include <stdio.h>

int main(){
   /* My first program in C */
   printf("Hello, World! \n");
 
   return 0;
}
gcc -o main *.c 
Hello, World!
-------------------------------------------
#Insertion of an element to an array (start; j = n; n = n+1; loop when  j >= k; array[j+1] =array[j]; j = j-1; array[k] =item; stop
#include <stdio.h>
main() {
   int init_array[] = {1,3,9};
   int item = 3, k = 4, n = 3;
   int i = 0, j = n;
 
   printf("Array elements were :\n");
   for(i = 0; i<n; i++) {
      printf("init_array[%d] = %d \n", i, init_array[i]);
   }
   n = n + 1;
   while( j >= k){
      init_array[j+1] = init_array[j];
      j = j - 1;
   }
   init_array[k] = item;
   printf("Updated array elements are:\n");
   for(i = 0; i<n; i++) {
      printf("init_array[%d] = %d \n", i, init_array[i]);
   }
}
Array elements were :                                                                                       
init_array[0] = 1                                                                                           
init_array[1] = 3                                                                                           
init_array[2] = 9                                                                                           
Updated array elements are:                                                                                 
init_array[0] = 1                                                                                           
init_array[1] = 3                                                                                           
init_array[2] = 9                                                                                           
init_array[3] = 4                                                                                           
-------------------------------------------
#Deletion of an element to an array (start; j = k; loop when  j < n; array[j-1] =array[j]; j = j+1; n = n-1; stop
#include <stdio.h>
main() {
   int init_array[] = {5,7,8};
   int k = 1, n = 3;
   int i, j;
 
   printf("The original array elements are :\n");

   for(i = 0; i<n; i++) {
      printf("init_array[%d] = %d \n", i, init_array[i]);
   }
 
   j = k;

   while( j < n){
      init_array[j-1] = init_array[j];
      j = j + 1;
   }

   n = n -1;
 
   printf("The array elements after deletion :\n");
for(i = 0; i<n; i++) {
      printf("init_array[%d] = %d \n", i, init_array[i]);
   }
}
The original array elements are :                                                                
init_array[0] = 5                                                                                
init_array[1] = 7                                                                                
init_array[2] = 8                                                                                
The array elements after deletion :                                                              
init_array[0] = 7                                                                                
init_array[1] = 8                                                                                
-------------------------------------------
#Search of an element to an array (start; j = 0; loop when  j < n; array[j] =item then print j, item; otherwise j = j+1; print j, item; stop
#include <stdio.h>
main() {
   int init_array[] = {1,7,9};
   int item = 7, n = 3;
   int i = 0, j = 0;
   printf("The original array elements are :\n");

   for(i = 0; i<n; i++) {
      printf("init_array[%d] = %d \n", i, init_array[i]);
   }
   while( j < n){
 if( init_array[j] == item ){
         break;
      }
      j = j + 1;
   }
   printf("Found element %d at position %d\n", item, j+1);

}
Original array elements are :                                                                    
init_array[0] = 1                                                                                
init_array[1] = 7                                                                                
init_array[2] = 9                                                                                
Found element 7 at position 2                                                                    
-------------------------------------------
Doubly linked list (implemented in C)
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <stdbool.h>

struct node {
   int data;  int key;
   struct node *next;    struct node *prev;
};

//this link always point to first Link
struct node *head = NULL;

//this link always point to last Link
struct node *last = NULL;
struct node *current = NULL;

//is list empty
bool isEmpty(){
   return head == NULL;
}

int length(){
   int length = 0;
   struct node *current;

   for(current = head; current != NULL; current = current->next){
      length++;
   }
return length;
}

//display the list in from first to last
void displayForward(){

   //start from the beginning
   struct node *ptr = head;

   //navigate till the end of the list
   printf("\n[ ");

   while(ptr != NULL){      
      printf("(%d,%d) ",ptr->key,ptr->data);
      ptr = ptr->next;
   }

   printf(" ]");
}

//display the list from last to first
void displayBackward(){

   //start from the last
   struct node *ptr = last;

   //navigate till the start of the list
   printf("\n[ ");

   while(ptr != NULL){  

      //print data
      printf("(%d,%d) ",ptr->key,ptr->data);

      //move to next item
      ptr = ptr ->prev;
      printf(" ");
   }

   printf(" ]");
}


//insert link at the first location
void insertFirst(int key, int data){

   //create a link
   struct node *link = (struct node*) malloc(sizeof(struct node));
   link->key = key;
   link->data = data;

   if(isEmpty()){
      //make it the last link
      last = link;
   }else {
      //update first prev link
      head->prev = link;
   }

   //point it to old first link
   link->next = head;

   //point first to new first link
   head = link;
}

//insert link at the last location
void insertLast(int key, int data){

   //create a link
   struct node *link = (struct node*) malloc(sizeof(struct node));
   link->key = key;
   link->data = data;

   if(isEmpty()){
      //make it the last link
      last = link;
   }else {
      //make link a new last link
      last->next = link;  
      //mark old last node as prev of new link
      link->prev = last;
   }

   //point last to new last node
   last = link;
}

//delete first item
struct node* deleteFirst(){

   //save reference to first link
   struct node *tempLink = head;

   //if only one link
   if(head->next == NULL){
      last = NULL;
   }else {
      head->next->prev = NULL;
   }

   head = head->next;
   //return the deleted link
   return tempLink;
}

//delete link at the last location

struct node* deleteLast(){
   //save reference to last link
   struct node *tempLink = last;

   //if only one link
   if(head->next == NULL){
      head = NULL;
   }else {
      last->prev->next = NULL;
   }

   last = last->prev;

   //return the deleted link
   return tempLink;
}

//delete a link with given key

struct node* delete(int key){

   //start from the first link
   struct node* current = head;
   struct node* previous = NULL;

   //if list is empty
   if(head == NULL){
      return NULL;
   }

   //navigate through list
   while(current->key != key){
      //if it is last node

      if(current->next == NULL){
         return NULL;
      }else {
         //store reference to current link
         previous = current;

         //move to next link
         current = current->next;          
      }
   }

   //found a match, update the link
   if(current == head) {
      //change first to point to next link
      head = head->next;
   }else {
      //bypass the current link
      current->prev->next = current->next;
   }  

   if(current == last){
      //change last to point to prev link
      last = current->prev;
   }else {
      current->next->prev = current->prev;
   }

   return current;
}

bool insertAfter(int key, int newKey, int data){
   //start from the first link
   struct node *current = head;

   //if list is empty
   if(head == NULL){
      return false;
   }

   //navigate through list
   while(current->key != key){

      //if it is last node
      if(current->next == NULL){
         return false;
      }else {        
         //move to next link
         current = current->next;          
      }
   }
   //create a link
   struct node *newLink = (struct node*) malloc(sizeof(struct node));
   newLink->key = key;newLink->data = data;

   if(current == last) {
      newLink->next = NULL; last = newLink;
   }else {
      newLink->next = current->next; current->next->prev = newLink;
   }

   newLink->prev = current; current->next = newLink;
   return true;
}
main() {
   insertFirst(1,10); insertFirst(2,20); insertFirst(3,30);

   printf("\nList (First to Last): "); displayForward();

   printf("\nList (Last to first): "); displayBackward();
 
   printf("\nList , after deleting first record: ");
   deleteFirst();displayForward();

   printf("\nList , after deleting last record: ");
   deleteLast();displayForward();

   printf("\nList , insert after key(2) : ");
   insertAfter(2,5, 11);displayForward();

   printf("\nList  , after delete key(2) : ");
   delete(2);displayForward();
  printf("\n");
}
List (First to Last):                                                                 
[ (3,30) (2,20) (1,10)  ]                                                             
List (Last to first):                                                                 
[ (1,10)  (2,20)  (3,30)   ]                                                          
List , after deleting first record:                                                   
[ (2,20) (1,10)  ]                                                                    
List , after deleting last record:                                                    
[ (2,20)  ]                                                                           
List , insert after key(2) :                                                          
[ (2,20) (2,11)  ]                                                                    
List  , after delete key(2) :                                                         
[ (2,11)  ]                                                                           
--------------------------------------------------

No comments:

Post a Comment