While data structures are the building blocks, algorithms are the procedures for laying these blocks in problem-solving.
In everything that we do, we are either generating or consuming data.
Whether it is making a transaction or scrolling over the feeds of social media, we are producing and using data at every point in time.
Data is simply raw information such as facts or numbers, often collected to be examined, processed and used for decision-making.
Things like the temperature of your room, the colour of the cloth that you are putting on, the number of followers on Twitter and even your age are all data. In fact, your exact geographical location is data.
In order to make the most out of these data, they have to be processed into meaningful information. This requires storing them in the computer in a form that is easy to retrieve and process.
So, data structures are ways of storing and organizing data in other to be easily accessed and operations efficiently carried out.
And this is where you need an algorithm for efficient storage and processing of the data.
So, what’s an algorithm?
Algorithms
In simple terms, an algorithm is a step-by-step sequence of activities or tasks designed to solve a problem.
Usually written in simple and plain English, the algorithm for a particular task is a finite sequence of instructions for performing the task.
An algorithm must be clear and precise enough to be understood by human beings. Though, programmers prefer a mix between plain English and a programming language. This intermediary is called pseudocode.
Pseudocode is an English-like nonstandard language that lets you state your solution with more precision than you can in plain English without using a formal programming language.
Pseudocodes allow programmers to focus on the program logic without having to be concerned just yet about the precise syntax of a particular programming language.
To formulate a good algorithm, it’s important that you ask yourself these questions before and after creating the algorithm.
- What is the purpose of the algorithm?
- Can it perform its intended task?
- Is it efficient?
In essence, creating an algorithm requires breaking down the problem into a series of smaller tasks and a specific order.
An algorithm is said to be efficient if it allows fast and easy retrieval of data and manipulation of data with minimal memory requirements and we will be looking at data structures in the next section.
Data Structures
Data structures are ways of collecting and organizing data for efficient storage and manipulation of data. For instance, in Python, you can store the name of a person with a string or a person’s detail with a dictionary.
You can even make use of an object-oriented programming approach and store a person with classes and objects.
The simplest of data structures are the primitive types and then there are complex types to store large and connected data such as stacks, queues, linked lists, trees, graphs and so on.
When data is organized into data structures, it becomes very easy to manipulate or perform certain operations on them. Some of the operations that can be performed on data structures include:
- Insertion
- Deletion
- Searching – finding if a given item exists in a data structure and the location
- Transversal or getting the specific items in a data structure
- Sorting
Primitive Types
Primitive data structures are also known as system-defined data types, and every other type of data structure is referred to as a user-defined or custom data type.
Python has four primitive data types and they include:
- Integer
- Floats
- Boolean
- Strings
Integers
Integers are whole numbers, such as 1, 10, 222, -5 and so on. They have no fractional part but can be positive or negative numbers.
Floats
These are numbers with decimal points or fractional parts. Numbers like 0.4. -100.004 and so on are floats. Like the integers, floats can be positive or negative.
Boolean
Boolean is a type that can only contain two values, True or False. It has only two possible values. These values are sometimes represented with 1 and 0, with 1 representing True and 0 representing False.
Strings
Strings are collections of different kinds of characters including alphabets, numbers and symbols. It comprises characters enclosed within double or single quotes. It is used to store text-based information or data.
User-defined data types
Aside from the primitives, every programming language comes with a set of built-in data types. These built-in data types are also referred to as user-defined data types. Here are the built-in data types that come with Python.
List
This is the most versatile data structure in Python. A list is an ordered sequence of information that can be accessed by index. It contains square brackets.
A list is created by enclosing data items in a square bracket [ ], with each item separated by a comma. These data items can be heterogenous in nature – meaning you can have integer, float, string and even other inbuilt data types in a list.
fruit = ['apple', 'mango', 'orange', 'grape'] num = [1, 2, 3, 4, 5, 6]
Tuple
It is created by enclosing data items in parentheses with each item separated by a comma.
dimension = (1.234, 4.405) students = (‘john’, ‘eve’, ‘steve’, ‘mark’)
Dictionary
Dictionary is a collection of items comprising key-value pairs separated by a comma that is enclosed in a curly brace. Each key is paired with a value using a colon.
country = {1:'England', 2:'Germany', 3:'Ghana', 4:'Mexico'}
1 is a key with the value ‘England’
Sets
Sets are closely related to sets in mathematics. A set is an unordered sequence of items with no duplicates. Items in a set are enclosed in curly braces and each item is separated from the other by a comma.
S = {1,2,3,4,5}
Abstract data types
These are data types that are created through a combination of other data structures. By default, these data types are not implemented in the programming language, but we can combine other data structures to create them.
Below are examples of these data structures.
- Stacks
- Queues
- LinkedLists
- Arrays
- Trees
- Hash Tables
- Graphs
Stacks
This is a type of data structure where data items can be added or removed from only one end. It uses Last In First Out (LIFO).
Just like a stack of books where you can collect a specific book by removing those at the end until you reach the location of the book.
You can only apply pop() or append(‘a’).
You add items in a stack at the end and remove or delete items from the end.
Queues
This represents the waiting line. Insertions can be made only at the end, deletions made only at the beginning. Insert and Delete operations are also known as enqueue and dequeue. A typical application of this type of data structure is in the spooling manager of a printer.
It’s a type of list where you can only add elements at the end and remove items from the beginning. It uses the concept of First In First Out (FIFO). You can only apply append and pop.
Linked Lists
Linked lists are collections of data items lined up in a row. Each node in a linked list contains its own information and the link to the next item in the collection.
Each item contains a pointer to its own item and a pointer to the next item. You can insert or delete items anywhere in the list. You can access s linked list through a reference to the first node of the list. Other nodes can be accessed by the link reference stored in the previous node.
An example of linked lists is the next and previous in browsers.
Arrays
An ordered collection of items. Items can be accessed by their indexes.
An array can be one-dimension or two-dimensional.
You can sequentially move through the items in an array using iteration tools or looping constructs.
Trees
This type of data structure resembles a real tree. It is an ordered sequence of linked nodes, in which each node has at most one parent node and zero or more children nodes within a specific order.
There are different types of trees like binary search trees, quadtrees and so on, but our emphasis is on the binary search tree.
A binary search tree does not have duplicate node values. The nodes on the left are smaller than the value of the parent node, while the nodes on the right have values less than that of the parent node.
Top-level nodes are called the root node. Nodes that have the same parents are known as siblings. Also, nodes under a given node are children of the node. Nodes that do not have children are known as leaves.
The height of a tree is the length of the longest downward steps from the root node to a leaf.
The shape of the binary tree depends on the order in which values are inserted into the tree.
Hash Tables
A Hash table is a data structure that stores items in key-value pairs. The keys are used to associate values. In Python, hash tables are implemented with dictionaries.
In some other languages, hash tables are also referred to as associated arrays.
Graphs
Graphs are non-linear data structures comprising a series of nodes called points or vertexes. The links that connect the vertexes are known as edges.