Strings in python are collections of characters like digits, letters of the alphabet, symbols and even non-printable characters. A string is a series of characters treated as a single unit.
In python, strings are consists of characters enclosed in matching single or double quotation signs.
In defining a string, the start and closing quotation marks must match. For instance, if you are using a single quotation sign, the start and closing quotation signs must be a single one.
S1 = 'this is string' S2 = "this is correct" S3 = "this is incorrect' S4 = 'this is incorrect"
Indexing and slicing
Strings are considered iterable because it consists of a sequence of characters. As a result, the individual characters in a string can be accessed using their positional indexes.
For example:
name = 'henry' print(name[0]) print(name[1]) print(name[2]) print(name[3]) print(name[4]) #output #h #e #n #r #y
Slicing strings
Slicing simply means extracting one or more characters in a string. The syntax for this is shown below:
color = 'yellow' print(color[0:3]) #output #yel
The above syntax simply means – to return all the characters in the string named color, starting from position 0 and ending but not including position 3.
Therefore, color[3:6] will return ‘low’ as the result.
Negative indexes
Aside from the regular positional indexing that starts from the left side of the string, you might as well access the position of a character in a string starting from the right side using the negative indexes.
The first item from the right or the last item from the left has an index of -1 and the second to the last with an index of -2 and so on.
color = 'yellow' print(color[-1]) print(color[-2]) print(color[-3]) print(color[-4]) print(color[-5]) print(color[-6]) #outputs #w #o #l #l #e #y
Now, let’s apply this concept to slicing strings.
color = 'yellow' print(color[0:-1]) print(color[0:-2]) print(color[0:-4]) print(color[-6:-1]) print(color[-6:5]) #output #yello #yell #ye #yello #yello
In slicing, if the start index position is zero, you might as well leave it empty as shown below:
S = 'hello' print(S[0:5]) print(S[:5]) #outputs #hello #hello
Also, if you want to include the last item in a sequence, you can also leave the stop index empty. This is particularly necessary if you don’t know the position of the last item in a sequence.
S = 'hello' print(S[0:5]) print(S[:5]) print(S[:]) #outputs #hello #hello #hello
Slicing in steps
Including a second colon after the stop index is used to specify the step value. The step value is the number of increments in the indexes as it is iterating from the start to the stop index. If no value is provided, it defaults to 1 which means that all the items should be included. If the step value is 2, it means skipping 1 item in the course of the iteration as shown below.
S = 'hello' print(S[::]) print(S[::2]) print(S[::3]) #outputs #hello #hlo #hl
If the step value is a negative number, it means that the slicing will be in reverse as shown below.
S = 'hello' print(S[::-1]) #output #olleh
Immutability of strings
Strings are immutable, meaning that you cannot perform remove or update characters in a string once it has been defined, or else you will get an error. For instance:
color = 'yellow' color[0] = 'Y' #error message #Traceback (most recent call last): #File "/Users/ex.py", line 2, in <module> #color[0] = 'Y' #TypeError: 'str' object does not support item assignment
Operations on strings
Strings and string manipulations are crucial in any programming language, especially python. Let’s look at different operations that can be performed on a string.
Concatenation
You can add two or more strings together to get another string. This is done using the + operator.
S = 'hello' + 'world' print(S) S = 'hello' + ' ' + 'world' + '!' print(S) #ouputs #helloworld #hello world!
Repetition
In order to repeat a given string into any given number of times using the * operator. You can repeat a given character or group of characters a given number of times as shown below:
print('h'*10) print('hello'*3) #output #hhhhhhhhhh #hellohellohello
Iteration on strings
Strings are sequences of characters and can be iterated like every other type of sequence through looping.
color = 'yellow' for char in color: print(char) #output #y #e #l #l #o #w
Membership Test
Strings support membership tests. Using the in operator, you can determine whether a character or group of characters are in a given string. The outcome of this operation is either True or False.
color = 'yellow' print('y' in color) print('llo' in color) print('x' in color) #outputs #True #True #False
Triple quotes
Triple quotes also known as docstrings are used to create multiline strings. Even though some consider this as a way of commenting strings, technically, docstrings or characters in triple quotes are considered as strings.
You can use single or double quotes for triple quotes in defining strings, but be consistent in your choice, otherwise, you will get a syntax error.
However, it’s recommended that you only use single or double quotes in defining strings. Triple quotes are predominantly used for code documentation. Also, avoid using triple quotes for the commenting except if it is part of your program documentation.
""" This is the documentation for the program """ #triple quotes used in defining strings S = """Hello 1""" print(S) S = '''hello 2''' print(S) S = '''Wrong""" #outputs #Hello 1 #hello 2 #File "/Users/mac/Documents/portfolio/tools/ex.py", line 13 #S = '''Wrong""" #^ #SyntaxError: unterminated triple-quoted string literal (detected at line 17)
Escape sequences
Escape sequences are characters used to present non-printable characters or literals into strings.
Here are some escape sequences and their meanings.
\’ – single quote
\” – double quote
\\ – backslash
\n – new line
\r – carriage return
\t – horizontal tab
\b – backspace
\uxxxxxxxx – 16-bit Unicode hex value
\Uxxxxxxxx – 32-bit Unicode hex value
\ooo – Octal value of 000
Examples of the use of escape sequences
Now, we will be demonstrating the use of escape sequences using the following codes.
Single and double quotes
The following examples illustrate how to escape single or double quotes in a string. For example, if you want to have a double quote in your string, then the outer quotation must be a single quote. In the same manner, if you want to include a single quotation in your string, then the outer quotation must be a double quote.
S = 'There are "3" apples' print(S) S = "There are '3' apples" print(S) #ouputs #There are "3" apples #There are '3' apples
Tabs and new line
This is how to implement the tab and new line escape sequences in python.
S = 'This is line 1\nThis is line 2\nThis is line 3' print(S) S = 'Word 1\tWord 2\tWord 3' print(S) #outputs #This is line 1 #This is line 2 #This is line 3 #Word 1 Word 2 Word 3
Backlash
Backslash has a special meaning to python interpreters, so in order to represent the character in your string, you have to escape it with another backslash as shown below:
S = 'C:\\Programs\\Hintacare' print(S) #output #C:\Programs\Hintacare
Unicode and Octal Values
The following examples show how to represent Unicode and octal values in python strings.
S = 'This is an octal value - \ooo421' print(S) S = 'This is a 32-bits Unicode hex value - \U00000023' print(S) S = 'This is a 16-bits Unicode hex value - \u0066' print(S) #outputs #This is an octal value - \ooo421 #This is a 32-bits Unicode hex value - # #This is a 16-bits Unicode hex value - f