An important part of any programming language are string manipulations. In Python, these are known as string methods. The table below gives a quick overview.
Overview Python String Methods
WHAT | PYTHON | Returns | PYTHON EXAMPLE | SPSS EXAMPLE |
---|---|---|---|---|
Extract Substring | [] | String | myString[0] | compute str01 = char.substr(str01,1,1). |
Concatenate 2(+) Strings | + or += | String | myString + myString | compute str01 = concat(str01,str02). |
Find Leftmost Occurrence of Substring | find | Integer | myString.find('a') | compute pos = char.index(str01,'a'). |
Find Rightmost Occurrence of Substring | rfind | Integer | myString.rfind('a') | compute pos = char.rindex(str01,'a'). |
Replace 1(+) Characters | replace | String | myString.replace('a','b') | compute str01 = replace(str01,'a','b'). |
Find Length of String | len | Integer | len(myString) | compute len01 = char.length(str01). |
Lowercase String | lower | String | myString.lower() | compute str01 = lower(str01). |
Uppercase String | upper | String | myString.upper() | compute str01 = upper(str01). |
Capitalize String | capitalize | String | myString.capitalize() | (None) |
Remove Characters from Left Part of String | lstrip() | String | myString.lstrip() | compute str01 = ltrim(str01). |
Remove Characters from Right Part of String | rstrip() | String | myString.rstrip() | compute str01 = rtrim(str01). |
Remove Characters from Left and Right Part of String | strip() | String | myString.strip() | (None) |
Convert String to Integer | int | Integer | int(myString) | compute num01 = number(str01,comma16). (Or use ALTER TYPE.) |
Split String into Python List | split | List | myString.split(' ') | (None) |
Check if String Starts With... | startswith | Boolean | myString.startswith("var") | (None) |
Check if String Ends With... | endswith | Boolean | myString.endswith("var") | (None) |
Left Pad String with Zeroes | zfill | String | myString.zfill(3) | compute str01 = char.lpad(str01,3). |
Extract Substring in Python
We extract substrings in Python with square brackets that may contain one or two indices and a colon. Like so,
myString[0]
extracts the first character;myString[1:]
extracts the second through last characters;myString[:4]
extracts the first through fourth characters;myString[1:3]
extracts the second through third characters;myString[-1]
extracts the last character.
Python Substring Examples
begin program python3.
myString = 'abcdefghij'
print(myString[0]) # a
print(myString[1:]) # bcdefghij
print(myString[:4])# abcd
print(myString[1:3]) # bc
print(myString[-1]) # j
end program.
Concatenating Strings in Python
Basically, + concatenates two or more strings. Additionally, myString += 'a' is a nice shorthand for myString = myString + 'a' that we'll often use for building SPSS syntax.
Python Concatenate Examples
begin program python3.
myString = 'abc'
print(myString + 'def') #abcdef
end program.
*2. CONCATENATE WITH "+="
begin program python3.
myString = 'abc'
for i in range(5):
myString += str(i)
print(myString) #abc01234
end program.
Note: in these examples, we're technically creating new string objects rather than truly changing existing string objects. This is because strings are immutable in Python.
Find Leftmost Occurrence of Substring
Retrieving positions for single or multiple character substrings is done in Python with find
. Keep in mind here that
- Python is fully case sensitive and
- Python objects are zero-indexed.
Like so, the indices for the characters in our examples are shown below.
Python Find Examples
begin program python3.
myString = 'Cycling in the mountains is fun.'
print(myString.find('c')) # 2
print(myString.find('in')) # 4
end program.
Find Rightmost Occurrence of Substring
In Python, rfind
returns the index (again, starting from zero) for the rightmost occurrence of some substring in a string. The syntax below shows a couple of examples.
begin program python3.
myString = 'Cycling in the mountains is fun.'
print(myString.rfind('i')) # 25
print(myString.rfind('in')) # 21
end program.
Replacing Characters in a Python String
Replacing characters in a string is done with replace
in Python as shown below.
begin program python3.
myString = 'The cat caught the mouse in the living room.'
print(myString.replace('a','')) #The ct cught the mouse in the living room.
print(myString.replace('the','a')) # The cat caught a mouse in a living room.
end program.
Note: in line 5 we replace all a’s with an empty string. That is, we'll remove all a’s from our example sentence.
Find Length of Python String
In Python, len
returns the number of characters (not bytes) of some string object.
begin program python3.
myString = 'abcde'
print(len(myString)) # 5
end program.
Convert Python String to Lowercase
For converting a Python string to lowercase, use lower
as shown below.
begin program python3.
myString = 'SPSS Is Fun!'
print(myString.lower()) # spss is fun!
end program.
Convert Python String to Uppercase
In Python, upper
converts a string object to uppercase.
begin program python3.
myString = 'This is Some Title'
print(myString.upper()) # THIS IS SOME TITLE
end program.
Capitalize Python String Object
In Python, “capitalizing” means returning a string with its first character in uppercase and all other characters in lowercase -even if they were uppercase in the original string.
begin program python3.
myString = 'aBcDeF'
print(myString.capitalize()) # Abcdef
end program.
Remove Characters from Left Part of String
In Python, just lstrip()
removes all spaces and tabs from the beginning of a string. Any other leading character can be removed by specifying it within the parentheses (line 12 below).
begin program python3.
myString = ' left padding removed'
print(myString.lstrip()) # left padding removed
end program.
*REMOVE ASTERISKS (*) FROM START OF STRING.
begin program python3.
myString = '****left padding removed'
print(myString.lstrip('*')) # left padding removed
end program.
Remove Characters from Right Part of String
The rstrip
method works the same as lstrip but removes characters from the right side of some string.
begin program python3.
myString = 'right padding removed '
print(myString.rstrip()) # right padding removed
end program.
*REMOVE ASTERISKS (*) FROM END OF STRING.
begin program python3.
myString = 'right padding removed****'
print(myString.rstrip('*')) # right padding removed
end program.
Remove Characters from Left and Right Part of String
Just strip
basically combines the Python lstrip and rstrip methods.
begin program python3.
myString = ' left and right padding removed '
print(myString.strip()) # left and right padding removed
end program.
*REMOVE ASTERISKS (*) FROM END OF STRING.
begin program python3.
myString = '****left and right padding removed****'
print(myString.rstrip('*')) # left and right padding removed
end program.
Sadly, this method doesn't have an SPSS equivalent, which is why we sometimes see LSTRIP(RSTRIP(MYSTRING)) in older syntax. Note that whitespace is often stripped automatically from string values in SPSS Unicode mode.
Convert String to Integer
In Python, int
converts a string to an integer. If a string contains anything else than digits, it'll crash with an error.
begin program.
myString = '123'
myInt = int(myString)
print(type(myInt)) # <type 'int'>
print(myInt) # 123
end program.
Split Python String into List Object
The example below splits a string into a Python list object. split
always requires some separator. Splitting a string without any separator can be done with a list comprehension (line 14 below).
begin program python3.
myString = 'A A C A B C'
myList = myString.split(' ')
print(type(myList)) # <type 'list'>
print(myList) # ['A', 'A', 'C', 'A', 'B', 'C']
end program.
*SPLIT STRING INTO PYTHON LIST WITHOUT SEPARATOR.
begin program python3.
myString = 'AACABC'
myList = [i for i in myString]
print(myList) # ['A', 'A', 'C', 'A', 'B', 'C']
end program.
Check if String Starts With...
begin program python3.
myString = 'abcdef'
print(myString.startswith('abc')) # True
print(myString.startswith('bcd')) # False
end program.
*TYPICAL USE OF STARTSWITH().
begin program python3.
if myString.startswith('a'):
print("First character is 'a'.")
else:
print("First character is not 'a'.")
end program.
Note: True and False are the (only) 2 possible values for Booleans. We mostly use them when we only want to run one or Python if statements.
Check if String Ends With...
begin program python3.
myString = 'abcdef'
print(myString.endswith('f')) # True
print(myString.endswith('e')) # False
end program.
Left Pad String with Zeroes
In Python, zfill(3)
left pads a string with zeroes up to a total length of 3 characters. We mostly do so when we want to sort numbers alphabetically: 002 comes before 010 and so on.
begin program python3.
myString = '1'
print(myString.zfill(3)) # 001
myString = '10'
print(myString.zfill(3)) # 010
end program.
So that's about it for Python string methods. I hope you found this tutorial helpful.
Thanks for reading!