- SPSS Python Essentials
- Run Python from SPSS Syntax Window
- Wrap Python Code into Functions
- Write Your Own Python Module
- Create an SPSS Extension
SPSS Python Essentials
First off, using Python in SPSS always requires that you have
- SPSS,
- Python and
- the SPSS-Python plugin files installed on your computer.
These components are collectively known as the SPSS Python essentials. For recent SPSS versions, the Python essentials are installed by default. One way to check this is navigating to
in which you'll probably find some Python location(s) as shown below.So what should you see here? Well,
- if you see an active Python 3 location here, you're good to go;
- if you only see an active Python 2 location, then you can only use Python 2, which is no longer supported. Your best option is to upgrade to SPSS version 24 or (preferably) higher;
- if all locations are greyed out (or even absent), you have SPSS without any Python essentials installed. In this case, you'll need to (re)install a recent SPSS version.
Run Python from SPSS Syntax Window
Right. So if you've SPSS with the Python essentials properly installed, what's next?
Well, the simplest way to go is to run Python from an SPSS syntax window. Enclose all lines of Python between BEGIN PROGRAM PYTHON3. and END PROGRAM. as shown below.
Try and copy-paste-run the entire syntax below. Note that this Python block simply lowercases all variable names, regardless what or how many they are.
data list free/V1 V2 v3 v4 EDUC gender SAlaRY.
begin data
end data.
*Run Python block for lowercasing all variable names.
begin program python3.
import spss,spssaux
oldNames = spssaux.GetVariableNamesList()
newNames = [var.lower() for var in oldNames]
spss.Submit("RENAME VARIABLES (%s = %s)."%(' '.join(oldNames),' '.join(newNames)))
end program.
Wrap Python Code into Functions
Right, so we just ran some Python from an SPSS syntax window. Now, this works fine but doing so has some drawbacks:
- our syntax becomes less readable and manageable if it contains long Python blocks;
- if we use some Python block in several SPSS syntax files and we'd like to correct it, we'll need to correct it in each syntax file;
- the SPSS syntax editor is a poor text editor.
A first step towards resolving these issues is to first wrap our Python code into a Python function.
data list free/V1 V2 v3 v4 EDUC gender SAlaRY.
begin data
end data.
*Define lowerCaseVars as Python function.
begin program python3.
def lowerCaseVars():
import spss,spssaux
oldNames = spssaux.GetVariableNamesList()
newNames = [var.lower() for var in oldNames]
spss.Submit("RENAME VARIABLES (%s = %s)."%(' '.join(oldNames),' '.join(newNames)))
end program.
*Run function.
begin program python3.
lowerCaseVars()
end program.
Note that we first define a Python function and then run it. Like so, you can develop a single SPSS syntax file containing several such functions.
Running this file just once (preferably with INSERT) defines all of your Python functions. You can now use these for all projects you'll work on during your SPSS session.
Write Your Own Python Module
We just defined and then ran a function. The next step is moving our function into a Python file: a plain text file with the .py extension that we'll place in C:\Program Files\IBM\SPSS Statistics\Python3\Lib\site-packages or wherever our site-packages folder is located.
We can now edit this file with Notepad++, which is much nicer than SPSS’ syntax editor. Since a Python file contains only Python, we'll leave out BEGIN PROGRAM PYTHON3. and END PROGRAM.
If we now import our module in SPSS, we can readily run any function it contains as shown below.
data list free/V1 v2 V3 V4 v5 V6.
begin data
end data.
*Import module and lowercase variable names.
begin program python3.
import ruben
ruben.lowerCaseVars()
end program.
Developing and using our own Python module has great advantages:
- each function is defined only once and it doesn't clutter up our syntax window;
- if we need to correct some function, we need to correct it only in one module that can be used by several SPSS syntax files;
- we can use functions within functions in our module. Doing so can make our code shorter and easier to manage.
A quick tip: if you're developing your module, reload it after each edit.
begin program python3.
import ruben,importlib # import ruben and importlib modules
importlib.reload(ruben) # use importlib to reload ruben module
ruben.lowerCaseVars() # run function from ruben module
end program.
Create an SPSS Extension
SPSS extensions are tools that can be developed by all SPSS users for a wide variety of tasks. For an outstanding collection of SPSS extensions, visit SPSS Tools - Overview.
Extensions are easy to install and can typically be run from SPSS menu dialogs as shown below.
So how does this work and what does it have to do with Python?
Well, most extensions define new SPSS syntax commands. These are not much different from built-in commands such as FREQUENCIES or DESCRIPTIVES. The syntax below shows an example from SPSS - Create All Scatterplots Tool.
SPSS TUTORIALS SCATTERS YVARS=costs XVARS=alco cigs exer age
/OPTIONS ANALYSIS=FITALLTABLES ACTION=RUN.
Now, running this SPSS syntax command basically passes its arguments -such as input/output variables, values or titles- on to an underlying Python function and runs it. This Python function, in turn, creates and runs SPSS syntax that gets the final job done.
Note that SPSS users don't see any Python when running this syntax -unless they can make the Python code crash. For actually seeing the Python code, you may unzip the SPSS extension (.spe) file and look for some Python (.py) file in the resulting folder.
Unzipping an SPSS extension (.spe) file results in a folder in which you'll usually find a Python (.py) fileSome final notes on SPSS extensions is that developing them is seriously challenging and takes a lot of practice. However, well-written extensions can save you tons of time and effort over the years to come.
Thanks for reading!