Most real world data contain some (or many) missing values. It's always a good idea to inspect the amount of missingness for avoiding unpleasant surprises later on. In order to do so, SPSS has some missing values functions that are mostly used with COMPUTE, IF AND DO IF. This tutorial demonstrates how to use them effectively. We'll do so by using the last 5 variables in hospital.sav.

Setting User Missing Values
Before discussing SPSS missing values functions, we'll first set 6 as a user missing value for the last 5 variables by running the line of syntax below. missing values doctor_rating to facilities_rating (6).
SPSS Missing Values Functions
Expression | Meaning | Returns |
---|---|---|
MISSING | Evaluate whether value is system missing or user missing | True or false |
SYSMIS | Evaluate whether value is system missing | True or false |
NMISS | Return number of missing values over variables | Numeric value |
NVALID | Return number of valid values over variables | Numeric value |
SPSS MISSING Function
SPSS MISSING function evaluates whether a value is missing (either a user missing value or a system missing value). For example, we'll flag cases that have a missing value on doctor_rating with the syntax below.If the COMPUTE command puzzles you, see Compute A = B = C for an explanation.
compute mis_1 = missing(doctor_rating).
*2. Move flagged cases to top of file.
sort cases mis_1 (d).

SPSS SYSMIS Function
SPSS SYSMIS function evaluates whether a value is system missing. For example, the syntax below uses IF to replace all system missing values by 99. We'll then label it, specify it as user missing and run a quick check with FREQUENCIES.
if sysmis(doctor_rating) doctor_rating = 99.
*2. Add value label 99.
add value labels doctor_rating 99 'Recoded system missing value'.
*3. Specify 6 and 99 as user missings.
missing values doctor_rating (6,99).
*4. Quick check.
frequencies doctor_rating.

SPSS NMISS Function
SPSS NMISS function counts missing values within cases over variables. Cases with many missing values may be suspicious and you may want to exclude them from analysis with FILTER or SELECT IF. The syntax runs a quick scan for such cases.
compute mis_2 = nmiss(doctor_rating to facilities_rating).
*2. Apply variable label. Tip: indicate number of variables involved here.
variable labels mis_2 'Number of missing values over doctor_rating to facilities_rating (5 variables)'.
*3. Quick check.
frequencies mis_2.

SPSS NVALID Function
SPSS NVALID function counts the number of valid values over variables. It is equivalent to the number of variables minus NMISS over those variables. Note that the dot operator is a faster alternative for excluding cases from statistical functions (such as MEAN and SUM).
compute valid_1 = nvalid(doctor_rating to facilities_rating).
exe.
THIS TUTORIAL HAS 7 COMMENTS:
By Nicky on December 8th, 2019
I’ve seen articles that use logistic regression, chi-square tests, or t-tests to predict whether a particular set of demographic variables predicts missingness. Can SPSS perform this analysis?
By Ruben Geert van den Berg on December 8th, 2019
Hi Nicky, good question!
Some patterns of missingness have been defined by (I believe) Rubin: MCAR, MAR and MNAR.
I believe some of those are implemented in SPSS Missing Value Analysis module. This requires an additional license. You can see if you have it by running
SHOW LICENSE.
Or see if there's "Missing Value Analysis" directly in the Analyze menu. If it isn't you don't have it.
I don't use this module myself so I can't guide you any further than that.
Hope that helps!
SPSS tutorials