3.13 Expression Check (S)

Purpose: Checks whether the values in a string column of the student’s tibble are correctly transformed.

Motivation: Many questions involve string manipulation, formatting, and concatenation. This Check detects a specific, wrongly transformed attribute of the string transformation and can hint at its correct modification. This Check is one of the more common Special Checks and can take on many forms.

#Expression Check Examples (with example hints)

#Example 1: Checking if a column has extracted the apostrophe "s" (i.e., 's) from its values
else if(sum(str_detect(variable_name$column_name, "'s"), na.rm = T) > 0){
  test.results[2, 4] <- "There are `'s` in `column_name`. Please remove all of the `'s` when modifying `column_name`."
}

#Example 2: Checking if a column has correctly removed all observations that are not a full name (i.e., names that only contain the first initial and last name, like "D. Smith")
else if(sum(str_detect(variable_name$column_name, "\\w{1}\\.\\s{1}\\w+$"), na.rm = T) > 0){
  test.results[2, 4] <- "Make sure to remove all observations that are not a full name (e.g., D. Smith) from `column_name`. Hint: Try functions like str_replace() or grepl() with the appropriate regular expression."
}

#Example 3: Checking if a column has correctly rounded its values to the hundredths place 
else if(!all(str_detect(variable_name$column_name, "\\.\\d{2}$"))){
  test.results[2, 4] <- "All values in `column_name` should be formatted to the second decimal place. Hint: Try using the round() function with the `digits` argument."
}

#Example 4: Checking if a column has correctly converted its year values to the corresponding decade (e.g., 1987 should be converted to 1980)
else if(sum(variable_name$column_name) %% 10 != 0){
  test.results[2, 4] <- "Make sure `column_name` only contains multiples of 10 (e.g., 1950, 1960, etc.). Hint: Convert the year values into multiples of 10. For example, 1987 should be converted to 1980. Consider using the `%/%` (integer division) operator, but there are other possible solutions as well."
}

Technicals

Since the Expression Check assumes that the student’s answer has the correct column name, it is essential that the Column Name Check is placed beforehand.

This is a subset of the Value Check.

Like the Calculation Check and the NA Check, the Expression Check should be placed before or within the Value Check and does not have to be implemented alongside it.

For examples of how to check multiple columns simultaneously, see below.

#Dynamic Expression Check (with an example hint)

#Checking if three columns (column_A, column_B, column_C) have converted their values to lowercase...
else if(any(str_detect(variable_name$column_A, "[A-Z]"),
            str_detect(variable_name$column_B, "[A-Z]"),
            str_detect(variable_name$column_C, "[A-Z]"))){
  
  q2_upper_check <- c(any(str_detect(variable_name$column_A, "[A-Z]")),
                      any(str_detect(variable_name$column_B, "[A-Z]")),
                      any(str_detect(variable_name$column_C, "[A-Z]")))
  
  q2_upper_name <- c("column_A", "column_B", "column_C")

  test.results[2, 4] <- paste0(c("The following column(s) contain observations
                                that are not converted to lowercase:",
                                q2_upper_name[q2_upper_check], 
                                "Hint: use str_to_lower() when necessary."),
                                collapse = "  ")
}