
Check Missing Values and Validity of Year Column
check_year_column.RdThis function checks for missing values in a specified column of a data frame
that contains years. It also ensures that the values are numeric and fall within
a valid range. If transform_numeric is TRUE, the function attempts to convert
non-numeric values to numeric.
Usage
check_year_column(
df,
col_year,
year_range = c(1800, 2024),
transform_numeric = TRUE
)Arguments
- df
A data frame containing the data to be checked.
- col_year
The name of the column in the data frame that contains the year values.
- year_range
A numeric vector of length 2 specifying the valid range of years (e.g.,
c(1900, 2024)).- transform_numeric
Logical. If
TRUE, attempts to convert non-numeric year values to numeric. Defaults toTRUE.
Value
A list containing:
missing_values: Indices of missing values in the year column.invalid_years: Indices of values that fall outside the valid year range.updated_df: A data frame with the updated year values and a flag indicating whether the values were transformed.
Examples
df <- data.frame(year = c("2001", "2005", NA, "two thousand and ten", "2018", "2050"))
result <- check_year_column(df, "year", year_range = c(1800, 2024))
#> Error in check_year_column(df, "year", year_range = c(1800, 2024)): The column could not be converted to numeric.