Skip to contents

This function imports data from either a CSV or Excel file, detects the file encoding, and converts character columns to UTF-8. It supports different delimiters for CSV files and allows specifying a sheet for Excel files. Additionally, the function can handle header detection and custom separators for CSV files.

Usage

auto_import_utf8(
  file_path,
  file_type = c("csv", "xlsx"),
  sheet = NULL,
  header = TRUE,
  sep = ","
)

Arguments

file_path

A string specifying the path to the file to import.

file_type

A string specifying the file type. Options are "csv" or "xlsx". Default is "csv".

sheet

A string or integer specifying the sheet to read from an Excel file. Ignored if file_type is "csv". Default is NULL, which means the first sheet is read.

header

A logical value indicating whether the first row contains column names. Default is TRUE.

sep

A string specifying the separator used to separate columns in a CSV file. Ignored if file_type is "xlsx". Default is ",".

Value

A data frame with all character columns converted to UTF-8 encoding.

Details

  • For CSV files, the function automatically detects the file's encoding using the stringi::stri_enc_detect() function and reads the data with data.table::fread().

  • For Excel files, the function uses readxl::read_excel() to read the data.

  • After importing, all character columns are explicitly converted to UTF-8 encoding to ensure consistency.

  • You can specify whether the first row contains column names (header) and adjust the separator (sep) for CSV files.

  • The sheet parameter allows you to specify which sheet to read when importing Excel files.

Examples

if (FALSE) { # \dontrun{
# Import a CSV file with UTF-8 encoding and custom separator
df_csv <- auto_import_utf8("data/file.csv", file_type = "csv", header = TRUE, sep = ";")

# Import an Excel file, reading from the first sheet by default
df_xlsx <- auto_import_utf8("data/file.xlsx", file_type = "xlsx")

# Import an Excel file, specifying a sheet and no header
df_xlsx_sheet <- auto_import_utf8("data/file.xlsx", file_type = "xlsx", sheet = "Sheet2", header = FALSE)
} # }