Enums and Factors Guide
How to map Rust enums to R factors and character strings.
How to map Rust enums to R factors and character strings.
miniextendr provides two complementary systems for enum-like types:
| System | R Representation | Partial Match | Default | Use Case |
|---|---|---|---|---|
RFactor | Factor (integer + levels) | No | - | Categorical data for table(), lm(), etc. |
MatchArg | Character scalar | Yes | First choice | Parameter validation (match.arg() style) |
πRFactor: enum as R Factor
Maps a Rust enum to an R factor with levels. Each variant becomes a level.
#[derive(Copy, Clone, RFactor)]
pub enum Color {
Red, // level index 1
Green, // level index 2
Blue, // level index 3
}
Use in functions:
#[miniextendr]
pub fn describe(color: Color) -> &'static str {
match color {
Color::Red => "warm",
Color::Green => "cool",
Color::Blue => "cool",
}
}
#[miniextendr]
pub fn favorite() -> Color {
Color::Blue
}
From R:
describe(factor("Red", levels = c("Red", "Green", "Blue")))
# [1] "warm"
favorite()
# [1] Blue
# Levels: Red Green BlueπRename Variants
#[derive(Copy, Clone, RFactor)]
#[r_factor(rename_all = "snake_case")]
pub enum Status {
InProgress, // level: "in_progress"
Completed, // level: "completed"
NotStarted, // level: "not_started"
}
#[derive(Copy, Clone, RFactor)]
pub enum Priority {
#[r_factor(rename = "lo")]
Low,
#[r_factor(rename = "med")]
Medium,
#[r_factor(rename = "hi")]
High,
}
Supported rename_all values: snake_case, kebab-case, lower, upper.
πFactor Vectors
Use FactorVec<T> for vectors and FactorOptionVec<T> for vectors with NA:
use miniextendr_api::{FactorVec, FactorOptionVec};
#[miniextendr]
pub fn all_colors() -> FactorVec<Color> {
FactorVec(vec![Color::Red, Color::Green, Color::Blue])
}
#[miniextendr]
pub fn parse_colors(input: FactorOptionVec<Color>) -> Vec<&'static str> {
input.0.iter().map(|c| match c {
Some(Color::Red) => "red",
Some(Color::Green) => "green",
Some(Color::Blue) => "blue",
None => "NA",
}).collect()
}
From R:
all_colors()
# [1] Red Green Blue
# Levels: Red Green Blue
x <- factor(c("Red", NA, "Blue"), levels = c("Red", "Green", "Blue"))
parse_colors(x)
# [1] "red" "NA" "blue"πCaching
The #[derive(RFactor)] macro generates a OnceLock-cached levels STRSXP. The
levels string vector is allocated once and reused for all subsequent conversions,
giving ~4x speedup for single-value conversions.
πVia #[miniextendr]
Instead of #[derive(RFactor)], you can use the attribute macro:
#[miniextendr]
#[derive(Copy, Clone)]
pub enum Color { Red, Green, Blue }
These are equivalent. #[miniextendr] on a fieldless enum dispatches to the same
RFactor derive internally.
πMatchArg: enum as string parameter
Maps a Rust enum to R character strings with match.arg() validation. Supports
partial matching and defaults to the first variant when NULL is passed.
#[derive(Copy, Clone, MatchArg)]
pub enum Mode {
Fast, // choice: "Fast"
Safe, // choice: "Safe"
Debug, // choice: "Debug"
}
Use in functions:
#[miniextendr]
pub fn run(mode: Mode) -> String {
match mode {
Mode::Fast => "running fast".into(),
Mode::Safe => "running safe".into(),
Mode::Debug => "running debug".into(),
}
}
The generated R wrapper shows the choice list directly as the formal default:
run <- function(mode = c("Fast", "Safe", "Debug")) {
mode <- if (is.factor(mode)) as.character(mode) else mode
mode <- base::match.arg(mode)
.Call(C_run, mode)
}
The enumβs CHOICES are spliced in at cdylib-write time (not stored in an R
variable), so ?run and tab-completion both show the real options. If you set
an explicit default = "\"Safe\"", the splice rotates that value to position 1
of the vector β match.arg(arg) returns arg[1] when arg matches the formal
default, so the rotated value becomes the effective default while the rest of
the choices remain visible: function(mode = c("Safe", "Fast", "Debug")). The
default value must be one of the enumβs choices; otherwise the cdylib panics
at write time.
From R:
run("Fast") # exact match
run("F") # partial match β "Fast"
run() # NULL β default (first choice: "Fast")
run("Saf") # partial match β "Safe"
run("X") # Error: 'arg' should be one of "Fast", "Safe", "Debug"πRename Variants
Same syntax as RFactor but with #[match_arg(...)]:
#[derive(Copy, Clone, MatchArg)]
#[match_arg(rename_all = "snake_case")]
pub enum BuildStatus {
InProgress, // choice: "in_progress"
Completed, // choice: "completed"
}
#[derive(Copy, Clone, MatchArg)]
pub enum Priority {
#[match_arg(rename = "lo")] Low,
#[match_arg(rename = "med")] Medium,
#[match_arg(rename = "hi")] High,
}πVia #[miniextendr]
#[miniextendr(match_arg)]
#[derive(Copy, Clone)]
pub enum Mode { Fast, Safe, Debug }πInline String Choices
For simple cases where you donβt need an enum, use choices(...) on a &str parameter:
#[miniextendr]
pub fn correlate(
x: f64, y: f64,
#[miniextendr(choices("pearson", "kendall", "spearman"))] method: &str,
) -> String {
format!("method={}, cor={}", method, x * y)
}πMultiple Choices with several_ok
Rβs match.arg(..., several.ok = TRUE) accepts multiple values from the choice
list and returns a character vector. miniextendr exposes this with
several_ok, which is valid on both match_arg and choices:
// Enum: Vec<Mode> - each element validated against MatchArg::CHOICES
#[miniextendr]
pub fn pick_modes(#[miniextendr(match_arg, several_ok)] modes: Vec<Mode>) -> String { ... }
// Inline: Vec<String> - each element validated against the inline list
#[miniextendr]
pub fn pick_metrics(
#[miniextendr(choices("mean", "median", "sd", "var"), several_ok)] metrics: Vec<String>,
) -> String { ... }
Accepted container shapes: Vec<T>, Box<[T]>, &[T], [T; N], and
Missing<Vec<T>> (optional). several_ok without match_arg or choices
is a compile error (no choice list to validate against). several_ok on a
scalar type (e.g. Mode without a Vec) is also a compile error.
Default behavior when the R caller omits the argument: the full choice list,
matching base::match.arg. Pass a single string to get partial matching,
pass a character vector to get multiple exact/partial matches.
πOn Impl-Block Methods
Rust rejects attribute macros on function parameters inside impl items, so
match_arg / choices / several_ok on impl methods use method-level
attributes that name the parameter. This works for all class systems
(r6, env, s3, s4, s7, vctrs) on both constructors and instance
methods:
#[miniextendr(r6)]
impl Counter {
// Constructor: first choice becomes the R default.
#[miniextendr(match_arg(mode))]
pub fn new(mode: Mode) -> Self { ... }
// several_ok variant - note the distinct attribute name.
#[miniextendr(match_arg_several_ok(modes))]
pub fn reset(&mut self, modes: Vec<Mode>) -> i32 { ... }
// Inline string choices - pass the list as a comma-separated string.
#[miniextendr(choices(level = "low, medium, high"))]
pub fn describe(level: String) -> String { ... }
}
Each form generates the same R match.arg prelude you get on standalone
functions, including the choices vector as the formal default.
The vctrs class system accepts match_arg on its fn new() constructor
even though vctrs constructors return a data vector (Vec<T>) rather than
Self. The vctrs generator recognizes a receiverless new as the
constructor regardless of return type.
πAuto-Injected @param Docs
When you leave a match_arg parameter undocumented, miniextendr fills in
the roxygen @param line at write time using the enumβs CHOICES:
#' @param mode One of "Fast", "Safe", "Debug".
This runs for both standalone functions and impl-block methods across every
class system. Explicit @param lines you write yourself are preserved
verbatim; only missing entries are auto-generated.
πReturning Vec<Enum>
Functions can return Vec<T> for any MatchArg enum. Each variant round-trips
to its choice string:
#[miniextendr]
pub fn all_modes() -> Vec<Mode> {
vec![Mode::Fast, Mode::Safe, Mode::Debug]
}all_modes()
# [1] "Fast" "Safe" "Debug"
This is provided by a blanket impl<T: MatchArg> IntoR for Vec<T> in
miniextendr-api. No extra derive is required.
πMatchArg as Base Trait
MatchArg is the base trait for all enum-like types. RFactor requires MatchArg
as a supertrait, so any RFactor type also has MatchArg::CHOICES, from_choice(),
and to_choice(). Use MatchArg as a bound for generic code over both systems:
use miniextendr_api::MatchArg;
fn describe_choices<T: MatchArg>() -> String {
T::CHOICES.join(", ")
}
fn lookup<T: MatchArg>(choice: &str) -> Option<T> {
T::from_choice(choice)
}
πComparison Table
| Feature | RFactor | MatchArg |
|---|---|---|
| R storage | factor(1, levels=c(...)) | "Fast" (character) |
| Validation | Type check (is factor with correct levels) | match.arg() with partial matching |
| Default on NULL | Error | First choice |
| Vec support | FactorVec<T>, FactorOptionVec<T> | Vec<T> return + several_ok inputs |
| Partial matching | No | Yes ("F" β "Fast") |
| Factor input | Native | Converted to character first |
| Use case | Categorical data | Parameter selection |
πWhen to Use Which
RFactor when:
- Data is categorical (colors, species, status codes)
- Working with R functions expecting factors (
table(),lm(),ggplot2) - Need vector support with NA handling
- Factor level ordering matters
MatchArg when:
- Building an API with string-based options
- Want Rβs
match.arg()partial matching and error messages - Want a default value when the argument is omitted
- Validating user input parameters
πSee Also
- MINIEXTENDR_ATTRIBUTE.md:
#[miniextendr]on enums - TYPE_CONVERSIONS.md: full type conversion reference