dvs_add()
Track files in dvs storage
dvs_add() hashes files, copies their contents into storage, writes a .dvs
meta file next to each one, and returns a tibble of results. Pass paths, a glob,
or both.
dvs_add(paths = character(0), message = NULL, glob = NULL, dry_run = NULL)🔗Parameters
| Name | Type | Default | Behavior |
|---|---|---|---|
paths | character | character(0) | Files to add. |
message | character(1) | NULL | Message recorded in each meta file. |
glob | character(1) | NULL | Library-expanded glob. Uses a literal path separator (use ** to recurse). |
dry_run | logical(1) | NULL (treated as FALSE) | Report what would be added; write nothing. |
A progress bar is shown automatically while files are processed (the wrapper
creates an internal progress_callback handle); it is not a parameter. No bar
is shown for a dry run.
🔗Setup
options(width = 1000)
library(dvs)
proj <- tempfile("project_")
store <- tempfile("storage_")
dir.create(proj)
dir.create(store)
dir.create(file.path(proj, "data", "sub"), recursive = TRUE)
write.csv(mtcars[1:8, ], file.path(proj, "data", "f1.csv"))
write.csv(mtcars[9:16, ], file.path(proj, "data", "f2.csv"))
write.csv(iris[1:10, ], file.path(proj, "data", "g1.csv"), row.names = FALSE)
write.csv(iris[11:20, ], file.path(proj, "data", "g2.csv"), row.names = FALSE)
write.csv(iris[21:30, ], file.path(proj, "data", "sub", "g3.csv"), row.names = FALSE)
setwd(proj)
dvs_init(store)DVS Initialized🔗paths
Add a single file. The result has one row per file: the on-disk size, the
stored (compressed) stored_size, the content hash, and outcome.
setwd(proj)
dvs_add("data/f1.csv")# A tibble: 1 × 5
path outcome hash size stored_size
<chr> <chr> <chr> <bytes> <bytes>
1 data/f1.csv copied 5e2c49dd5c8a24a2ffe102e42804812c7eabc2c49682240480558eac390c5d65 488 B 306 B
Add several files in one call:
setwd(proj)
dvs_add(c("data/f2.csv"))# A tibble: 1 × 5
path outcome hash size stored_size
<chr> <chr> <chr> <bytes> <bytes>
1 data/f2.csv copied c9c0ca2ba19bf1d65159ba2b7300c3336fdf113c92b79901e06be4b40fcd8871 500 B 275 B🔗message
Record a message in the meta files. It surfaces later as the message column
of dvs_status().
setwd(proj)
dvs_add("data/g1.csv", message = "iris head sample")# A tibble: 1 × 5
path outcome hash size stored_size
<chr> <chr> <chr> <bytes> <bytes>
1 data/g1.csv copied 881b133ef44731569b7844e1dc60f10e5fc4a9f2c2ac36fe20b2bba636d6bbce 312 B 140 B🔗glob
A library-expanded glob uses a literal path separator: data/*.csv matches
files in data/ only, not in subdirectories.
setwd(proj)
dvs_add(glob = "data/*.csv")# A tibble: 4 × 5
path outcome hash size stored_size
<chr> <chr> <chr> <bytes> <bytes>
1 data/f1.csv present 5e2c49dd5c8a24a2ffe102e42804812c7eabc2c49682240480558eac390c5d65 488 B 488 B
2 data/f2.csv present c9c0ca2ba19bf1d65159ba2b7300c3336fdf113c92b79901e06be4b40fcd8871 500 B 500 B
3 data/g1.csv present 881b133ef44731569b7844e1dc60f10e5fc4a9f2c2ac36fe20b2bba636d6bbce 312 B 312 B
4 data/g2.csv copied a8eebfc0b894a2818e8c9edffe742df01a934073e3602a978836d622665786f7 312 B 147 B
Use ** to cross directory boundaries. data/**/*.csv matches the nested file
that data/*.csv skipped.
setwd(proj)
dvs_add(glob = "data/**/*.csv")# A tibble: 5 × 5
path outcome hash size stored_size
<chr> <chr> <chr> <bytes> <bytes>
1 data/f1.csv present 5e2c49dd5c8a24a2ffe102e42804812c7eabc2c49682240480558eac390c5d65 488 B 488 B
2 data/f2.csv present c9c0ca2ba19bf1d65159ba2b7300c3336fdf113c92b79901e06be4b40fcd8871 500 B 500 B
3 data/g1.csv present 881b133ef44731569b7844e1dc60f10e5fc4a9f2c2ac36fe20b2bba636d6bbce 312 B 312 B
4 data/g2.csv present a8eebfc0b894a2818e8c9edffe742df01a934073e3602a978836d622665786f7 312 B 312 B
5 data/sub/g3.csv copied 6d19965b2ae61ed235114e9739c4e1d1d52994d3be9f95e436ac1b7c69bb4582 310 B 146 B🔗dry_run
With dry_run = TRUE the result reports what would be added, but no blob or meta
file is written.
setwd(proj)
write.csv(iris[31:40, ], "data/preview.csv", row.names = FALSE)
dvs_add("data/preview.csv", dry_run = TRUE)# A tibble: 1 × 5
path outcome hash size stored_size
<chr> <chr> <chr> <bytes> <bytes>
1 data/preview.csv copied 172ce4213a1914a59a72e11b4555784b29fd0722787b2a8dbb52cdd9cdbf966a 314 B 314 B
The file is not tracked afterward:
setwd(proj)
dvs_status("data/preview.csv")# A tibble: 0 × 0🔗Return value
A tibble with one row per file:
| Column | Type | Description |
|---|---|---|
path | character | File path. |
outcome | character | copied (new content stored) or present (already stored). |
hash | character | blake3 content hash. |
size | dvs_bytes | On-disk size. |
stored_size | dvs_bytes | Stored (compressed) size. |
Rows that fail carry an error column instead. The size and stored_size
columns are dvs_bytes values that print as human-readable sizes.
🔗Differences from the CLI
The CLI command is dvs add. It prints lines and exits non-zero
on partial failure, and adds --threads and --json, which the R surface does
not have. Threads are set process-wide with set_dvs_threads().