Extract the non-numeric bits of a string where numbers are optionally defined with decimals, scientific notation and thousand separators.
str_extract_non_numerics(
string,
decimals = FALSE,
leading_decimals = decimals,
negs = FALSE,
sci = FALSE,
big_mark = "",
commas = FALSE
)
A string.
Do you want to include the possibility of decimal numbers
(TRUE
) or not (FALSE
, the default).
Do you want to allow a leading decimal point to be the start of a number?
Do you want to allow negative numbers? Note that double negatives are not handled here (see the examples).
Make the search aware of scientific notation e.g. 2e3 is the same as 2000.
A character. Allow this character to be used as a thousands
separator. This character will be removed from between digits before they
are converted to numeric. You may specify many at once by pasting them
together e.g. big_mark = ",_"
will allow both commas and underscores.
Internally, this will be used inside a []
regex block so e.g. "a-z"
will behave differently to "az-"
. Most common separators (commas, spaces,
underscores) should work fine.
Deprecated. Use big_mark
instead.
str_first_non_numeric(...)
is just
str_nth_non_numeric(..., n = 1)
.
str_last_non_numeric(...)
is just
str_nth_non_numeric(..., n = -1)
.
Other non-numeric extractors:
str_nth_non_numeric()
strings <- c(
"abc123def456", "abc-0.12def.345", "abc.12e4def34.5e9",
"abc1,100def1,230.5", "abc1,100e3,215def4e1,000"
)
str_extract_non_numerics(strings)
#> [[1]]
#> [1] "abc" "def"
#>
#> [[2]]
#> [1] "abc-" "." "def."
#>
#> [[3]]
#> [1] "abc." "e" "def" "." "e"
#>
#> [[4]]
#> [1] "abc" "," "def" "," "."
#>
#> [[5]]
#> [1] "abc" "," "e" "," "def" "e" ","
#>
str_extract_non_numerics(strings, decimals = TRUE, leading_decimals = FALSE)
#> [[1]]
#> [1] "abc" "def"
#>
#> [[2]]
#> [1] "abc-" "def."
#>
#> [[3]]
#> [1] "abc." "e" "def" "e"
#>
#> [[4]]
#> [1] "abc" "," "def" ","
#>
#> [[5]]
#> [1] "abc" "," "e" "," "def" "e" ","
#>
str_extract_non_numerics(strings, decimals = TRUE)
#> [[1]]
#> [1] "abc" "def"
#>
#> [[2]]
#> [1] "abc-" "def"
#>
#> [[3]]
#> [1] "abc" "e" "def" "e"
#>
#> [[4]]
#> [1] "abc" "," "def" ","
#>
#> [[5]]
#> [1] "abc" "," "e" "," "def" "e" ","
#>
str_extract_non_numerics(strings, big_mark = ",")
#> [[1]]
#> [1] "abc" "def"
#>
#> [[2]]
#> [1] "abc-" "." "def."
#>
#> [[3]]
#> [1] "abc." "e" "def" "." "e"
#>
#> [[4]]
#> [1] "abc" "def" "."
#>
#> [[5]]
#> [1] "abc" "e" "def" "e"
#>
str_extract_non_numerics(strings,
decimals = TRUE, leading_decimals = TRUE,
sci = TRUE
)
#> [[1]]
#> [1] "abc" "def"
#>
#> [[2]]
#> [1] "abc-" "def"
#>
#> [[3]]
#> [1] "abc" "def"
#>
#> [[4]]
#> [1] "abc" "," "def" ","
#>
#> [[5]]
#> [1] "abc" "," "," "def" ","
#>
str_extract_non_numerics(strings,
decimals = TRUE, leading_decimals = TRUE,
sci = TRUE, big_mark = ",", negs = TRUE
)
#> [[1]]
#> [1] "abc" "def"
#>
#> [[2]]
#> [1] "abc" "def"
#>
#> [[3]]
#> [1] "abc" "def"
#>
#> [[4]]
#> [1] "abc" "def"
#>
#> [[5]]
#> [1] "abc" "def"
#>
str_extract_non_numerics(c("22", "1.2.3"), decimals = TRUE)
#> Warning: `NA`s introduced by ambiguity.
#> ℹ The first such ambiguity is in string number 2 which is '1.2.3'.
#> ✖ The offending part of that string is '.2.3'.
#> [[1]]
#> character(0)
#>
#> [[2]]
#> [1] NA
#>