Search for the names of all functions in the expression R

I am trying to find the names of all the functions used in an arbitrary legal expression R, but I can not find a function that will indicate the example below as a function instead of a name.

test <- expression( this_is_a_function <- function(var1, var2){ this_is_a_function(var1-1, var2) }) all.vars(test, functions = FALSE) [1] "this_is_a_function" "var1" "var2" 

all.vars (expr, functions = FALSE) seems to return function declarations (f <- function () {}) in the expression, and when filtering function calls ('+' (1,2), ...).

Is there any function - in the main libraries or elsewhere - that there will be a flag for 'this_is_a_function' as a function, not a name? It should work on arbitrary expressions that are syntactically legal, but can be incorrectly evaluated (for example, "+" (1, "duck"))

I found similar questions , but they don't seem to contain a solution.

If clarification is required, leave a comment below. I use the parser package to parse expressions.

Edit: @Hadley

I have expressions containing whole scripts, which usually consist of a main function containing definitions of nested functions, with a call to the main function at the end of the script.

All functions are defined inside expressions, and I do not mind if I need to include '<-' and '{', since I can easily filter them myself.

The motivation is to take all my R-scripts and collect basic statistics on how my use of functions has changed over time.

Edit: current solution

The regular expression approach captures function definitions combined with the method in James's comment for calling function calls. Usually works, since I never use the right assignment.

 function_usage <- function(code_string){ # takes a script, extracts function definitions require(stringr) code_string <- str_replace(code_string, 'expression\\(', '') equal_assign <- '.+[ \n]+<-[ \n]+function' arrow_assign <- '.+[ \n]+=[ \n]+function' function_names <- sapply( strsplit( str_match(code_string, equal_assign), split = '[ \n]+<-'), function(x) x[1]) function_names <- c(function_names, sapply( strsplit( str_match(code_string, arrow_assign), split = '[ \n]+='), function(x) x[1])) return(table(function_names)) } 
+4
source share
2 answers

Short answer: is.function checks if a variable really contains a function. This does not work on (unvalued) calls, because these are calls. You also need to take care of disguise:

 mean <- mean (x) 

Longer answer:

IMHO there is a big difference between the two occurrences of this_is_a_function .

In the first case, you assign the function to a variable named this_is_a_function after evaluating the expression. The difference is the same difference as between 2+2 and 4 .
However, simply looking for <- function () does not guarantee that the result is a function:

 f <- function (x) {x + 1} (2) 

The second occurrence is a syntax function call. You can determine from the expression that a variable named this_is_a_function that contains the function must exist in order for the call to be evaluated correctly. BUT: you do not know if this exists from this statement. however, you can check if such a variable exists and whether it is a function.

The fact that functions are stored in variables, such as other data types, also means that in the first case you can know that the result of function () will be a function, and it follows that immediately after evaluating this expression, a variable named this_is_a_function will contain a function.

However, R is full of names and functions: "->" is the name of the assignment function (a variable containing the assignment function) ...

After evaluating the expression, you can check it for is.function (this_is_a_function) . However, this is by no means the only expression that returns a function: Think about

 f <- function () {g <- function (){}} > body (f)[[2]][[3]] function() { } > class (body (f)[[2]][[3]]) [1] "call" > class (eval (body (f)[[2]][[3]])) [1] "function" 

all.vars (expr, functions = FALSE) seems to return function declarations (f <- function () {}) in the expression, and when filtering function calls ('+' (1,2), ...).

I would say that it is the other way around: in this expression f there is a variable (name) that will be assigned to the function (after calculating the call). + (1, 2) is evaluated as a number. If you do not.

 e <- expression (1 + 2) > e <- expression (1 + 2) > e [[1]] 1 + 2 > e [[1]][[1]] `+` > class (e [[1]][[1]]) [1] "name" > eval (e [[1]][[1]]) function (e1, e2) .Primitive("+") > class (eval (e [[1]][[1]])) [1] "function" 
+4
source

Instead of looking for function definitions that would be virtually impossible to do correctly without actually evaluating the functions, it will be easier to look for function calls.

The following function recursively confuses the expression / call tree, returning the names of all objects called as a function:

 find_calls <- function(x) { # Base case if (!is.recursive(x)) return() recurse <- function(x) { sort(unique(as.character(unlist(lapply(x, find_calls))))) } if (is.call(x)) { f_name <- as.character(x[[1]]) c(f_name, recurse(x[-1])) } else { recurse(x) } } 

It works as expected for a simple test case:

 x <- expression({ f(3, g()) h <- function(x, y) { i() j() k(l()) } }) find_calls(x) # [1] "{" "<-" "f" "function" "g" "i" "j" # [8] "k" "l" 
+2
source

All Articles