Take on Recursion

My take on a recursive Elisp function for accumulating a list while walking up an AST

Created on [2023-04-01], last updated [2023-06-04]

Over at Take on Rules, Jeremy Friesen has a couple of recent posts about an Emacs command he wrote for grabbing the qualified name of the function at point in Ruby files.

I don’t know much about Ruby, but the way he implemented this command caught my attention because he’s relying on the new treesit package from Emacs 29 to examine the syntax tree of his Ruby code, and I’ve been playing around with this nice parsing framework myself recently.

Apparently, Ruby functions can reside within nested modules. Jeremy’s command thus leverages the function treesit-defun-at-point to obtain the syntax tree node corresponding to the Ruby function at point, and calls a recursive helper function called jf/treesit/module_space that walks up the syntax tree and accumulates the names of the enclosing modules it finds along the way to the root of the syntax tree. There’s just a tiny problem which is that this helper function accumulates the module names in a rather awkward manner… As Jeremy notes:

The list returned by jf/treesit/module_space is '(nil ("Hello" ("World"))); which is a ugly but workable. Perhaps someone will write to me with a refactor of this code.

Let’s have a look at this function’s implementation:

(defun jf/treesit/module_space (node)
  (when-let* ((parent (treesit-parent-until
                       node
                       (lambda (n) (member (treesit-node-type n)
                                           '("class" "module" "assignment")))))
              (parent_name (treesit-node-text
                            (car
                             (treesit-filter-child
                              parent
                              (lambda (n)
                                (member (treesit-node-type n)
                                        '("constant" "scope_resolution"))))))))
    (list (jf/treesit/module_space parent) parent_name)))

Basically what this does is climbing up the syntax tree with treesit-parent-until to find the next module boundary (if there is one) and then grabbing its name with treesit-node-text. Now all that’s left is to perform a recursive call and return its result along with the found module name, but the way it bundles them together as two elements of a list causes the result to be a nested tree structure, instead of a simple list. Then, to obtain the desired list of module names, the calling command needs to apply the -flatten function to the tree that jf/treesit/module_space returns.

Here’s my take on this function:

(defun esy/treesit/module_space (node &optional acc)
  (if-let ((parent (treesit-parent-until
                    node
                    (lambda (n) (member (treesit-node-type n)
                                        '("class" "module" "assignment")))))
           (parent_name (treesit-node-text
                         (car
                          (treesit-filter-child
                           parent
                           (lambda (n)
                             (member (treesit-node-type n)
                                     '("constant" "scope_resolution"))))))))
      (esy/treesit/module_space parent (cons parent_name acc))
    acc))

It’s extremely similar to the original version, but the crucial difference is that we use an accumulator argument ACC to hold the result of our syntax tree traversal as we climb up to the root. At each step in the way, ACC holds a list of module names which is the cdr of the list that we’ll eventually return. We extend this list with the name of the next parent module during the recursive call, and when we reach the root of the syntax tree we simply return the accumulated list.

For example, if we have Ruby code that looks something like this:

module Foo
  module Bar
    module Baz
      def spam
        :true
      end
    end
  end
end

With point inside the function definition we get a simple list:

(esy/treesit/module_space (treesit-defun-at-point))
⇒ ("Foo" "Bar" "Baz")

Which is a little cleaner than what we get with jf/treesit/module_space:

(jf/treesit/module_space (treesit-defun-at-point))
⇒ (((nil "Foo") "Bar") "Baz")