In Clojure, when trees with heterogeneous node types are represented using records or vectors?

Question

In Clojure, when trees with heterogeneous node types are represented using records or vectors?

What is the best idiomatic clojure practice for representing a tree consisting of different types of node:

but. building trees from several different types of records that define the use of deftype or defrecord:

(defrecord node_a [left right]) (defrecord node_b [left right]) (defrecord leaf []) (def my-tree (node_a. (node_b. (leaf.) (leaf.)) (leaf.)))

C. building trees from vectors, with keywords denoting types:

 (def my-tree [:node-a [:node-b :leaf :leaf] :leaf])

The most clojure code that I see seems to promote the use of general-purpose data structures (vectors, maps, etc.), rather than data types or records. Hiccup, to take one example, presents html very nicely using a vector + keyword approach.

When do we prefer one style over another?

+6

idioms types clojure tree

Rob lachlan Nov 04 '10 at 21:32

source share

2 answers

Brian carper · Answer 1 · 2010-11-04T22:02:56+0000

You can put as many elements in the vector as you want. A record has a specified number of fields. If you want to limit your nodes to only N sub-nodes, entries can be good, for example. when a binary tree, where a node should have only left and right. But for something like HTML or XML, you probably want to support an arbitrary number of sub-nodes.

Using vectors and keywords means that “expanding” a set of supported node types is as simple as putting a new keyword in a vector. [:frob "foo"] OK in Hiccup, even if its author had never heard of frobbing. Using records, you will potentially need to define a new record for each node type. But then you get the advantage of catching typos and checking trays. [:strnog "some bold text?"] will not be caught by Hiccup, but (Strnog. "foo") will be a compile-time error.

Vectors, which are one of the main Clojure data types, you can use Clojure's built-in functions to control them. Want to expand your tree? Just conj on it or update-in or whatever. You can grow your tree gradually in this way. With records, you are probably stuck in constructor calls, otherwise you need to write a ton of shell functions for constructors.

It seems like this partially boils down to the dynamic and static argument. Personally, I would go along the dynamic route (vector + keyword) if there was no specific need for the benefits of using records. Most likely, it is easier to program this method, and it will become more flexible for the user due to the fact that the user will ultimately create a mess. But Clojure users are probably used to regularly handling dangerous weapons. Clojure is a largely dynamic language, it is often important to stay dynamic.

Alex miller · Answer 2 · 2010-11-05T02:56:01+0000

This is a good question. I think both of them are suitable for various problems. Nested vectors are a good solution if each node can contain a variable set of information - in particular, template systems will work well. Entries are a good solution for a small number of fixed node types, where nesting is much more limited.

We work a lot with heterogeneous record trees. Each node is one of several known types, each of which has a different set of known fixed keys. In this case, the reason entries are better, since you can select data from node by key, which is O (1) (actually the Java method call is very fast), and not O (n) (where you have to view the contents of node ), and it’s also usually easier to access.

The entries in 1.2 are imho not quite “finished,” but it's pretty easy to compile this stuff yourself. We have defrecord2 , which adds constructor functions (new-foo), field validation, print support, print support, tree navigation support / editing through lightning, etc.

An example of where we use this is to represent AST or execution plans, where nodes can be like Join, Sort, etc.

Vectors will be better at creating things like strings where you can place an arbitrary number of things on each node. If you can fill out 1+ <s> inside the <div>, then you cannot create a record containing the field: p, which just doesn't make any sense. This is the case when vectors are much more flexible and idiomatic.

In Clojure, when trees with heterogeneous node types are represented using records or vectors?

More articles: