First of all, you should never read clojure code directly from untrusted data sources. Instead, you should use EDN or another serialization format.
Considering that with clojure 1.5 there is a safe way to read lines without crawling them. Before using the read string, you must bind read-eval var to false. In clojure 1.4 and earlier, this potentially led to side effects caused by calling java constructors. Since then, these problems have been fixed.
Here is a sample code:
(defn read-string-safely [s] (binding [*read-eval* false] (read-string s))) (read-string-safely "#=(eval (def x 3))") => RuntimeException EvalReader not allowed when *read-eval* is false. clojure.lang.Util.runtimeException (Util.java:219) (read-string-safely "(def x 3)") => (def x 3) (read-string-safely "#java.io.FileWriter[\"precious-file.txt\"]") => RuntimeException Record construction syntax can only be used when *read-eval* == true clojure.lang.Util.runtimeException (Util.java:219)
As for the macro reader
The send macro (#) and tagged literals are invoked while reading. There is no data for them in clojure, because by this time all these constructs have been processed. As far as I know, there is no way to build a clojure code tree.
To save this information you will have to use an external parser. Either you roll your own parser, or you can use a parser generator such as Instaparse and ANTLR. A complete clojure grammar for any of these libraries can be hard to find, but you can extend one of the EDN grammars to include additional forms of clojure. A quick google showed the ANTLR grammar for clojure syntax , you can change it to support a construct that is missing if necessary.
There is also a Sjacket library created for clojure tools, which should contain information about the source code itself. This is similar to what you are trying to do, but I have no experience with him personally. Judging by the tests, it supports the reader macro in its parser.
Dirk geurs
source share