Self-referential Seqs Considered Dangerous

Published on Saturday, 2013-07-27

There was a lively conversation on freenode’s Clojure channel a few days ago about strange behavior involving self-referential chunked seqs.

The normal gotcha with chunked seqs is that more elements of the sequence may be realized than you expect, which is usually only noticeable if your lazy sequence has side-effects associated with its realization:

(take 3 (map print (range)))

;; returns (nil nil nil) and prints
;; 012345678910111213141516171819202122232425262728293031

Note that the return value here ((nil nil nil)) is as expected, since print returns nil, but the associated side effects are not.

The discussion centered around an example where the return value itself is unexpected, due to the sequence being self-referential. A self-referential immutable data structure is only possible with some flavor of delayed evaluation – in this case, we use lazy seqs and vars. Haskell achieves self-referentiality using the lazy nature of the entire execution model, but while this is a common tactic in Haskell, it is not a common one in Clojure (for reasons that should become apparent).

Self-referential Vanilla Seqs

For a simple example of the Clojure approach, we can try to define the Fibonacci series self-referentially:

(def fibs (list* 0 1 (map + fibs (rest fibs))))
;; IllegalArgumentException Don't know how to create ISeq from:
;; clojure.lang.Var$Unbound  clojure.lang.RT.seqFrom (RT.java:505)

This fails when the compiler tries to evaluate (list* 0 1 (map + fibs (rest fibs))) before assigning any value to #'fibs. The var exists, but has an “Unbound” value, and the rest function is trying to treat it as a seq right away.

We can try to delay evaluation a little bit by wrapping the expression in lazy-seq:

(def fibs2 (lazy-seq (list* 0 1 (map + fibs2 (rest fibs2)))))

(take 10 fibs2)
;; StackOverflowError   user/fn--4171 (NO_SOURCE_FILE:1)

This one surprised me for a few minutes. We successfully suppress any errors at compile time, but when we try to get some of the elements the same kind of issue as before pops up – when we try to evaluate the body of the lazy-seq expression, the rest function is trying to walk a step or two into the sequence, which requires evaluating the body of the lazy-seq expression, which requires…

Okay, so since the culprit is the rest function being a little too eager, presumably we can just wrap that bit in lazy-seq to get the behavior we need:

(def fibs3 (list* 0 1 (map + fibs3 (lazy-seq (rest fibs3)))))

(take 10 fibs3)
;; IllegalArgumentException Don't know how to create ISeq from:
;; clojure.lang.Var$Unbound  clojure.lang.RT.seqFrom (RT.java:505)

This one I was sure would work. Note that (take 2 fibs3) works fine here, so the exception is not thrown until the lazy-seq internal to the map function is being forced. The culprit this time is not the third argument to map but the second – since fibs3 is unbound at the time the arguments to map are being evaluated, the thing passed to map is the “unbound” value which doesn’t change later on. So we need to delay that as well:

(def fibs4 (list* 0 1 (map + (lazy-seq fibs4)
                             (lazy-seq (rest fibs4)))))

(take 10 fibs4)
;; (0 1 1 2 3 5 8 13 21 34)

Phew. It’s worth mentioning that we can cover both args at once by lifting the lazy-seq up around the call to map:

(def fibs5 (list* 0 1 (lazy-seq (map + fibs5 (rest fibs5)))))

(take 10 fibs5)
;; (0 1 1 2 3 5 8 13 21 34)

So self-referential seqs are possible if you’re careful about what gets evaluated when. If this weren’t argument enough to avoid the technique, it gets worse when we introduce chunked sequences.

Self-referential Chunked Seqs

Chunked sequences are less lazy, because they realize their elements in groups of 32 for better performance. Due to implementation details this leads to problems when an element of a sequence depends on the preceding element and both elements happen to be part of the same chunk. This might not be so bad if it blew up like in the previous examples, but surprisingly we can silently get incorrect results.

I had a hard time coming up with a “natural” example of this, so instead we’ll have to settle for a contrived example. We’ll define a sequence of numbers where all but the first several numbers are filtered from (range) by checking if the element of the sequence at the index (rem n 10) is a number (which it always should be).

(def foos
  (list* -3 -2 -1
            ;; we expect this function to always return true
    (filter #(number?
               ;; Using nth here instead throws an
               ;; IndexOutOfBoundsException.  Why is left
               ;; as an exercise for the reader.
               (first (drop (rem % 10) foos)))
            (range))))

(take 20 foos)
;; => (-3 -2 -1 0 1 2 10 11 12 20 21 22 30 31 32 33 34 35 36 37)

We expect the result to be (-3 -2 -1 0 1 2 3 4 5 ...), but instead we see a gap from 3 to 9. Just to verify that this is caused by chunking, we can tweak the code to use (iterate inc 0), which is an unchunked equivalent of (range):

(def foos2
  (list* -3 -2 -1
    (filter #(number?
               (first (drop (rem % 10) foos2)))
            (iterate inc 0))))

(take 20 foos2)
;; => (-3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16)

What’s going on is that the first chunk of (range) is realized and run through the filter function before the chunk being returned from filter is properly populated, and as a result the expression (first (drop (rem % 10) foos)) is returning nil. So the effect is the elements of the sequence temporarily appear as if they were nil.

As indicated in a comment above, sometimes this sort of thing can instead cause an IndexOutOfBoundsException, and certainly in a lot of numeric cases an unexpected nil will mean a NullPointerException. So it won’t always silently give incorrect results.

Self-reference is an innocent-enough technique in Haskell due to the fully-lazy execution model. Clojure’s partial laziness provides a handful of various ways to screw yourself up, and so you’re much better off avoiding self-reference altogether.

Epilogue

It’s worth distinguishing the sort of self-reference talked about above from a function that returns a lazy sequence which is internally dependent on the return value from a separate (recursive) call of the same function. A great example of this is computing primes, which by one strategy requires having a list of smaller primes to test for divisibility.

We can do this with a function that returns a list of primes (using defn instead of def), which internally calls itself. You still have to be careful with this kind of recursion (see the comment in the code for how chunking is a danger here), but I think it’s a little easier to reason about and avoids some of the self-reference problems discussed above.

(defn primes
  []
  (cons 2
    (lazy-seq
      ;; don't make the recursive call until the
      ;; lazy-seq is forced
      (let [ps (primes)]
              ;; using (drop 3 (range)) here gives a
              ;; StackOverflowError due to chunking
        (for [n (iterate inc 3)
              :let [root (Math/sqrt n)]
              :when (->> ps
                         (take-while #(<= % root))
                         (filter #(zero? (rem n %)))
                         (empty?))]
          n)))))

(nth (primes) 9999) ;; => 104729

Even less problematic (and actually idiomatic) is a function that returns a lazy-seq and uses a recursive call for the tail, like this simplified version of filter:

(defn filter
  [func coll]
  (lazy-seq
    (when-let [[x & xs] (seq coll)]
      (if (func x)
        (cons x (filter func xs))
        (filter func xs)))))

Thanks to @ltw_ for helping this all to make more sense.