Can Z3 be used to define substrings?

Question

Can Z3 be used to define substrings?

I am trying to use Z3 to talk about substrings, and have come across some unintuitive behavior. Z3 returns “sat” when asked to determine if “xy” appears inside “xy”, but it returns “unknown” when asked if “x” is in “x” or “x” is in “xy” "

I commented on the following code to illustrate this behavior:

(set-logic AUFLIA) (declare-sort Char 0) ;characters to build strings are _x_ and _y_ (declare-fun _x_ () Char) (declare-fun _y_ () Char) (assert (distinct _x_ _y_)) ;string literals (declare-fun findMeX () (Array Int Char)) (declare-fun findMeXY () (Array Int Char)) (declare-fun x () (Array Int Char)) (declare-fun xy () (Array Int Char)) (declare-fun length ( (Array Int Char) ) Int ) ;set findMeX = 'x' (assert (= (select findMeX 0) _x_)) (assert (= (length findMeX) 1)) ;set findMeXY = 'xy' (assert (= (select findMeXY 0) _x_)) (assert (= (select findMeXY 1) _y_)) (assert (= (length findMeXY) 2)) ;set x = 'x' (assert (= (select x 0) _x_)) (assert (= (length x) 1)) ;set xy = 'xy' (assert (= (select xy 0) _x_)) (assert (= (select xy 1) _y_)) (assert (= (length xy) 2))

Now that the problem is configured, we are trying to find the substrings:

 ;search for findMeX='x' in x='x' (push 1) (assert (exists ((offset Int)) (and (<= offset (- (length x) (length findMeX))) (>= offset 0) (forall ((index Int)) (=> (and (< index (length findMeX)) (>= index 0)) (= (select x (+ index offset)) (select findMeX index))))))) (check-sat) ;'sat' expected, 'unknown' returned (pop 1)

If instead search findMeXY in xy , then the solver returns 'sat', as expected. It would seem that since the solver can process this request for "xy", it should be able to process it for "x". Also, when searching for findMeX='x' in 'xy='xy' it returns "unknown".

Can someone suggest an explanation, or perhaps an alternative model for expressing this problem in the SMT solver?

+8

smt z3

Katie Aug 19 '11 at 20:06

source share

1 answer

Leonardo de moura · Accepted Answer · 2011-08-22T21:54:23+0000

Short answer for observed behavior: Z3 returns "unknown" because your statements contain quantifiers.

Z3 contains many procedures and heuristics for processing quantifiers. Z3 uses a method called Instant-Based Quantifier Instantiation (MBQI) to create models to satisfy requests like yours. The first step is to create an interpretation of the candidate based on the interpretation that satisfies the free statements of the quantifier. Unfortunately, in the current Z3, this step does not interact seamlessly with array theory. The main problem is that the interpretation based on the theory of arrays cannot be modified by this module.

The fair question is: why does it work when we remove push / pop commands? It works because Z3 uses more aggressive simplification (preprocessing) procedures when incremental solution commands (such as push and pop commands) are not used.

I see two possible workarounds for your problem.

You can avoid quantifiers and continue to use array theory. This is possible in your example, since you know the length of all the "lines". This way you can expand the quantifier. Although this approach may seem inconvenient, it is used in practice and in many validation and testing tools.
You can avoid the theory of arrays. You declare the string as an uninterpreted sort, for example, for Char. Then you declare a char -of function, which should return the ith character of the string. You can axiomatize this operation. For example, you can say that two strings are equal if they are the same length and contain the same characters:

 (assert (forall ((s1 String) (s2 String)) (=> (and (= (length s1) (length s2)) (forall ((i Int)) (=> (and (<= 0 i) (< i (length s1))) (= (char-of s1 i) (char-of s2 i))))) (= s1 s2))))

The following link contains your script encoded using the second approach: http://rise4fun.com/Z3/yD3

The second approach is more attractive and allows you to prove more complex properties about strings. However, it is very easy to write feasible quantified formulas that Z3 will not be able to build a model. The Z3 Guide describes the main features and limitations of the MBQI module. It contains resolvable fragments that can be processed by Z3. By the way, note that the theory of a falling array will not be a big problem if you have quantifiers. The guide shows how to code arrays using quantifiers and functions.

Further information on how MBQI works can be found in the following documents:

Can Z3 be used to define substrings?

More articles: