Even easier in Swift 3
utterance.preUtteranceDelay = 1.0
or
utterance.postUtteranceDelay = 1.0
for one second of delay, assuming that each number is its own statement (as in a loop). You may need to slightly reduce the delay to take into account the actual duration of the conversation for each number.
source share