Two alternatives:
s = "[test| blah] \n [foo |bar bar bar]\n[test| abc |123 | 456 789]" s.split(/\s*\n\s*/).map{ |p| p.scan(/[^|\[\]]+/).map(&:strip) } #=> [["test", "blah"], ["foo", "bar bar bar"], ["test", "abc", "123", "456 789"]] irb> s.split(/\s*\n\s*/).map do |line| line.sub(/^\s*\[\s*/,'').sub(/\s*\]\s*$/,'').split(/\s*\|\s*/) end #=> [["test", "blah"], ["foo", "bar bar bar"], ["test", "abc", "123", "456 789"]]
Both of them begin with line breaks (discarding surrounding spaces).
The first breaks each fragment, looking for everything that is not [ , | or ] , and then throws extra spaces (calls strip for each).
Then the second discards the leading [ and the ending ] (with a space), and then splits into | (with a space).
You cannot get the final result you want with a single scan . About the near you can get this:
s.scan /\[(?:([^|\]]+)\|)*([^|\]]+)\]/ #=> [["test", " blah"], ["foo ", "bar bar bar"], ["123 ", " 456 789"]]
... that conveys information, or this:
s.scan /\[((?:[^|\]]+\|)*[^|\]]+)\]/ #=> [["test| blah"], ["foo |bar bar bar"], ["test| abc |123 | 456 789"]]
... which captures the contents of each "array" as one capture, or this:
s.scan /\[(?:([^|\]]+)\|)?(?:([^|\]]+)\|)?(?:([^|\]]+)\|)?([^|\]]+)\]/ #=> [["test", nil, nil, " blah"], ["foo ", nil, nil, "bar bar bar"], ["test", " abc ", "123 ", " 456 789"]]
... which is hard-coded for a maximum of four elements and inserts nil entries that you will need .compact .
It is not possible to use Ruby scan to take a regular expression, for example /(?:(aaa)b)+/ , and get a few captures each time a repeat is performed.