I just started playing with Regex and it seems to be a bit stuck! I wrote a massive find and replaced using multi-line text in TextSoap. This is for cleaning recipes that I have OCR'd, and because there are Ingredients and Directions . I cannot change "1" to become "1.", as this can rewrite "1 Tbsp" as "1 Tbsp".
So I checked to see if the following two lines (possibly with extra lines) were the next consecutive numbers, using this code as find:
^(1) (.*)\n?((\n))(^2 (.*)\n?(\n)^3 (.*)\n?(\n)) ^(2) (.*)\n?((\n))(^3 (.*)\n?(\n)^4 (.*)\n?(\n)) ^(3) (.*)\n?((\n))(^4 (.*)\n?(\n)^5 (.*)\n?(\n)) ^(4) (.*)\n?((\n))(^5 (.*)\n?(\n)^6 (.*)\n?(\n)) ^(5) (.*)\n?((\n))(^6 (.*)\n?(\n)^7 (.*)\n?(\n))
and as a replacement for each of the above:
$1. $2 $3 $4$5
My problem is that although it works the way I wanted it, it will never complete the task for the last three numbers ...
Example text I want to clear:
1 This is the first step in the list 2 Second lot if instructions to run through 3 Doing more of the recipe instruction 4 Half way through cooking up a storm 5 almost finished the recipe 6 Serve and eat
And I want it to look like this:
1. This is the first step in the list 2. Second lot if instructions to run through 3. Doing more of the recipe instruction 4. Half way through cooking up a storm 5. almost finished the recipe 6. Serve and eat
Is there a way to check the previous line or two above to run this backwards? I looked at the look and looked at me, and I'm a little confused about this. Does anyone have a way to clear the list of numbered pages or help me with the regex that I want, please?
source share