Regex break bbcode into pieces

Question

Regex break bbcode into pieces

I have it:

str = "some html code [img]......[/img] some html code [img]......[/img]"

and I want to do this:

 ["[img]......[/img]","[img]......[/img]"]

0

split ruby regex bbcode

squarezw 24 sept '10 at 16:14

source share

4 answers

Please do not use bbcode. This evil.

BBCode came to life when developers were too lazy to parse HTML properly and decided to invent their own markup language. Like all laziness products, the result is completely inconsistent, non-standard and widely accepted.

Try using a more user-friendly markup language like Markdown (which uses stack overflow) or Textile . Both have parsers for Ruby:

Maruku for Markdown
RedCloth for Textile

If you still do not want to follow my advice and want to go with BBCode, do not reinvent the wheel and use the BBCode parser . To answer your question directly, there is the least desirable option: use a regular expression.

 /\[img\].*?\[\/img\]/

As seen on rubular . Although I would use /\[img\](.*?)\[\/img\]/ , so it will extract the contents inside the img tags. Note that this is pretty fragile and breaks if there are nested img tags. Therefore, the advice is to use a parser.

+45

NullUserException 25 sept. '10 at 2:49

source share

Google Code has a BBCODE ruby analyzer .

Do not use a regular expression for this.

+4

Tomalak 24 sept '10 at 16:17

source share

 str = "some html code [img]......[/img] some html code [img]......[/img]" p str.split("[/img]").each{|x|x.sub!(/.*\[img\]/,"")}

-1

ghostdog74 25 sept. '10 at 2:40

source share

Yaser sulaiman · Accepted Answer · 2010-09-24T19:35:07+0000

 irb(main):001:0> str = "some html code [img]......[/img] some html \ code [img]......[/img]" "some html code [img]......[/img] some html code [img]......[/img]" irb(main):002:0> str.scan(/\[img\].*?\[\/img\]/) ["[img]......[/img]", "[img]......[/img]"]

Keep in mind that this is a very specific answer based on your exact question. Change str , say, adding an image tag inside the image tag, and all hell breaks .

Regex break bbcode into pieces

More articles: