Simply "THE" BBCode parser.

Oct 21, 2011 at 8:21 AM

I came across this parser by chance, because a parser I already used had too many problems to solve, and tough it needed a little time to understand, it worked then like a charm.

A lot of people is complaining about the parser not being "ERROR FREE". That is completely WRONG. The parser is error free, but works normally in Error.Strict mode, so it give errors for everything not properly written...unknown tags, tags without closing and so on.

BUT. Once you set the Error.ErrorFree mode, it will work as expected. To set it, you'll need to run it with a custom parser like this:


var parser = new BBCodeParser(ErrorMode.ErrorFree,null,new[]
                    new BBTag("b", "<b>", "</b>"), 
                    new BBTag("i", "<span style=\"font-style:italic;\">", "</span>"), 
                    new BBTag("u", "<span style=\"text-decoration:underline;\">", "</span>"), 
                    new BBTag("code", "<pre class=\"prettyprint\">", "</pre>"), 
                    new BBTag("img", "<img src=\"${content}\" />", "", false, true), 
                    new BBTag("quote", "<blockquote>", "</blockquote>"), 
                    new BBTag("list", "<ul>", "</ul>"), 
                    new BBTag("*", "<li>", "</li>", true, false), 
                    new BBTag("url", "<a href=\"${href}\">", "</a>", new BBAttribute("href", ""), new BBAttribute("href", "href")), 



Very Strong points:

- Based on a Tree of tags. So it will understand and properly write nested tags.

- Understands tags with start and closing in different lines. That seems normal, but most of the bbcode parser based on RegEx out there doesn't work with multiline.

- in Error Free mode, it will get everything and try to parse it, without errors. It will even add at the end of the pared text all the missing closing tags. REALLY IMPRESSIVE.

Weak points:

- The lack of documentation or complex examples. I needed to download the source code and execute it step by step to understand it.

- Missing a nice line break management (<br/>). Still, you can do it simply by adding a BR tag to your custom parser like this:


  new BBTag("br", "<br/>", "",true,false), 


and replacing all \r\n with [br] before parsing like this:


   text= text.Replace("\r\n", "[br]");
   return parser.ToHtml(text);



Oct 21, 2011 at 12:43 PM

Thanks for the review!

Error Mode: I will change the default. This trips people over all the time. I originally intended to use the parser in strict mode myself but it turned out people wanted to type "[i]" (array indexing) without escaping the braces. So I gave up.

"The lack of documentation or complex examples" Yeah that is true. However, the example on the homepage is the main intended usage. Most people should be able to adapt it.

Line break management: This issue has caused us a lot of headache. I now think the correct way to handle this is to make every TextNode emitted inside of a p-Tag. That should solve it. Remember, that you can transform the syntax-tree to your liking. You can also change the way everything is output.

As a final note: I despise regex-based parsing. It causes an endless stream of problems as well as security holes (passing JS in an img url for example). For the same reason, you cannot sanitize HTML with regex'es (I have created a syntax-tree based HTML sanitizer/rewriter as well, for the RSS-based news section on I might open-source it, too.

Oct 21, 2011 at 1:18 PM
Edited Oct 21, 2011 at 1:20 PM

If you like it, I made an Italian Language resource file (not that I need it, since in ErrorFree mode they never pops up). How can I send it to you?

edit: ok thanx, I got the email, I'm sending them to you. Bye

Oct 21, 2011 at 1:46 PM

I have released a new version which is ErrorFree by default (breaking change) and contains the Italian language resources. Thanks for contributing!