I've release a new version of Postmarkup, my bbcode rendering engine for Python. If you are not familiar with bbcode, it is a simple markup used by many message boards. For example [b]Hello, World![/b] would render Hello, World!

There are a number of bugfixes in the 1.1.4 release, mostly to fix the possibility of HTML injection by manipulation of the tags and attributes. The link tag was particularly problematic for this, so it has been re-written. I've also made a number of optimizations so that it will render HTML faster. It wasn't exactly slow, but I have noticed that most people use Postmarkup as a filter in web frameworks (rather than storing the pre-rendered HTML in a database), so the speed boost may be appreciated.

I've also added the option to turn new lines in to paragraphs rather than inserting break tags. Break tags are a little more literal, in that the bbcode author will get what they expect in the output when they hit the return key, but paragraph tags make for more elegant markup that can be styled a little easier. Another difference with paragraph tags is that multiple newlines will result in only a single paragraph.

Another new feature is the ability to run the resulting html through a cleanup filter that removes redundant markup, which can be produced if the bbcode author doesn't explicitly close tags. The markup will still be valid, but it may contain something like <b> </b>, which doesn't do anything useful. Incidentally, I was kind of pleased with the method that does this -- it seemed almost too simple. Here's the code, let me know if you come up with a better way!

# Matches simple blank tags containing only whitespace
    _re_blank_tags = re.compile(r"\< (\w+?)\>\s*\")

    @classmethod
    def cleanup_html(cls, html):
        """Cleans up html. Currently only removes blank tags, i.e. tags containing only
        whitespace. Only applies to tags without attributes. Tag removal is done
        recursively until there are no more blank tags. So <strong><em></em></strong>
        would be completely removed.

        html -- A string containing (X)HTML

        """

        original_html = ''
        while original_html != html:
            original_html = html
            html = cls._re_blank_tags.sub(u"", html)
        return html

Yet another new feature is the ability to retrieve additional information generated when the bbcode is rendered. When the render_to_html method is called it creates a dictionary which the tag classes can use to store any additional data needed when rendering. This dictionary was discarded after rendering, but now the interface allows for an alternative dictionary to be supplied so that it can be accessed after rendering. This could be used to create tags that supply meta information and don't contribute to the resulting HTML. For instance, if this blog post was using postmarkup, it might be nice to do something like [tags] python, postmarkup, code, tech [/tags] or [template] halloween.html [/template].

Postmarkup is licensed under my politeware license, which allows you to do anything at all you want with it, as long as you say thanks.

This blog post was posted to It's All Geek to Me on Saturday November 1st, 2008 at 2:54PM
 

6 Responses to "New version of Postmarkup"

  • Jason Peddle
    November 14th, 2008, 12:08 a.m.

    This is perfection, and seemingly braindead to extend. Thank you.

  • November 15th, 2008, 2:24 p.m.

    I use this library within my websites and I have one feature request:
    Making a "br" tag from single newline symbol and paragraph from two (or more) symbols.

    What do you think of it? I'd even like to join the project for making few features :)

  • November 30th, 2008, 11:07 a.m.

    Are you planning to add support for smilies that are present in phpbb databases?

    An example:

    <!-- s:D --><img src="{SMILIES_PATH}/icon_biggrin.gif" alt=":D" title="Very Happy" /><!-- s:D -->

  • Ralph Corderoy
    December 24th, 2008, 2:26 p.m.

    The code is getting mangled. The definition of _re_blank_tags looks wrong; There's a backslash escaping the closing double quote. And how can it be a blank tag if it has \w+ in it?

  • January 26th, 2009, 1:19 a.m.

    Thanks, I am adapting this library to use it on a new website I'm working on at www.cafesurvey.com. I use TinyMCE so people who don't know HTML can edit fields but this will be useful for fields which are too small for the full-up TinyMCE editor.

  • Ahlywog
    February 25th, 2010, 9:32 p.m.

    @ Anatoliy

    But this is, hopefully, what you're looking for. His library is extensive enough that it provide the features necessary to do most anything.

    add_tag(LineBreakTag, 'br')


    class LineBreakTag(TagBase): # Ahlywog Contribution

    """A tag used to include line breaks in BB code. """

    def __init__(self, name):
    TagBase.__init__(self, name, inline=True)

    def render_open(self, parser, node_index):
    return u"<br />"

    Here is another I added for my own use; It allows you to call a function from within the open tag and pass it whatever is in the contents between the tag.

    So info => MyFunction(info)

    add_tag(FunctionReturnTag, 'fr')

    class FunctionReturnTag(TagBase): # Ahlywog Contribution

    """This tag allows you to specify a function in the params then the info to be passed to that function between the tags. """
    """All information sent to the function will be in the form of a tuple. """

    def __init__(self, name):
    TagBase.__init__(self, name, inline=True)

    def render_open(self, parser, node_index):
    output = u""
    if self.params:
    if self.get_contents(parser):
    self.skip_contents(parser)
    if self.params.strip() in dir(sys.modules['__main__']):
    function = getattr(sys.modules['__main__'], self.params.strip())
    args = self.get_contents(parser).strip().split(',')
    output = function(args)
    return output

    It's a little dirty but it works for now.

Leave a Comment

You can use bbcode in the comment: e.g. [b]This is bold[/b], [url]http://www.willmcgugan.com[/url], [code python]import this[/code]
Preview Posting...
Previewing comment, please wait a moment...

My Tweets

Will McGugan

My name is Will McGugan. I am an unabashed geek, an author, a hacker and a Python expert – amongst other things!

Search for Posts
Possibly related posts
Tags
 
Popular Tags
 
Archives
2010
 
Recent Comments
http://www.iclshoes.com/alexander-mcqueen-c-13.html [iclshoes.com] http://www.iclshoes.com/jimmy-choo-shoes-c-2.html [iclshoes.com] http://www.iclshoes.com/ [iclshoes.com] http://www.zentai-mart.com/Play-Costumes-c-5.html [zentai-mart.com] http://www.zentai-mart.com/Latex-Catsuits-Clothes-c-3.html [zentai-mart.com] http://www.zentai-mart.com/PVC-Catsuits-Clothes-c-6.html [zentai-mart.com] http://www.zentai-mart.com/ [zentai-mart.com] http://www.hereshoes.com/miu-miu-shoes-c-31.html [hereshoes.com] http://www.hereshoes.com/giuseppe-zanotti-c-43.html [hereshoes.com] http://www.hereshoes.com/lanvin-shoes-c-50.html [hereshoes.com] ...
- Christian Louboutin on Turning website favicons in to 3D
What are the charmings of?a href="http://www.iclshoes.com [iclshoes.com]cl shoes/a?They are quality,comfort and style.The a href=http://www.iclshoes.com/jimmy-choo-shoes-c-2.html [iclshoes.com]Jimmy Choo shoes/a are made from ...
- Christian Louboutin on Turning website favicons in to 3D
Jay, the reason Creationists, and their slightly more dishonest variant intelligent design, are bit of a worry, is their tendency ...
- Shayne O'Neill on Creationists in Oxford
Thanks a lot for that: I had first tried sudo aptitude purge adobe-flashplugin then sudo aptitude install flashplugin-nonfree but that ...
Andre, the name is derived from the class name (camel case converted to lower case with underscores). But, you can ...
 
© 2008 Will McGugan.

A technoblog blog, design by Will McGugan