Wednesday, November 07, 2007

SnTT: NotesSAX Parsing

Jake was asking about how to filter out malicous tags and code, I suggested using the NotesSAXParser after the HTML is converted to XHTML. I'm suprised that no one uses this as an approach, I mean fundementally XHTML is XML and the right tools for parsing this is the NotesSAXParser and NotesDOMParser. I figured that maybe there has not been a example of its use. So here is a sample class pulled right out of our home grown content management system - which uses NotesSAXParser - a little early for Show-and-Tell Thursdays...


  1. Looks interesting Tony. Have you got an example of its use?

  2. Here is the XML that describes the home page for homepage.xml.
    The RenderEngine takes the xml which is a mix of XHTML and the CMS tags to build the HTML.
    Is that what you mean ?

  3. That's seems like an ingenious approach to website templating.

    The power of the Sax-parser combined with the ability to fill in content on a node to node basis.

    Regarding why no one other uses it, Notes/Domino is a giant toolbox. There are several "right tools". Once you find one that is fast/functional enough, it's easy to stop looking for a better tool.

    A couple of questions:
    Is the HTML for a page created on WQO?

    Have you used other ways of templating before/do you have performance comparisons?

    If someone messes up (if that's possible), and enters something like this <<div>, how does the rendering engine handle this?

  4. Tommy, Yes a WQO generates the content. I've used templating before for email generation using tags like ${doc:fieldname} but not for HTML generation. So I don't have any comparisons. I did look at the page generation performance for the websites that I've developed and way happy with the response time. However, the websites that I have developed didn't have a large site map - That would impact performance. If I get to that stage then I'll add in some caching.

    I've spent time coding JSP for plain web apps on Webshere and Websphere Portal skins and themes, so I was looking for something that was more align with open standards. There seems to be a number of different ways that templating has been approached, like project dx - but again not a standard approach outside of the Notes/Domino world.

    The render engine displays the SAX errors in-line with the HTML and some diagnostics - that way you can see straigtht away where the problems is - this include invalid XML too. So your example would be displayed as an error.

    You need to ensure that you have correct XML - which can be a constraint sometimes - but not a bad thing in reality.