Stripping Styles As Part Of Sanitation
By Adrian Sutton
Somewhere along the line I stumbled across Mark Pilgrim’s description of how the universal feed parser sanitizes HTML. A key part of this is that the universal feed parser strips styles because IE on Windows can wind up executing part of that style as JavaScript.
While obviously at the moment this how to be done, it seems completely unreasonable to me that any content that wants to be syndicated accurately needs to avoid CSS entirely. It seems to me that rather than stripping style information, we should be pressuring Microsoft (and any other affected browser vendors) to fix the browser so that it doesn’t ever treat information in CSS as executable code.
There are two aspects of RSS security to consider. Firstly and most seriously, there is the possibility of security risks posed by things like the embed, object, applet, script and a bunch of other tags. Out of the box these can allow cross-site scripting exploits and with user intervention (accepting a security dialog) can completely hose a user’s machine. The security dialog is a lot less protection than it would normally be because the user may not realize they are viewing syndicated content and that the site their on isn’t the source of the security dialog. These things are big dangerous security risks and obviously need to be stripped if any of the aggregated content sources is not as trusted as the end publication point1. Another point of entry here is that the syndicated content is effectively moved into a more trusted domain (eg: it’s viewed from your web server at localhost instead of from some random remote server on the web), thus it gets more permissions than it should otherwise. Stop trusting localhost. IE already has moved to treating any file on your hard drive as completely untrusted and preventing it from executing JavaScript etc. We need to treat all content coming into our browsers as if it came from a random site on the internet – the concept of different security zones in browsers has always been a hideously bad idea.
Secondly, there is the potential for syndicated content to mess up the display of the final site by including CSS. If we were in a world where executable code couldn’t be embedded in CSS, then the worst that could happen is the rendering is messed up content. No cross site scripting, no security dialogs popping up as if there were from a more trusted site, just a bunch of things laid out on the page wrong. The most serious case of this that I can think of is using CSS to overlay form elements, say for a username and password from syndicated content over the real login box on the page. Similarly, it would be possible to position HTML elements on the page such that they looked like they came from the trusted site instead of from the syndicated content.
If the final site was a public site like Google News, then such messing up of the content, even if it wasn’t devious would be extremely bad and obviously stripping style information is really the only way to go. If however, the final destination site is my news aggregator, then the chances are that I don’t really care. I know everything that is displayed comes from syndicated content and if something messes up the display I’ll drop that feed from my subscriptions. Web-based aggregators fall into the first category, but a planet aggregator I run locally falls into the second. Readers like NetNewsWire that display the stories separately instead of combining them are an even safer case – the content is completely delineated from everything else, has no extra abilities to interact with other things and is essentially the same as viewing the page on the web. This is particularly so in NetNewsWire’s case since it uses WebKit so it’s even the same rendering engine. As long as the rendering engine is setup to treat the content as a random web site instead of in some ultra-stupid trusted mode, I can’t see a way that any content could be malicious.
The best way that I can see to solve this is to improve the final renderer rather than trying to limit what can be aggregated. Firstly, no browser should have a trusted mode – everything should be treated as if it was downloaded from a remote site somewhere on the internet.
Secondly, browsers need to provide a way to specify that part of the page has been syndicated and is actually from a separate source. Any syndicated source would run its JavaScript in a fresh sandbox, it would have no access to any part of the DOM outside of that specific syndicated content area or any other resource – it is effectively a brand new web page that just happens to render as part of the existing one. There is no way to jump back out of that syndicated block from inside it. Aggregators would then just have to make sure that the syndicated content is well-formed XHTML2 and put it inside a DIV with the special syndicated content marker. Everything inside that DIV is now separated from the rest of the page. Further, nothing from inside that DIV can render outside of the boundaries.
Using the syndicated content flag would only ever add restrictions, for example if you were browsing a site that you had explicitly disabled JavaScript on, syndicated content in that page could not use JavaScript even if the syndicated content reportedly came from a completely trusted site. Browsers could choose to never allow syndicated content to trigger a security dialog, or use both the original page URL and the reported syndication source in the dialog so the user is effectively asked to trust both sources. Personally, I’d heavily lean towards just not allowing syndicated content to trigger security dialogs because users tend to approve them without thinking.
The essential part of the syndication flag isn’t that it is a way of saying where the content comes from, it’s a way of saying “this site doesn’t vouch for this content, don’t trust it even if you trust me, oh and by the way, I think it came from over there”. Now, I haven’t thought about this for more than about half an hour, so there may be a reason it wouldn’t work. Regardless of the feasibility of this scheme though, we need to come up with some way of syndicating things without giving up the latest, and downright old, web technologies and the enhanced user experience it brings. Even if we have to put up with stripping styles from feeds for now, what can we do to make it safe to keep in the future?
1 – In other words, if your intranet pulls data via RSS from your internal systems you probably don't need to worry – all the systems and thus all the content is under your own control and RSS is just being used as a transport mechanism.↩
2 – thus avoiding an extraneous closing DIV tag which would allow the content to suddenly become part of the main page again↩