Third party content-The paradoxes of the web

When the World Wide Web started, it was just a bunch of static HTML pages which are interconnected by hyperlinks. More importantly, each website had content which was loaded from its own server (technically speaking, no cross origin content). Today, the web we browse daily has content which originates from various origins. People use several buzzwords such as "Mashups", "Web 2.0", "Social Web" etc. while referring to the present day's dyanmic web.

Advertisements, JavaScript libraries, images, stylesheets, social plugins, multimedia etc. are examples of content which can load from multiple origins into a website. They may exist independent of each other in a website (e.g., Facebook "Like" button), or may interact with each other to provide a richer experience (e.g, plotting petrol bunks on Google Maps). Irrespective of what they are used for, for sure, integrating third party content proved useful in the modern web.


By definition, paradoxes are arguments which give rise to inconsistencies. e.g., answer this seemingly simple question with Yes/No:- Is the answer to this question "No"?

Irrespective of what you answer is, you will see that you are contradicting yourself.

If you closely observe, most of the problems on the web are paradoxes. Whether you judge the below factors as safe/unsafe, you will be contradicting yourself :-)

Paradox 1: Content inclusions

As said earlier, including third party content such as advertisements, social plugins, scripts etc in a web page enabled and enhanced interactivity between websites. When people speak of interactivity on the web, the first thing that strikes is JavaScript. Several third party scripts such as Google Analytics, page hit counters, libraries such as jQuery and its plugins etc have gained popularity over the years. Developers "trust" third party scripts and believe them to be good and secure. However, as we know, there are good scripts as well as malicious scripts (Hello XSS!) and browsers do not have any way of differentiating between the two, just as we fail to answer the below paradox:

"_ _ _, third party scripts are not safe" [Fill in the blank with "Yes"/"No"]

Do we have a robust solution for safe content inclusions? Well, after several proposals such as input sanitization, automatic analysis of content, Content Security Policy has gained acceptance. Yet, the 'trust' factor is still present.

Paradox 2: Content requests

Even before third party content is included, technically speaking, a HTTP request is what happens before. A request, even without a valid response, is just sufficient to do both good and bad. HTML elements such as <img>, <script>, <iframe>, <form> etc which have src/href/action attributes take any URL and trigger a HTTP request. Note that these requests are not restricted to same origin (otherwise, we will not be able to see loading images from multiple websites). However, as we know, there could be malicious cross origin requests as well (Hello CSRF!) and browsers do not have any way of differentiating between the two.

Do we have any way of differentiating between safe/unsafe requests? Should we block all cross origin requests with cookies? Breaks almost entire Web. Should we block cross origin requests with parameters? API's will not work. Should we block cross origin POST? Paypal/Like/Tweet/OpenId workflows will not work. Should we use HTTP headers, tokens to differentiate? Many use today, but this is not robust. The scenario turns complex even without discussing techniques such as JSONPHTML5 CORS and complex access control policies. So the below paradox holds good here as well.

"_ _ _, one cannot differentiate between genuine and malicious cross origin requests" [Fill in the blank with "Yes"/"No"]

Paradox 3: Social plugins

This case is even more problematic. Technically speaking, social plugins (such as Facebook "Like", Twitter's "Tweet", Google's "+1" etc) are third party content embeded in <iframe> tags. Framing third party content in iframes is an important step towards security. If the third party content is not framed, it has complete access to DOM, storage, network of a website, which is dangerous. Content in cross origin iframes cannot access the content of parent web pages due to Same Origin Policy restrictions. However, framing content leads to another dangerous attack called "Clickjacking". Though clickjacking has defenses such as X-Frame-Options, Frame busting etc, they cannot be applied to social plugins, as they defeat the whole purpose of wrapping third party content in iframes. A more detailed explanation of this problem is stated in one of my previous posts

"_ _ _, social plugins cannot exist without iframes" [Fill in the blank with "Yes"/"No"]

Paradox 4: Sandbox Iframes [HTML5]

HTML5 introduced Iframe sandbox, which attempts to make iframes even more secure. Sandboxed iframes cannot run third party scripts, block popups, block frame navigation etc. Several websites use JavaScript based frame busting techniques to defend against clickjacking. This means, if an attacker frames a website A which uses frame busting script, the script makes the site A to occupy attacker's page. However, if the attacker uses sandboxed iframe to frame website A, since sandbox prevents script execution, frame busting code fails, thereby enabling the attacker to carry clickjacking attack. In short, sandboxed iframes protect from bad scripts, but also disable script based clickjacking protection.

"Sandboxed iframes do not allow scripts. Yes or No?". Implies, sandbox breaks clickjacking.

Conclusion? The web is not short of contradictions. We have more hacks than solutions and several solutions contradict themselves, as seen above. The bigger problem today is not solving several of these well known problems, but solving them effectively, without breaking millions of websites. More about some of the interesting proposals to solve these problems in my upcoming posts.

