Another barrier that is frequently used with applications that must accept user-generated HTML is to separate cookie domains: put sensitive pages on a separate origin from the user-generated content. For example, you could have admin.foo.com and comments.foo.com. If sensitive cookies are only setup for domain=admin.foo.com, an XSS on comments.foo.com won’t net anything useful.
So that’s what you’ve been so busy working on since your last post? Makes me glad I’m wracking my brain with WPF and XAML instead of Web 2.0 stuff.
No, we just improved it. That’s how code evolves. Giving up is lame.
When you find yourself at the bottom of a hole it’s best to stop digging.
Also what Mr Blasdel said.
Uh, couldn’t someone just filter the response from the server to remove the httpOnly flag? It seems very half-assed to use a feature that is client-side, in SOME browsers. This is a circumstance where it’s important enough to come up with a solution that isn’t just more obfuscated, but that actually has increases the security by an order of magnitude.
Just my opinion.
Sorry if I didn’t give you sufficient credit
My point was less about re-auth in general, but more about trying to detect who had a legitimately rotating IP address. If detected, cookies can’t be trusted… so force the user into an auth scheme that used cookies as secondary to something else. Primary would be SSL Certs or (shudder) Basic Auth over HTTPS.
Here was the list I initially had:
That’s probably good enough for anonymous comments. These ones are also safe and useful for untrusted comments:
That’s 9 tags. If you want to add a video or an image, you could use a bit of DHTML or Flash to pop up a media selector widget for approved sites: Flickr, YouTube, etc. People get to select URLs to pages, but that’s it. On the back end, check the URL to see if it looks hacked. If so, reject it.
For trusted contributors, you could open it up even more and use tables, headers, links, etc… in which case you’re looking at closer to 20 tags.
For very trusted contributors, you get to use attributes like SRC for IMG, and maybe even SCRIPT nodes.
Of course, @dood mcdoogle summed it up quite well when he said that input filtering cannot ever be sufficient… so you always need an output filtering step. However, there’s no harm in pre-parsing your data and teaching your audience what will and what will not be tolerated.
My tags got gobbled… I these are critical for anonymous comments:
B, I, UL, OL, LI, PRE, CODE, STRIKE, and BLOCKQUOTE
Anything else, and you probably want to be a verified or trusted user.
Quite an eye opener; thanks Jeff. Also, WTF, when are you going to accept me as a beta user?!
I’m not sure why you people are being so hard headed. He didn’t say that he didn’t ALSO fix the sanitizer. But like all things in web security adding the HttpOnly flag raises the bar. Why not do it? He isn’t advocating using HttpOnly in lieu of other good security measures.
As for sanitizing input verses output I prefer to sanitize output. There are too many other systems downstream that are impacted by sanitizing the input. I write enterprise systems, not forums. There is a big difference. I can’t pass a company name of Smith%32s%20Dairy to some back end COBOL system. They wouldn’t know what to do with it.
For those of you that decide to sanitize your input, it must be nice to write web applications that live in a vacuum…
The Web needs an architectural do-over.
With recent vulnerabilities like the Gmail vulnerability I’m really starting to question whether it is possible to write a secure web app that people will still want to use. Even if it is, it seems like it is little more than a swarm of technologies that interact in far more ways than are immediately obvious.
Why not keep a dictionary that maps the cookie credential to the IP
used when the credential was granted, and make sure that the IP
matches the dictionary entry on every page access?
Most of us get our IP addresses through DHCP, which means they can change whenever our system (or router) is rebooted.
I’m still quite leery of your sanitiser, for the reasons I described on RefactorMyCode: you’re doing blacklisting even if you think you’re doing whitelisting. Your blacklist is more or less anything that looks like BLAH BLAH X BLAH, where X isn’t on the whitelist. As you can see, it’s very hard to write that rule correctly. Your bouncer is still kicking bad guys out of the queue. Instead your bouncer should be picking up good guys and carrying them through the door. If the bouncer messes up, the default behaviour should be nobody gets in, not everybody getting in!
As an interesting side note to those who say you should sanitize late rather than early:
I have run into all kinds of XSS when opening tables in my database. Yes, I learned that opening said tables in PHPMyAdmin might not be a good idea.
That was an interesting experience to be sure.
I have to agree with what most people are saying. Allowing direct HTML posting that other users can see is sure to cause at least headaches, if not major problems. You’re better off using some kind of wiki system, or some kind of subset of HTML, where only the tags you are interested in are allowed.
Hey, But how do I set the HttpOnly flag on cookies. I certainly did not find it in the preferences/options dialog.
IP spoofing over UDP = easy, IP spoofing over TCP = hard
The biggest problem in security is that a lot of people think that hard is the same as impossible. It is not. We can patch this and that hole after we’ve completed implementing our design and make it harder to attack our system, but we’ll never really know if we’re 100% safe.
In that regard, giving up is not lame. Playing catch-up is better than not. It’s also better than going back to the drawing board when you’re well into beta (aka scope creep), unless you have infinite budget. I do believe, though, that in the design stage, as Schneier says, security is about trade-offs. If a feature introduces security risks that are not absolutely not tolerable, then it might indeed be a good idea to drop it altogether, if designing built-in protection against a class of attacks is not feasible.
IP spoofing over UDP = easy, IP spoofing over TCP = hard
As someone who has written an IP stack, I’m not really sure what about TCP makes it particularly hard. I’m not saying it isn’t, I just don’t see why it would be offhand.
It might (might) be tough to push aside the rightful IP holder from an established connection. However, initiating a connection with a spoofed IP should be just as easy as spoofing your IP in UDP and getting the victim to respond to you.
Friends dont let friends allow XSS attacks.
When you emit a session id, record the IP. Naturally you also emitted it over ssl, in which case you record the cert they were granted for the session. Therefore each request is validated by IP and cert?
It’s amazing how easily cookies can be hijacked. Shouldn’t there be some way to encrypt them too so that even if they do manage to get the cookie, it’s useless?
I have run into all kinds of XSS when opening tables
in my database. Yes, I learned that opening said tables
in PHPMyAdmin might not be a good idea.
That just shows you that PHPMyAdmin is not a safe program. The PHPMyAdmin program could not possibly know whether or not the data in the database has been scrubbed. So it should default to scrubbing it on output. It also can’t enforce the rule that all input should be scrubbed before putting it into the database.
It also shows that all programs fall into this same category. There could be an SQL injection vulnerability in your code that lets the user force data into the database unscrubbed. So ALL programs (including yours) should make the assumption that the data could be tainted and scrub it before outputting it to the screen.
It is the one true way to be safe. Making assumptions is always a bad idea. Be sure. Scrub all output.
If you don’t allow unsafe characters, then just completely remove them from input. Done
Think about what this means. What is an unsafe character?
In the context of the user’s message, nothing. It’s only when you go to insert that message directly into a HTML/JS document that certain characters take on a different meaning. And so at that time you escape them. This way the user’s message displays as they intended it AND it doesn’t break the HTML. Everyone wins.
It’s the same for when you’re putting it into SQL, or into a shell-command, or into a URL, etc. You can’t store your data escaped for every single purpose in your DB, you need to do the escaping exactly when it’s needed and keep your original data raw and intact.
Your policy of stripping unsafe characters gets in the way of the user’s perfectly legitimate message. And there’s absolutely no reason for that.
You store user input verbatim, and you always remember to escape when displaying output, and you hope input cleaning works 100%
There is no hope required. You don’t have to always remember if you have a standard method of building DB queries and building HTML documents/templating, and it’s tested. And you should have this.
Where and when to escape (assuming a DB store):
- Untrusted data comes in
- Validate it (do NOT alter it)
And, if it’s valid
- Store it (escape for SQL here)
later, if you want to display it in a HTML page:
retrieve from DB and escape for HTML
or, if you want to use it in a unix command line:
retrieve from DB and escape for shell
or, into a url:
retrieve from DB and URL encode
The key is not MODIFYING the user’s data. Just accept or reject. Then you escape if necessary when you use it in different contexts.
Now you can do anything you want with your data. You don’t have to impose confusing constraints on what your users can and can’t say.