How Can I Clean And Sanitize A Url Submitted By A User For Redisplay In Java?
Solution 1:
URLs having '
in are perfectly valid. If you are outputting them to an HTML document without escaping, then the problem lies in your lack of HTML-escaping, not in the input checking. You need to ensure that you are calling an HTML encoding method every time you output any variable text (including URLs) into an HTML document.
Java does not have a built-in HTML encoder (poor show!) but most web libraries do (take your pick, or write it yourself with a few string replaces). If you use JSTL tags, you get escapeXml
to do it for free by default:
<a href="<c:out value="${link}"/>">ok</a>
Whilst your main problem is HTML-escaping, it is still potentially beneficial to validate that an input URL is valid to catch mistakes - you can do that by parsing it with new URL(...)
and seeing if you get a MalformedURLException.
You should also check that the URL begins with a known-good protocol such as http://
or https://
. This will prevent anyone using dangerous URL protocols like javascript:
which can lead to cross-site-scripting as easily as HTML-injection can.
Solution 2:
I think what you are looking for is output encoding. Have a look at OWASP ESAPI which is tried and tested way to perform encoding in Java.
Also, just a suggestion, if you want to check if a user is submitting malicious URL, you can check that against Google malware database. You can use SafeBrowing API for that.
Solution 3:
You can use apache validatorURLValidator
UrlValidatorurlValidator=newUrlValidator(schemes);
if (urlValidator.isValid("http://somesite.com")) {
//valid
}
Post a Comment for "How Can I Clean And Sanitize A Url Submitted By A User For Redisplay In Java?"