Encode User Input Before Outputting it to a Page - Or Else!

Turn bad user input into good ... and prevent your Web server from being hacked.

 

By Jeff Prosise

 

Here's an extraordinarily simple ASP.NET Web page - one that displays a personalized greeting consisting of "Hello" followed by a name typed by the user when the Click Me button is clicked. Despite its simplicity, this page suffers from a potentially fatal security flaw that can put cookies, your Web server, and even other computers behind your firewall at risk. Can you spot the flaw?

 

<html>

  <body>

    <form runat="server">

      <asp:TextBox ID="Input" RunAt="server" />

      <asp:Button Text="Click Me" OnClick="OnSubmit"

        RunAt="server" />

       <asp:Label ID="Output" RunAt="server" />

    </form>

  </body>

</html>

 

<script language="C#" runat="server">

void OnSubmit (Object sender, EventArgs e)

{

    Output.Text = "Hello, " + Input.Text;

}

</script>

 

If you guessed that the flaw has to do with the fact that the code echoes raw user input to the page, you guessed correctly. A Web page should never, under ANY CIRCUMSTANCES, echo raw, unfiltered user input to the page. Why? Because it leaves that page susceptible to cross-site scripting (XSS) attacks. To demonstrate, run the page and type the following text into the text box:

 

<script>alert('Gotcha')</script>

 

Click the Click Me button and a message box will pop up in your browser. The problem? When you echoed a script block to the page, the browser interpreted it as a piece of code that needs to be executed. This script is benign. But malicious users - read: hackers - can enter scripts that are far from benign. Even a page as simple as this one can't afford to assume that all users have honorable intentions. (Note: If you run this page with ASP.NET 1.1, you'll need to add an <%@ Page ValidateRequest="false" %> directive to the top of the page to see the message box.)

 

The solution is simple. Before echoing user input to a page, use Server.HtmlEncode to HTML-encode it. The following page is functionally equivalent to the previous one, but because it HTML-encodes the text typed by the user before outputting it to the page - turning < and > characters, for example, into &lt; and &gt; - it's impervious to XSS attacks:

 

<html>

  <body>

    <form runat="server">

      <asp:TextBox ID="Input" RunAt="server" />

      <asp:Button Text="Click Me" OnClick="OnSubmit"

        RunAt="server" />

      <asp:Label ID="Output" RunAt="server" />

    </form>

  </body>

</html>

 

<script language="C#" runat="server">

void OnSubmit (Object sender, EventArgs e)

{

    Output.Text = "Hello, " + Server.HtmlEncode (Input.Text);

}

</script>

 

Remember: user input is inherently evil and should be treated as such. Server.HtmlEncode turns bad user input into good and might just prevent your Web server from being hacked - or worse.