Why Load Testing Ajax is Hard

Software QA FYI - SQAFYI

By: Patrick Lightbody

Article on the Ajaxian blog site from December 2008 on the challenges of load testing sites incorporating Ajax.

Today we are fortunate to have a guest post by Patrick Lightbody, most recently of BrowserMob fame (and previously Selenium work, OpenQA, WebWork, and more). Let’s listen in to him talk to us about load testing, and let him know your thoughts in the comments below:

I’ve been developing and testing complex web apps for a long time. I was the co-creator of WebWork (now Struts 2.0) and an early champion of DWR, writing one of the first AJAX form validation frameworks for Java web apps. But over the years, I noticed that as our web technologies and techniques got more sophisticated, our testing techniques were not keeping up.

That was why I founded OpenQA and helped grow Selenium to the popular testing tool that it is today. Selenium helps with functional testing of complex AJAX apps, but there isn’t an equivalent for load testing, which is why I started BrowserMob, a new type of load testing service.

Traditional load testing
In order to achieve high levels of concurrency, traditional load testing tools (both open source and commercial) work by sending large numbers of HTTP requests as a way to simulate many concurrent users interacting with your web page. These tools work by recording the traffic that comes from a browser session and then requiring that the load tester tweak a generated script so that it worked properly when played back X times concurrently.

Common problems would be that the initial recording would embed in cookie values that were tied to individual sessions. Additional unique state might be encoded in other hidden form elements, all of which required some fine tuning after the fact. If you’ve ever tried to run a load test, this is probably a very familiar process. It has worked reasonably well up until recent years, but AJAX has made this process even more difficult.

Ajax + load testing = hard
The reason Ajax has complicated things is that it encourages more logic and state to run inside the browser session. This means that just watching the traffic across the wire doesn’t necessarily tell the full story. The richer an app gets, the more difficult it gets to simulate the exact effects of hundreds or thousands of users hitting your site.

This is the problem I decided to solve when I started BrowserMob. It’s on-demand, low-cost and uses real browsers to completely change the way load testing is recorded and played back.

Do real browsers really matter?
Real browsers absolutely matter. There are two major reasons:
1. It simplifies the script creation process by letting you avoid all the complexities and hacks you have to do with traditional load testing tools.
2. It ensures that you’ll see 100% of the traffic and load against your site that a real user would cause.

We’ll look in-depth at each of these topics separately to see how use of real browsers helps and how a service like BrowserMob compares to existing load testing technologies.

Simplifies script creation
In today’s modern web applications, AJAX is just about everywhere. And we’re not necessarily talking about super rich applications like Google Maps or Yahoo Mail, but even simple sites like google.com now use advanced AJAX techniques. See Google’s auto-complete for a real-world example:

In this case, when typing values in to the search box, the web browser executes JavaScript logic that in turn makes AJAX calls to Google’s search engine, asking for search suggestions to display. It does this on every keystroke that the user types in. This is a standard auto-complete control that most Ajaxian readers are very familiar with.

When recording a script with a traditional load testing tool, one of two things may happen here:
* The recorder will see the AJAX traffic and capture it for playback in the load test
* The record will not see the AJAX traffic and will only capture the request made when the user clicks the “submit” button

Obviously these Ajax requests are causing real load, so we want to make sure they get played back in a load test. Let’s assume you’re using a tool, such as JMeter, that does capture the AJAX traffic. Here’s what that looks like:

The recorded traffic is effectively:
http://clients1.google.com/complete/search?hl=en&gl=us&q=b
http://clients1.google.com/complete/search?hl=en&gl=us&q=ba
http://clients1.google.com/complete/search?hl=en&gl=us&q=ban
http://clients1.google.com/complete/search?hl=en&gl=us&q=bana
http://clients1.google.com/complete/search?hl=en&gl=us&q=banan
http://clients1.google.com/complete/search?hl=en&gl=us&q=banana

Each key stroke by the user is included in each subsequent search term. Let’s ignore the requirement of validating the results that come back from the AJAX requests for the moment (they are usually in JSON or XML format and difficult to validate using most tools). Instead, let’s just add a twist to the load test requirement for doing searches: the load test must search from 100 different search terms.

Parameterization is very common requirement, since it ensures that the load is realistic and doesn’t get cached in any unnatural way. This means that now in addition to searching for the term “banana”, we’re also searching for “apple”, and “orange”, among others.

However, this means your script can’t just blindly submit requests to those previous URLs either, since those were tied to the “banana” term. Instead, they must search for the sequential characters of the respective search term, such as:

http://clients1.google.com/complete/search?hl=en&gl=us&q=a
http://clients1.google.com/complete/search?hl=en&gl=us&q=ap
http://clients1.google.com/complete/search?hl=en&gl=us&q=app
http://clients1.google.com/complete/search?hl=en&gl=us&q=appl
http://clients1.google.com/complete/search?hl=en&gl=us&q=apple

Unfortunately, this is where even the best traditional load testing tools fall down. They don’t provide any help here, so it’s up to you to figure out how to, if it’s even possible, write complex scripting logic that breaks down the randomly selected search term by characters and then subsequently issue Ajax requests for each character in the term.

At this point, you’re basically rewriting the same logic that the web app developer wrote originally. If you’re a QA engineer, this may be difficult since you don’t know all the internal AJAX logic coded in to the application. If you’re the developer, it’s still annoying because it’s tedious and likely in a language other than the original JavaScript that you wrote your code in.

So how do real browsers help?
Because BrowserMob uses real browsers to both record and playback load, that means you don’t have to worry about trying to simulate the logic in a web browser. Instead, all you have to do is record the human interaction with the browser, such as typing in a randomly selected search term. BrowserMob will then pass those instructions on to the hundreds or thousands of browsers participating in the load test, and those browsers will in turn “do the right thing” and issue the proper AJAX requests.

And if the underlying logic, such as the request URL pattern for those AJAX requests, changes? With traditional load testing it’s up to you to detect and fix the problem. If your test uses real browsers to play back the traffic, your script won’t need to change one bit – the new AJAX logic will be run by the browser in real time.

Ensuring realistic playback
We’ve seen how use of real browsers helps with script creation, but what about playback? As we just learned, using real browsers simplifies the process of recording and shrinks the behavior coded in to the script itself. This means we’re letting the real browser – the same type of program your end users will use – make the decisions about what requests to make.

For example, when visiting http://ebay.com you might see the following page:
But reload the page and now you might see this:
Notice a difference? The upper right section has completely different images displayed. That’s because eBay’s home page chooses what to display based on complex and multi-variant logic determined at runtime. It’s quite likely that it’s going to be impossible for a load tester to know which images will be displayed on any given request.

It’s true that some load testing tools will try to parse the pages in real time and figure out which images should be displayed, but that’s hardly comforting once you’ve already learned they can’t deal with even the most simply Ajax components, as we just saw. And as most AJAX developers know, resources such as images and stylesheets are more and more likely to come from complex JavaScript logic and not due to a simple static reference in an HTML page.

Instead, the only way to guarantee that every single object (image, JavaScript, AJAX request, advertisement from an ad partner, etc) gets requested is to use a real web browser during playback. While it is much more resource intensive, it is also a major time saver on both the front-end, as scripts are much simpler to write, and the back-end, as you can be confident that the most realistic level of load was produced.

So next time you hear of load testing happening on one of your Ajax apps, make sure those doing the testing understand the complexities and difficulties associated with testing a complex web app. Help them be on the lookout for the issues highlighted here.

Full article...

Other Resource

... to read more articles, visit http://sqa.fyicenter.com/art/

Why Load Testing Ajax is Hard