Grid and P2P add to automated testing on the cloud
It's a given that testing for cloud-based applications is a critical function before deployment to guarantee the application's functionality, security, scalability, and reliability. And with increasing number of applications needing to be deployed, especially with the trend to adapt all types of applications for mobile devices, effective testing can become a bottleneck. That's where automated cloud app testing comes into play.
This article introduces the effect of adding grid computing and peer-to-peer collaboration to automated testing in the cloud, discusses some of the key concepts of each as they relate to cloud computing, illustrates the considerations for integrating each into automated testing in the cloud, and provides a real-world scenario and software as an example.
But first, a basic backgrounder on grid computing and peer-to-peer functionality:
Defining grid computing
Everyone remembers when the concept of "grid computing" first became popular. It was the early 2000s when projects like SETI@home and the Human Genome Project harnessed the power of thousands of computers to work on complex problems. Grid computing refers to combining resources from multiple administrative domains in order to reach a common goal; the grid can be considered to be a distributed system with non-interactive workloads that involve a large number of files.
Grids tend to be more loosely coupled, heterogeneous, and physically dispersed than clusters. A grid can also be constructed on a LAN; in fact a private cloud is really a kind of virtual private grid.
Defining collaborative peer-to-peer
Another concept that emerged at the same time as grid computing was called peer-to-peer (P2P), immortalized by the short-lived Napster project and later made resoundingly successful by Skype with its collaborative peer-to-peer model. P2P is a distributed application architecture that partitions tasks or workloads between peers (peers being equally privileged participants in the application).
Peers form a peer-to-peer network of nodes. Peers make a portion of their resources, such as processing power, disk storage, or network bandwidth, directly available to other network participants without the need for central coordination by servers or stable hosts. Peers can be suppliers and consumers of resources (in the client-server model, only servers supply and clients consume).
Peer-to-peer systems are often implemented as an abstract overlay network built at the application layer on top of the native or physical network. These overlays are used for peer discovery and indexing and are what makes a P2P system independent from the network topology. The P2P overlay network consists of all the participating peers as network nodes.
Since nodes share their resources as well as their demands with the overall system in a P2P network, the more nodes that are engaged, the larger the total capacity of the system to serve demands. Also, the decentralized nature of the P2P network means that it is more robust since no single points of failure are present.
Integrating the concepts into a cloud testing environment
So what happens if you integrate both grid computing and collaborative peer-to-peer concepts into the cloud test environment?
* Adding grid functionality gives you the ability to spin up an automated test execution grid on the cloud that can run one test in a small fraction of the time or many instances of the same test in the same period of time. Any data-driven functional test, regardless of the functional test tool being used, can be executed in parallel the same way a large problem like the Human Genome Project is divided between many computers.
* Adding P2P functionality allows the test execution grid to be shared via the browser session with as many collaborators as desired simply by passing along the link. It allows for a built-in chat window detailing all the online participants and those with proper access can run their own tests. (And since we're describing an automated test execution system instead of a manual one, participants rarely need to establish a direct remote desktop protocol (RDP) session with running instances.)
Let's talk about adding these functionalities to the cloud environment and some of the details you should consider with each before we introduce the real-world software that resulted from this integration effort.
Integrating grid functions
What makes grid computing unique is the ability to divide a large task into many smaller ones that can be carried out in parallel. In the classic case of the SETI@home project, this task was about analyzing radio signals from deep space to identify patterns that could indicate the presence of extraterrestrial intelligence. Other well-known examples of grid computing include the sequencing of the human genome and calculating a million digits of pi.
Besides the grid itself, a central server or server cluster for distributing the task and collecting the results is a necessary component of grid computing architecture. Depending on the task being performed, the design and architecture of the server component can vary from simply sending and receiving data to the node computers to performing complex control and synchronization activities or both. So both the size of the grid itself and the complexity of the task are largely determined by the command and control server.
Earlier in this decade, I developed a grid computing project called Capacity Calibration. In the project, people around the world downloaded an agent program and allowed their computers to be used for testing the capacity and performance of websites. One notable example of this was testing NASA's website used to stream video of the space shuttle launch. More than 1,500 agents worldwide simulated thousands of hits per second to verify service levels before NASA took the site live.
Capacity Calibration (or CapCal) has been migrated to the cloud where the grid itself can be spun up and torn down on demand. Since there is a considerable amount of control and synchronization required, in addition to a large amount of data being transferred, this makes a huge difference. In particular, the variable network connectivity and availability of random agents on the web made it an extra challenge to coordinate and control massive load tests. With IBM® Smart Cloud Enterprise, this same thing can be done with different data centers using precise and accurate metrics and 99.99 percent availability.
With my current project, Cloud Lab Grid Automation (which I'll discuss later in the article), the challenge and the approach are similar — how to harness the power of a number of virtual machines and apply them to a common task (in this case, functional test automation). I needed the Cloud Lab server to be able to quickly spin up new instances and add them to the virtual lab, tearing them down when no longer needed.
With each of these instances being a full-blown desktop Windows® environment, of course there are all kinds of interesting possibilities that may arise. Take for example the problem of testing the performance of a legacy client-server application using a thick client written in Java™ Swing or .NET — these are notoriously difficult to test because there is no way to simulate multiple clients on a single machine. With IBM SmartCloud Enterprise, the answer is simple — just spin up multiple machines instead!
... to read more articles, visit http://sqa.fyicenter.com/art/