Selenium 2.0 brings simpler design, better API

245x245xanton-blog-3-jpg-pagespeed-ic-xcoxbl-98tSelenium RC became the de facto standard for automated testing when it was introduced 11 years ago, and now with WebDriver added, Selenium 2.0 is the new de facto standard. The reasons that Selenium is preferred are obvious: it has been an open-source project, supports multiple programming languages, and has a big user community.

Selenium WebDriver is the next evolutional step in comparison to Selenium RC. The new tool uses a completely different way to interact with the browser, calling browser commands directly using the browser’s native API. Functions used and methods run depend on each browser, but the point is that WebDriver’s calls are more similar to a real user’s actions. Other improvements also include:

  • The WebDriver interface is simpler and more meaningful.
  • WebDriver has a more compact and object-oriented API.
  • Controlling the browser more effectively, WebDriver fixes some of the limitations of Selenium RC, such as dealing with pop-ups, dialogues and file upload and download.

Three software components needed to work with WebDriver

  1. The browser which you want to automate. This can be any real browser, installed on an OS and having its own configuration – custom or by default. (In fact, WebDriver can work with fictional browsers, but that will be covered in a later blog.)
  2. You need a driver to control the browser. Technically, the driver is a web server, which launches and controls the browser, sending commands to it. Every browser has its own driver. Here is a link to a list of available drivers.
  3. Scripts/tests with the commands for the browser’s driver. These will use Selenium WebDriver bindings (compiled and ready to use libraries) which are available in different languages.

All the client implementations of WebDriver that communicate with the browser or a RemoteWebDriver server use a common wire protocol. This wire protocol defines a RESTful web service using JSON over HTTP.  A client implementation of WebDriver proposes an object-oriented interface, whereas JSON Wire protocol is a much flatter WebDriver API with request/response pairs of commands and responses. This approach to the tool’s architecture has made possible the driver development process for every real browser to be separated from each other. Moreover, it became something that could be passed on to browser vendors.

At the moment the JSON Wire Protocol is a standard working draft while WebDriver is a de facto standard of functional automated testing. Right now ChromeDriver is developed and supported by the Chrome browser development team, and FirefoxDriver by the Firefox browser development team, even Microsoft has recently provided the driver for their new Edge browser.

On the other hand, when you have a universal protocol and a client that works over it, you can use them for any backend that aims to perform automated actions. For instance, Appium is a tool for mobile automation and Winium is a tool for automating Windows applications, but both use WebDriver JSON Wire Protocol and client. There are even more exotic variants of the tools, like QtWebdriver that allows users to automate QT applications using WebDriver client and API.

Aside from the openness and flexibility of WebDriver, it can also be easily integrated with different test frameworks and tools for testing. It also allows a user to create more categorical tools for functional testing, performance testing, web crawling and more.