Tue 22 February 2022

Web app test automation with `cdt`

Codethink recently created a simple command line tool to help with automated script-driven testing of web apps and we have released it as open source.

Modern cars contain computers to present slick displays showing everything from car information screens and maps to music players and phone books. Often they support voice commands as well as touch screen interaction. These are generally referred to as In-Vehicle Infotainment (IVI) systems.

One of our recent projects was to improve the performance of an IVI system for a large automotive company. Our goals were to make startup faster, make the interactions like scrolling more smooth, to make the switch between applications faster, and to make certain key applications load more quickly.

Browser-based user interfaces

The system we were optimising featured a sleek-looking user interface, multiple screens, and a large assortment of different applications and info panes.

The interface was implemented as a web application running inside a browser. This is a good choice because web development is a highly mature technology and modern browser engines have had an enormous amount of effort put in, to make them as efficient as possible.

However it posed a problem for us, because we wanted to do rigorous automated performance testing against the vanilla system supplied to us, and against a system running whatever changes we made. Anything involving touching the physical screen would make the process far too manual, error-prone and time-consuming.

What we needed was a way to do all the necessary interaction from the command line, in such a way as to enable us to write scripts that would run various benchmarks and tests in an automated way. This would enable a rigorous testing process that would let us track performance changes and compare different changes and optimisations directly.

Chrome DevTools Protocol

Browsers such as Chrome and Firefox have inspectors that allow users to manipulate the page, and interact with the JavaScript console. Much of this functionality is available through an API that works by sending JSON messages over a websocket. This is called the Chrome DevTools Protocol. It allows other processes to connect to a browser session and carry out particular actions.

Note: This isn't limited to Blink-based browsers like Chrome. Firefox nightly also has a partial implementation of the Chrome DevTools Protocol.

For this project we initially wanted to be able to simulate tapping the screen at a given coordinate. This uses the Input.dispatchTouchEvent method.

{
    "id":1,
    "method":"Input.dispatchTouchEvent",
    "params":{
        "type":"touchStart",
        "touchPoints":[
            {"x":100,"y":100}
        ]
    }
}

Each message has an ID field, which you can set to a unique number when you make the call. It is returned in the response, enabling you to match responses to their original calls, when there may be multiple messages in flight at any given time. The above message would instruct the browser to pretend the screen had been touched at a display coordinate of (100, 100).

Note: When run with the command line argument --remote-debugging-port=9222 Chrome will expose the Chrome DevTools Protocol, and you will see a list of available browsing contexts at http://localhost:9222/json.

cdt: Chrome DevTools Tool

We have released cdt as an open source project. It is a tool for driving browser interactions from the command line and it was written in C using the Libwebsockets library.

It was made quickly, so that we could get meaningful performance information about the changes we were making for the real work of the project as early as possible. So please note that it is currently a bit rough in places and currently only exposes a tiny fraction of the Chrome DevTools Protocol's functionality.

That said, we found it very useful already, so it might be useful to others too. It also serves as an example for how to connect to a Chrome DevTools Protocol websocket from C using Libwebsockets.

One nice thing about cdt is that it enables remote access, which is perfect for remote working and distributed teams. (cdt was developed and tested against an automotive bench rig in our office, from home in another part of the country.)

What does it do?

Each of the browser's browsing contexts (think "tab", "window", "iframe", etc) has its own Chrome DevTools Protocol endpoint. The first thing cdt does is try to connect to one. In cdt these are called displays, and you can specify the one you want to connect to.

cdt also takes a command. The command carries out some action against the remote Chrome DevTools Protocol endpoint.

The currently available commands for test automation are:

Command Function
tap Issues touch events to simulate tapping at given coordinate
run Runs the supplied JavaScript script on the remote
swipe Synthesizes a scroll gesture over a time period
run-log Runs the supplied JavaScript on remote, capturing console.log
screencast Fetches continuous screenshots and saves locally
screenshot Fetches screenshot of the remote and saves locally

This simple set of commands allowed us to build richer sets of tests as bash scripts and python scripts.

Sometimes, while developing the automated tests, it is necessary to do some manual interaction with the remote web app. This is supported by cdt using the additional sdl command, which launches an interactive graphical display of the remote application, and allows touch interaction. This is useful for determining which coordinates to include in automated scripts in order to tap specific things. This was implemented using Simple DirectMedia Layer (SDL).

Command Function
sdl Interactive front end that renders display and supports touch

How did we use it?

This is a generic tool for interacting with web applications, so in order to write specific tests for the devices we were working with, we wrote shell scripts and python scripts to drive cdt.

For example, we could use the tap and swipe commands to ensure a known starting state, and then start a screencast, and tap to launch a particular application. The recorded screencast frames contain creation timestamp from the server end. This allows us to know when things happen, with no need to worry about network latency between us and the device being tested. We used ImageMagick to compare screenshots to reference screenshots (with masking to remove areas like the current time).

This let us build rich and rigorous tests that were specific to the project we were tasked with. These tests could remotely restart the rigs, in order to test from a cold start, and collect various benchmarks.

Since this tooling allowed us to collect a lot of data, the scripting we wrote around it allowed a lot of metadata to be recorded with the benchmark recordings. This let us state why any given test was run, what it should be compared to, what it was testing, etc. All the resulting recording files were output in YAML, and saved in git.

The raw recording files were huge and not much fun to read, so in addition to the tooling to run the tests, we wrote tooling to take those recording files and translate them into individual reports, or comparison reports, comparing any number of different recordings. These used python's Jinja templating engine to output in a pleasant format.

Future work

As stated above, cdt was thrown together hastily, so there is quite a bit more that could be done with it.

One idea is to support more of the Chrome DevTools Protocol because at the moment only a tiny sliver of that is supported.

As an example, in order to fetch the console.log messages, the run-log command does some crude hackery which re-used the already implemented Runtime.evaluate method, but should really be using the proper Log functionality.

Other areas that could be improved include the JSON message handling, and the internal representation for messages.

Also, it would be interesting to look at other applications for the tool. For example, the openQA project. It currently works with a VNC backend. It needs a method for sending touch events, and a method for receiving the framebuffer. So this seems like a good match, and perhaps there is potential for a cdt backend in the future. Codethink has set up an openQA instance for testing. So there is potential for this to be included there, to add new types of tests.

Links

Related content

Other Articles

Get in touch to find out how Codethink can help you

sales@codethink.co.uk +44 161 660 9930