Armen Zambrano's battlefield: Usability improvements for Firefox automation initiative

Tuesday, July 19, 2016

Usability improvements for Firefox automation initiative - Status update #1

The developer survey conducted by Engineering Productivity last fall indicated that debugging test failures that are reported by automation is a significant frustration for many developers. In fact, it was the biggest deficit identified by the survey. As a result,

the Engineering Productivity Team (aka A-Team) is working on improving the user experience for debugging test failures in our continuous integration and speeding up the turnaround for Try server jobs.

This quarter’s main focus is on:

Debugging tests on interactive workers (only Linux on TaskCluster)
Improve end to end times on Try (Thunder Try project)

For all bugs and priorities you can check out the project management page for it:

https://wiki.mozilla.org/EngineeringProductivity/Projects/Debugging_UX_improvements

In this email you will find the progress we’ve made recently. In future updates you will see a delta from this email.

PS = These status updates will be fortnightly

Debugging tests on interactive workers

Accomplished recently:

Landed support for running reftest and xpcshell via tests.zip
Many UX improvements to the interactive loaner workflow

Upcoming:

Make sure Xvfb is running so you can actually run the tests!
Mochitest support + all other harnesses

Thunder Try - Improve end to end times on try

Project #1 - Artifact builds on automation

Tracking bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1284882

Accomplished recently:

Landed prerequisites for Windows and OS X artifact builds on try.
Identified which tests should be skipped with artifact builds

Upcoming:

Provide a try syntax flag to trigger only artifact builds instead of full builds; starting with opt Linux 64.

Project #2 - S3 Cloud Compiler Cache

Tracking bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1280641

Accomplished recently:

Sccache’s Rust re-write has reached feature parity with Python’s sccache
Now testing sccache2 on Try

Upcoming:

We want to roll out a two-tier sccache for Try, which will enable it to benefit from cache objects from integration branches

Project #3 - Metrics

Tracking bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1286856

Accomplished recently:

Preliminary analytics / research based on job data from Treeherder found at: http://nbviewer.jupyter.org/url/people.mozilla.org/%7Ewlachance/try%20analysis.ipynb

Which jobs finish last?
Which jobs have the highest wait times?
Which jobs have the longest total wall clock time (i.e. are the largest consumers of resources)

Upcoming:

Putting Mozharness steps’ data inside Treeherder’s database for aggregate analysis

Other

Upcoming:

TaskCluster Linux builds are currently built using a mix of m3/r3/c3 2xlarge AWS instances, depending on pricing and availability. We’re going to be looking to assess the effects on build speeds of using more powerful AWS instances types, as one potential way of reducing e2e Try times.