These are some of the things we discussed:
- There is no need to tie ourselves to Rev3 minis for unit tests. Adding another pool of faster machines will help to keep that pool to just performance tests and run the unit tests on a pool with faster machines and without the need to have the same hardware for each platform. The problem with this is that it would require us adding more maintenance for a 3rd pool of slaves and many more reference images. We can revisit this in another quarter as there are other short-term options to reduce the time that unit tests jobs take. The good thing is that our infrastructure is now capable of doing unit test jobs in different pools of slaves as we have made our code infrastructure more flexible.
- We can shave tear up and tear down times. To run our unit tests we have to download the builds and the tests, remove everything from the previous run and checkout the tools repository to unpack the dmg mac files. We also download the symbols in case the browser crashes. We will have to determine where we can optimize steps to take shorter time and get to run the test suite before. These tear-up and tear-down steps could be greatly optimized, for instance, on Windows we determined at a quick glance that we could save between 20% to 30%.
- We can investigate if the test framework could be optimized. I don't recall too much of this but I believe that Bob Moss' team could help us speed up our functional and performance tests. For instance we could leave it to the framework to download and unpack the symbols only if the build crashed.
- Our minis are dual core - how could we take advantage of it?. Could we run two buildbot instances? Could we hand off two jobs each one in a different thread? There are a lot of experimenting and technical considerations for this; specially the fact that we have to reboot every time and we would have to wait for both jobs to finish.
- We need better tools to determine step times. Imagine if I could tell you that suite A in average wastes X% of its time on Y platform doing tear up/tear down? It would also be cool if we could determine when a spike on test runs appeared. I saw yesterday our new intern Syed playing with SQL queries to determine some of these things. Happy to see this happening :)
- Quickformat instead of remove. The step that removes the previous build and tests can take few minutes on Windows and that is way too much time. Instead we could quickformat the drive where these gets unpacked which is supposed to be really fast. Here is the bug where the investigation is to happen. This can also help to make our talos time more reliable.
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.
One little math experiment I did a while back in bug 489333 may be interesting, too. Basically, you can calculate the optimal amount of slaves on a parallizable task if there's a constant setup time per slave.
ReplyDeleteJust something that comes to my mind, could it save time if you download and unpack packages in one go? Like, wget -O- | tar -jx- ?
Both very interesting points. Thanks for your input!
ReplyDeleteFirst as you said we can try to have a more constant tear down/tear up time by optimizing it. The only difference is that we don't clobber on the minis as on the builders since there is hundreds of available GBs. We should optimize the removal of the previous run or put it on the side.
Very out of the box the wget/tar combination :). There are other ideas lurking around in the bugs saying of just extracting the subset of tests we need.
There's a bug on doing crash processing on another server:
ReplyDeletehttps://bugzilla.mozilla.org/show_bug.cgi?id=561754
All the code is basically written, it just needs to be hooked up to buildbot. I think catlee started looking at it but ran out of time.
Running the tests on faster hardware (but the same OS environment) is just plain smart. Why didn't anyone think of that before?
We should look into fixing our test suites
so that they can run some tests in parallel. it's a good idea in general.