Friday, April 22, 2016

The Joy of Automation

This post is to announce The Joy of Automation YouTube channel. In this channel you should be able to watch presentations about automation work by Mozilla's Platforms Operations. I hope more folks than me would like to share their videos in here.

This follows the idea that mconley started with The Joy of Coding and his livehacks.
At the moment there is only "Unscripted" videos of me hacking away. I hope one day to do live hacks but for now they're offline videos.

Mistakes I made in case any Platform Ops member wanting to contribute want to avoid:

  • Lower the music of the background music
  • Find a source of music without ads and with music that would not block certain countries from seeing it (e.g. Germany)
  • Do not record in .flv format since most video editing software do not handle it
  • Add an intro screen so you don't see me hiding OBS
  • Have multiple bugs to work on in case you get stuck in the first one



Creative Commons License
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

Sunday, April 17, 2016

Project definition: Give Treeherder the ability to schedule TaskCluster jobs

This is a project definition that I put up for GSoC 2016. This helps students to get started researching the project.

The main things I give in here are:

  • Background
    • Where we came from, where we are and we are heading towards
  • Goal
    • Use case for developers
  • Breakdown of components
    • Rather than all aspects being mixed and not logically separate

NOTE: This project has few parts that have risks and could change the implementation. It depends on close collaboration with dustin.


-----------------------------------
Mentor: armenzg 
IRC:   #ateam channel

Give Treeherder the ability to schedule TaskCluster jobs

This work will enable "adding new jobs" on Treeherder to work with pushes lacking TaskCluster jobs (our new continuous integration system).
Read this blog post to know how the project was built for Buildbot jobs (our old continous integration system).

The main work for this project is tracked in bug 1254325.

In order for this to work we need the following pieces:

A - Generate data source with all possible tasks

B - Teach Treeherder to use the artifact

  • This will require close collaboration with Treeherder engineers
  • This work can be done locally with a Treeherder instance
  • It can also be deployed to the “staging” version of Treeherder to do tests
  • Alternative mentors for this section is: camd

C - Teach pulse_actions to listen for requests from Treeherder

  • pulse_actions is a pulse listener of Treeherder actions
  • You can see pulse_actions’ workflow in here
  • Once part B is completed, we will be able to listen for messages requesting certain TaskCluster tasks to be scheduled and we will schedule those tasks on behalf of the user
  • RISK: Depending if the TaskCluster actions project is completed on time, we might instead make POST requests to an API

Project definition: SETA re-write

As an attempt to attract candidates to GSoC I wanted to make sure that the possible projects were achievable rather than lead them on a path of pain and struggle. It also helps me picture the order on which it makes more sense to accomplish.

It was also a good exercise for students to have to read and ask questions about what was not clear and give lots to read about the project.

I want to share this and another project definition in case it is useful for others.

----------------------------------
We want to rewrite SETA to be easy to deploy through Heroku and to support TaskCluster (our new continuous integration system) [0].

Please read carefully this document before starting to ask questions. There is high interest in this project and it is burdensome to have to re-explain it to every new prospective student.

Main mentor: armenzg (#ateam)
Co-mentor: jmaher (#ateam)

Please read jmaher’s blog post carefully [1] before reading anymore.

Now that you have read jmaher’s blog post, I will briefly go into some specifics.
SETA reduces the number of jobs that get scheduled on a developer’s push.
A job is every single letter you see on Treeherder. For every developer’s push there is a number of these jobs scheduled.
On every push, Buildbot [6] decides what to schedule depending on the data that it fetched from SETA [7].

The purpose of this project is two-fold:
  1. Write SETA as an independent project that is:
    1. maintainable
    2. more reliable
    3. automatically deployed through Heroku app
  2. Support TaskCluster, our new CI (continuous integration system)

NOTE: The current code of SETA [2] lives within a repository called ouija.

Ouija does the following for SETA:
  1. It has a cronjob which kicks in every 12 hours to scrape information about jobs from every push
  2. It takes the information about jobs (which it grabs from Treeherder) into a database

SETA then goes a queries the database to determine which jobs should be scheduled. SETA chooses jobs that are good at reporting issues introduced by developers. SETA has its own set of tables and adds the data there for quick reference.

Involved pieces for this project:
  1. Get familiar with deploying apps and using databases in Heroku
  2. Host SETA in Heroku instead of http://alertmanager.allizom.org/seta.html
  3. Teach SETA about TaskCluster
  4. Change the gecko decision task to reliably use SETA [5][6]
    1. If the SETA service is not available we should fall back to run all tasks/jobs
  5. Document how SETA works and auto-deployments of docs and Heroku
    1. Write automatically generated documentation
    2. Add auto-deployments to Heroku and readthedocs
  6. Add tests for SETA
    1. Add tox/travis support for tests and flake8
  7. Re-write SETA using ActiveData [3] instead of using data collected by Ouija
  8. Make the current CI (Buildbot) use the new SETA Heroku service
  9. Create SETA data for per test information instead of per job information (stretch goal)
    1. On Treeherder we have jobs that contain tests
    2. Tests re-order between those different chunks
    3. We want to run jobs at a per-directory level or per-manifest
  10. Add priorities into SETA data (stretch goal)
    1. Priority 1 gets every time
    2. Priority 2 gets triggered on Y push

[8] http://alertmanager.allizom.org/seta.html


Creative Commons License
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

Wednesday, April 13, 2016

Improving how I write Python tests

The main focus of this post is about what I've learning about writing Python tests, using mocks and patching functions properly. This is not an exhaustive post.

What I'm writing now is something I should have learned many years ago as a Python developer. It can be embarrassing to recognize it, however, I've thought of sharing this with you since I know it would have helped me earlier on my career and I hope it might help you as well.

Somebody has probably written about this topic and if you're aware of a good blog post covering this similar topic please let me know. I would like to see what else I've missed.

Also, if you want to start a Python project from scratch or to improve your current one, I suggest you read "Open Sourcing a Python Project the Right Way". Many of the things he mentions is what I follow for mozci.

This post might also be useful for new contributors trying to write tests for your project.

My takeaway

These are some of the things I've learned

  1. Make running tests easy
    • We use tox to help us create a Python virtual environment, install the dependencies for the project and to execute the tests
    • Here's the tox.ini I use for mozci
  2. If you use py.test learn how to not capture the output
    • Use the -s flag to not capture the output
    • If your project does not print but instead it uses logging, add the pytest-capturelog plugin to py.test and it will immediately log for you
  3. If you use py.test learn how to jump into the debugger upon failures
    • Use --pdb to using the Python debugger upon failure
  4. Learn how to use @patch and Mock properly

How I write tests

This is what I do:

  • If no tests exists for a module, create the file for it
    • If you're testing module.py create a test called test_module.py
  • If you already have tests but want to add coverage to a function, determine what is the minimal py.test call to only call the test or set of tests

@patch properly and use Mocks

What I'm doing now to patch modules is the following:

  • What function are you testing? (aka test subject)
    • Have a look at the function you're adding tests for and list which functions it calls (aka test resources)
  • Which of those test resources do you need to patch?
    • To patch the test resources I use @patch + I change the return_value. You can see an example in test_buildbot_bridge.py. I use two different style of patching if you're interested
    • I normally change test resources which hit the network (controlled environment) or that I can make the test execution faster
    • You can have pieces of code that are shared between tests to avoid duplicating mocking code
  • Determine if you need to Mock objects and function calls


The way that Mozilla CI tools is designed it begs for integration tests, however, I don't think it is worth doing beyond unit testing + mocking. The reason is that mozci might not stick around once we have fully migrated from Buildbot which was the hard part to solve.


Creative Commons License
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

Tuesday, April 12, 2016

mozci-trigger now installs with pip install mozci-scripts

If you use mozci from the command line this applies to you; otherwise, carry on! :)

In order to use mozci from the command line you now have to install with this:
pip install mozci-scripts
instead of:
pip install mozci

This helps to maintain the scripts separately from the core library since we can control which version of mozci the scripts use.

All scripts now lay under the scripts/ directory instead of the library:
https://github.com/mozilla/mozilla_ci_tools/tree/master/scripts




Creative Commons License
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.