This quarter has been a tough one for me. It has been a mix of organizing people, projects and implementing prototypes. It’s easy to forget what you have worked on, specially when plans and ideas change along the way.
Beware! This blog post may be a little unstructured, specially towards the end.
The quarter as a whole
Before the quarter commenced I was planning on adding TaskCluster jobs to Treeherder as my main objective. However, it quickly changed as GSoC submissions came around. We realized that we had two to three candidates interested in helping us. This turned into creating three potential projects. Two projects that came out of it are "refactor SETA and enable TaskCluster support" and "add unscheduled TaskCluster jobs to Treeherder." Both of these bring us closer to feature parity with Buildbot. Once this was settled, a lot of conversations happened with the TaskCluster team to make sure that Dustin's work and ours lined up well (he’s worked on refactoring how the ‘gecko decision task’ schedules tasks).
Around this time a project was completed, which made creating TaskCluster clients an excellent self-serve experience. This was key for me in reducing the number of times I had to interrupt the TaskCluster team to request that they adjust my clients.
Also around this time, another change was deployed that allowed developer’s credentials to assume almost all scopes required to schedule TaskCluster tasks without an intermediary tool with power scopes. This is very useful to create tools which allow developers to schedule tasks directly from the command line with their personal credentials. I created some prototypes to prove this concept. Here’s a script to schedule a Linux 64 task. Here’s the blog post explaining it.
During this quarter, Dustin refactored how scheduling of tasks was accomplished in the gecko decision task. For the project "adding new TaskCluster jobs," this was a risk as it could have made scheduling tasks either more complicated or not possible without significant changes. After many discussions, it seemed that we were fine to proceed as planned.
Out of these conversations a new idea was born: the "action tasks" idea. The beauty of "action tasks" is that they're atomic units of processing that can make complicated scheduling requests very easy. You can read martianwars’ blog post (under “What are action tasks?”) to learn more about it. Action tasks are defined in-tree to schedule task labels for a push. The project as originally defined had a very big scope (goal: make Treeherder find action task definition and integrate them in the UI) and some technical issues were encountered that made me concerned that more would be encountered (i.e., limited scopes granted to developers; this is not an issue anymore). My focus switched to making pulse_actions requests be visible on Treeherder. When switching deliverables I did not realize that we could have taken the first part of the project and just implemented that. In any case, a reduced scope is being implemented by martianwars since, after Dustin's refactoring, we need to put our graph through "optimizations" that determine which nodes need to be removed from the graph. This code lives in-tree and made "action tasks" to be the right solution since it uses in-tree logic where the optimizations code lives.
While working on my deliverables, I also discovered various things and created various utility projects.
New projects
TaskCluster developer experiments (prototypes):
In this repository I created various prototypes that make scheduling tasks from the command line extremely easy.
- Blog post: Schedule Linux64 tasks from the command line
Replay pulse messages (new package):
This project allows you to dump a Pulse queue into a file. It also allows you to "replay" the messages and process them as if feeding from a real queue. This was crucial to test code changes for pulse_actions.
- Blog post: Replay Pulse messages
Treeherder submitter (new package):
Treeherder submitter is a Python package which makes it very easy to submit jobs to Treeherder. I made pulse_actions submit jobs to Treeherder with this package. I had to write this package since the Treeherder client allowed me to shoot my own foot. Various co-workers have written similar code, however the code found was not repackaged to be used by others (understandably). This package helps you to use the minimum amount of data necessary to submit jobs and helps you transition between job states (pending vs. running vs. completed).
Unfortunately, I have not had time to upstream this code due to the end of quarter being upon me, however I would like to upstream the code if the team is happy with it. On the other hand, Treeherder will soon be switching to a Pulse-based submission model for ingestion and the Python client might not be used anymore.
- Blog post: Not yet
TaskCluster S3 uploader (new package):
This package allows to upload files to an S3 bucket that has a 31 days expiration.
It takes advantage of a cool feature that the TaskCluster team provides.
The first step is that there is an API where you request temporary S3 credentials to the bucket associated to your TaskCluster credentials. You then can upload to your assigned prefix for that bucket. It is extremely easy!
- Blog post: Not yet
Discoveries
- Writing tests for a large project can be very time-consuming, specially if it calls for "integration tests"
- Writing Mocks and patching functions for services like BuildApi, allthethings.json, TaskCluster and Treeherder can take a lot of work
- Trying to test a Pulse multi-exchange-based consumer can be hard
- This is probably because it is difficult to write integration tests
- I developed "pulse replay" to help me here, however, I did not create automated tests to test each case scenario
- Contributors and I don't like writing tests
- I'm glad that when doing reviews I can ask for contributors to write tests; otherwise, I don't think that we would have what we have!
- Writing tests is not easy, specially integration tests. It takes time to learn and to write them properly.
- It also does not give you the satisfaction of thinking, "I built this feature."
- The good thing about writing tests this quarter is that I finally learned how to write them.
- I've also have another post in the works about how to increase test coverage
- I also learned that code which was written by contributors and reviewed by me does not necessarily have the same quality as it would have if I fully focused on it myself. Not that they don't write superb code, however, due to my experience with the project I have more context. I've noticed this when I started writing tests, which gets me into the “ideal code” mode or a “big picture” mode. While writing tests I can also spot refactoring opportunities to make the code more maintainable and understandable. It is a very different kind of mindset that you enter when you're writing tests than when you're reviewing someone else’s code, even though, I tried to enter this "maintainable code mindset" while reviewing code for others.
- I've improved my knowledge on writing tests
- I really didn’t have much experience in this before this quarter
- Project management is more interesting and complicated than I thought it could be!
- I started working on project management to improve debugging issues for Try
- Doing project management at Engineering productivity is something new to all of us. As we begin, we want to have a light-handed approach.
- Project management templates
- I've also created some guidelines and templates on how to do project management for platform operations
- This is still a work in progress
- Created an open process to design a team logo
- We ordered some logos and t-shirts for the team and it was alot of fun!
- I'm glad the process was done as transparently as possible and had people vote for their favourite logo
- Blog post: Open Platform Operations logo design
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.
Great post!
ReplyDelete