One of our core services is Staff Augmentation. Potential clients come to us with multiple reasons to scale their teams:
- to increase their productivity
- to meet tight deadlines
- to add experienced devs to their team
- to diagnose/advise about performance issues
The story I’m about to tell has a bit of everything.
The initial picture
Last year we were approached by a well-established company in the printed advertisement industry. They have a 10-year-old rails application that serves as the foundation for their business: they use it as a back-office for admin tasks, as a hub for their franchisees to collect information, and even assemble and design the printed materials.
They came to us because they had recently hired a new CTO with vast experience in other technologies. After assessing the company, he knew they needed experienced Rails devs to help with its low productivity. The rate of fixes and new features released was very low and they wanted to increase the team’s output.
The naive solution to low productivity is simple: just throw more devs at the problem. But this hadn’t worked for them in the past.
When we are engaged in looking for our staff augmentation or maintenance services, we like to start by assessing the application and the team so we can provide developers with the right experience needed for the project. This is what we always assess:
- the source code
- the infrastructure
- the existing team
- the team’s workflow
The code
This assessment is not very deep unless strictly needed. We look for the Ruby version, the Rails version, what gems are used, what services, third-party APIs, etc. We are specifically looking for unconventional things. Rails is a convention-based framework, despite each application having its own business rules, different scales, and complexities, after years of working with many Rails codebases and legacy applications, we’ve learned to develop a quick sense of familiarity with the apps we need to maintain that let us know exactly where things are and how to do them.
On the code side, we found nothing extraordinary. Nothing critical here to produce such a hit on productivity.
The infrastructure
On the infrastructure side, the application was deployed to a bunch of VPSs (Virtual Private Server) and managed by a single DevOps that was working on automating the deployment process for a long time. That meant he was the only one who could deploy the application and maintain the infrastructure.
The staging and production applications were on different VPSs, and deploys were done manually. The reason why is a long story and it deserves its post, so I’m not going to go deep in details here but should be enough knowing that this was a huge waste of time for everyone and also it was prone to errors.
Lastly, staging wasn’t working for a while and there were several disparities between staging and production.
🚩 First red flag detected
The existing team
The existing team mainly consisted of freelancers who had limited availability (most of them worked part-time or quarter-time). They were also distributed geographically and in different time zones.
They met only once a week. For us, good and fluent communication is key to a healthy collaboration. Meeting weekly, in combination with infrequent communication on Slack means a lot of wasted time when encountering blockers or needing help understanding a bug or new feature.
Also, it looked like there wasn’t enough communication between the team members to sustain the asynchronicity of the distributed team.
🚩 Second red flag detected
The workflow
We already mentioned that the staging servers were unusable and the communication lacking. As we expected, this had a direct impact on the workflow and the team’s productivity.
There were a lot of tickets in JIRA, but there was a lack of organization. There were tickets that no longer made sense, epics for people instead of for grouping related tickets, and there were no sprints. So tracking who’s doing what and how long the tasks have been in development or waiting for development was impossible. This was a big problem as it gave everyone the impression that nothing was moving forward.
🚩 Third red flag detected
On the technical side, each of the freelancers had their workflow and way of working. To try and standardize this, a lot of rules were enforced in the GitHub repository:
- it was required that two people must approve each pull request
- an AI code reviewer was added that would leave comments to the author
This did standardize workflows, but with a downside: merging a pull request was a slow process. And the impact was even worse when paired with the async and lacking communication.
🚩 Fourth red flag detected
As mentioned above, the number of JIRA tickets was huge and disorganized, and the workflow for merging pull requests was slow. As you can imagine, the number of open and abandoned pull requests was also huge. Pull requests open for 2 or 3 years that no longer made sense, or were way behind the main branch and consequently extremely hard to merge. This bottleneck is a clear indication that the workflow wasn’t working out in their favor.
🚩 Fifth red flag detected
The proposed solutions
After a few days of working with the team, we were able to identify these four red flags mentioned above, and to address them we proposed several changes.
The infrastructure
We proposed to migrate the application to Heroku to:
- simplify the infrastructure management
- simplify the deployment process
- fix the staging/production disparity
- have a working testing environment
This process was not as simple as it sounds, but that’s a story for another post. 😉
The team
Having a team of freelancers geographically distributed is not the same as having a nearshore team. On one hand, we are closer to the client’s time zone. On the other hand, we have solid shared views on our craft: pragmatic, simple, and maintainable code and workflows.
So we proposed to reduce the distributed team in favor of a more cohesive nearshore team.
Communication
For us, good and fluent communication is key to a healthy collaboration. We needed to have clearer and more frequent communication. So we proposed to:
- concentrate the async communication in Slack
- add sync daily standups
- keep the weekly planning
Planning and tracking
For the organization of JIRA, we proposed to:
- use epics for high-level feature specifications
- use tickets for each task inside the epics
- use independent tickets for small tasks or fixes
- plan ahead a couple of weeks’ worth of work in sprints
Finally, to wrap this process up we would use the weekly planning meeting to review next week’s priorities and discuss the mid-term plan with the team.
Workflow
We also proposed to change the way the team was working. We needed a workflow with less friction that allowed the team to move from development to production swiftly and effectively. So we proposed our standard workflow:
- one pull request per ticket
- once a pull request is open, any team member can review it
- once a team member approves it, it can be deployed to stage
- once a team member tests it (we normally include QA analysts in our teams), it can be merged
- once a pull request is merged, it can be deployed to production
The results
After 3 months of working with the team, we were able to:
- reduce the number of servers from 10 (applications and services) to 2 Heroku apps and the corresponding services
- reduce the infrastructure bill by a factor of 3X
- reduce the time it takes to deploy the application to minutes
- fix the testing and deploy pipeline with 1 working staging server, where all features are tested before deploying to production
- fix the communication channels and the rituals of the team
- fix the priorities and planning
We also reduced the team’s size from 4 people to 1.
Conclusion
At the beginning of this collaboration, everyone thought that what this project needed was more developers to increase the team’s productivity. That’s not always the answer. Finding efficiency is, and is achieved by smartly allocating the budget to the right things.