Tuesday, July 2, 2024

closing 5 years, retrospect



After just a bit more than five years, today was my last day at Deep Instinct, and that's a great time for some reflection. I've seen some good, I've seen some bad, and I managed to learn from both. It's a bit daunting to try and pack five whole years into a single post, so I'll just paint an image in wide brush strokes. It will be inaccurate, and I'll miss a lot, but it is what it is. 

Also, it will be long, bear with me. 

Year 1

This was a year of growing, and of adjusting expectations. it was my first time working in a start-up, with only one previous workplace, and I was up for a surprise. There were no working procedures, minimal infrastructure, and the very strange part - it seemed like everyone were ok with that.  In this year I learned how does it look to have a project that can make or break the company, I learned that building communication channels takes a lot longer than I anticipated, and that what I believed were industry standards were not necessarily common. 

Our main focus on this year has been to build a team and to create the necessary infrastructure for system testing our products. So we spent some time hiring (note for readers - if you aim above the standard skill-set in your area, you have a long and arduous process ahead of you), and I found out that with more than 5 junior people to shuffle around (I wouldn't really call this mentoring) I was overburdened.  I came with an idea on how should our framework look like, and was lucky enough to have someone in the team come up with a better suited idea. There was a lot of work done on the technical side, and not a lot done on integrating with the other teams or the product work. We simply were not yet ready. by the end of this period we had a working system test framework, a team of ~12 people, and we've proven our value to the organization.

Years 2 & 3

Those are not actually full calendar years, but it's a nice title for the period after we've made enough catching up and focused on integrating with our environment. Some of this effort was being done during the first period, but this was more in the way of laying some foundations for the future. Now we've started turning up the notch on talking with other groups. This is where we've started feeling the missing procedures and culture around us - getting invited to a design review is quite easy, but getting invited to a design review that does not happen is a different thing altogether. We've hacked around and compromised a lot - starting with just asking for someone to give us some sort of a handover in lieu of the design review would be one such example. In other places, where we did join the table, we made sure to provide value so that we will be invited again, and we volunteered to help other groups use our infrastructure for their end, partially to gain a reputation boost (but mostly because it was needed for the company). We had a great opportunity when a new product was starting from scratch (ish) and we provided them with a dedicated person. We then used this team as a model to say "this is how we think we should work", thus initiating the third phase of our 4 steps plan (worded a bit differently, I wrote about it here. Roughly labeling, the steps are stabilize, grow, disperse and disband) - we didn't actually complete the growing part: we didn't have a good enough infrastructure to share, and most of the people we had were not skilled enough in testing yet, but reality is messy, you don't get to complete things cleanly before moving on. 

We also struggled with our own growth - when our team size neared 20, and the number of products grew, we started feeling the pain of stepping on each other's toes and trying to focus on too many things. So we split into two teams, which in turn required us to split our code-base to match - a realization we got to a few months after the team split, since we now had people focusing on one task and we didn't want us distracted by noises from the other side of the split. Another aspect of this split was that I found out I'm a difficult employee - as part of the split my then manager became a group lead, and a team lead was recruited. At the point where she has arrived, all of the team were people I either recruited or welcomed to the team, and I was a nexus of knowledge about most of the things happening around our team. That is to say - while the authority was hers, I had way more social power. It took us a while to sync, and there was this one time where I used my excessive power and clashed with her in front of the entire team - I knew I was wrong a few minutes after doing so, but the damage was done. I got a good lesson on the difficulty of apologizing publicly, and later we've both synced and balanced our power in a more suitable way, so I'm guessing it all worked out in the end, but I hope I will remember this lesson for the future.
We have also faced another problem - A lot of people were leaving us to pursue other opportunities. There have been a lot of factors to this - we have hired mostly juniors out of university, we were a small start-up with a lot of changes, but there were two factors that bothered me. First, there was the feeling that in order to progress in one's career, one had to get out of testing. The second was that in our company's culture, testers are seen as a second class employees - I'm pretty sure no one will admit to this even to themselves, but it can be seen in all of the tiny decisions - who gets invited to early discussions, how often do people feel comfortable telling you how to do your job, who gets credit for work done, and so on. It ties back nicely to the first problem, but it was noticeable enough to deserve its own place. It took me a while, but the first problem has led me down the path that led to building an in-testing career growth model. The second part of the problem was one I did not address at all, and when I look back on, there were probably some actions I could have attempted. The way I see it, there are 3 main reasons for testers to be on the bottom of the food chain: First is the reputation of the profession in the outside world, which is beyond me to change. Second is the fact that in most places, testers who write code are less capable than "proper developers" (which leads to a vicious cycle of hiring with low standards and reinforcing the conception of testers having inferior skills) and the third is that testing have no explicit contribution: outcomes such as risk reduction, placing safeguards, and maintaining test code are part of other roles as well, especially when actively pushing towards full integration of testing into the teams. I'm still bashing my head occasionally against this question, with recent thoughts influenced by "Wiring the winning organization" (A book review will come soon, I hope). 

All in all, there were a lot of challenges during this period, but it was a good time. 

Year 4

Year 4 has started on January 2nd, 2022. those with a keen eye and access to our HR system might notice that there are 2 months missing between my actual 4th anniversary on April 2nd and this date, but as I mentioned, the previous years have been so only in the most general way. In that date, we had a new VP engineering (it's formally named VP R&D, but we do have a separate research department) join us and replace the person who was filling this role at the time. 

 I learned one thing during this time - While it's true that Rome wasn't built in a day, it didn't take that long to burn it down. At first, I thought it's just a style of communication and trying to prove that he's the boss, but soon enough there were so many examples of bad leadership that I couldn't push down the fact that I was dealing with a bully. A few red flags for the future, and perhaps for the readers as well:

  1. Everyone who were here before the bully are stupid, as is every decision made.
  2. People that are still here need a firm sheriff  to teach them how to do their work.
  3. All decisions should go through the bully, if there was a decision made without him, it will be reversed.
  4. Listening is for other people.

I could see the damage really fast - the number of crises that required people to stay up late and to work on weekends skyrocketed, communication between teams was reduced, and everyone tried to make sure that when something goes wrong, they will have someone to point at and say "I did my part, so it must be those people over there". Add to that a replacement of 2 of the 3 group leaders within 3 months (one quit, the other was fired in the most insensitive manner), and it's no surprise that things were a mess. About a month after the VP has joined, he announced on two major projects - a complete rewrite of our SaaS product, and a shift to scrum. Both failed spectacularly.
Let's start with the technical project - As it is with many companies, there are more urgent tasks than there is time or people, so shifting people to work on the project would mean slowing down on the roadmap, which can feel fatal for a startup. Also, do recall point #2 - people other than the VP are incompetent by definition. We also have no real documentation of the system we are aiming to rebuild and no experience is creating good requirements or specification even for smaller projects. The solution? To hire an external company to build part of our core business components. Yes, it is stupid as it sounds. Yes, it was said in some circles at the time. Saying it to the VP would have been hopeless, and an invitation to be bulldozed. 

So, while this project is happening, the other project can't happen without the people in the company, right? Moving to Scrum is a people and processes task. Well, this has botched as well, and I wrote about it in some length already here. Since that time, the project has concluded and I got to talk with one of the consultants and asked him why some very basic things didn't happen, or at least mentioned as goals and his answer was very respectful and general, as he was masking the fact that the VP who hired them also blocked most of their initiatives (I did ask if I understood correctly, and got the "I can't answer this directly" kind of answer). 

So, why did I stay? At first, I was mostly shielded by my skip-level manager, the one remaining group leader. In July I approached him to tell that I have a job offer, but if he can promise me that he won't break in 6 months, I'm staying. His response was that while everything is dynamic and he can't make promises, at the moment he has no thoughts of leaving and is not looking around for alternatives. A month later he called me to tell that he got a job offer and is accepting it. Well, that's life, and I wished him well. Then I took my time - things were not terrible for me personally, so I thought to choose a place I'd be happy to. In January 10th 2023 I got an offer from a place that I thought could be good enough, but I took some time to deliberate. The 4th year has ended at January 15th 2023, when we were told that the VP has chosen to pursue other opportunities and will be leaving effective immediately. The next day I told the place that offered me a position that I'm staying. 

Naturally, there was a lot more going this year. For instance, I got to do some close mentoring with some people, which was very interesting - I worked formally with two people, one who joined us in order to stretch his abilities, and was very receptive to my attempts in teaching. I found out that I need to learn how to map the skills and gaps of my colleagues and then find proper way to teach those. For instance, I did manage to see that some of my mentees were struggling with a top-down code design, and then found that I'm not sure how to teach that (tried to force TDD, I think it worked to some extent), other skills such as modeling or communicating confidence - I don't think I managed to teach as much. 

It was also the year where I could put my knowledge in testing theory to use, even if a bit violently. The story was as follows: After my skip-level manager has skedaddled, another was brought in his place, though without either his skills, his care or his work ethics (though he does hold the record of being the fastest person to be fired that I've seen). Shortly after that, there was a crisis (did I mention there were a lot of those?) a product that has been developed for an entire version without any sort of testing or attention of testers (all of the testers were busy on the previous crisis) and without mitigating this problem by telling the people working "ok, you don't have testers, do your best", was now blocked by all of the tickets in a "ready for testing" state.  The new group lead, eager to show... something, has stated (or, perhaps, iterated the bully) "no compromises on testing, I want full test plans", which, given the deadlines we were facing was simply not possible, so I did some rough estimation and got to somewhere between 13 and 40 days just to write the test plans for the 42 features, let alone executing them, to which I got the lovely response "we need to be more agile". I'm still quite proud of not responding snarkily  "agile doesn't mean cutting corners and doing a shoddy work", instead I suggested that we'd use SBTM, which won't solve our problems of not having enough time, but will mean that we'll start working and finding problems faster. I might have exaggerated its benefits a bit, but I can now say that I got to win an argument only because I was more educated on testing than the other side and could slap some important looking documents at them (I have James Bach to thank for creating this document) , so yay me. Oh, also there was the part where we didn't waste a ton of time. 

Year 5

So, the bully has gone. There were some rumors around the cause of him getting fired, and sadly, "someone noticed he's menacing the entire organization" was not one of the options (in fact, another rumor had it that they actually had another VP coaching him on the bullying part, which, if true, saddens me greatly because it means his behavior was known). Now we had a chance to start a healing process and make up for the year of moving backwards (my feeling was that our culture went back to a state which is roughly equivalent to what was a year before the bully, but that's just a guess). We had several challenges to overcome - we had a beaten down department, we had lost a huge number of people (while I can't tell exactly the numbers, today, about half of our engineering department were hired in the year and a half after he was fired), the group leaders we had were hired specifically to match his management style - to relay his decisions and to have as little agency as possible and make no decisions themselves. Personally they were nice, but didn't have the skills to lead a healing process, and finally is the person who has replaced the bully,  that I think is a good manager, but took on an impossible task - managing two departments (he was, and still is, our VP of research) with over 100 people at the time, and having the style of a good manager, which is to delegate a lot to his direct reports, which would have been great had we the right people and the right structure in place. Add to that the market pressure as funds of the last round were slowly but decisively dwindling, and the result is that we had an absent VP. On top of all that, since last October we got a war in Israel, with many people, the new VP included, were called for months of reserve duty, risking their very lives defending the citizens here.

So, not the ideal conditions for healing. But, there were opportunities as well - before the bully left, he declared that we're moving to (finally) disband the QA department, sure - as always, it was done for the wrong reasons and seemed like he was going to do that the wrong way, but once he left we talked to the group leaders and understood he did not share with them more details than he had shared with us (not a lot, in case you've wondered, only a general statement), so we started discussing how to do this transition safely, and where should the responsibilities that were currently held by the test teams be. This task proved too much for our group leaders, so they decided to leave the status quo more or less as is, with some cosmetic changes perhaps, this left my manager to start and dedicate more of our team to specific products, which is a smaller step then what I aimed for, but at least it was in the right direction. What we both did not know at the time, was that the bully, after clashing with my manager who was defending the team quite admirably, has started the process of firing her, which while halted due to him being let go, did paint her as a troublemaker in front of HR and upper management. The difficult discussions around the re-org were the last straw, and despite the team rallying up behind her, she was let go. 

this was the point where I had to assume another role - the team's anchor. Several people approached me with concerns and wondered whether they should start looking for another place. I spent some time talking with them and trying to help them figure out what they need and how to get that here. I was feeling very ambiguous about this - on one hand, I wanted to join their rant and tell them I agree, on the other, I was the face of the system, and had a clear incentive of convincing them to stay. I was very careful not to lie, but there were some questions I had to dance around. There were some nicer aspects to this, though. Once I realized this is part of my role, I looked on the 2nd test team - we had a major problem there, a few months before my skip-level manager left, their team lead made a lateral move to another department, and the team was left without a manager. As long as my skip level was there, he was able to provide some guidance while looking for another team lead, but this effort stopped when he left, and the team of mostly juniors was left without a leader, and the few seniors there left or detached shortly after. So, now we have an orphaned team for over a year. People were leaving. I did a quick assessment of the people still there, and found that while most of the people there are ok, there is one that I really didn't want leaving, so I approached her and asked how she was feeling and what she was missing, which has led to us meeting weekly (ish) and me trying to be the professional authority she could learn from or consult with. That ended up being really fun.

On the work side, with a new VP, we finally started to work on the shadow project declared by the bully, only to find that, quite unsurprisingly, the contracting company who were now getting paid for over a year has managed to produce  some lengthy architecture documents that were not really suited to what we needed, and that the bulk of work still needs doing. Surprised? I wasn't. So, teams started scrambling in order to get up to speed, and the added difficulty of coordinating with a contracting team made it that much more difficult. I was brought in to assess the "testing" they have done, just to see some demo robot-framework tests done against a mockup, with no thought on the SDLC, and no one on the contracting team who even understood my basic questions. So it was up to me to devise a testing strategy for this new product, and for the modifications of the other parts that needed to change in order to support this rewrite. 

At any rate, reality was not waiting on us to sort our mess, with post-covid funding being harder to get and some major customers backing out of deals in the last moment, our company figured out that in order to stay in business, it is better to update the income forecast to be more on the pessimistic side, which meant a reduction in force - one of my mentees, whom I just told a week earlier that probably had a month to improve before termination was an option was the first to be named, and another, more appreciated employee was laid off as well. Again, I found myself as the go-to person for people disturbed by the change, and this time I could only tell them "the company claims that after this reduction we should be clear for at least a year, so if the only reason is job stability, keep an eye, but this need not be the fatal sign. As part of this, the company has pivoted and changed its market focus, so the entire rewrite project was called off, a year and a half too late. We returned to the previous initiative of a gradual upgrade of the current system, because, well, it made more sense. so new strategy and cool buzzword technologies? not anymore. At least this time when I did damage control I could be honest and say that my only problem with aborting this project was that it happened too late. 

Another thing that happened is that management declared that "we have a quality problem", and while they also said "quality is everyone's responsibility", the way they behaved  was more in the way of "we have a QA group, so they should patch quality in the end". We found out about this when we were told a "QA expert" consultant was hired and will be joining us. Yes, it was a slap in our face and ego, but we decided to let it pass and try to use this opportunity to further push our goals. This consultant, unlike the agile transition ones, was quite good - She did her research, did some coalition building, and started to look for ways to push for a change. The main difficulty? her hands were tied by the limited scope - fixing only the test team and not the surrounding problems we have could make a minor improvement at best. To make the problem worse, resource and time allocation were not really made (there were promises for time, but they were not scheduled or considered by the other projects) and we had a new test team lead (for the orphan team) that, at the very least, wasn't aligned with the rest of the company - but we still needed to push through his incessant resistance. 

I've learned a lot from the consultant. I learned how to take a vague goal and make it concrete and presentable to management, I learned from our failure on the importance of setting clear milestones, and of saying "I don't need 10% of the department time as a constant, I need 5% now, in the form of one specific person, and in 30 days I'll need 20% of all hands on deck", not doing so meant that we had people bored while we were preparing, and schedules tight when we actually needed the people. One thing we tried to do but didn't get to an agreement was on our definition of "quality" (which is a term I really hate for its vagueness) and following that, on the goals we could theoretically achieve.

 However, not all slaps in the face were as successful as this one, there were a few others, and two that had a significant impact, both were a case of a problematic hire. The first was a team leader for the orphaned team - it just so happened that the hiring manager, a developer in his past, had no idea how to assess testing skills, and thus, for most candidates in this position, he asked the other test team lead we had to interview, and due to the nature of the market - there were a lot of candidates that fit on paper, but had no relevant skills, and they were failing the simple skill checks. Then, there was this one candidate who did not get interviewed by said team lead - the benign explanation is that the team lead was under a time crunch and needed some space, and the hiring manager decided to skip this phase despite the obvious gaps in his filtering. The candidate that was hired was a very wrong fit - he didn't have the skills to help his team improve (they needed a strong coder) and he wasn't able to find someone else to provide this sort of service. He tried to bring a solution that worked for him in a completely different place without understanding the situation and without talking to his team, and he had the wrong mindset for what we were trying to achieve. All of this is because someone assumed they can hire a testing professional without at least having a conversation with the local testing experts (there were two of us, at the time, so I know it didn't happen) to understand what should they be looking for, and how to check for that. Failing to do that, the first person who knows to parrot some stuff that sound like testing will have an easy was to slip in.

The second slap is pretty recent - a new QA group lead was brought along. This is a double slap in our face - first, because bringing an outsider without letting someone from within to apply for the position is showing a lack of confidence in the people you have (this was also the case with the team lead position). Second, because bringing a group lead was the opposite of what we were doing here for the past few years, and finally, because no one talked to us, again. Now, it's complicated to have one interview their future manager, so it makes a lot of sense to leave us out of the hiring loop, but after so many bad choices, why not ask for our input? maybe we'll be able to convince you you don't need this function, maybe we won't and you'll explain to us what is it that you're trying to achieve, maybe we'll agree that in our current state we need to find a team lead that will have the explicit goal of disbanding this group within one year and then build a community of practice instead - but this didn't happen, and I found myself facing another manager that I don't appreciate professionally. In this specific case, it took the manager less than a month to make some unforgivable mistakes that were a clear signal for me that on top of what might be just professional differences between us and not sheer incompetence on his side, there are also a severe issue with his management style and skills that I can't accept. To be fair - this particular problem is something that is difficult to interview for, and I'm not sure it could have been avoided. It is still not a pleasant experience.

The one last thing that sticks in my mind is the most recent project I was involved in - a new, AWS native product that is supposed to be a major revenue boost for the company. It started while I was busy putting off some fires, or as we call it here "releasing a version". When I tried to get involved in the discussions with AWS, I got a pushback, and since I was busy with the fires, didn't insist. It turned out to be quite a mistake, since by the time I was ready to join, the initial architecture was already decided, and aside from being quite complex for reasons it took me a while to learn, there was zero thought on how to manage this coding project or how to test it, and I had to spend perhaps a month just wrapping my head around all of the new things I had to learn just to come up with a strategy, then break it down to a plan, and do this while everyone is asking "how soon can you start writing system tests?" and answer that we need about a month to change our infrastructure, and that it is more difficult to do because the product has no deployment scheme of the latest code, so each step is just taking five times the time and effort it should.  Dealing with the constraints I had was tough, and I wrote about it in my last blog post. I'm quite happy with the result, but sadly I won't see how well it works, or be there to help push for the strategy to be fulfilled - there are a lot of ideas that are new to the company (sadly, one of them is unit tests), and there is surely some tinkering required with the new mechanisms (such as contract tests and managing the various contracts), but that's someone else's task now. 


I'm off to find new challenges, and will be starting tomorrow in a new and exciting place.

No comments:

Post a Comment