13 min read

Jason Barry @ Netlify Talks Migrating to DevCycle

Jason Barry, Senior Staff Frontend Engineer at Netlify shares the development strategies they used to make their feature flag platform migration an easy one.
Jason Barry @ Netlify Talks Migrating to DevCycle

Last year, Netlify made the decision to switch to DevCycle for their feature flag system.

Netlify adheres to the philosophy that every developer and manager should have the access and freedom to create the feature flags they need. However, the pricing model of LaunchDarkly, their previous feature flag provider, made this approach prohibitively expensive.

Jason Barry, Senior Staff Frontend Engineer at Netlify, talked with us about the strategies and tools that facilitated an easy and secure transition to DevCycle. He also talked to us about the DevCycle features that became indispensable, including beta opt-in for their dashboard, nested flag grouping to reduce code complexity, and more.

Taraneh: Can you fill us in a little bit on what your current role and responsibilities are?

Jason:  Yeah! So I'm a front-end engineer. I work on the Netlify dashboard primarily, and various front-end, client-side applications for Netlify. I've worked a lot on collaborative deploy previews, which is on every branch that you have as a dedicated front-end environment, and we overlay some UI that syncs back with your Git provider. You can invite the team reviewers and such to leave comments, and that syncs back to your open GitHub/GitLab pull requests. That's one of our front-end properties. And then [as I said, I worked] on the main Netlify dashboard where teams can come in and administer/configure their sites and their deploys, and things like that. Also [I've worked on] the CLI, which is another client for interacting with Netlify in a command line environment. We use feature flags, pretty much throughout all of those, and it really helps us a lot.

Taraneh: Very cool. So what does your tech stack look like?

Jason:We're a React shop, so we use a TypeScript, React Redux app. It's been around since the dawn of the company, so like, [the] 2014 era. It's quite a large repo, and we have about 15 or so front-end engineers contributing to it on any given day. That's the core project, and also the biggest user of front-end feature flags at Netlify. 

Taraneh: What is the makeup of your team structure?

Jason: Organizationally, we congregate into pods. That's kind of like the base unit. Pods are cross-functional teams. You'll have front-end engineers, back-end engineers, designers, technical writers, PM's—things like that—all of those organize into pods. But then, on more of a horizontal level, we have a guild (so to speak) which is just everyone from every pod who has the same role. So, all the front-end engineers together are on the same guild, but touch many different pods.

So because the front-end app is a shared repo, right, we have all of our pods coming together to implement features, so we have to work together and organize and communicate to build that app.

Taraneh: So you mentioned that you use feature flags frequently throughout this process? Are there a couple examples you could give us?

Jason: Sure. Primarily, the one we use the most is to gate feature rollout. [Let's say] we're developing a new feature. You don't want to necessarily release it in one giant PR, right? Because your reviewers will be upset with you that they're reviewing 1000 plus line pull requests. It's just a lot easier to manage when you can ship to production in smaller bite-sized chunks. [Then] you could ship 5% and roll it out to prod and your users don't need to know about it, right? Because it's gated behind this feature flag. That way, too, you're not having to deal with huge merge conflicts. It's much easier to just ship smaller pieces, and feature flags allow us to do that.

Also controlling rollout. Let's say, we have a new feature or maybe even a bug fix. We were not sure how it's going to be received, right? Is the bug fix going to work? Will users even use this feature, right? We can do a controlled row rollout, [which] I think you call, like, a "gradual rollout", which is percentage-based. So for, you know, 10% of traffic, you roll it out to those users, and then we slowly ramp it up as we monitor the results. And it's not just random per request, [DevCycle] does a really good job of mapping it to the user. So if someone falls within that bucket, they continue to receive that feature and ... they're forever in that bucket until we change it.

Taraneh: What does the decision to create new flags look like from an end to end process? Who's involved? Who helps? Who decides, how are they organized? 

Jason: We do it! It's up to the developer or the person implementing the feature. That's what's really nice, we don't need to ask like, “Hey, can I make this feature flag?”, [because] your model that [has] unlimited seats. Not paying on a per seat basis for your pricing was instrumental to us. Anyone on the team should feel capable of going in and making a feature flag for their particular needs.

So, yeah, [we use feature flags] pretty much whenever. We use a flag whenever someone wants to release something that is bigger than what fits in one pull request, or if there's a release that's timed with a certain marketing announcement. [In that case] we want to make sure that it goes out at an exact date because we're announcing something at a conference and we want the attendees to be able to see it as soon as the announcement is made, so it can be released at the click of a button rather than merging something and waiting for the build to complete, and trying to time it that way.

So, yeah, just gating features behind like "one click" and it's just instantly released. It’s a huge help, and developers on our team feel empowered to make those feature flags without needing to ask someone else for approval.

Taraneh: Are there any other unique features of DevCycle that your team is currently using?

Jason: Yeah, we use [DevCycle’s] feature opt-in quite heavily. So, we have an area in Netlify called “Netlify Labs”. It's like a feature opt-in. I'm sure you've seen it on many SaaS products where as a user, you can opt-in to receiving experimental features. It's kind of like a pre-beta if you want to see the latest and greatest cutting edge stuff, you can as a user go into Netlify Labs. It's in a section in the Netlify dashboard, and click, “I want this one turned on. That one turned off,” that sort of thing.

So, before, we kind of frankenstein’d together a solution using LaunchDarkly with feature flags for feature flags. So it was a very meta, frankly confusing, implementation where we had a feature flag that would determine whether or not the labs entry would show up in labs, right?

And that would show the name of the feature, the description, right? Kind of like explaining to the user what this feature does. And then we had another one, that was actually controlling whether the feature was enabled, right?

And it was a frequent source of confusion. Our convention was to just reuse the same flag name, but put a “two” at the end. Everyone mixed up like “Is this the feature that shows it to you in labs? Or is this the feature flag that actually controls whether or not the feature is enabled?” So… yeah, it was confusing for sure.

But because you have first class citizens support for user feature opt-in, now basically, we just flip a switch on an existing feature flag, right? We toggle this boolean and fill out like “here's the title, here's the description,” and then anytime someone opts-in, you have your EdgeDB that handles whether or not that user should receive that feature. So, yeah, migrating to that, let us delete a lot of code which is really nice.

Taraneh: It always feels good to have a PR with negative contributions. 

Jason: It’s the best feeling! The only thing better than writing code is deleting code.

Taraneh: I think you've painted a really good picture of how your team works together to use feature flags in a general sense. Moving onto the migration side of things, when and how did your team make the decision to do that? What were you looking for in your next feature flag platform?

Jason: Well, really the impetus for us to switch was saving cost. So many feature flag platforms charged by the seat and you didn't. With our growing team, we realized that we could save a lot of money.

Before we came to become DevCycle customers, we would have just had a limited amount of seats, and then we had a dedicated Slack channel where people would ask “Can you enable targeting for this feature for me? I need it for this user and this ID” and blah blah blah. And we had a dedicated Slack channel just for that, because our budget didn't allow for every single person to have a seat. So when we learned that your pricing model doesn't charge for seat, it was kind of a no-brainer!

And then you've also been really good at listening to our feature requests. You didn't have feature parity out of the gate, one of which was like SDK support. But now that's landed where you can specify which SDKs you want a certain feature for. That was pretty big for us too because we share feature flags between front-end and back-end, and we didn't want the names of back-end feature flags to be pulled into our front-end, right? That could [show] feature details and things like that. So being able to control which SDK a feature is [attached] to, while still being able to share flags between front-end and back-end—or like cross SDK—was really cool.

And then there's also the performance factor too. You've done a really good job at making it like super performant. I know especially in the Go SDK, our run-time folks are very happy with the performance improvements. And really, yeah, the impetus was, the pricing model. So, we came for the pricing and stayed for the feature parity.

Taraneh: Did you at any point consider an in-house solution?

Jason: We did, yeah. So not just building an in-house solution for us to use but that we would eventually like productize, and sell, right? Because we are a front-end platform, it's such a frequent use case of developers needing to control feature roll-out, and we're like, “What if we build this ourselves?”, not just to use but to sell as well and make into our platform.

But it's such a large overhead and we didn't think we wanted to spend our resources… kind of building that one and competing in a bit of a saturated market. There are lots of good players in the space that do it well, and then there are so many [other] things that we could be building. It didn't necessarily make sense for us to build it when there are so many good options out there in the market.

Taraneh: What was your role in the migration project overall?

Jason: Yeah. So I led the migration for the Netlify UI. I was responsible for the front-end portion of the migration. That was our biggest web property. We have others of course, some don't use feature flags at all—like our docs, I don't think use feature flags—but collaborative deploy previews uses them as well, [however] it's much smaller—maybe two or three. [That said], we have a couple 100 in our main Netlify dashboard.

Taraneh: Migration is not a trivial thing. I’m curious to gain some insight into all the major steps that were involved in that transition and you tackled them as a team?

Jason: Yeah. So, we had a pretty cool migration strategy, and you were very helpful too. DevCycle has a tool called “feature importer” that's open-source and on your GitHub that we used and that got us like 90 percent of the way there. That was great. Basically, you select a project in LaunchDarkly and then it downloads it, and converts it in the right shape and then sends it to your API so that you see all the feature flags with all the targeting rules and everything like that for all the environments, the audiences, and everything. It's all replicated. LaunchDarkly calls them something else, [so it's] basically just converting them into your world and nomenclature. That was a big help.

But then from the coding side, we actually had a pretty good strategy that I liked. The first thing we did was stop importing directly from LaunchDarkly. We created a hook that was genericized to work with either LaunchDarkly or DevCycle, right?

So basically grep for every literal import directly from the LaunchDarkly SDK and create our own helper hook that we call “use flag” and also “use flags", because I think that had a more direct parody with LaunchDarkly [where] you import all the flags at once and then you select which ones you want there.

But I like your method better because you can set a default per flag. So, if DevCycle goes down for any reason, then the fallback is in the code. So say, “use flag”, “flag name” and then you provide the default value because some of our feature flags are defaulted to “true”, right? We want users to be able to opt-out, for example, that was one of our use cases. So, to have to define “true” in code, was super helpful and kind of gave us peace of mind if, for whatever reason, DevCycle goes down or something like that, then, it resolves to sensible defaults. So that was really nice.

So basically, the first step was stop importing directly from LaunchDarkly and have our custom "use flag" hook. Then what we did was we wrapped our app container in the DevCycle client, right? So we actually had DevCycle and LaunchDarkly clients running concurrently side-by-side and the DevCycle one was the outermost. I think it didn't actually matter which one was the outermost, that was just how we did it.

But the cool thing was that we used a LaunchDarkly feature flag to control the roll out to DevCycle. So we used a feature flag to control which feature flag provider users used, right? And that was a huge help because it let us test internally before rolling out to a wider audience. We actually built [something] internally in our command palette so you can select if you wanted to get your feature flags pulling from LaunchDarkly or DevCycle. You just select it and then the page would refresh, and that's how you would choose your feature flag provider.

The other big thing was swapping out our old convention. I touched on this previously—the ”feature flag for a feature flag” for feature opt-in—we didn't want to continue that pattern anymore. This is probably where we spent the most custom work, because your feature importer didn't know about our wacky system, so we had to build something.

[The feature importer] imported all of the rules, but we didn't want to use those flags anymore, anyway, right? So all of the feature flag 2 names that had all the targeting lists [that said] "here are all the users that have clicked opted into this" which were, like, hundreds or even thousands of like individual e-mail addresses on the flag, that was no good for us, because we wanted to end up deleting those flags anyway. So I wrote a script to basically, for all of those feature opt-in flags, point them to your EdgeDB instead. Your docs were great in learning how to do that. Basically [I] just wrote a script that would pull all the targeting rules for the individual email addresses, and then point them to the EdgeDB Bucketing API.

So, yeah, we had to do a custom solution there because you offer like a feature opt-in widget, which is great—you can set your branding and color and things like that, and it happens in an iframe—but we wanted more control of the UI. And one that worked across all our environments because we have local deploy, preview, staging production, we wanted each to have their own Edge DB environment. So for example, you can opt into a feature in a deploy preview and test it out there. But it wouldn't opt you in on production for example.

So that was a bit complex because to opt-in a user, we basically have a lambda to control that. So we wrote equivalents of those LaunchDarkly lambdas for a DevCycle lambda, and then on the client-side we would say, “Are you opted-in to receive the DevCycle provider?", if so, hit the lambda. If not, hit the old LaunchDarkly lambda.

But then, that was tricky as well because we had to be careful because that could create drift in data, right? Because we were using LaunchDarkly as the source of truth until everything was migrated over to DevCycle. But if someone was already on DevCycle—like someone internal—and they opted into a feature, if we ever switch them back to LaunchDarkly, we wouldn't have that data, right? They would have been opted in DevCycle but not LaunchDarkly. So then we're kind of forking user preferences. There's a bit of drift. So, that was a risk.

We considered opting-in would actually opt you in both platforms and opting-out would opt you out in both platforms, but then that raised a question of well, "what if one request succeeds in one platform and fails in the other? What do we do?". So we ended up basically just hitting one and trying not to roll back users. Like, once users were on DevCycle [we tried our best] to keep them on DevCycle so that any of their user preferences would not be lost.

That overall was successful as a strategy. We never had “Oh no, it's not working. Let's put everyone back to LaunchDarkly.” We never had to do that, which was really nice. 

Adam: Speaking of PRs that delete code and your feature 1 and feature 2 setup, that sounds like it might have been a little bit complicated. The [new] simplified system of a first class opt-in, did that allow you to simplify the code around that as well, and delete a bunch of code?

Jason: Absolutely. Yeah. [In] the labs component itself, we were handling a lot of logic just client-side and the convention a naming convention was not great. We made a lot of accidents because of it, right? We didn't know. We thought that we had a flag enabled and it turned out that users couldn't see it, right? Because they could only see the entry in labs, right? So having that first class support really cleared things up from the code's perspective as well.

Adam: Yeah. And I guess the channel you were talking about earlier, the one where people request changes to LaunchDarkly, have you been able to archive that now?

Jason: Archived! Yeah, it's great. We just log in with SSO and make whatever changes we need.

Adam: I feel like there must have been a little bit of a celebration there finally archiving in that channel.

Jason: Yeah. I mean, luckily, I had a seat on both, so, I never had to go through those pains, but a lot of my teammates did. 

Taraneh: What were some of the tangible differences that you experienced in your workflow after switching to DevCycle?

Jason: Yeah. The biggest one was not having to use that channel for sure. Everything else has been really nice. I mean, yeah, you listen to our feature requests really well, and I can just feel like I can count on it. It works how I expect.

I [also] really like your model of targeting users as an ordered array rather than just a set, if that makes sense? The rule sets execute top down, like in your UI, you have like, "okay, if this passes, then the user gets targeting. Otherwise they move on to this section, otherwise they move on to this section. Otherwise they don't get the flag at all", right? Whereas in LaunchDarkly, it's kind of like "here's everything all at once" and you kinda have to see which evaluations supersede each other. So, I liked your method, like your model, of how users get targeted.