I have worked in numerous businesses over the years. On many different types of projects from Windows apps, all the way through to popular service based websites. Each business had a different process to update new code and get features into their production environments. Some processes were easier to get code live that others. One thing that all of the processes had in common is that a rollback mechanism was in place. Allowing you to ‘back out’ any changes that may have caused issues in the production environment. Similar to a normal rollback process, feature switching adds another layer to code deployment processes.
Implementing a rollback process is a great way to ensure that you can always get your software, product or service back up and running with minimal effort. In my opinion, this process as a ‘all or nothing approach’. If you are releasing a feature or fix one at a time. This release mechanism works fine.
Some projects I have worked on deployed a lot of new features into production software by batching the new functionality or fixes into one release. However, if you have a handful of new features your were trying to get out to customers asap and one of the features had an issue. You would have to back out all the other new functionality too. Regardless of whether or not the other features worked as expected.
One or two of the businesses I worked with, either had a ‘Feature Toggle’ mechanism in place or we managed to build one into our software delivery process. I will explain what is ‘Feature Switching’ (also known as feature toggling or feature tagging) in the next section.
What Is Feature Switching
So, what is feature switching, toggling or tagging. Well, at it’s highest level. Feature Switching is a way to manage code that allows you to put new enhancements into a product, but they can be switched on or off as and when you want them to be used.
“What do you mean”, I hear you ask. I will try to explain it using an example. I have a website that I login to. After I login I see the main page which shows 4 graphs. These graphs detail key information relating to my business. The loading of each graph takes well over a couple of seconds to display the information. The styling of the graphs is like something from the 1980’s.
Easy As Flicking A Light Switch
I have paid a designer and developer to improve these graphs. It is a completely new way of displaying key information. Therefore, this has the potential to break the home page of my website.
To get round this I use a feature switching system to create 4 feature switches. I ask the developer to implement the code to be able to check if a feature switch has been enabled for a particular graph. If so, display the new version. Otherwise I want my website to display the older version of the graph which we know works. Below is an example piece of code showing how I would like the graphs to be displayed.
if (featureSwitch.IsEnabled("GraphOne")) then
By default all the feature switches are turned off. I release an update to my website with 4 pieces of code similar to the example above. I login to the website and the home page looks identical to the old version of my website. Using my switching system I enable the ‘GraphOne’ switch. Returning to my website, I refresh the home page. “Hey presto” the new graph appears in the top left section of my screen, replacing the old graph in the process.
I can now repeat this process for the other 3 graphs. Ensuring each graph is working as expected before moving on the next graph.
Benefits of Feature Switching
There are many benefits of using feature switching. I am going to run through what I believe are the key reasons for using a feature switching system.
Safely Deploy Code Only Once
I know from my experience of production releases. How stressful they can be. Especially if you have to release code out of hours, after already doing a full days work.
It can take time to deploy a release to production depending on the size of the features that is contains. It isn’t worth thinking about having to rollback and redeploy a previous version of the code and components. Both deploying and redeploying take time and can potential take your product or service off-line. What happens if your rollback fails? It doesn’t bare thinking about.
If something does go wrong you have to take your production system down again to restore it to the previous working state. Using feature switching reduces the risk or something going wrong. Once you have pushed the code into production and you have validated that everything is working as the previous version did. You will not need to redeploy any code. It is then a case of enabling a feature switch and seeing if that works. Oops, something has broken or the feature that is behind the switch isn’t working as expected. To rectify the issue, it is just a case of disabling the switch again and the previous behavior is restored.
No further downtime and no need to rollback. Amazing
Removes ‘All or Nothing’ Approach
I touched on this in the example in the previous section. If I needed to update 4 graphs in a page on my website. I can deploy the code for all 4 graphs and switch them on as I needed.
Using the example. I have found that the first two graphs work perfectly, as designed. However, the second two graphs aren’t displaying information correctly. Lets say there is a bug in the third and fourth graphs. Both graphs are only showing information for the last seven days. When, infact the graphs have been designed to show key info from the last 30 days. I switch off the last two feature switches and the older graphs are again displayed showing the correct information.
I have managed to get two out of the four features out in a release. If I hadn’t used feature switching then I would have had to rollback all four features. Meaning we would have had a failed release and additional product downtime. Argh.
Makes Testing and Fault Finding Easier
Again, using my example. Without using feature switching, I would have to test that all four of the new graphs were working as I would like. This means it can be a rushed process ensuring that everything is working. I know only too well, rushing can lead to mistakes. I don’t want to sign off a release by mistake.
Toggling on each graph and testing that it works as required. Before moving onto the next graph. Enables the person testing to focus making sure everything is correct with a particular feature. Only a small part of the system may not be working correctly at any one time. Which reduces the impact on customers.
If the worst case happens and one out of the four graphs doesn’t work. A development team can use the feature switch to pinpoint what the issue is within a production environment.
As the code with the bug in it is still in production. A development team could switch on the feature when a minimal number of customer are using the product. Investigate what the issue is with the aid of other departments such as infrastructure. Once the issue has been found, the feature can be switched off again. The team can then move onto fixing the issue with confidence that they know what the underlying problem is.
I have in the past been on releases where an issue has only occurred in a production environment. All the testing and validation in our staging environments did not throw up any issues. If we had rolled back the code we wouldn’t have been able to rally round and investigate the issue. It would have been left to “we think this is why it broke”, not “Yep, that’s why it isn’t working”. (I have a saying that I use when working with development teams “Assumption is the mother of all f**k-ups).
Makes Coding New Features Easier
In software development. You may have heard about the SOLID principles, Clean Code and the KISS principle. These are great and I’m a big believer in using these guidelines. However, software can get complicated, very quickly.
For example. I want to implement a new improvement. But, if that new functionality didn’t work and I had to somehow fall back to the old way of serving up a graph for example. My code would get very messy, very quickly and the principles I touched on would get thrown out of the window.
Putting new functionality behind a feature switch, lets you as a developer focus on building one thing in a professional and maintainable way. You don’t need to worry about worst case scenarios and fall backs. All that is taken care for you by the feature switching system.
Remove Feature Switching and Old Code
I may just be me and my OCD. But I love deleting old, unused code. I get a kick out of it. This goes for feature switches too. I love deleting old ones. If you think about it, it’s a good thing.
You have improved a part of a product, that has now been switched on in production for a good while. Why wouldn’t you want to remove that old smelly code and feature switch that is around it. It means you have done a great job.
Code is becoming more and more organic. Evolving everyday, at faster pace than ever before. Old code and functionality needs to be removed on a constant basis to keep your project lean. Keeping your code footprint as small as possible, allows deployments to be a quick and easy process.
Implementing Feature Switching
I am not going to go into detail about how to implement a feature switching system. The majority of feature switch systems that companies use have been built ‘in-house’, by their existing development teams. Therefore tailored to the needs of each business. However, the usage and underlying principle in each system is pretty much the same.
Using feature switching adds an additional layer of safety to your release process. As well as giving you an more granular ‘rollback’ process that sits on top of your existing rollback mechanism. It takes pressure of your development team too, as they are confident that is something does go wrong. The code with the issue can be turned off very quickly if needs be.
If you would like to find out more about feature switching and how to implement the system into a project. Please do not hesitate to contact me. I will be more than happy to help.