For nearly 50 years, the switch statement (also known as the case statement) has been an integral part of programming. In recent years, however, some are claiming that the switch statement has outlived its usefulness. Others go even further by labeling the switch statement as a code-smell.
In 1952, Stephen Kleene conceived the switch statement in his paper, Introduction to Metamathematics. The first notable implementation was in ALGOL 58 in 1958. Later, the switch statement was included in the indelible C programming language, which, as we know, has influenced most modern programming languages.
Fast forward to the present day and virtually every language has a switch statement. However, a few languages have omitted the switch statement. The most notable being Smalltalk.
This piqued my curiosity, why was the switch statement excluded from Smalltalk?
Andy Bower, one of the creators/proponents behind Dolphin Smalltalk shared his thoughts on why Smalltalk excluded the switch statement:
When I first came to Smalltalk from C++, I couldn’t understand how a supposedly fully fledged language didn’t support a switch/case construct. After all when I first moved up to “structured programming” from BASIC I thought that switch was one of the best things since sliced bread. However, because Smalltalk didn’t support a switch I had to look for and understand, how to overcome this deficiency. The correct answer is, of course, to use polymorphism and to make the objects themselves dispatch to the correct piece of code. Then I realized that it wasn’t a “deficiency” at all, but Smalltalk was forcing me into much finer grained OOP design than I had got(ten) used to in C++. If there had been a switch statement available it would have taken me a lot longer to learn this or, worse, I might still be programming C++/Java pseudo-object style in Smalltalk.
I would contend that in normal OOP there is no real need for a switch statement. Sometimes, when interfacing to a non-OOP world (like receiving and dispatching WM_XXXX Windows messages that are not objects but just integers), then a switch statement would be useful. In these situations, there are alternatives (like dispatching from a Dictionary) and the number of times they crop up doesn’t warrant the inclusion of additional syntax.
Was Andy right? Are we better off without the switch statement? Would other languages also benefit from excluding the switch statement?
To shed some light on this question, I’ve put together a comparison between a switch statement, a dictionary, and polymorphism. Let’s call it a smackdown. May the best implementation win!
Each implementation has a method taking one parameter, an integer, and returns a string. We’ll use cyclomatic complexity and maintainability index to examine each implementation. We’ll then take a holistic view of all three implementations.
The code.
Switch Statement
Maintainability Index | 72 |
---|---|
Cyclomatic Complexity | 6 |
public class SwitchWithFourCases
{
public string SwitchStatment(int color)
{
var colorString = "Red";
switch (color)
{
case 1:
colorString = "Green";
break;
case 2:
colorString = "Blue";
break;
case 3:
colorString = "Violet";
break;
case 4:
colorString = "Orange";
break;
}
return colorString;
}
}
Dictionary
Maintainability Index | 73 |
---|---|
Cyclomatic Complexity | 3 |
public class DictionaryWithFourItems
{
public string Dictionary(int color)
{
var colorString = "Red";
var colors = new Dictionary<int, string> {{1, "Green"}, {2, "Blue"}, {3, "Violet"}, {4, "Orange"}};
var containsKey = colors.ContainsKey(color);
if (containsKey)
{
colorString = colors[color];
}
return colorString;
}
}
Polymorphism
Total Maintainability Index | 94 |
---|---|
Total Cyclomatic Complexity | 15 |
Interface
Maintainability Index | 100 |
---|---|
Cyclomatic Complexity | 1 |
public interface IColor
{
string ColorName { get; }
}
Factory
Maintainability Index | 76 |
---|---|
Cyclomatic Complexity | 4 |
public class ColorFactory
{
public string GetColor(int color)
{
IColor defaultColor = new RedColor();
var colors = GetColors();
var containsKey = colors.ContainsKey(color);
if (containsKey)
{
var c = colors[color];
return c.ColorName;
}
return defaultColor.ColorName;
}
private static IDictionary<int, IColor> GetColors()
{
return new Dictionary<int, IColor>
{
{1, new GreenColor()},
{2, new BlueColor()},
{3, new VioletColor()},
{4, new OrangeColor()},
{5, new MagentaColor()}
};
}
}
Implementation
Maintainability Index | 97 |
---|---|
Cyclomatic Complexity | 2 |
public class BlueColor : IColor
{
public string ColorName => "Blue";
}
public class RedColor : IColor
{
public string ColorName => "Red";
}
public class GreenColor : IColor
{
public string ColorName => "Green";
}
public class MagentaColor : IColor
{
public string ColorName => "Magenta";
}
public class VioletColor : IColor
{
public string ColorName => "Violet";
}
The Results
Before I dive into the results, let’s define Cyclomatic Complexity and Maintainability Index:
- Cyclomatic Complexity is the measure of logic branching. The lower the number, the better.
- Maintainability Index measures maintainability of the code. It’s on a scale between 0 and 100. The higher the number, the better.
Cyclomatic Complexity | Maintainability Index | |
---|---|---|
Switch Statement | 6 | 72 |
Dictionary | 3 | 73 |
Polymorphism | 15 | 94 |
We will examine cyclomatic complexity first.
The results for cyclomatic complexity are straightforward. The dictionary implementation is the simplest. Does this mean it’s the best solution? No, as we’ll see when we evaluate the maintainability index.
Most would think as I did, the implementation with the lowest cyclomatic complexity is the most maintainable — how could it be any other way?
In our scenario, the implementation with the lowest cyclomatic complexity isn’t the most maintainable. In fact in our scenario, it’s the opposite. The most complex implementation is the most maintainable! Mind blown!
If you recall, the higher the maintainability index score, the better. Cutting to the chase, polymorphism has the best maintainability index score — but it also has the highest cyclomatic complexity. What gives? That doesn’t seem right.
Why is the most complex implementation the most maintainable? To answer this, we must understand the maintainability index.
The maintainability index consists of 4 metrics: cyclomatic complexity, lines of code, the number of comments and the Halstead volume. The first three metrics are relatively well known, but the last one, the Halstead Volume, is relatively unknown. Like, cyclomatic complexity, the Halstead Volume attempts objectively measure code complexity.
In simple terms, Halstead Volume measures the number of moving parts (variables, system calls, arithmetic, coding constructs, etc.) in code. The higher the number of moving parts the more complexity. The lower the number of moving parts, the lower the complexity. This explains why the polymorphic implementation scores high on the maintainability index; the classes have little to no moving parts. Another way to look at the Halstead Volume is it measures “moving parts” density.
What is software, if it’s not to change? To reflect the real world, we are introducing change. I’ve added a new color to each implementation.
Below are the revised results.
Cycolmatic Complexity | Maintainability Index | |
---|---|---|
Switch Statement | 7 | 70 |
Dictionary | 3 | 73 |
Polymorphism | 17 | 95 |
The switch statement and the polymorphic approaches both increased in cyclomatic complexity by one unit, but interestingly, the dictionary didn’t increase. At first I was puzzled by this, but then I realized the dictionary considers the colors as data and the other two implementations treat the colors as code.I’ll get down to the brass tacks.
Turing our attention to the maintainability index, only one, the switch statement, decreased in maintainability. Polymorphism’s maintainability score improved and yet the complexity also increases (we’d prefer it to decrease). As I mentioned above, this is counter-intuitive.
Our comparison shows that dictionaries can, from a complexity standpoint, scale infinitely. The polymorphic approach is by far the most maintainable and seems to increase in maintainability as more scenarios are added. The switch statement increases in complexity and decreases in maintainability when the new scenario was added. Even before we added the new scenario, it had the worst cyclomatic complexity and maintainability index measures.
Jem Finch from Google shared his thoughts on the switch statements shortcomings:
1. Polymorphic method implementations are lexically isolated from one another. Variables can be added, removed, modified, and so on without any risk of impacting unrelated code in another branch of the switch statement.
2. Polymorphic method implementations are guaranteed to return to the correct place, assuming they terminate. Switch statements in a fall through language like C/C++/Java require an error-prone “break” statement to ensure that they return to the statement after the switch rather than the next case block.
3. The existence of a polymorphic method implementation can be enforced by the compiler, which will refuse to compile the program if a polymorphic method implementation is missing. Switch statements provide no such exhaustiveness checking.
4. Polymorphic method dispatching is extensible without access to (or recompiling of) other source code. Adding another case to a switch statement requires access to the original dispatching code, not only in one place but in every place the relevant enum is being switched on.
5. … you can test polymorphic methods independent of the switching apparatus. Most functions that switch like the example the author gave will contain other code that cannot then be separately tested; virtual method calls, on the other hand, can.
6. Polymorphic method calls guarantee constant time dispatch. No sufficiently smart compiler is necessary to convert what is naturally a linear time construct (the switch statement with fall through) into a constant time construct.
Unfortunately, or fortunately, depending on your camp, most languages have a switch statement, and they aren’t going anywhere anytime soon. With this in mind, it’s good to know what’s happening under the hood when compiling switch statements.
There are three switch statement optimizations that can occur:
- If-elseif statements – When a switch statement has a small number of cases or sparse cases (non-incremental values, such as 10, 250, 1000) it’s converted to an if-elseif statement.
- Jump Table – In larger sets of adjacent cases (1, 2, 3, 4, 5) the compiler converts the switch statement to a jump table. A Jump Table is essentially a Hashtable with a pointer (think goto statement) to the function in memory.
- Binary Search – For large sets of sparse cases the compiler can implement a binary search to identify the case quickly, similar to how an index works in a database. In extraordinary cases where cases are a large number of sparse and adjacent cases, the compiler will use a combination of the three optimizations.
Summary
In an object oriented world the switch statement, conceived in 1952, is a mainstay of the software engineer. A notable exception is Smalltalk where the designers opted to exclude the switch statement.
When compared to alternative equivalent implementations, the dictionary, and polymorphism, the switch statement did not fare as well.
The switch statement is here to stay, but as our comparison has shown there are better alternatives to the switch statement.
The implementations are available on Github.