March 24, 2019 ▪ 7 min read
Within an individual engineering team, it’s fairly difficult to determine the actual value that one attains from switching from a dynamic to static typing system.
A group of researchers created a study to look into just this: do static type systems improve software quality, and if so, by how much?
Definition and methodology
The research team classified a bug as
ts-detectable1 if using type
annotations would cause an error on a line changed by a bug fix, and when the
new type annotations are consistent with the fixed program.
The researchers classified each bug as
ts-detectable in each of TypeScript 2.0
and Flow 0.30. First, they checked if the bug report was absolutely not
type-related (for example, if the bug was due to a misunderstanding of the
product specification), and if so they labeled it as undetectable. Next, they
looked at the intended behavior of the bug fix and attempted to add type
annotations to cause the type system to error at the area of code patched by the
bug fix. If they could do this, the sample was deemed
it was not. The researchers also set a time-bound for each bug investigation to
10 minutes, after which the bug was deemed unknown.
The team was able to label all 400 bug samples: Each of Flow and TypeScript detected 60 of the bugs. This means at the confidence level of 95%, the percentage of detectable bugs for each falls into [11.5%, 18.5%] with mean 15%.
Most of the bugs were easy to determine: only 18 of the 400 hit the time-box period of 10 minutes, mostly due to external modules and interfaces making it harder to isolate the source of the bug with type annotations. The team spent more time on these bugs, utilizing published type definitions and documentation when necessary, and were able to classify them all.
The authors think this greatly understates the impact of static typing, because:
- They only surveyed publicly visible bugs, which means any bug caught during development was not included. They also think public bugs are more often cause by misunderstanding of the specification, which type systems cannot detect.
- These results do not include any other strengths of static type systems, like developer efficiency or app performance.
- This experiment uses the relatively weak type systems of TypeScript and Flow.
- The authors have limited expertise in Flow and TypeScript, which means they could have incorrectly deemed bug as undetectable.
What about the undetectable bugs?
The vast majority of bugs that were undetectable were due to “specification
errors”, which constituted 78% of total bugs. This is covered by
UIError, and the catch-all
SpecError in the above histogram.
This demonstrates the importance of careful specification before development
The second most common error type was
StringError, often due to a wrong URL.
Comparing TypeScript and Flow
While TypeScript and Flow had the same number of
ts-detectable bugs, they
don’t have complete overlap. There were 3 bugs that were only detectable on
Flow, and 3 only on TypeScript.
In all of the three bugs detectable in Flow but not TypeScript, the bug was a result of concatening a possible undefined or null value with another string. For example:
var x = " " + null + " ";
This highlights a TypeScript weakness, at least as of version 2.0, in its null handling.
Two of the three bugs detectable in TypeScript but not Flow were due to Flow’s incomplete support for using string literals as an index.
Also of note: of the 60 ts-detectable bugs, 22 of them needed null checks. This
is a feature added in TypeScript 2.0, and needs
Approximating the cost of using static typings
While it’s incredibly difficult to directly measure the effort for programmers to use a static type system, the authors tried to approximate this.
token tax, which was the number of tokens in an annotation needed to trigger a type error for the bug in question. The goal for this was to proxy the number of decisions a programmer would take when adding type annotations.
time tax, which was the time spent adding annotations.
Note that these measures are an underestimate of the real-world cost, because they are calculated for a single bug in which the authors only used type annotation to target the bug in question (and not the entire module).
These metrics are only intended to track the incremental cost for each change a developer makes. If we assume an entire project is already using a static type checker, then the cost to a developer is only the time and tokens needed as part of their code change.
Using these measures, the authors find that the mean annotation tax for each bug is: 1.7 tokens and 231 seconds for Flow, and 2.4 tokens and 306 seconds for TypeScript.
The authors note this discrepency is largely due to Flow’s paradigm of strong type inference, and a more compact syntax for nullable types, which made it faster for them to annotate.
However, the authors also noted that the most time-consuming aspect of this project was handling external modules and their typings. For many projects, Flow didn’t have built-in support, and the team leaned on the TypeScript community’s set of type definitions.
The research is pretty clear that static typing does indeed help prevent a significant percentage of bugs, even for engineers not familiar with the language or codebase.
While it’s possible those bugs could have been caught through other means — such as testing or linting — there’s a huge benefit in preventing this class of issues as part of your development cycle (as opposed to your testing process). There are also many other adjacent benefits of static typing that were not explored in this research, such as improved code editor integration and faster new engineer onboarding to a codebase.
One takeaway from this research is that there is value in an incremental conversion of your codebase to static typing. The researchers did not convert entire projects and were still able to prevent bugs. Rather than hold off conversion until your team has bandwidth to fully migrate, start incrementally with the most commonly edited files or those most susceptible to bugs.
TypeScript and Flow performed similarly in the goal of preventing bugs, but may have more significant differences in adoption in a company or toolkit. Use the software that most fits your team’s needs and goals: the differences highlighted here are negligable enough.
ts-detectable: Given a static type system
ts, a bug
ts-detectablewhen adding or changing type annotations causes the program
bto error on a line changed by a fix and the new annotations are consistent with
f, a fixed version of
I'm a software engineer at Asana. I manage a few engineers and am the program & tech lead on a product team. I spend a lot of time thinking about running effective teams, fostering growth, product+engineering collaboration, and engineering design patterns.