Intel unveiled ControlFlag – a machine programming research system that can autonomously detect errors in code. Even in its infancy, this self-supervised system shows promise as a productivity tool to assist software developers with the labor-intensive task of debugging.
In preliminary tests, ControlFlag trained and learned novel defects on over 1 billion unlabeled lines of production-quality code.
ControlFlag and debugging
In a world increasingly run by software, developers continue to spend a disproportionate amount of time fixing bugs rather than coding. It’s estimated that of the $1.25 trillion that software development costs the IT industry every year, 50 percent is spent debugging code.
Debugging is expected to take an even bigger toll on developers and the industry at large. As we progress into an era of heterogenous architectures — one defined by a mix of purpose-built processors to manage the massive sea of data available today — the software required to manage these systems becomes increasingly complex, creating a higher likelihood for bugs. In addition, it is becoming difficult to find software programmers who have the expertise to correctly, efficiently and securely program across diverse hardware, which introduces another opportunity for new and harder-to-spot errors in code.
When fully realized, ControlFlag could help alleviate this challenge by automating the tedious parts of software development, such as testing, monitoring and debugging. This would not only enable developers to do their jobs more efficiently and free up more time for creativity, but it would also address one of the biggest price tags in software development today.
“We think ControlFlag is a powerful new tool that could dramatically reduce the time and money required to evaluate and debug code. According to studies, software developers spend approximately 50% of the time debugging. With ControlFlag, and systems like it, I imagine a world where programmers spend notably less time debugging and more time on what I believe human programmers do best — expressing creative, new ideas to machines,” said Justin Gottschlich, principal scientist and director/founder of Machine Programming Research at Intel Labs.
How ControlFlag works
ControlFlag’s bug detection capabilities are enabled by machine programming, a fusion of machine learning, formal methods, programming languages, compilers and computer systems.
ControlFlag specifically operates through a capability known as anomaly detection. As humans existing in the natural world, there are certain patterns we learn to consider “normal” through observation. Similarly, ControlFlag learns from verified examples to detect normal coding patterns, identifying anomalies in code that are likely to cause a bug. Moreover, ControlFlag can detect these anomalies regardless of programming language.
A key benefit of ControlFlag’s unsupervised approach to pattern recognition is that it can intrinsically learn to adapt to a developer’s style. With limited inputs for the control tools that the program should be evaluating, ControlFlag can identify stylistic variations in programming language, similar to the way that readers recognize the differences between full words or using contractions in English.
The tool learns to identify and tag these stylistic choices and can customize error identification and solution recommendations based on its insights, which minimizes ControlFlag’s characterizations of code in error that may simply be a stylistic deviation between two developer teams.
Intel has even started evaluating using ControlFlag internally to identify bugs in its own software and firmware product development. It is a key element of Intel’s Rapid Analysis for Developers project, which aims to accelerate velocity by providing expert assistance.