Symbolic Execution

July 19, 2016

What is Symbolic Execution

Symbolic execution is a method of finding errors that are otherwise difficult to find in a software application. The idea behind the use of symbolic execution/testing is to test the application by providing symbolic values than any specific data value.

Symbolic execution has gained much popularity in the recent past. This technique ensures high coverage and for finding deep errors in a very complex code or software applications. The aim is to discover maximum number of paths in a very short span of time, to generate a set of input values that covers that path and also check for various kinds of errors which includes uncaught exceptions, security issues, memory corruptions etc. The ability of symbolic execution method to generate a concrete set of input values, is one of the major strength, because it enables a developer to uncover a list of input values that triggers a bug. This information can be used to independently confirm and debug the error without the use of a symbolic execution tool.

Steps for implementing symbolic execution :

The sequence of execution requires selection of paths, the implementation of which depends on a set of data values.
In symbolic execution, symbolic values replace the data values with set of expressions, that is, one expression per output variable.
A very common approach is to carefully analyse the program to create a flow graph.
The flow graph is used to identify the various decision points. When one traverses the flow graph, we get to know the various assignment statements and branch predicates.

Issues with Symbolic Execution :

Path Explosion -In case of static code coverage, the size of a program in a large software application can grow exponentially. This results in an inefficiency to cover a large portion of the relevant parts of a program within a limited amount of time. The symbolic execution implicitly filters those paths which does not depend on symbolic input and those which are not feasible to execute as per the given path constraints.
Constraint Solving -The issue here arises in the fact that choice of which branch condition is a feasible option. A program branch generally depends on a small number of program variables and thus on a very small number of constraints from the path condition.
Memory modelling -This technique deals with implementing symbolic execution is to take into consideration memory related factor. For instance, a memory model that approximates a fixed width integer variable with actual mathematical integers may prove to be more efficient. This on the contrary may result in the arithmetic overflows which may result in symbolic execution to miss out paths, or explore those that are otherwise infeasible.
Handling Concurrency -The real world program is quite dynamic in nature. This makes testing a difficult task. But such difficulties have been dealt with the advent of dynamic symbolic execution techniques.

Advantages:

Executing calls directly - It is simple to implement. That is, calls to the program can be done depending upon the sequence of the given program.
Modelling the environment - In this, the environment used for executing the program invokes system calls to assess the impact of program execution.
Forking the entire system state - The tools for symbolic execution that are based on virtual machines , solves the environment issue by forking/emulating the entire VM state. This approach does not require one to write and maintain complex models. It allows any program binary to be executed symbolically.

Based on our understanding about symbolic execution, we may arrive at a conclusion that this method is applied in the following manner :

Path domain checking
Test data generation
Partition Analysis
Symbolic debugging

Conclusion :

Among the various kinds of techniques available for software testing, symbolic execution offers a unique approach to solve the issue of code coverage. Provided the wide range of tools available, symbolic execution automatically generates inputs that helps to trigger an action to catch a potential bug hidden in the program. Software errors ranging from low-level to higher level program crashes are caught.