Mutation Testing

Mutation Testing

Don’t just cover your code — Equip it to kill mutants

Welcome to the fourth and final instalment of my series on Advanced Software Testing Techniques. Throughout this journey, I have shared some insights gained from my academic pursuits and hands-on experiences in the realm of software testing. Today, I will delve into the intriguing world of mutation testing—an indispensable method that forms the focal point of this enlightening series.

🎉 Prepare to embark on a thought-provoking exploration of this cutting-edge technique and discover how it can revolutionize your approach to software quality assurance. So without further ado, let us dive into the depths of mutation testing and unlock its hidden potential.

🔍Overview

Mutation testing is a technique that involves introducing deliberate bugs or “mutations” into the software code to identify weak tests or code. It offers a high level of error detection and can identify hidden defects that other testing methods cannot. The process provides concrete suggestions for additional testing and helps to identify flaws in designs that were not previously considered. The big benefit of mutation testing is that it helps to ensure the quality of code by detecting problematic bugs that are difficult to find with traditional testing methods.

Is a test set evaluation technique for a program (having a set of tests generated, we can evaluate how effective it is, based on the results obtained by this test on the mutants of the program).

💻 Example

Let's have a look at this code sample where we read the x and y variables and based on their values we write a specific compute:

public class MutationExample {
    public static void main(String[] args) {
        int x = 0;
        int y = 0;
        read(x,y);
        if (x > 0) {
            write(x + y);
        } else {
            write(x * y);
        }
    }

    public static void write(int value) {
        //...
    }

    public static void read(int x, int y) {
        //...
    }
}

And here we have the same code with a mutation applied:

public class MutationExample {
    public static void main(String[] args) {
        int x = 0;
        int y = 0;
        read(x,y);
        if (x >= 0) {
            write(x + y);
        } else {
            write(x * y);
        }
    }

    public static void write(int value) {
        //...
    }

    public static void read(int x, int y) {
        //...
    }
}

The mutation is a single alteration in the if condition of the main method. In the original code, the condition is if (x > 0), and in the mutated code, it has been changed to if (x >= 0). The modification here is to replace the strict greater-than-comparison (>) with a greater-than-or-equal-to comparison (>=).

And what is the explanation...? 🤔

By making this simple change, the behaviour of the program can potentially be affected. In the original code, the write method is called only when x is greater than 0, and otherwise, the write method with the multiplication of x and y is called. However, in the mutated code, the write method with the addition of x and y will be called when x is greater than or equal to 0.

Now, during mutation testing, the test suite will be executed against this mutated version of the MutationExample class. If the test suite fails to detect this artificial fault introduced by the mutation (i.e., it passes successfully), it indicates that the tests may have insufficient coverage or are not sensitive enough to this particular mutation. On the other hand, if the test suite fails, it means the tests are effective in detecting the mutation, demonstrating their ability to identify faults and validating the quality of the test suite.

🚙 Utility of mutation testing

Evaluating Test Suite Effectiveness:

Mutation testing helps gauge the ability of a test suite to detect faults or errors in the code.

Suppose you have a test suite for a sorting algorithm (selection sort). By applying mutation testing, you introduce a mutation that swaps two elements incorrectly. If the test suite successfully detects this mutation, it indicates the effectiveness of the tests in identifying faults and ensuring correct sorting.

// Original code
public boolean isSorted(int[] array) {
    for (int i = 0; i < array.length - 1; i++)  {  
            int index = i;  
            for (int j = i + 1; j < array.length; j++){  
                if (array[j] < array[index]){  
                    index = j; 
                }  
            }  
            int smallerNumber = arr[index];   
            array[index] = array[i];  
            array[i] = smallerNumber;  
        }  
}

// Mutation: Swapping two elements incorrectly
public boolean isSorted(int[] array) {
   for (int i = 0; i < array.length - 1; i++)  {  
            int index = i;  
            for (int j = i - 1; j < array.length; j++){  
                if (array[j] < array[index]){  
                    index = j; 
                }  
            }  
            int smallerNumber = arr[index];   
            array[index] = array[i];  
            array[i] = smallerNumber;  
        }  
}

@Test
public void testIsSorted() {
    int[] array = {1, 2, 3};
    assertTrue(isSorted(array)); // Original code passes the test
    assertFalse(isSorted(array)); // Mutation fails the test, indicating effective fault detection
}

Fault Localization

When a mutation is not detected by the test suite, mutation testing helps identify potential weaknesses in the test cases or areas of the code that may lack appropriate test coverage. This allows developers to focus their efforts on improving specific test cases or adding new ones to enhance fault detection and localization.

During mutation testing, if a specific mutation goes undetected by the test suite, you can analyze which test cases failed to identify the fault. This helps pinpoint weaknesses in the test suite, allowing you to add targeted test cases or improve existing ones to enhance fault localization.

// Original code
public int divide(int a, int b) {
    return a / b;
}

// Mutation: Incorrect operation
public int divide(int a, int b) {
    return a * b;
}

@Test
public void testDivide() {
    assertEquals(2, divide(10, 5)); // Original code passes the test
    assertEquals(2, divide(10, 5)); // Mutation passes the test, indicating a potential weakness in the test suite
}

Quality Assurance and Code Quality Improvement

By identifying areas of the code that are not adequately covered by tests, mutation testing helps improve the overall quality of the software. It encourages the development of more robust and comprehensive test suites, leading to higher-quality code and reduced likelihood of undetected faults in production.

Imagine you are developing a calculator application. By performing mutation testing, you introduce a mutation that alters the logic for multiplication. If the test suite detects this mutation, it highlights the need for improved code quality and reinforces the importance of comprehensive testing.

// Original code
public int multiply(int a, int b) {
    return a * b;
}

// Mutation: Faulty multiplication logic
public int multiply(int a, int b) {
    return a - b;
}

@Test
public void testMultiply() {
    assertEquals(20, multiply(4, 5)); // Original code passes the test
    assertEquals(-1, multiply(1, 1)); // Mutation fails the test, highlighting the need for improved code quality
}

Identifying Weaknesses in Design and Specifications

Mutation testing can reveal flaws in the software's design or specifications that were not initially considered. By exploring different scenarios and potential code modifications, mutation testing helps expose vulnerabilities and can guide improvements in the design and requirements.

Suppose you have a test suite for a banking application. Through mutation testing, you introduce a mutation that modifies the interest calculation formula. If the test suite fails to identify this mutation, it suggests flaws in the design or specifications, indicating the need for reevaluation and potential adjustments.

// Original code
public double calculateInterest(double principal, double rate, int years) {
    return principal * rate * years;
}

// Mutation: Incorrect interest calculation formula
public double calculateInterest(double principal, double rate, int years) {
    return principal + (principal * rate * years);
}

@Test
public void testCalculateInterest() {
    assertEquals(500, calculateInterest(1000, 0.05, 10)); // Original code passes the test
    assertEquals(1500, calculateInterest(1000, 0.05, 10)); // Mutation fails the test, indicating a potential flaw in design or specification
}

Confidence in Software Reliability

Through mutation testing, developers and stakeholders gain greater confidence in the reliability and resilience of the software. By demonstrating the ability of the test suite to detect artificial faults, mutation testing contributes to a higher level of assurance that the software can handle real-world scenarios and potential issues.

By using mutation testing, you introduce mutations that simulate exceptional scenarios, such as invalid inputs or edge cases. If the test suite successfully detects these mutations, it instils confidence in the software's reliability and resilience to handle unexpected situations.

// Original code
public int divide(int a, int b) {
    return a / b;
}

// Mutation: Division by zero
public int divide(int a, int b) {
    return a / 0;
}

@Test
public void testDivide() {
    assertEquals(2, divide(10, 5)); // Original code passes the test
    assertThrows(ArithmeticException.class, () -> divide(10, 0)); // Mutation throws an exception, confirming software resilience and confidence
}

Continuous Integration and Regression Testing

Mutation testing can be integrated into the continuous integration (CI) process, allowing developers to automatically run mutation tests alongside other automated tests. This helps identify regressions and ensure that code modifications do not inadvertently introduce new faults or weaken the existing test coverage.

➗Mutation operators

Mutation operator = The rule that applies to a program to create mutants;

Many mutation operators have been explored by researchers. Here are some examples of mutation operators for imperative languages:

  • Statement deletion;

  • Statement duplication or insertion, e.g. goto fail;

  • Replacement of boolean subexpressions with true and false;

  • Replacement of some arithmetic operations with others, e.g. + with *, - with / ;

  • Replacement of some boolean relations with others, e.g. > with >=, == and <= ;

  • Replacement of variables with others from the same scope (variable types must be compatible);

  • Remove the method body;

These mutation operators are also called traditional mutation operators. There are also mutation operators for object-oriented languages, concurrent constructions, complex objects like containers, etc. Operators for containers are called class-level mutation operators. For example, the muJava tool offers various class-level mutation operators such as Access Modifier Change, Type Cast Operator Insertion, and Type Cast Operator Deletion. Mutation operators have also been developed to perform security vulnerability testing of programs.

🤔Mutation Testing Technique

Mutation testing employs several steps to evaluate the effectiveness of a test suite in detecting artificial faults introduced through code mutations. Here are the key techniques used in mutation testing:

Mutant Generation:

In this technique, a set of mutants (modified versions) of the original code is generated by applying various mutation operators. Each mutant represents a specific artificial fault, created by altering the code in a predefined way.

Test Suite Execution:

The generated mutants are subjected to the test suite execution. The test suite consists of a collection of test cases designed to validate the correctness and robustness of the software.

Mutation Score:

The mutation score is a metric that quantifies the effectiveness of the test suite in detecting mutations. It is calculated by determining the percentage of mutants that are killed (i.e., detected as faulty) by the test suite. A higher mutation score indicates a more comprehensive and effective test suite.

Equivalent Mutants:

Equivalent mutants are mutations that do not alter the program's behaviour or result in observable changes. These mutants are typically excluded from the evaluation process, as they do not contribute to assessing the quality of the test suite.

Surviving Mutants:

Surviving mutants are the mutations that are not detected by the test suite, meaning the corresponding faults are not identified. These surviving mutants indicate weaknesses in the test suite and highlight areas where additional test cases or improvements are required.

Test Case Effectiveness:

During mutation testing, the focus is on assessing the effectiveness of individual test cases in detecting mutations. Test cases that are successful in killing mutants are considered valuable and contribute to the overall mutation score.

Mutation Operators:

Mutation operators are specific rules or patterns used to modify the code and create mutants. These operators define the types of mutations that can be applied, such as changing conditional operators, altering arithmetic operations, modifying variable assignments, and more.

⬇️First-order mutants

First-order mutants are a type of mutation that involves making a single small change to the original code. These mutations typically introduce simple faults, such as changing a conditional operator, altering an arithmetic operation, or modifying a variable assignment.

First-order mutants are created by applying basic mutation operators that target specific elements of the code, such as binary operators, unary operators, constants, and variables. The purpose of first-order mutants is to evaluate the ability of the test suite to detect simple and easily identifiable faults.

For example, consider a conditional statement if (x > 0) in the original code. A first-order mutant could change the relational operator from > to >=, resulting in the mutated code if (x >= 0). This mutation tests whether the test suite is capable of detecting the change in the condition and its impact on the program's behaviour.

First-order mutants are relatively straightforward and provide a good starting point for mutation testing. They help assess the test suite's ability to identify basic faults and ensure that even small changes in the code can be effectively detected by the tests. However, first-order mutants may not capture more complex faults or subtler issues that require higher-order mutations to manifest.

⬆️Higher-order mutants

Higher-order mutants are a more advanced type of mutation that involves making multiple and potentially complex changes to the original code. Unlike first-order mutants, which make single small alterations, higher-order mutants introduce more extensive modifications that can result in significantly different program behaviours.

Higher-order mutants are created by applying compound mutation operators that combine multiple mutation operators or introduce intricate changes across different parts of the code. These mutations aim to challenge the test suite's ability to detect more complex faults and assess the robustness of the testing approach. They provide a more comprehensive assessment of the test suite's ability to handle intricate code modifications and ensure the software's resilience against more sophisticated faults.

It is important to note that higher-order mutants require careful consideration and analysis due to the increased complexity and potential impact on the program's behaviour. The generation and evaluation of higher-order mutants can be more computationally expensive compared to first-order mutants, but they provide valuable insights into the thoroughness and effectiveness of the testing process.

💻 Example

public class MutationExample {
    public static void main(String[] args) {
        int x = 0;
        int y = 0;
        read(x,y);
        if (x > 0) {
            write(x + y);
        } else {
            write(x * y);
        }
    }

    public static void write(int value) {
        //...
    }

    public static void read(int x, int y) {
        //...
    }
}

Order 2 mutant on a single execution:

public class MutationExample {
    public static void main(String[] args) {
        int x = 0;
        int y = 0;
        read(x,y);
        if (x >= 0) {
            write(x + y + 1);
        } else {
            write(x * y);
        }
    }

    public static void write(int value) {
        //...
    }

    public static void read(int x, int y) {
        //...
    }
}

In general, only 1-order mutants are used in practice.

The reasons are:

  • A large number of 2 or higher-order mutants slows down the tests;

  • Coupling-effect;

🪜Basics of mutation testing

Mutation testing is built upon two fundamental concepts: the competent programmer hypothesis and the coupling effect. Let's begin by exploring the first concept...

Competent programmer hypothesis

When tackling a specific problem, programmers typically develop code that closely resembles a correct solution for that problem. Consequently, the presence of errors or faults within the code can be identified by solely employing first-order mutants.

It assumes that developers write code that is functionally correct and follows the intended logic. According to this hypothesis, the majority of faults in a program are introduced through mistakes or oversights in the code implementation, rather than fundamental flaws in the underlying design.

In other words, the competent programmer hypothesis posits that if a developer creates a faulty piece of code, it is more likely to be due to human error or oversight rather than a deliberate intention to implement faulty logic. Therefore, when conducting mutation testing, the hypothesis assumes that most mutations introduced into the code will likely result in faults that deviate from the intended behaviour. The effectiveness of a test suite is evaluated by its ability to detect these introduced faults (mutations) and identify deviations from the expected behaviour. If the test suite fails to detect a significant number of mutations, it suggests that the tests may not be thorough enough or sensitive to potential faults, highlighting areas for improvement.

Coupling effect

The choice of test data plays a crucial role in distinguishing errors within slightly deviating programs, particularly when those errors become more complex. Empirical evidence reveals that a test suite capable of differentiating between a program and its first-order mutants is highly likely to be effective in discerning second-order mutant programs as well.

An intuitive explanation for this phenomenon lies in the observation that simple errors tend to be more elusive and challenging to detect. On the other hand, complex errors tend to manifest themselves more prominently, making them more susceptible to identification by a broader range of tests.

💪Strong mutation / 😩weak mutation

A T-test is considered to kill a mutant, denoted as M, if it can distinguish the behaviour of M from that of the original program, referred to as P. This distinction is based on the observation of their different behaviours during the T-test.

Now, the question arises: When do we observe the behaviours of the two programs?

During the T-test, both P and M are executed in different states. This means that the program's status, including the values of the affected variables, is observed after the execution of the modified instruction.

Additionally, the state changes propagate until the end of the program. This includes observing the values of the returned variables and other effects, such as modifications to global variables, files, or databases, immediately after the program's execution is completed.

It's worth noting that weak mutation occurs when only the first condition (observation of the program's status) is satisfied, while strong mutation requires both conditions to be met for proper observation and distinction between P and M.

Program P

public class MutationExample {
    public static void main(String[] args) {
        int x = 0;
        int y = 0;
        read(x,y);
        y = y + 1;
        if (x > 0) {
            write(x);
        } else {
            write(y);
        }
    }
}

Mutant M

public class MutationExample {
    public static void main(String[] args) {
        int x = 0;
        int y = 0;
        read(x,y);
        y = y - 1;
        if (x > 0) {
            write(x);
        } else {
            write(y);
        }
    }
}

The ( 1, 1 ) test distinguishes between P and M from the point of view of weak mutation, but does not distinguish between P and from the point of view of strong mutation.

The ( 0, 1 ) test distinguishes between P and M from the point of view of strong mutation.

Strong mutation: stronger. It is ensured that the test detects T with real problems.

Weak mutation: requires less computing power; closely related to the idea of ​​coverage

🟰Equivalent mutants

Equivalent mutants refer to mutated versions of code that have the same observable behaviour as the original code. In other words, equivalent mutants produce the same outputs for the given set of test cases and do not introduce any discernible differences in the program's behaviour.

We need to decide whether or not the mutants are equivalent to be able to evaluate the effectiveness of the tests

From a theoretical point of view: in general, the problem of determining if a mutant is equivalent to the parent program is undecidable (it is equipped with a halting problem)

In practice: the determination of equivalence is done by analyzing the code.

Determining equivalent mutants can be a complex process – the main practical problem of the mutation testing technique

An example of an equivalent mutant for the provided code can be achieved by simply swapping the order of the initialisation for x and y:

// Original code
public class MutationExample {
    public static void main(String[] args) {
        int x = 0;
        int y = 0;
        read(x, y);
        y = y + 1;
        if (x > 0) {
            write(y);
        } else {
            write(x);
        }
    }
}

// Mutation code
public class MutationExample {
    public static void main(String[] args) {
        int y = 0;     
        int x = 0; // <-- swapped
        read(x, y);
        y = y + 1;
        if (x > 0) {
            write(x);
        } else {
            write(y);
        }
    }
}

🧮Mutation score (MS)

Mutation score is a metric used in mutation testing to quantify the effectiveness of a test suite in detecting mutations. It represents the percentage of mutations that are killed or detected by the test suite out of the total number of generated mutations.

The mutation score is calculated by dividing the number of killed mutations by the total number of generated mutations and multiplying the result by 100 to get a percentage. A higher mutation score indicates a more effective test suite, as it signifies a greater ability to detect and kill mutations.

MS(T) = D/(L+D), where:

• D – the number of distinguished mutants;

• L – the number of non-equivalent mutants (live mutants);

📍Conclusion

In conclusion, mutation testing emerges as a powerful technique that revolutionizes the landscape of software testing. Throughout this article, we have explored the intricacies and utilities of mutation testing, delving into its fundamental concepts, techniques, and benefits.

By introducing artificial faults (mutations) into the code and assessing the ability of the test suite to detect these mutations, mutation testing provides valuable insights into the quality and effectiveness of the testing process. It evaluates the thoroughness of the test suite, identifies areas of improvement, and enhances the overall reliability and resilience of the software.

Happy coding! 🙂