36 posts

Descartes and Elastest Working Together

Patxi Gortázar
Patxi Gortázar, Professor at Universidad Rey Juan Carlos and Elastest project coordinator, joined the latest STAMP plenary meeting in Madrid last October 9th, 2019.
Elastest is providing observability tools to understand what is going on in the code of a given software when it fails. Stamp tools, on their side, improve the quality of the test cases. Applying STAMP Descartes in a Java project can bring useful insights of the mutants that survive, while ElasTest offers accurate observability tools. Using both open source tools together can provide more information about how a DevOps team is performing on test technologies, including mutation testing and software testing automation.
For more information about this integration, read the Elastest article Amplifiying the value of your tests and check out the demo in the Elastest live instance.

Polyglot Applications and Mutation Testing

Pharo Smalltalk
In your Edge computing, Cloud computing or IoT environment, chances are you're mixing several services written in different languages. 

"Multiple modern architectures are now polyglot, with a different language on the server and the clients, or even different languages among microservices", says Martin Monperrus, Professor at KTH. 

Google, eBay, Twitter, and Amazon are among the big technology companies that have evolved to support a polyglot microservices architecture. "The essence of a polyglot architecture is to delegate the decision over which technology stack and programming languages to use to the service developers", explains Tripta Gupta in her article entitled Analyzing Polyglot Microservices

In STAMP, we focus on Java because it's the #1 language for server-side enterprise applications. However, several mutation testing tools are now centered on popular development languages, including:

  • Infection, a mutation testing tool for PHP
  • MutPy and Mutmut, mutation testing tools for Python
  • Muter, a mutation testing tool for Swift
  • Unima, a mutation testing tool for C# 
  • Stryker Mutator, a mutation testing tool for C#, Scala, JavaScript and Typescript

This list is far from exhaustive. We'd like to ear about your favorite Mutation Testing Tools and your suggestions about STAMP Descartes

DSpot first implementation in the Pharo Smalltalk ecosystem

Pharo Smalltalk

Title: Test amplification in the Pharo Smalltalk Ecosystem
Authors: Mehrdad Abdi, Henrique Rocha and Serge Demeyer, University of Antwerp
Event: International Workshop on Smalltalk Technologies 2019, Cologne 

The STAMP team is glad to announce a brilliant external contribution to DSpot open source test amplification tool, from the University of Antwerp. 

Thanks to Mehrdad Abdi, Henrique Rocha and Serge Demeyer, DSpot is no longer limited to Java applications. The researcher team has successfully replicated the core of DSpot to Smalltalk, developing test amplification in the Pharo environment. The three researchers improved a test suite mutation score, applying DSpot on a simple Bank application with a few methods and test cases. They have learned a lot from there and they are currently fine tuning internal algorithms to scale up to more realistic Smalltalk systems.

Test amplification is the act of strengthening existing unit tests to exercise the boundary conditions of the unit under test. It is an emerging research idea which has been demonstrated to work for Java, relying on the type system to safely transform the code under test. In this paper we report on a feasibility study concerning test amplification in the context of the Smalltalk eco-system. We introduce a proof-of-concept test amplifier named Small-Amp, and discuss the advantages and challenges we encountered while incorporating the tool into the Pharo Smalltalk environment. We demonstrate that by building on top of the Refactoring Browser API, the MuTalk mutation tool, it is feasible to build a test amplifier in Pharo Smalltalk despite the absence of a type system.

Descartes and DSpot Demo Applied to OW2 Joram


OW2 Joram is a JMS-compatible message-oriented middleware.
Thanks to a new Gitlab issue generator extension for Descartes, the Joram team found and solved a critical issue in unit tests.

The OW2 Joram project, as a STAMP use case, reveals:

  • The detection of a critical issue in Joram unit tests, with Descartes (code is removed but the test suite is green - everything "normal"!).
  • The issue is automatically inserted in the Joram Gitlab.
  • Dspot, focused on the issue, generates a test that fixes it.

These tools were really used to detect and fix an issue in Joram: at least, the Descartes + Gitlab issue generation part (the issue was fixed by the Joram team, but without DSpot).

For more information:

DSpot Amplification Visualization

In this post we share a visualization of the unit test amplification process. In the example, DSpot is used to amplify two unit tests from JSoup.

Production Traffic For Testing


Publication: Information and Software Technology
Authors: Jeff Anderson, Maral Azizi, Saeed Salem, and Hyunsook Do

Title: On the Use of Usage Patterns from Telemetry Data for Test Case Prioritization

In an original work about Production Traffic for Testing, Jeff Anderson, Maral Azizi, Saeed Salem, and Hyunsook Do present a new opportunity in the area of regression testing techniques. Here is the abstract:

Context: Modern applications contain pervasive telemetry to ensure reliability and enable monitoring and diagnosis. This presents a new opportunity in the area of regression testing techniques, as we now have the ability to consider usage profiles of the software when making decisions on test execution. Objective: The results of our prior work on test prioritization using telemetry data showed improvement rate on test suite reduction, and test execution time. The objective of this paper is to further investigate this approach and apply prioritization based on multiple prioritization algorithms in an enterprise level cloud application as well as open source projects. We aim to provide an effective prioritization scheme that practitioners can implement with minimum effort. The other objective is to compare the results and the benefits of this technique factors with code coverage-based prioritization approaches, which is the most commonly used test prioritization technique. 

Method: We introduce a method for identifying usage patterns based on telemetry, which we refer to as “telemetry fingerprinting.” Through the use of various
algorithms to compute fingerprints, we conduct empirical studies on multiple software products to show that telemetry fingerprinting can be used to more effectively prioritize regression tests. 

Results: Our experimental results show that the proposed techniques were able to reduce over 30 percent in regression test suite run times compared to the coverage-based prioritization technique in detecting discoverable faults. Further, the results indicate that fingerprints are effective in identifying usage patterns, and that the fingerprints can be applied to improve regression testing techniques.

Conclusion: In this research, we introduce the concept of fingerprinting software usage patterns through telemetry. We provide various algorithms to compute fingerprints and conduct empirical studies that show that fingerprints are effective in identifying distinct usage patterns. By applying these techniques, we believe that regression testing techniques can be improved beyond the current state-of-the-art, yielding additional cost and quality benefits.

Search-Based Test Case Implantation for Testing Untested Configurations

Publication: Information and Software Technology
Authors: Dipesh Pradhan, Shuai Wang, Tao Yue, Shaukat Ali, Marius Liaaen

Title: Search-Based Test Case Implantation for Testing Untested Configurations

Modern large-scale software systems are highly configurable, and thus require a large number of test cases to be implemented and revised for testing a variety of system configurations. This makes testing highly configurable systems very expensive and time-consuming.

Driven by our industrial collaboration with a video conferencing company, we aim to automatically analyze and implant existing test cases (i.e., an original test suite) to test the untested configurations.

We propose a search-based test case implantation approach (named as SBI) consisting of two key components:
1) Test case analyzer that statically analyzes each test case in the original test suite to obtain the progr am dependence graph for test case statements and
2) Test case implanter that uses multi-objective search to select suitable test cases for implantation using three operators, i.e., selection, crossover, and mutation (at the test suite level) and implants the selected test cases using a mutation operator at the test case level including three operations (i.e., addition, modification, and deletion).

We empirically evaluated SBI with an industrial case study and an open source case study by comparing the implanted test suites produced by three variants of SBI with the original test suite using evaluation metrics such as statement coverage (SC), branch coverage (BC), and mutation score (MS). Results show that for both the case studies, the test suites implanted by the three variants of SBI performed significantly better than the original test suites. The best variant of SBI achieved on average 19.3% higher coverage of configuration variable values for both the case studies. Moreover, for the open source case study, the best variant of SBI managed to improve SC, BC, and MS with 5.0%, 7.9%, and 3.2%, respectively.

SBI can be applied to automatically implant a test suite with the aim of testing untested configurations and thus achieving higher configuration coverage.

Configuration Tests: the JHipster web development stack use case


Title: Test them all, is it worth it? Assessing configuration sampling on the JHipster Web development stack
Authors: Axel Halin, Alexandre Nuttinck, Mathieu Acher, Xavier Devroey, Gilles Perrouin, Benoit Baudry
Publication: Springer Empirical Software Engineering, April 2019, Vol24

A group of software researchers, partially involved in the EU-funded STAMP project, has published interesting results based on configuration tests on JHipster, a popular open source application generator to create Spring Boot and Angular/React projects. 

Abstract: Many approaches for testing configurable software systems start from the same assumption: it is impossible to test all configurations.
This motivated the definition of variability-aware abstractions and sampling techniques to cope with large configuration spaces. Yet, there is no theoretical barrier that prevents the exhaustive testing of all configurations by simply enumerating them if the effort required to do so remains acceptable. Not only this: we believe there is a lot to be learned by systematically and exhaustively testing a configurable system. In this case study, we report on the first ever endeavour to test all possible configurations of the industry-strength, open source configurable software system JHipster, a popular code generator for web applications. We built a testing scaffold for the 26,000+ configurations of JHipster using a cluster of 80 machines during 4 nights for a total of 4,376 hours (182 days) CPU time. We find that 35.70% configurations fail and we identify the feature interactions that cause the errors. We show that sampling strategies (like dissimilarity and 2-wise):
(1) are more effective to find faults than the 12 default configurations used in the JHipster continuous integration;
(2) can be too costly and exceed the available testing budget. We cross this quantitative analysis with the qualitative assessment of JHipster’s lead developers.

Read the full article on Springer website

Five Machine Learning Usages in Software Testing

According to the Reqtest team, machine learning is a hot trend this year, bringing revolutionary changes in workflows and processes.
In software testing, machine learning can be used for:

  • Test suite optimization, to identify redundant and unique test cases.
  • Predictive analytics, to predict the key parameters of software testing processes on the basis of historical data.
  • Log analytics, to identify the tests cases which need to be executed automatically.
  • Traceability, extracting keywords from the Requirements Traceability Matrix (RTM) to achieve test coverage.
  • Defect analytics, to identify high-risk areas of the application for the prioritization of regression test cases.

Read nine more recent testing trends from the Reqtest editors.

Maven Central Top Libraries

Elastest Architecture

Analysing the Maven Central Repository during the second half of 2018, a group of scientific researchers led by Benoit Baudry, Professor in Software Technology at the KTH Royal Institute of Technology, reveals that Maven Central contains more than 2.5 million artifacts, a real treasure of extraordinary software development. More than 17% of the libraries have several versions that are actively used by a large number of clients.
However, 1.3 million dependencies declared are actually not used. Also, a vast majority of APIs can be reduced to a small, compact core and still serve most of their clients. 

For a more accurate exploration of the Maven Central ecosystem, read Benoit Baudry's article posted on
A journey at the heart of 2.4 million Maven artifacts