technical-news

34 posts

Sep 16 2019

DSpot first implementation in the Pharo Smalltalk ecosystem

Pharo Smalltalk

Title: Test amplification in the Pharo Smalltalk Ecosystem
Authors: Mehrdad Abdi, Henrique Rocha and Serge Demeyer, University of Antwerp
Event: International Workshop on Smalltalk Technologies 2019, Cologne 

The STAMP team is glad to announce a brilliant external contribution to DSpot open source test amplification tool, from the University of Antwerp. 

Thanks to Mehrdad Abdi, Henrique Rocha and Serge Demeyer, DSpot is no longer limited to Java applications. The researcher team has successfully replicated the core of DSpot to Smalltalk, developing test amplification in the Pharo environment. The three researchers improved a test suite mutation score, applying DSpot on a simple Bank application with a few methods and test cases. They have learned a lot from there and they are currently fine tuning internal algorithms to scale up to more realistic Smalltalk systems.

Abstract
Test amplification is the act of strengthening existing unit tests to exercise the boundary conditions of the unit under test. It is an emerging research idea which has been demonstrated to work for Java, relying on the type system to safely transform the code under test. In this paper we report on a feasibility study concerning test amplification in the context of the Smalltalk eco-system. We introduce a proof-of-concept test amplifier named Small-Amp, and discuss the advantages and challenges we encountered while incorporating the tool into the Pharo Smalltalk environment. We demonstrate that by building on top of the Refactoring Browser API, the MuTalk mutation tool, it is feasible to build a test amplifier in Pharo Smalltalk despite the absence of a type system.

Jun 26 2019

Descartes and DSpot Demo Applied to OW2 Joram

JoramDemo

OW2 Joram is a JMS-compatible message-oriented middleware.
Thanks to a new Gitlab issue generator extension for Descartes, the Joram team found and solved a critical issue in unit tests.

The OW2 Joram project, as a STAMP use case, reveals:

  • The detection of a critical issue in Joram unit tests, with Descartes (code is removed but the test suite is green - everything "normal"!).
  • The issue is automatically inserted in the Joram Gitlab.
  • Dspot, focused on the issue, generates a test that fixes it.

These tools were really used to detect and fix an issue in Joram: at least, the Descartes + Gitlab issue generation part (the issue was fixed by the Joram team, but without DSpot).

For more information:

Jun 14 2019

DSpot amplification visualization

In this post we share a visualization of the unit test amplification process. In the example, DSpot is used to amplify two unit tests from JSoup.

More…

Jun 03 2019

Production Traffic For Testing

ist

Publication: Information and Software Technology
Authors: Jeff Anderson, Maral Azizi, Saeed Salem, and Hyunsook Do
URL: https://www.sciencedirect.com/science/article/abs/pii/S0950584919301223?dgcid=rss_sd_all

Title: On the Use of Usage Patterns from Telemetry Data for Test Case Prioritization

In an original work about Production Traffic for Testing, Jeff Anderson, Maral Azizi, Saeed Salem, and Hyunsook Do present a new opportunity in the area of regression testing techniques. Here is the abstract:

Context: Modern applications contain pervasive telemetry to ensure reliability and enable monitoring and diagnosis. This presents a new opportunity in the area of regression testing techniques, as we now have the ability to consider usage profiles of the software when making decisions on test execution. Objective: The results of our prior work on test prioritization using telemetry data showed improvement rate on test suite reduction, and test execution time. The objective of this paper is to further investigate this approach and apply prioritization based on multiple prioritization algorithms in an enterprise level cloud application as well as open source projects. We aim to provide an effective prioritization scheme that practitioners can implement with minimum effort. The other objective is to compare the results and the benefits of this technique factors with code coverage-based prioritization approaches, which is the most commonly used test prioritization technique. 

Method: We introduce a method for identifying usage patterns based on telemetry, which we refer to as “telemetry fingerprinting.” Through the use of various
algorithms to compute fingerprints, we conduct empirical studies on multiple software products to show that telemetry fingerprinting can be used to more effectively prioritize regression tests. 

Results: Our experimental results show that the proposed techniques were able to reduce over 30 percent in regression test suite run times compared to the coverage-based prioritization technique in detecting discoverable faults. Further, the results indicate that fingerprints are effective in identifying usage patterns, and that the fingerprints can be applied to improve regression testing techniques.

Conclusion: In this research, we introduce the concept of fingerprinting software usage patterns through telemetry. We provide various algorithms to compute fingerprints and conduct empirical studies that show that fingerprints are effective in identifying distinct usage patterns. By applying these techniques, we believe that regression testing techniques can be improved beyond the current state-of-the-art, yielding additional cost and quality benefits.

Apr 02 2019

Search-Based Test Case Implantation for Testing Untested Configurations

Publication: Information and Software Technology
Authors: Dipesh Pradhan, Shuai Wang, Tao Yue, Shaukat Ali, Marius Liaaen
URL: https://www.sciencedirect.com/science/article/pii/S0950584919300540?dgcid=rss_sd_all

Title: Search-Based Test Case Implantation for Testing Untested Configurations

Context
Modern large-scale software systems are highly configurable, and thus require a large number of test cases to be implemented and revised for testing a variety of system configurations. This makes testing highly configurable systems very expensive and time-consuming.

Objective
Driven by our industrial collaboration with a video conferencing company, we aim to automatically analyze and implant existing test cases (i.e., an original test suite) to test the untested configurations.

Method
We propose a search-based test case implantation approach (named as SBI) consisting of two key components: 1) Test case analyzer that statically analyzes each test case in the original test suite to obtain the progr am dependence graph for test case statements and 2) Test case implanter that uses multi-objective search to select suitable test cases for implantation using three operators, i.e., selection, crossover, and mutation (at the test suite level) and implants the selected test cases using a mutation operator at the test case level including three operations (i.e., addition, modification, and deletion).

Results
We empirically evaluated SBI with an industrial case study and an open source case study by comparing the implanted test suites produced by three variants of SBI with the original test suite using evaluation metrics such as statement coverage (SC), branch coverage (BC), and mutation score (MS). Results show that for both the case studies, the test suites implanted by the three variants of SBI performed significantly better than the original test suites. The best variant of SBI achieved on average 19.3% higher coverage of configuration variable values for both the case studies. Moreover, for the open source case study, the best variant of SBI managed to improve SC, BC, and MS with 5.0%, 7.9%, and 3.2%, respectively.

Conclusion
SBI can be applied to automatically implant a test suite with the aim of testing untested configurations and thus achieving higher configuration coverage.

Mar 22 2019

Configuration Tests: the JHipster web development stack use case

SpringerESE

Title: Test them all, is it worth it? Assessing configuration sampling on the JHipster Web development stack
Authors: Axel Halin, Alexandre Nuttinck, Mathieu Acher, Xavier Devroey, Gilles Perrouin, Benoit Baudry
Publication: Springer Empirical Software Engineering, April 2019, Vol24

A group of software researchers, partially involved in the EU-funded STAMP project, has published interesting results based on configuration tests on JHipster, a popular open source application generator to create Spring Boot and Angular/React projects. 

Abstract: Many approaches for testing configurable software systems start from the same assumption: it is impossible to test all configurations.
This motivated the definition of variability-aware abstractions and sampling techniques to cope with large configuration spaces. Yet, there is no theoretical barrier that prevents the exhaustive testing of all configurations by simply enumerating them if the effort required to do so remains acceptable. Not only this: we believe there is a lot to be learned by systematically and exhaustively testing a configurable system. In this case study, we report on the first ever endeavour to test all possible configurations of the industry-strength, open source configurable software system JHipster, a popular code generator for web applications. We built a testing scaffold for the 26,000+ configurations of JHipster using a cluster of 80 machines during 4 nights for a total of 4,376 hours (182 days) CPU time. We find that 35.70% configurations fail and we identify the feature interactions that cause the errors. We show that sampling strategies (like dissimilarity and 2-wise):
(1) are more effective to find faults than the 12 default configurations used in the JHipster continuous integration;
(2) can be too costly and exceed the available testing budget. We cross this quantitative analysis with the qualitative assessment of JHipster’s lead developers.

Read the full article on Springer website

Feb 27 2019

Five Machine Learning Usages in Software Testing

According to the Reqtest team, machine learning is a hot trend this year, bringing revolutionary changes in workflows and processes.
In software testing, machine learning can be used for:

  • Test suite optimization, to identify redundant and unique test cases.
  • Predictive analytics, to predict the key parameters of software testing processes on the basis of historical data.
  • Log analytics, to identify the tests cases which need to be executed automatically.
  • Traceability, extracting keywords from the Requirements Traceability Matrix (RTM) to achieve test coverage.
  • Defect analytics, to identify high-risk areas of the application for the prioritization of regression test cases.

Read nine more recent testing trends from the Reqtest editors.

Feb 25 2019

Maven Central Top Libraries

Elastest Architecture

Analysing the Maven Central Repository during the second half of 2018, a group of scientific researchers led by Benoit Baudry, Professor in Software Technology at the KTH Royal Institute of Technology, reveals that Maven Central contains more than 2.5 million artifacts, a real treasure of extraordinary software development. More than 17% of the libraries have several versions that are actively used by a large number of clients.
However, 1.3 million dependencies declared are actually not used. Also, a vast majority of APIs can be reduced to a small, compact core and still serve most of their clients. 

For a more accurate exploration of the Maven Central ecosystem, read Benoit Baudry's article posted on Medium.com:
A journey at the heart of 2.4 million Maven artifacts

Feb 11 2019

Global vs Local Coverage

Coverage

On the XWiki project, we've been pursuing a strategy of failing our Maven build automatically whenever the test coverage of each Maven module is below a threshold indicated in the pom.xml of that module. We're using Jacoco to measure this local coverage.

We've been doing this for over 6 years now and we've been generally happy about it. This has allowed us to raise the global test coverage of XWiki by a few percent every year.

More recently, I joined the STAMP European Research Project and one our KPIs is the global coverage, so I got curious and wanted to look at precisely how much we're winning every year. 

I realized that, even though we've been generally increasing our global coverage (computed using Clover), there are times when we actually reduce it or increase very little, even though at the local level all modules increase their local coverage...

Read Vincent Massol, XWiki CTO, full post and learnings

Dec 21 2018

Short circuiting method executions to assess test quality

Today, a Medium article by Benoit Baudry, Professor in Software Technology at KTH and STAMP project coordinator, shares interesting results about Descartes mutation testing tool. This software can automatically short-circuit covered methods and determine a list of pseudo-tested methods in Java projects. Experimenting Descartes over 21 open source Java projects, a total of 28K+ methods could be analyzed, with three main results:

  • short circuiting the complete execution of methods provides valuable feedback to developers. The developers have clear goal to write a test: to make this method not pseudo-tested anymore. Developers are more comfortable reasoning at the granularity of a method than at the statement level (fine grained traditional mutation testing).
  • short circuiting methods has revealed the presence of pseudo-tested methods in all the projects that we have analyzed, even the ones with very high code coverage. Development teams of all Java projects can benefit from this type of analysis to assess their test suites and improve them.
  • interviews with developers reveal that some pseudo-tested methods actually reveal major weaknesses in the test suite. We have collected empirical evidence of test suites fixed after running a short-circuiting experiment.
Site maintained by OW2