8 posts

Jan 14 2019

Andy Zaidman, TU Delft Associate Professor in Software Engineering

STAMP uses state-of-the-art search-based software engineering techniques to reproduce existing crashes


How would you present STAMP?

STAMP is about being smart about testing. We know we need to test, but often we don’t do it or we don’t do it fully enough. What STAMP brings to the table is making full use of the tests that are already there and by applying smart approaches, creating additional tests. 

Some of the key technologies that STAMP uses are so-called amplification operators, or small systematic changes that are made to test code, so that the test’s course is altered. If this leads to exercising an interesting path through the code, potentially uncovering a bug, we have a new test that we can add. 

Similarly, using techniques borrowed from artificial intelligence, we construct a new test that replicates a crash that has previously occurred. 


What is your role in STAMP?

My primary role is to supervise and guide the two researchers from Delft University of Technology that work on crash replication. As such, I am primarily concerned with WP3, the work package about the runtime test amplification.

What key innovation do you bring or help to develop? 

We use state-of-the-art search-based software engineering techniques to reproduce existing crashes. We start from a stack trace, which is a simple list of the method calls that were executed just before the program crashed. Then, using a genetic algorithm we try to approximate the exact sequence of method calls from the stack trace, with the expectation that we thus also recreate the crash. 

Being able to recreate a crash is important, because it is often the first step in understanding and subsequently debugging the software. In addition, once the bug has been resolved, the test can be altered, so that it becomes a regression test.

A word about yourself and your organization

I am an associate professor in software engineering working specifically in the area of software testing. I study how people test software and how to make the process of testing easier.

Dec 19 2018

Vincent Massol, XWiki CTO

STAMP innovates running UI functional tests on various configurations and environments


How would you present STAMP?

STAMP is a European Research project with the aim of pushing the limits in Java software testing. Its novelty comes from the focus it has, which is to try to generate new tests based on existing tests. This is different from other initiatives trying to generate tests from source code only and this gives STAMP a much higher chance of getting tangible results. 

STAMP develops three key technologies:

1) The usage of Mutation Testing as a way to measure existing test quality but more importantly as a strategy to mutate test code to generate new tests.

2) The ability to mutate Dockerfiles to execute test suites under various configurations and thus to automatically find out which configurations are supported by the software.

3) The ability to take a Java stack trace from production and to generate automatically a test that, when executed, leads to exactly the same stack trace! This means finding the conditions leading to the error and thus providing the developers help to fix the issue.


What is your role in STAMP?

XWiki contributes to STAMP mostly as a Use Case Provider, which means providing needs and use cases for the various tools being developed by STAMP. It also means testing the various tools developed by the Academic partners on a real production project with not only a sizable code base (hundreds of thousands of lines of code), but also a relatively well-tested code base (70% test coverage overall, more than 10K tests) and with a big focus on quality. Inside STAMP, XWiki also develop some tools and scripts such as:

  • Jenkins pipeline script to handle flickering tests so that they are recognized as flickers and don’t generate false positives
  • Maven plugin to compare OpenClover reports and that allows failing the build if the global test coverage contribution by a module is negative, across two dates.
  • A functional test framework based on TestContainers, used in the XWiki code base to run XWiki’s Selenium tests across various configurations, inside Docker containers.

What key innovation do you bring or help to develop? 

Key innovations:

  • Automated UI testing using Docker and Selenium to reduce bugs related to configurations
  • Increasing test quality through Mutation Testing. Code coverage doesn’t guarantee the quality of tests, just ensure that the code is reached but it doesn’t say anything about assertions nor whether the test is useful.
  • New test generation (from production stack traces, from existing tests)

The main challenges we face are about generating new tests from existing ones and doing that in a reasonable timeframe. Another difficulty is in generating new tests that are relevant to the developers and discarding those that are not.

The most useful innovation for XWiki and the one we’re very excited about is the ability to run UI functional tests on various configurations/environments. We’ve already put this in production and we can run XWiki tests on Tomcat/Jetty/MySQL/PostgreSQL/HSQLDB/Chrome/Firefox/LibreOffice and all combinations of these environments and different versions of them. This has already allowed us to find several bugs that were only happening on some environments.

A word about yourself and your organization

I’ve been working for the XWiki open source project since 2005 and for the XWiki SAS company (as its CTO) since 2006. XWiki SAS is sponsoring the development of the XWiki open source project and I have the privilege to lead this team of talented developers. What’s interesting is that there’s a complete separation between the XWiki SAS company and the open source project, which has its own governance. We strongly believe in community-driven open source!

The XWiki open source project has always been very keen about software quality and over the years we’ve implemented a lot of testing strategies and tools to ensure this quality (check my blog post to know more about what we are working on). Thus, we were very happy to join the STAMP research project to try to push test quality even further and participate to the future of testing.


Oct 15 2018

Henry Coles, PITest designer

Practicing effective mutation testing


How do you see mutation testing tools being adopted in business projects, as an effective method with significant benefits on software updates? 

The most effective way I've seen mutation testing tools being deployed in business projects is when a developer simply starts using one locally to check their own work as they develop. If they are self motivated to write good code and good tests then the tool saves them time and effort by automating some of the thinking and highlighting areas of the code and tests that need attention. 

I am often contacted by people trying a top down approach where mutation testing is setup on a CI server and developers then told that they must attain some mutation score or other. I've never tried this with mutation testing, but have worked in situations where this has been done with plain code coverage. The results were never good, the code coverage scores went up but the benefit that these score is meant to be a proxy for (good tests that enabled the code to be refactored) was never achieved. Instead developers generally wrote tests that were either ineffective or, much worse, typely coupled and overfitted to the code making it hard to change.

Can you give us more information about today’s business challenges solved by Pitest? How is it relevant in today's environment compared to 10 years ago?

The challenge is "how can we push out code that works and is easy to change, in order to enable our business of doing X". 

A key learning in industry over the last 20 years or so is that simple code with well written tests and quick feedback cycles, is easier to change than "clever" code without tests or with slow feedback cycles. Pitest automates checking one aspect of a good test (its strength), so allows code to be developed more quickly with fewer mistakes.

Unfortunately strength is only one aspect of what makes a test good so, like any tool, the person wielding needs skill and understanding about what it is they are trying to achieve.

I don't think the relevance has changed greatly compared to 10 years ago. I often see it asserted that mutation testing tools are more relevant now as we have access to more computational power, but I am not sure this has been the main driving factor. Looking at the tools that are in common use (pitest, mutant, infection) the main change has been along the lines of "do-smarter" - making more effective use of the available CPU cycles. It is telling that languages such as C# where tools that implement the key coverage targeting optimisation have not been available have seen little practical uptake of mutation testing, despite the equal availability of more computational power.

What kind of sector in the industry is using more Pitest? And how frequently?

I have a small insight into this based on hits to the pitest website from links hosted on corporate CI servers. Based on this the main users would appear to be the financial services and insurance industry, but this sample is skewed towards people running pitest in a certain way.

Based on contacts via e-mail, pitest is being used in a very wide variety of industries (recruitment, bio tech, fashion, tractor sales, big science). It is not possible to say how frequently, personally I run it against code I'm working on upto a few hundred times a day.

Do you have to convince new DevOps teams that mutation testing improves the code quality? Is that a cultural challenge?

I don't try to convince people to use mutation testing. If a team needs convincing there are probably more important things they need to do before they start using a mutation testing tool (learning how to write "good" tests being the most important one - where "good" covers much more than the test's ability to detect faults).

There are an infinite range of different attitudes and cultures in development ranging from "tests are a waste of time" through to "I see no point in mutation testing because I'm a rockstar developer and there can't possibly be anything wrong with my tests". 

So, yes, it can be a cultural challenge. Other teams instantly see the benefits and use it enthusiastically.

Would you appreciate more collaborations with STAMP project partners? On which aspects?

While I would love to be involved in the STAMP project and its partners, mutation testing is currently something I can only work on in my spare time, which is very limited. That said, an area I currently find particularly interesting is mutant subsumption.


Henry Coles (@0hjc) is Head Of Development Practice at NCR Edinburgh, Scotland. He has been writing software professionally for 20 years, most of it in Java. Henry has produced many open source tools including pitest and an open source book, Java for Small Teams.

Sep 18 2018

Brice Morin, SINTEF Senior Research Scientist

CAMP builds a set of Docker images to test multiple configurations


How would you present STAMP? 

Writing rich test suites able to bring a high level of confidence in your software is a costly and time-consuming endeavour. STAMP takes your existing tests and automatically amplify them, to generate more tests, increasing test coverage and reducing the number of regressions.

What is your role in STAMP?

I am involved in Work Package 2, where SINTEF and other partners develop novel techniques for configuration testing, in particular leveraging new and rapidly-adopted technologies such as Docker containers.


What key innovation do you bring or help to develop?

With the rapid adoption of new development paradigm such as microservice architectures, configuration testing is becoming a crucial task within the DevOps process, for example to ensure that containers, typically building on top of third-party base images, can still work whenever the base images is updated. To help with configuration testing, we have developed CAMP, which automatically builds a set of Docker images and configurations from a set of constraints, in order to test the system in multiple configurations.

A word about yourself and your organization

SINTEF is a large, multi-disciplinary and independent research institute. SINTEF Digital is an institute within SINTEF, whose mission is to support and accelerate the digitization of our society. New paradigms such as Cloud Computing and Microservices, are important drivers for digitization, and SINTEF Digital has developed strong experience in Cloud systems, e.g. through FP7 MODAClouds and PaaSage, and microservices through strategic internal projects.

Jul 05 2018

Daniele Gagliardi, Engineering Group Technical Manager

An augmented software testing design now delivered as a pure service


How would you present STAMP? 

From my point of view STAMP is some kind of augmented reality to the service of software testing design: when I, as a developer, design test cases and test configurations for my software, STAMP helps me to enhance my design with several variants, also assessing how good are my test design.
It’s not a bare “generate-automatically-code” tool that works for me (designing a test case is an intellectual human activity that no computer in the world can emulate), it’s a tool that empowers my design.

What is your role in STAMP?

I’m the WP4 leader. This work package aims to integrate STAMP in developers toolboxes and toolchains, and to provide potential STAMP users with relevant documentation and courseware to adopt it as easily as possible. In this activity, I had a very fruitful collaboration with all other partners, mainly with OW2, INRIA, ATOS, XWiki and ActiveEon as they are the main partners involved in WP4, and this is turning out to be a very enriching experience for me.


What key innovation do you bring or help to develop?

At the beginning of the project, the main challenge was to grasp the concept of mutation testing, not so well known in day-by-day software development activities of a system integrator such as Engineering Group.
Once the internals of mutation testing and the automatic generation of test cases and test configurations became clearer, the next question was: how this stuff can be integrated in software tools of daily use to become easy to use? At the moment we focused on the most used tools (developer productivity tools as Maven, Gradle, Eclipse and automation tools as Jenkins and Gitlab), but the exciting aspect is to make available these amplification services as microservices, in order to offer them as cloud services. This packaging is interesting because it frees STAMP adopters from the constraint of having STAMP execution environments - system and hardware resources that need to be managed -, providing this “augmented software testing design” as a pure service.

A word about yourself and your organization

I’m an employee at Engineering Group and I’m working within a team which provides all company employees with the tools and the best practices needed to make their job in the best way. We provide an infrastructure that helps people to work in agile or traditional ways, with all the tools needed to automate as much as possible software development and quality assurance processes. Moreover we make internal consultancy about software testing with a focus on test automation, performance and security tests and teach several courses about software testing in our corporate IT & Management School. The involvement in STAMP project was a natural consequence.


Daniele Gagliardi is an electronic engineer with a passion for computer sciences. He's currently working as a technical manager at Engineering Group, in Padua (Italy) with a small but great team of 9 people supporting all Engineering Group employees in making their software solutions as better as possible with the best testing tools and methodologies available today.

Jun 01 2018

Jesús Gorroñogoitia, Research Line Expert on Software Engineering, ATOS Research

Improving bug detection wherever the software is executed


How would you present STAMP? 

STAMP aims at improving the Software Engineering QA process in a 3-dimensional approach that largely improves the efficiency of the design, implementation and execution of test cases on a SUT (System under test) over multiple configurations. These 3 dimensions are: 

1) amplifying the test cases, in terms of their number and quality (i.e. quality of assertions),
2) amplifying the SUT configurations and executions,
3) amplifying the reproduction of runtime crashes (i.e. reproducing runtime exceptions). 

The ultimate purpose of these amplifications is to improve the ability of the testing process to detect and anticipate software defects under all the circumstances this software is executed.

What is your role in STAMP?

My main role is to lead the industrial validation of the STAMP results, in terms of the adequacy (i.e. fit for a purpose) of the STAMP techniques, methods, tools and services to improve the industrial QA process in software engineering. I am also playing a secondary role as leader of the Atos’ team in the industrialization of the STAMP tools, particularly integrating them within the Eclipse IDE, as well as improving their performance.


What key innovation do you bring or help to develop?

We are bringing MDE techniques for abstracting container-based configuration deployment technologies (e.g. Docker, Ansible, Chess, etc) into a platform independent metamodel (DSL) for test configuration amplification. We have large expertise on IDE tooling development for software engineering and on the optimization of processes in the JVM.

A word about yourself and your organization

I am working as Research Line Expert in Software Engineering in the IT group of the Atos Research and Innovation Department (ARI). In the last 12 years I’ve been working on EU funded projects from the FP6 program, on topics such Service Oriented Computing (SOC), Model Driven Engineering (MDE), Open-Source Software Collaborative Environments, Autonomous Computing, Testing or Semantics.
ARI is the R&D hub for new technologies and a key reference for the whole Atos group. More than 150 employees in ARI are participating in the research, development and in - novation (RDI) projects that enrich Atos offer portfolio, market view or position with respect to emerging technologies.
Atos is leader in digital services with pro forma annual revenue of circa € 13 billion and circa 100,000 employees in 73 countries, serving a global client base.


 Jesús Gorroñogoitia has a degree in Theoretical Physics from the Universidad Complutense de Madrid (UCM), also complementing his studies with a Master in Condensed Matter and Statistics Physics by UNED (Madrid). He has been working in diverse ICT companies as Software Analyst and Architect for 20 years. In Atos Research & Innovation (ARI) he is currently the ARI Research Line Expert on Software Engineering, working on topics such as Service Oriented Computing (SOC), Model Driven Engineering (MDE), Open-Source Software Collaborative Environments, Autonomous Computing, Testing or Semantics. Currently, he has the role of architect and integration leader in the H2020 SUPERSEDE project, technical team leader in H2020 STAMP project and technology consultant in MegaM@ART ECSEL project. He is also member of the OW2 Technology Council and the Cluster on Software Engineering for Services and Applications.

Feb 27 2018

Caroline Landry, Software Architect and Project Manager, INRIA

Reducing the number of regression bugs and improving test coverages


How would you present STAMP? 

The main goal of STAMP is to automatically generate tests from existing assets (scenarios, configurations and logs), to detect regressions and reduce tests cost.
Writing and maintaining test suites manually are costly or … not done ! :-)
So using the test amplification, an innovative technology, STAMP raises software quality by reducing the number of regression bugs, and improving test coverage.

What is your role in STAMP?

As the technical project manager, I’m in charge of the global project organization and the reporting to the European Commission. I’m also involved in technical work packages, especially the WP1 about  Unit Test Amplification.


What key innovation do you bring or help to develop?

Test amplification is a new field of research in software testing, and the concept can be applied to several domains of software testing, which correspond to different steps in the project lifecycle:

  • unit tests (coding phase),
  • configuration tests (integration/validation phase),
  • log analysis (operational phase)

The work on unit test amplification also explores another innovative technology: the extreme mutation testing. Mutation testing is a robust technique, but a drawback is the number of generated mutants, because the traditional approach works at instruction level. Extreme mutation consists in removing all the instructions of a method, so it significantly reduces the number of mutants.
Industrial applications are obvious, at least for the STAMP team :-), though with several challenges and among them, the usability of such technologies in a DevOps approach, as software testing can be very time-consuming, not just for humans, and the way to integrate the use of the tools in a CI system.

A word about yourself and your organization

I’m software engineer, and I’ve worked for industry for almost 30 years before joining Inria, the national institute of research on digital sciences. Research at Inria covers fields as diverse as healthcare, transport, energy, communications, security and privacy protection, smart cities and the factory of the future.
I’m a member of the DIVERSE team, who currently works on 4 main research axis: software language engineering, software variability, software adaptation and software diversification. But the foundation behind all our research activities are abstraction and model manipulation to automatically generate software.


Nov 10 2017

Benoit Baudry, Professor in Software Technology, KTH 

Automatically Enhancing Test Suites to Improve Software Quality


How would you present STAMP? 

STAMP addresses the need for increased quality of automatic testing in a continuous delivery pipeline. Companies that have adopted DevOps already have a culture of automatic testing, but also acknowledge that the quality of their test suites can be improved. STAMP develops technology that has exactly this objective: automatically enhance existing test assets, such as unit test suites or test configurations, to improve software quality in DevOps. 

What is your role in STAMP?

I am the scientific and technical coordinator of the project. As such, I lead all collaborative activities, actively disseminate the results of the project and coordinate the management tasks. I also coordinate the scientific and research activities on unit test amplification within WP1.


What key innovation do you bring or help to develop?

I contribute to the development and experimentation of a novel concept in the area of test automation, which is called “test amplification”. The key idea is to start from existing test assets, i.e., any program or script that already automates a testing task, and then generate variants of these assets through automatic transformations. The intuition is that these assets embed essential knowledge put there by a human developer, but that this knowledge is naturally only partial because it is manually defined. In this context, machines can be very good at exploring large quantities of variants that rely on the same knowledge but trigger diverse behavior that need to be tested.

A word about yourself and your organization

I am scientist working in the area of software engineering. Until 2017, I was at INRIA, in Rennes, France. Now, I am at KTH, the Royal Institute of Technology, in Stockholm, Sweden. I lead a group of students and engineers who investigate algorithms and tools to automatically diversify software components (unit test cases in STAMP, libraries and applications in the context of other projects).
I strongly believe in the value of EU projects to strengthen scientific collaborations within Europe, to increase the impact of science on innovation through direct experiments with use case providers and to increase the visibility of science and software tools through open source consortia.  

Learn more about testing your software tests with mutants through this EclipseCon Europe 2017 video presentation by Benoit Baudry: