Project Background Information: Software Data/Behaviour Mining

Swinburne University of Technology, Melbourne, Australia


The context. Today's software industry demands a short development cycle and continuous delivery for software systems and services, as highlighted by the DevOps movement ( To meet such industry needs, automated system testing needs to be conducted against production-like conditions throughout the development cycle. The highly interconnected nature of enterprise software systems poses a particular challenge, as the behaviour of any software service depends on the specific interconnections with other services in the production environment. To this end, light-weight, realistic emulations of dependency services are required to facilitate system testing.


The project. This research project is to address this challenge by developing:

(1)    techniques that automatically analyse the interaction traces of software services, derive executable behaviour models for them, and emulate these services by executing the derived behavioural models for runtime interaction with other services;

(2)    techniques that automatically correlate the interaction traces and behavioural models of a network of interconnected software services, and discover the business process and the mutual impacts between the services;

(3)    domain-specific languages that can be used to describe the service behaviour models at a high level of abstraction, so that they are amenable to engineer manipulation as well as automatic execution.

The project is multi-discipline, combining techniques from Data Mining, Services Computing and Model Driven Software Engineering.


The personnel. The joint research project is funded by the Australian Research Council and CA Technologies (NASDAQ: CA), involving Swinburne University of Technology, the University of Melbourne and CA Technologies (NASDAQ: CA). The project is led by Professor Jun Han (Swinburne), and has Associate Professor Jean-Guy Schneider (Swinburne), Professor Chris Leckie (Melbourne), Professor Chengfei Liu (Swinburne) and Dr Steve Versteeg (CA) as other chief/partner investigators. In addition, the project has one postdoctoral research fellow and four PhD students investigating various research issues.


The background readings below provide further information concerning the research issues targeted.


For further information, please contact Professor Jun Han (


Some background readings:

1.       Data Mining for Software Engineering. IEEE Computer 42(8): 55-62 (). (pdf)

2.       Generating service models by trace subsequence substitution. QoSA : 123-132. (pdf)

3.       Interaction Traces Mining for Efficient System Responses Generation. ACM SIGSOFT Software Engineering Notes 40(1): 1-8 () (pdf)

4.       From Network Traces to System Responses: Opaquely Emulating Software Services. CoRR abs/1510.01421 (2015).( ).

5.       A virtual deployment testing environment for enterprise software systems. QoSA : 101-110. (pdf)

6.       Enterprise software service emulation: constructing large-scale testbeds. ICSE2016, (

7.       Emulation of Cloud-Scale Environments for Scalability Testing. QSIC : 201-209. (pdf)

8.       A Business Protocol Unit Testing Framework for Web Service Composition. CAiSE : 17-34. (pdf)

9.       Automatic generation of software behavioral models. ICSE : 501-510 (GK-tail). (pdf)

10.   Leveraging existing instrumentation to automatically infer invariant-constrained models. SIGSOFT FSE : 267-277 (Synoptic). (pdf)

11.   Automatically Generating Test Cases for Specification Mining. IEEE Trans. Software Eng. 38(2): 243-257 (). (pdf)

12.   Automatic mining of specifications from invocation traces and method invariants. SIGSOFT FSE : 178-189. (pdf)