The WOMBAT project aims at providing new means to understand the existing and emerging threats that are targeting the Internet economy and the net citizens. To reach this goal, the proposal includes three key workpackages: (i) real time gathering of a diverse set of security related raw data, (ii) enrichment of this input by means of various analysis techniques, and (iii) root cause identification and understanding of the phenomena under scrutiny. The acquired knowledge will be shared with all interested security actors (ISPs, CERTs, security vendors, etc.), enabling them to make sound security investment decisions and to focus on the most dangerous activities first. Special care will also be devoted to impact the level of confidence of the European citizens in the net economy by leveraging security awareness in Europe thanks to the gained expertise.

More details»

D24/D6.4 Second Open Workshop Proceedings

This is the deliverable for the second wombat open workshop, BADGERS, that took place within the EuroSys 2011 conference on April 10 in Salzburg (Austria). In this document we discuss the preparation of the second workshop, our expectations vs. feedback and impressions we collected by authors and attenders. Proceedings are included.


D23/D5.3 Early Warning System: Experimental report

A large part of Workpackage 5 concerns the Early Warning System functionality. This deliverable offers a report of the experiments carried out as part of the effort to create the Early Warning System. Several specialized alerting systems are presented, including FIRE, Exposure, BANOMAD and HoneyBuddy myIMhoneypot


D22/D5.2 Root Causes Analysis: Experimental Report

This deliverable offers an extensive report of all experiments carried out with respect to root cause analysis techniques. This final deliverable for Workpackage 5 (Threats Intelligence ) builds upon D12 (D5.1 - Technical Survey on Root Cause Analysis) and benefits from the modifications made to the various software modules developed in WP4, following up the experimental feedback.
The R&D efforts carried out in WP5 with respect to root cause analysis have produced a novel framework for attack attribution called triage. This framework has been successfully applied to various wombat datasets to perform intelligence analyses by taking advantage of several structural and contextual features of the data sets developed by the different partners. These experiments enabled us to get insights into the underlying root phenomena that have likely caused many security events observed by sensors deployed by wombat partners.
In this deliverable, we provide an in-depth description of experimental results obtained with triage, in particular with respect to (i) the analysis of Rogue AV campaigns (based on  HARMUR data), and (ii) the analysis of different malware variants attributed to the Allaple malware family (based on data from SGNET, VirusTotal and Anubis).
Finally, we describe another experiment performed on a large spam data set obtained from Symantec.Cloud (formerly MessageLabs), for which triage was successfully used to analyze spam botnets and their ecosystem, i.e., how those botnets are used by spammers to organize and coordinate their spam campaigns. Thanks to this application, we are considering a possible technology transfer of triage to Symantec.Cloud, who is interested in carrying out regular intelligence analyses of their spam data sets, and may ralso consider the integration of triage to their Skeptic ○ spam filtering technology.


D21/D4.7 Consolidated report with evaluation results

This is the final deliverable for Workpackage 4 within the wombat project. In this document we discuss the final extensions and improvements to our data collection and analysis techniques that were implemented as part of wombat. Furthermore, we present some additional results obtained from the analysis of data collected within wombat.


The Wombat API (WAPI) is now available on sourceforge


WAPI, or WOMBAT API, is a SOAP-based API built in the context of the project to facilitate the remote access and exploration of security-related datasets.

The package contains all the essential code to start using the WAPI. The WAPI represents an attempt to tackle two main challenges for security data providers:

- Many of the data access primitives are not easily scriptable. Many data sources provide web-based interfaces that, while easily accessible by human operators, are not convenient for automated analysis.

- The interfaces for security datasets are very diverse in structure and methodology. The analyst who wants to take advantage of multiple data sources to perform correlations among them is thus forced to implement ad-hoc plugins and parsers for each data feed. This process is not necessarily a simple task, and requires the analyst to fully understand, for example, the schema of the SQL database provided by the data owner.

You can find the package on sourceforge :

More information and details on WAPI are available in the deliverable D10/D6.3.

WOMBAT second open workshop proceedings

This volume collects the proceedings of the second WOMBAT Project Workshop,held on April 10 in Salzburg.


Wombat Deliverable D18/D4.6 Final description of contextual features

The objective of Workpackage 4 is to develop techniques to characterize the malicious
code that is collected in the previous workpackage. The main idea is to enrich the
collected code thanks to metadata that might reveal insights into the origin of the code
and the intentions of those that created, released or used it.
This deliverable is an extension of D15 (D4.5), and provides a final description of the
contextual features collected within the wombat consortium. Furthermore, it presents
initial results, statistics, and insights obtained by analyzing the collected contextual


WOMBAT second open workshop Call For Paper

This deliverable is a final report on the experimental results obtained by using structural
features to characterize executable code. It discusses and evaluates a number of tech-
niques, based on these features, that have been developed in the context of the wombat
project, and aim to provide a deeper understanding of malicious code and of the relations
between malicious code samples.

Wombat Deliverable D16/D4.2 Analysis Report of Behavioral Features

This deliverable provides a discussion of the features used to characterize the behavior
of code, and a discussion of preliminary results of applying these features to a set of
malicious code. It discusses the project's results in behavior-based clustering, malware
detection at end hosts in different ways, system call analysis, but also our work on
shellcode behavior.


Wombat Deliverable D15/D4.5 Intermediate Report on Contextual Features

The objective of this Workpackage 4 is to develop techniques to characterize the malicious code that is collected in the previous workpackage. The main idea is to enrich the collected code thanks to metadata that might reveal insights into the origin of the code and the intentions of those that created, released or used it. This deliverable provides a preliminary discussion of possible contextual features of malware, and for each feature, an estimate on its effectiveness and the difficulty to obtain it. Some of these features can be used to analyze potential threats and discriminate collected samples that are mere variations of already known threats.


Wombat Deliverable D13/D3.3 Sensor Deployment

This deliverable reports the deployment of all types of sensors implemented in the WOMBAT project and includes descriptions of experiences with the sensors from several months of deployment and experimentation. The sensors that are deployed are the SGNET, HARMUR, Shelia, Paranoid Android, HoneySpider Network, Bluebat and NoAH. The early experiences show that the WOMBAT Project is fulfilling our preliminary expectations about having powerful tools for collecting data. These data are useful for categorizing attackers and malware behaviors. Moreover our experiments reveal that the sensors can cooperate with each other, enriching in this way the information offered for analysis.


Wombat Deliverable D12/D5.1 Root Causes Analysis

This deliverable aims at giving an overview of existing techniques for root cause analysis, and provides some preliminary results with respect to the root cause analysis work performed in the project so far. The deliverable is mainly made up of 6 published peer-reviewed papers and one technical report that has reached a wide-audience.

This deliverable provides a preliminary discussion of structural features that can be used to characterize executable code. Furthermore, it discusses a number of techniques, based on these features, that are being developed in the context of the wombat project, and aim to provide a deeper understanding of malicious code and of the relations between malicious code samples.


Wombat Deliverable D10/D6.3 First WOMBAT open workshop proceedings

This volume collects the presentations and handouts of the first WOMBAT open Workshop,held on September 22-23, 2009 in St. Malo. This year's workshop focuses on the introduction of early results of the project, and in particular on the Wombat APIs or WAPI, a set of API developed by the project partners to allow integrated access to different attack dataset.
The aim of the workshop was to give participants a first-hand experience on how the WAPIs
help the analyst and the researcher in investigating new phenomena. The demos and presentations were prepared thanks to the collective effort of the project partners: France Telecom, Hispasec, Politecnico di Milano, Technical University of Vienna, Institut
Eurecom, FORTH-ICS, Symantec Corporation, Vrije Universiteit Amsterdam, Institute for Infocomm Research, NASK.


WOMBAT Deliverable D08/D4.1 Specification language for code behavior

This document provides a specification language to describe the behavior of code. Consistently with the requirements for an extensible, layered architecture for the behavioral analysis of malware, four different languages are defined, ranging from a complete, low-level description of the code's behavior to a high-level analysis report that is suitable for a human analyst. Furthermore, current approaches to behavioral malware analysis and detection within the wombat project are discussed, most of which already take advantage (or can be extended to take advantage) of the provided specification language.


WOMBAT first open workshop programme

The program of the first WOMBAT open workshop is now available.

September 22nd:

  • 12:00-14:00: Registration and lunch
  • 14.00-14.10: Official welcome and introduction (H. Debar)
  • 14.10-14.40: "Introduction to the WOMBAT datasets" (M. Dacier)
  • 14.45-15.15: "The WOMBAT WAPI: idea, implementation and use" (C. Leita)
  • 15.15-16.00: "The SHELIA and HNS client honeypot datasets" (H. Bos; P. Kijewski)
  • 16.00-16.15: Coffee Break and preparation for the demos
  • 16.15-18.30: Demos
  • Adjourn

September 23rd

  • 9.00-9.30:"Clustering malware with ANUBIS and SGNET and interaction with the WAPI" (P. M. Comparetti)
  • 9.30-12.30: Demos with coffee break
  • 12.30: lunch, closing of the workshop

The registration for the WOMBAT workshop can be done on the RAID+ESORICS registration page. If you wish to register for the workshop alone, you should use the RAID+WOMBAT registration and mention it in the comments. The registration fee for the workshop alone is 50 Euros.

WOMBAT first open workshop

The WOMBAT consortium will organise its first open workshop in St 
Malo, France, on September 22-23 (from Tuesday 12:00 - Wednesday 12:00).

The workshop is conveniently co-located with RAID and organised just before the main conference. The workshop will be practical and hands-on. Attendance will be limited to 45 researchers. Registration should be made through the RAID registration site by selecting the RAID+WOMBAT option.

By means of presentations, participants will learn what sources of
information Wombat makes available to analysts, security experts and researchers. These sources include malware repositories and attack related databases such as those of Anubis, Symantec, HoneySpider, VirusTotal, Noah, SGNet, and several others. Moreover, participants will be allowed to get hands-on experience in an exciting tutorial session in which the participant uses a variety of sensors and databases to analyse different security incidents.

We believe that the availability of a large set of databases and a way to access all of them conveniently will be crucial for any security expert. By means of a simple API, WOMBAT allows users to do so in an intuitive manner, while allowing the data owners to keep control over exactly what data can be shared and how.

WOMBAT Origami


WOMBAT first open workshop

The WOMBAT project will organize its first open workshop in St Malo, France, September 22-23, 2009 noon-to-noon, just before RAID 2009. Attendance will be limited to 45 researchers. Additional information will be announced here.

WOMBAT derivatives


Lecture at ZISC by Marc Dacier from Symantec

Marc Dacier from symantec has presented a one hour lecture at the ZISC Information Security colloquium ( including pointers to WOMBAT.

In order to assure accuracy and realism of resilience assessment methods and tools, it is essential to have access to field data that are unbiased and representative. Several initiatives are taking place that offer access to malware samples for research purposes. Papers are published where techniques have been assessed thanks to these samples. Definition of benchmarking datasets is the next step ahead. In this presentation, we report on the lessons learned while collecting and analyzing malware samples in a large scale collaborative effort. Three different environments are described and their integration used to highlight the open issues that remain with such data collection. Three main lessons are offered to the reader. First, creation of representative malware samples datasets is probably an impossible task. Second, false negative alerts are not what we think they are. Third, false positive alerts exist where we were not used to see them. These three lessons have to be taken into account by those who want to assess the resilience of techniques with respect to malicious faults.

These are the results of a joint work carried out in the context of the European funded WOMBAT project, together with partners from Hispasec Systemas, EURECOM institute and Symantec Research Labs Europe (see for more on the WOMBAT project). Zurich_ZISC_presentation.pdf

WOMBAT presentation at the e-COPP conference

As part of his presentation at the e-COPP conference, P. Kijewski (NASK) will introduce the WOMBAT project.

WOMBAT paper accepted at NDSS2009

The following paper has been accepted at the Network and Distributed Systems Security (NDSS) 2009 conference:

Title: Scalable, Behavior-Based Malware Clustering
  • Ulrich Bayer, TUV
  • Paolo Milani Comparetti, TUV
  • Clemens Hlauschek, TUV
  • Christopher Kruegel, UCSB
  • Engin Kirda, Eurecom

Anti-malware companies receive thousands of malware samples every day. To process this large quantity, a number of automated analysis tools were developed. These tools execute a malicious program in a controlled environment and produce reports that summarize the program's actions. Of course, the problem of analyzing the reports still remains. Recently, researchers have started to explore automated clustering techniques that help to identify samples that exhibit similar behavior. This allows an analyst to discard reports of samples that have been seen before, while focusing on novel, interesting threats. Unfortunately, previous techniques do not scale well and frequently fail to generalize the observed activity well enough to recognize related malware.

In this paper, we propose a scalable clustering approach to identify and group malware samples that exhibit similar behavior. For this, we first perform dynamic analysis to obtain the execution traces of malware programs. These execution traces are then generalized into behavioral profiles, which characterize the activity of a program in more abstract terms. The profiles serve as input to an efficient clustering algorithm that allows us to handle sample sets that are an order of magnitude larger than previous approaches. We have applied our system to real-world malware collections. The results demonstrate that our technique is able to recognize and group malware programs that behave similarly, achieving a better precision than previous approaches. To underline the scalability of the system, we clustered a set of more than 75 thousand samples in less than three hours.

WOMBAT Participation at the FIA Conference in Madrid, Dec. 2008

The WOMBAT proect will be represented at the Future Internet Assembly conference in Madrid, December 2008, by the following people:
  • Vincent Boutroux, France Télécom R&D/Orange Labs
  • Sotiris Ioannidis, FORTH (also representing FORWARD)
  • Philip Homburg, VU (Also representing FORWARD)
  • Paolo Milani Comparetti, TUV

WOMBAT participation at the ICT 2008 Conference in Lyon

The WOMBAT project will be represented by the following people at the ICT 2008 Conference:
  • Vincent Boutroux, France Télécom R&D/Orange Labs
  • Marc Dacier, Symantec

WOMBAT contribution to the Think-Trust project

Hervé Debar participates in working group 1 of the Think-Trust project.

WOMBAT participation at the SEC 2008 Conference

The WOMBAT project was represented by Hervé Debar at the SEC 2008 Conference in Paris, September 2008. 

Wombat Deliverable D06/D3.1 Infrastructure Design


This document contains a description of the wombat architecture and a high level design
of the new sensors. The wombat architecture is covered by a comprehensive review of
all its components. Part of this architecture is also the data sources and especially the
new ones that will be implemented as part of the wombat project. Each of them will
be described in the design level, focusing on the way that they will be integrated with
the wombat infrastructure


PhD Defense of Corrado Leita

M. Corrado LEITA will publicly defend his UNS Doctoral Thesis 
on Thursday, December 4th 2008 at 2:00 pm, in the Amphitheater MARCONI at EURECOM.

Topic of the Thesis:

"SGNET: automated protocol learning for the observation of malicious threats"

Jury members :

  • Marc DACIER (Symantec)
  • Vern PAXSON (ICSI)
  • Hervé DEBAR (France Télécom R&D/Orange Labs)
  • Engin KIRDA (Eurecom)
  • Christopher KRUEGEL (UCSB)

One of the main prerequisites for the development of reliable defenses to protect a network resource consists in the collection of quantitative data on  Internet threats. This attempt to "know your enemy" leads to an increasing interest in the collection and exploitation of datasets providing intelligence on network attacks. The creation of these datasets is a very challenging task. The challenge derives from the need to cope with the spatial and quantitative diversity of malicious activities. The observations need to be performed on a broad perspective, since the activities are not uniformly distributed over the IP space. At the same time, the data collectors need to be sophisticated enough to extract a sufficient amount of information on each activity and perform meaningful inferences. How to combine the simultaneous need to deploy a vast number of data collectors with the need of sophistication required to make meaningful observations? This work addresses this challenge by proposing a protocol learning technique based on bioinformatics algorithms. The proposed technique allows to automatically generate low-cost protocol responders starting from a set of samples of network interaction. Its characteristics are exploited in a distributed honeypot deployment that collected information on Internet attacks for a period of 8 months in 23 different networks distributed all over the world (Europe, Australia, United States). This information is organized in a central dataset enriched with contextual information from a number of sources and analysis tools. Simple data mining techniques proposed in this work allow the generation of a valuable overview on the propagation techniques employed by nowadays malware.

Interaction with the WOMBAT project - data provision

The WOMBAT project has received numerous requests for interaction, either to provide data to the project for analysis or to use the information collected by the project.

Our current answer to these requests is to suggest that, if you are interested in participating, you join one of the project partners' initiatives. The current suggestion is to install an SGNet honeypot through the project, This will enable you to collect data and provide it to the project. It will also enable you to access some of the data collected by others throgh well specified interfaces, and carry out your own data analysis research.

If you are a large data collector, we also have an interface for data exchange, run by FORTH in Greece. Please contact us if you feel that you fall into this category

WOMBAT Deliverable D05/D2.3 Requirements analysis

This document outlines the requirements for early warning systems built on technology provided by the WOMBAT project, setting out both: functional and non-functional requirements. The collected requirements reflect the identified user needs and the key directions to be followed within the research and development Work-packages (WP3-Data Collection and Distribution, WP4-Data Enrichment and Characterization, WP5-Threat Intelligence).

The document starts from an assessment of user requirements gathered from potential users including external participants in the Amsterdam Workshop and the WOMBAT development group. This part covers expectations of distinct classes of data users such as: security vendors, malware researchers, ISPs, CERT teams, Government, financial institutions and home users. It details the requirements for the system architecture, data and system functions, and specifies performance, availability and security features to provide sufficient functionality. It also defines user interface, testing and configuration management requirements.


WOMBAT Deliverable D03/D2.2 Analysis of the state of the art

This document contains a detailed analysis of the state-of-the-art tools and research approaches for malware collection and analysis. We have reviewed high/medium/low-interaction honeypots and malware collection tools and worldwide initiatives. The analysis of the collected malware is covered by a comprehensive review of the most relevant research proposals, also including techniques that have been used to analyze running programs in general, to be adapted for the wombat purposes.


Institute for Infocomm Research


Vrije Universiteit Amsterdam

Stefano Zanero from Politecnico di Milano will present the WOMBAT project during the ENISA Awareness phone conference on June 20th, 2008. We will briefly describe WOMBAT, an EU FP7 STREP project which began on 01/01/2008, which aims to build an automatic, global network which can perform early warning, automatic classification and analysis of malware and exploits as they propagate, or are used, worldwide. We will outline the challenges we see in the project, and the project goals.

NASK announces participation to WOMBAT


Symantec announces participation to the WOMBAT project


Hispasec announces participation to WOMBAT on its blog


Contribution of NASK

Partner description
The Research and Academic Computer Network (NASK) is a research and development unit active in Poland since March 1991. It was set up to connect Poland and the scientific-academic community to the Internet. Currently, NASK is one of the main Internet Service Providers in Poland and operator of the '.pl' country top level domain. The primary NASK group that will take part in the project is CERT (Computer Emergency Response Team) Polska, a team within NASK, set up to handle Internet security incidents for the '.pl' constituency. It will be supported by members of the NASK Research Division. CERT Polska has been operational since 1996 (until 2000 known as CERT NASK). The team cooperates with other IRTs from around the world under the auspices of FIRST (Forum of Incident Response Security Teams) and with many ISPs, banks and government institutions in Poland. It also runs ARAKIS, a nation-wide early warning system, that uses a large distributed network of sensors located in various Polish institutions to collect and analyze network activity to detect new threats. CERT Polska has contributed to EU funded projects, under FP5 ( and the Safer Internet Action Plan (SpotSpam and NIFC Hotline Polska). Representatives from NASK, including CERT Polska team members play active roles (Management Board member, National Liaison Officer and Working Group members) in cooperation with ENISA.
Partner specific involvement in the wombat project
NASK has extensive practical experience in the area of honeypot technology achieved through the design, implementation, deployment and maintenance of a wide network of honeypot based sensors (one of the initial data sources for WOMBAT). The CERT contribution will be unique as it will be based on over a 10 year practical experience in security incident handling. The team will focus on the development of threat intelligence acquisition from a CERT perspective (WP5). Moreover, it will engage in state of the art analysis, formulation of requirements (WP2), design of interfaces between WOMBAT and the ARAKIS system (WP3), testing of new sensors (WP3), as well as the evaluation of the proposed data enrichment and malware characterization methods (WP4). Dissemination will also be handled, in particular in the IRT community (WP6).

Contribution of France Télécom R&D

Partner description
France Télécom R&D is the corporate research and development arm of France Télécom, in charge of specifying, implementing and testing advanced services for the company. The group involved in the project is the Network and Services Security (NSS) laboratory of the Middleware and Advanced Platforms (MAPS) research and development center. The NSS laboratory is 65 people strong and covers all areas of research and development in information systems security. Beyond security engineering, the laboratory has a strong focus on research, funding more than 12 man-years on research issues, hosting 10 PhD students and managing external research contracts with leading French research institutions such as Supélec or the Groupement des Ecoles de Telecommunications (GET). The laboratory also contributes to several European projects, such as Ecrypt (NoE), Artist (NoE), Resist (NoE), Diadem (STREP) and Daidalos (IP).
Partner specific involvement in the wombat project
France Télécom R&D is the project coordinator; it has exhaustive experience in handling collaborative research projects at the European level. Furthermore, France Télécom R&D has significant research contributions in the project related to malware collection and analysis. As a participant to several honeypot alliances and an operator of specific wireless honeypot technologies, France Télécom will produce scientific research results for the project. As an industrial partner, France Télécom wishes to exploit the project results during the development of the LiveboxTM home or SME Internet gateway, and the hardening of its networking infrastructure. In WP3, France Télécom will contribute alternative honeypot technologies, related to wireless networks and to clientside honeypots. In WP4, France Télécom will develop new malware models using grammars to describe their behavior, and will use these grammars in WP5 to evaluate the detection capabilities of the tools we have in place for detecting malware propagation.

WOMBAT Workshop, April 21st-22nd, Amsterdam, NL

On April 21st-22nd, the WOMBAT project will organize an invitation-only workshop (located in Amsterdam, Netherlands) to address the difficulties in collaboration and attack data sharing. The discussion will address standards for data exchange, infrastructural challenges, and the resolution of privacy and competition issues in data sharing. The project partners will present the vision of the project, and a draft version of our requirements analysis. The invited participants will share their own technical infrastructures and research directions. Some of the revised papers presented at the workshop will be released in a volume of proceedings.

Update: The proceedings are published by IEEE in their electronic library.

Project fact sheet from the EC

The aim of WOMBAT is to provide new means to understand the existing and emerging threats that are targeting the Internet economy and the net citizens.


Find recent content on the main index or look in the archives to find all content.