Abstract. Detecting advanced attacks is increasingly complex and no single solution can work. Defenders can leverage logs and alarms produced by network and security devices, but big data analytics solutions are necessary to transform huge volumes of raw data into useful information. Existing anomaly detection frameworks either work offline or aim to mark a host as compromised, with high risk of false alarms. We propose a novel online approach that monitors the behavior of each internal host, detects suspicious activities possibly related to advanced attacks, and correlates these anomaly indicators to produce a list of the most likely compromised hosts that is provided to human analysts. Due to the huge number of devices and traffic logs, we make scalability one of our top priorities. Therefore, most computations are independent on the number of hosts and can be naively parallelized. A large set of experiments demonstrate that our proposal can pave the way to novel forms of detection of advanced malware.