Yun Shen

Yun Shen has been a researcher at Symantec Research Labs since 2012. His current research interests focus on applying data-driven approaches to better understand malicious activity on the Internet. Through the collection and analysis of large-scale datasets, he developed novel and robust mitigation techniques to make the Internet a safer place. His research involves a mix of quantitative analysis, machine learning, and systems design. 

Before joining Symantec, he was a researcher in the HP Labs Bristol, working on privacy enhancing technologies and Cloud Computing infrastructure. Prior to this, he conducted research on intelligence analysis supported by government funding in the University of Bristol. He has authored a number of papers in international journals and conferences, and several US patents. Dr. Shen received his PhD in Computer Science from University of Hull (UK), where his research focused on indexing and retrieval of distributed XML data. He received his bachelors degree in Computer Science from Sichuan University (China).

Selected Academic Papers

  • Tiresias: Predicting Security Events Through Deep Learning
    Yun Shen, Enrico Mariconti, Pierre-Antoine Vervier, and Gianluca Stringhini
    In Proceedings of the 25th ACM Conference on Computer and Communications Security (ACM CCS 2018)

  • Before Toasters Rise Up: A View Into the Emerging IoT Threat Landscape
    Pierre-Antoine Vervier and Yun Shen
    In Proceedings of the 21st International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2018)

  • Multi-label Learning with Highly Incomplete Data via Collaborative Embedding
    Yufei Han, Guolei Sun, Yun Shen, Xiangliang Zhang
    In Proceedings of the 24th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2018)

    We proposed a weakly supervised multi-label learning approach, based on the idea of collaborative embedding. It provides a flexible framework to conduct efficient multi-label classification at both transductive and inductive mode by coupling the process of reconstructing missing features and weak label assignments in a joint optimisation framework.

  • Marmite: Spreading Malicious File Reputation Through Download Graphs
    Gianluca Stringhini, Yun Shen, Yufei Han, Xiangliang Zhang
    Annual Computer Security Applications Conference (ACSAC 2017)

    We presented Marmite, a system that is able to detect malicious files by leveraging a global download graph and label propagation with Bayesian confidence.

  • Accurate spear phishing campaign attribution and early detection
    Y Han, Y Shen
    ACM Sig SAC 2016

    In this paper, we introduce four categories of email profiling features that capture var-ious characteristics of spear phishing emails. Building on these features, we implement and evaluate an affinity graphbased semi-supervised learning model for campaign attri-bution and detection.

  • Content-Agnostic Malware Detection in Heterogeneous Malicious Distribution Graph
    I M Alabdulmohsin, Y Han, Y Shen, X Zhang
    CIKM 2016

    We propose a novel Bayesian label propagation model to unify the multi-source information,including content-agnostic features of different node types and topological information of the heterogeneous network. Our approach does not need to examine the source codes nor inspect the dynamic behaviours of a binary. Instead, it estimates the maliciousness of a given file through a semi-supervised label propagation procedure, which has a linear time complexity w.r.t. the number of nodes and edges. The evaluation on 567 million real-world download events validates that our proposed approach efficiently detects malware with a high accuracy.

  • Partially Supervised Graph Embedding for Positive Unlabelled Feature Selection
    by Yufei Han and Yun Shen
    IJCAI 2016

    We propose to encode the weakly supervised information in PU learning tasks into pairwise constraints between training in-stances. Violation of pairwise constraints are measured and incorporated into a partially supervised graph embedding model.

  • Insights into rooted and non-rooted Android mobile devices with behavior analytics
    Y Shen, N Evans, A Benameur
    ACM SAC 2016

    We proposed the first quantitative analysis of mobile devices from the perspective of comparing rooted devices to non-rooted devices. We have attempted to map high level thoughts about the characteristics of users who root their devices to the low-level data at our disposal.

  • All your Root Checks are Belong to Us: The Sad State of Root Detection
    N S Evans, A Benameur, Y Shen
    ACM MobiWac 2015

    We analyzed security focused applications as well as BYOD solutions that check for evidence that a deviceis “rooted”.