Peer-Reviewed Publications from NortonLifeLock Research Group
In Proceedings of the 23rd ACM Conference on Computer and Communications Security (ACM Sigsac 2016)
In this paper, we introduce four categories of email profiling features that capture various characteristics of spear phishing emails. Building on these features, we implement and evaluate an affinity graph-based semi-supervised learning model for campaign attribution and detection.
In Proceedings of the 31st ACM/SIGAPP Symposium on Applied Computing (ACM SAC 2016)
We proposed the first quantitative analysis of mobile devices from the perspective of comparing rooted devices to non-rooted devices. We have attempted to map high level thoughts about the characteristics of users who root their devices to the low-level data at our disposal.
In Proceedings of the 25th USENIX Security Symposium (USENIX Security 2019)
We perform the first systematic study of PUP prevalence and its distribution through pay-per-install (PPI) services, which link advertisers that want to promote their programs with affiliate publishers willing to bundle their programs with offers for other software.
In Proceedings of the VLDB Endowment, Vol. 10, No. 3, 2016
A scalable and distributed implementation of the DBSCAN clustering algorithm. The particularity of NG-DBSCAN is that it works scalably based on arbitrary data and distance functions.
In Proceedings of the 25th International Joint Conference on Artificial Intelligence (IJCAI 2016)
We propose to encode the weakly supervised information in PU learning tasks into pairwise constraints between training in-stances. Violation of pairwise constraints are measured and incorporated into a partially supervised graph embedding model.
In Proceedings of the 25th International World Wide Web Conference (WWW), 2016
We study the problem of determining the proper aggregation granularity for a stream of time-stamped edges. To this end, we propose ADAGE and demonstrate its value in automatically finding the appropriate aggregation intervals on edge streams for belief propagation to detect malicious files and machines.
In Proceedings of the 21st IEEE Symposium on Computers and Communication (ISCC 2016)
We use distributed and scalable clustering techniques to perform estimation of population estimation, including mobility, based on mobile phone calls data.
In Proceedings of the 25th ACM International Conference on Information and Knowledge Management (CIKM 2016)
We propose a novel Bayesian label propagation model to unify the multi-source information, including content-agnostic features of different node types and topological information of the heterogeneous network. Our approach does not need to examine the source codes nor inspect the dynamic behaviours of a binary. Instead, it estimates the maliciousness of a given file through a semi-supervised label propagation procedure, which has a linear time complexity w.r.t. the number of nodes and edges. The evaluation on 567 million real-world download events validates that our proposed approach efficiently detects malware with a high accuracy.
IEEE Transactions on Computers, 2016
Size-based scheduling algorithms can perform disastrously with skewed workloads and incorrect size information. PSBS is a scheduling discipline that performs very well even when job sizes are incorrect.
In Proceedings of the IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS'16) In the context of large-scale data architectures, we propose an efficient technique to speedup the routing of a large number of real-time queries while minimizing the number of machines that each query touches (query span).