Publications

Peer-Reviewed Publications from NortonLifeLock Research Group

Academic Papers - 2013

pdf
MutantX-S: Scalable Malware Clustering Based on Static Features

In Proceedings of the 2013 USENIX Annual Technical Conference (USENIX ATC ’13)
In this paper, we present an efficient malware clustering technique that uses instruction-based features to provide high accuracy.

pdf
Spatio-Temporal Mining of Software Adoption & Penetration

In Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM ’13)

pdf
Server-side code injection attacks: a historical perspective

In Proceedings of the 16th International Symposium on Research in Attacks, Intrusions, and Defenses (RAID 2013)

pdf
MINESTRONE: Testing the SOUP

In Proceedings of the 6th USENIX Workshop on Cyber Security Experimentation and Test (CSET ’13)

pdf
Cloud Resiliency and Security via Diversified Replica Execution and Monitoring

In Proceedings of the 6th International Symposium on Resilient Cyber Systems (ISRCS 2013)

pdf
Detecting Deadlock in Programs with Data-Centric Synchronization

In Proceedings of the 35th International Conference on Software Engineering (ICSE ’13)
We present an analysis for establishing deadlock-freedom for programs written in AJ, a Java extension in which programmers declaratively specify synchronization constraints on data members, relieving them from writing error-prone synchronization code.

pdf
SpamTracer: How stealthy are spammers?

In Proceedings of the 32nd IEEE International Conference on Computer Communications (IEEE INFOCOM 2013)
In this paper we present SpamTracer, a system designed to collect and analyze the routing behavior of spam networks in order to determine whether they use BGP hijacks to stealthily send spam from stolen networks.

pdf
A Safety-First Approach to Memory Models

IEEE Micro Top Picks, Volume 33, Number 3, May/June 2013
The concurrency semantics of mainstream programming languages provide "safety" only under the assumption that programmers have implemented proper synchronization to prevent data races. But since simple programming mistakes can break this assumption and result in unreliable program behavior, we argue instead for providing a safety-first model that assumes an access may participate in a data race unless proven otherwise.

pdf
DCU-Symantec at the WMT 2013 Quality Estimation Shared Task

In Proceedings of the 8th Workshop on Statistical Machine Translation (ACL 2013)

pdf
Evaluation of MT systems to translate user generated content

In Proceedings of the 13th Machine Translation Summit (MT Summit XIII)

pdf
Community-based post-editing of machine-translated content: monolingual vs. bilingual

In Proceedings of the 2nd MT Summit XIV Workshop on Post-editing Technology and Practice (WPTP 2013)

pdf
Domain adaptation in statistical machine translation of user-forum data using component-level mixture modeling in statistical machine translation of user-forum data using component-level mixture modeling

In Proceedings of the 13th Machine Translation Summit (MT Summit XIII)

pdf
The ACCEPT Post-Editing environment: a flexible and customisable online tool to perform and analyse machine translation post-editing

In Proceedings of the 14th Machine Translation Summit (MT Summit 2013)

pdf
Quality Estimation-guided Data Selection for Domain Adaptation of SMT

In Proceedings of the 14th Machine Translation Summit (MT Summit 2013)

Academic Papers - 2012

pdf
DISCLOSURE: Detecting Botnet Command and Control Servers Through Large-Scale NetFlow Analysis

In Proceedings of the 28th Annual Computer Security Applications Conference (ACSAC ’12)
We present Disclosure, a large-scale, wide-area botnet detection system that incorporates a combination of novel techniques analysing netflow data.

pdf
Before We Knew It: An Empirical Study of Zero-Day Attacks In The Real World

In Proceedings of the 2012 ACM Conference on Computer and Communications Security (CCS ‘12)
We describe a method for automatically identifying zero-day attacks from field-gathered data that records when benign and malicious binaries are downloaded on 11 million real hosts around the world.

pdf
Ethical Considerations of Sharing Data for Cybersecurity Research

In Proceedings of the 15th International Conference on Financial Cryptography and Data Security (FC 2011)

pdf
A Detailed Analysis of Phrase-based and Syntax-based Machine Translation: The Search for Systematic Differences

In Proceedings of the 10th Conference of the Association for Machine Translation in the Americas (AMTA 2012)

pdf
The Provenance of WINE

In Proceedings of the 9th European Dependable Computing Conference ( EDCC 2012)
In the WINE benchmark, which provides field data for cyber security experiments, we aim to make the experimental process self-documenting. The data collected includes provenance information—such as when, where and how an attack was first observed or detected—and allows researchers to gauge information quality.

pdf
Translation Quality-Based Supplementary Data Selection by Incremental Update of Translation Models

In Proceedings of the 24th International Conference on Computational Linguistics (COLING 2012)

pdf
Spammers operations: a multifaceted strategic analysis

Security and Communication Networks (Wiley) (09 October 2012)
This paper explores several facets of spammers operations by studying their strategic behavior on a long‐term basis.

pdf
Toward a Standard Benchmark for Computer Security Research: The Worldwide Intelligence Network Environment (WINE)

In Proceedings of the 1st EuroSys Workshop on Building Analysis Datasets and Gathering Experience Returns for Security (BADGERS ‘11)

pdf
Ask WINE: Are We Safer Today? Evaluating Operating System Security through Big Data Analysis

In Proceedings of the 5th USENIX Workshop on Large-Scale Exploits and Emerging Threats (LEET '12)
In this position paper, we argue that in order to answer conclusively whether end-users are safer today, we must analyze field data collected on real hosts that are targeted by attacks—e.g., the approximately 50 million records of anti-virus telemetry available through Symantec’s WINE platform.

pdf
Using Automatic Machine Translation Metrics to Analyze the Impact of Source Reformulations

In Proceedings of the 10th Biennial Conference of the Association for Machine Translation in the Americas (AMTA-2012)

pdf
End-to-End Sequential Consistency

In Proceedings of the 39th Annual International Symposium on Computer Architecture (ISCA ’12)
By allowing compiler and hardware to cooperate, we show how strong, safe memory models for concurrent programs can be provided with minimal impact on performance.

pdf
Visual Analytics for BGP Monitoring and Prefix Hijacking Identification

IEEE Network (Volume: 26 , Issue: 6 , November-December 2012)
In this article, we give a short survey of visualization methods that have been developed for BGP monitoring, in particular for the identification of prefix hijacks. Our goal is to illustrate how network visualization has the potential to assist an analyst in detecting abnormal routing patterns in massive amounts of BGP data.

pdf
VisTracer: A Visual Analytics Tool to Investigate Routing Anomalies in Traceroutes

In Proceedings of the 9th Symposium on Visualisation for Cyber Security (VizSec ‘12)
This paper proposes VisTracer, a visual analytics tool specifically tailored for the analysis of traceroute measurements for the purpose of uncovering routing anomalies potentially resulting from BGP hijacks.

pdf
A Data-Centric Approach to Synchronization

ACM Transactions on Programming Languages (TOPLAS), Volume 34, Issue 1, April 2012
Concurrency-related errors, such as data races, are frustratingly difficult to track down and eliminate in large, object-oriented programs. We describe AJ, and extension to Java, which uses a declarative, data-centric synchronization paradigm that eliminates a large class of concurrency bugs with low programmer effort.

pdf
Industrial Espionage and Targeted Attacks: Understanding the Characteristics of an Escalating Threat

In Proceedings of the 15th International Workshop on Recent Advances in Intrusion Detection (RAID 2012)
We provide an in-depth analysis of a large corpus of targeted attacks identified by Symantec during the year 2011.

pdf
File Routing Middleware for Cloud Deduplication

In Proceedings of the 2nd International Workshop on Cloud Computing Platforms (CloudCP ’12)
We propose the idea of performing local deduplication operations within each cloud node, and introduce file similarity metrics to determine which node is the best deduplication host for a particular incoming file. This approach reduces the problem of scalable cloud deduplication to a file routing problem, which we can address using a software layer capable of making the necessary routing decisions.

pdf
Domain Adaptation in SMT of User-Generated Forum Content Guided by OOV Word Reduction: Normalization and/or Supplementary Data?

In Proceedings of the 16th Annual Conference of the European Association for Machine Translation (EAMT 2012)

pdf
DCU-Symantec Submission for the WMT 2012 Quality Estimation Task

In Proceedings of the 7th Workshop on Statistical Machine Translation (NAACL 2012)

pdf
Evaluation of Machine-Translated User Generated Content: A pilot study based on User Ratings

In Proceedings of the 16th Annual Conference of the European Association for Machine Translation (EAMT 2012)

Related News

Secure systems map

Secure Systems

Central to trust in an increasingly digital world is the ability to detect and prevent attacks in modern (and not so modern) information systems. This research includes building secure software, supporting forensics, malware analysis, browser/web/network security, and information-centric security.

LEARN MORE
machine learning image

Robust and Fair Machine Learning, Data Mining, and Artificial Intelligence

The tremendous growth in the learning capacity of Machine Learning methods has yet to be met with a corresponding growth in our ability to understand these models. Equally troubling, our ability to build robust machine learning models has not kept pace with research in adversarial attacks against machine learning. As we increasingly hand over decision making to automated machine learning and AI systems, we must find ways that the life-altering decisions made by these systems can be audited for fairness, safety, robustness to adversaries, and the preservation of privacy of any personally identifiable information over which they operate.

LEARN MORE
Man entering credit card details on tablet

Privacy, Identity, and Trust

Consumers and corporations are driven to engage in a digital world that they cannot adequately trust. We are developing paradigms to enable online commerce and facilitate machine learning in ways that provide privacy and protect user identities, by leveraging such concepts as local differential privacy, federated machine learning, identity brokering, and blockchain technology.

LEARN MORE