An Off-the-shelf Approach to Authorship Attribution

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Authors

Authorship detection is a challenging task due to many design choices the user has to decide on. The performance highly depends on the right set of features, the amount of data, in-sample vs. out-of-sample settings, and profile- vs. instance-based approaches. So far, the variety of combinations renders off-the-shelf methods for authorship detection inappropriate. We propose a novel and generally deployable method that does not share these limitations. We treat authorship attribution as an anomaly detection problem where author regions are learned in feature space. The choice of the right feature space for a given task is identified automatically by representing the optimal solution as a linear mixture of multiple kernel functions (MKL). Our approach allows to include labelled as well as unlabelled examples to remedy the in-sample and out-of-sample problems. Empirically, we observe our proposed novel technique either to be better or on par with baseline competitors. However, our method relieves the user from critical design choices (e.g., feature set) and can therefore be used as an off-the-shelf method for authorship attribution.

Original languageEnglish
Title of host publicationCOLING 2014 - 25th International Conference on Computational Linguistics, Proceedings of COLING 2014 : Technical Papers
Number of pages10
Place of PublicationDublin
PublisherAssociation for Computational Linguistics (ACL)
Publication date2014
Pages895-904
ISBN (print)978-194164326-6
ISBN (electronic)9781941643266
Publication statusPublished - 2014
Externally publishedYes
Event25th International Conference on Computational Linguistics - COLING 2014 - Dublin, Ireland
Duration: 23.08.201429.08.2014
Conference number: 25
https://aclanthology.info/volumes/proceedings-of-coling-2014-the-25th-international-conference-on-computational-linguistics-technical-papers

Links

Recently viewed

Publications

  1. A group-level theory of helping and altruism within and across group boundaries
  2. Problems in Mathematizing Systems Biology
  3. Multitrait-Multimethod Analysis
  4. Guest editorial
  5. Sustainable Development
  6. SoilTemp: A global database of near-surface temperature
  7. New product development and flawed cause-and-effect relations in strategy maps
  8. Algorithmisches Management
  9. Editorial overview
  10. Achieving consumer trust on Twitter via CSR communication
  11. How Did It Get So Late So Soon? The Effects of Time Management Knowledge and Practice on Students’ Time Management Skills and Academic Performance
  12. Applying standard network analysis to hypermedia systems
  13. Feature Extraction and Aggregation for Predicting the Euro 2016
  14. The Structure and Behavioural Effects of Revealed Social Identity Preferences
  15. Soziale Netzwerke im Internet
  16. Non-invariance? An Overstated Problem With Misconceived Causes
  17. Effectiveness of a Web-Based Intervention in Reducing Depression and Sickness Absence
  18. We'll get them to do anything! Funny Inventions and Marketing
  19. Trainingsqualität durch Trainingsquantität?
  20. Demarcating transdisciplinary research in sustainability science—Five clusters of research modes based on evidence from 59 research projects
  21. A scale-up procedure to dialkyl carbonates; evaluation of their properties, biodegradability, and toxicity
  22. Effect of ECAP Process on the Activation of Deformation Mechanisms During Subsequent Uniaxial Tension of Mg-ZEWK2000 Sheets
  23. Fictions of the Possible
  24. Information Extraction from Invoices
  25. Native vegetation cover thresholds associated with species responses
  26. Endemic predators, invasive prey and native diversity
  27. The informed society - Final report of SAFECOAST action 2
  28. Towards the design of organosilicon compounds for environmental degradation by using structure biodegradability relationships
  29. Local expansion concepts for detecting transport barriers in dynamical systems
  30. The First 50 Contributions to the Data Observer Series - An Overview
  31. Physicochemical properties and biodegradability of organically functionalized colloidal silica particles in aqueous environment

Press / Media

  1. Ostern