Supervised clustering of streaming data for email batch detection

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Standard

Supervised clustering of streaming data for email batch detection. / Haider, Peter; Brefeld, Ulf; Scheffer, Tobias.
Proceedings of the 24th international conference on Machine learning. ed. / Zoubin Ghahramani. New York: Association for Computing Machinery, Inc, 2007. p. 345-352.

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Harvard

Haider, P, Brefeld, U & Scheffer, T 2007, Supervised clustering of streaming data for email batch detection. in Z Ghahramani (ed.), Proceedings of the 24th international conference on Machine learning. Association for Computing Machinery, Inc, New York, pp. 345-352, Proceedings of the 24th international conference on Machine learning - ICML 2007, Corvalis, OR, Oregon, United States, 20.06.07. https://doi.org/10.1145/1273496.1273540

APA

Haider, P., Brefeld, U., & Scheffer, T. (2007). Supervised clustering of streaming data for email batch detection. In Z. Ghahramani (Ed.), Proceedings of the 24th international conference on Machine learning (pp. 345-352). Association for Computing Machinery, Inc. https://doi.org/10.1145/1273496.1273540

Vancouver

Haider P, Brefeld U, Scheffer T. Supervised clustering of streaming data for email batch detection. In Ghahramani Z, editor, Proceedings of the 24th international conference on Machine learning. New York: Association for Computing Machinery, Inc. 2007. p. 345-352 doi: 10.1145/1273496.1273540

Bibtex

@inbook{c18a5632af0d4bb78aadc3a531618a5c,
title = "Supervised clustering of streaming data for email batch detection",
abstract = "We address the problem of detecting batches of emails that have been created according to the same template. This problem is motivated by the desire to filter spam more effectively by exploiting collective information about entire batches of jointly generated messages. The application matches the problem setting of supervised clustering, because examples of correct clusterings can be collected. Known decoding procedures for supervised clustering are cubic in the number of instances. When decisions cannot be reconsidered once they have been made - - owing to the streaming nature of the data - - then the decoding problem can be solved in linear time. We devise a sequential decoding procedure and derive the corresponding optimization problem of supervised clustering. We study the impact of collective attributes of email batches on the effectiveness of recognizing spam emails.",
keywords = "Informatics, Business informatics",
author = "Peter Haider and Ulf Brefeld and Tobias Scheffer",
year = "2007",
doi = "10.1145/1273496.1273540",
language = "English",
isbn = "978-1-59593-793-3",
pages = "345--352",
editor = "Zoubin Ghahramani",
booktitle = "Proceedings of the 24th international conference on Machine learning",
publisher = "Association for Computing Machinery, Inc",
address = "United States",
note = "Proceedings of the 24th international conference on Machine learning - ICML 2007, ICML 2007 ; Conference date: 20-06-2007 Through 24-06-2007",
url = "https://dl.acm.org/doi/proceedings/10.1145/1273496",

}

RIS

TY - CHAP

T1 - Supervised clustering of streaming data for email batch detection

AU - Haider, Peter

AU - Brefeld, Ulf

AU - Scheffer, Tobias

N1 - Conference code: 24

PY - 2007

Y1 - 2007

N2 - We address the problem of detecting batches of emails that have been created according to the same template. This problem is motivated by the desire to filter spam more effectively by exploiting collective information about entire batches of jointly generated messages. The application matches the problem setting of supervised clustering, because examples of correct clusterings can be collected. Known decoding procedures for supervised clustering are cubic in the number of instances. When decisions cannot be reconsidered once they have been made - - owing to the streaming nature of the data - - then the decoding problem can be solved in linear time. We devise a sequential decoding procedure and derive the corresponding optimization problem of supervised clustering. We study the impact of collective attributes of email batches on the effectiveness of recognizing spam emails.

AB - We address the problem of detecting batches of emails that have been created according to the same template. This problem is motivated by the desire to filter spam more effectively by exploiting collective information about entire batches of jointly generated messages. The application matches the problem setting of supervised clustering, because examples of correct clusterings can be collected. Known decoding procedures for supervised clustering are cubic in the number of instances. When decisions cannot be reconsidered once they have been made - - owing to the streaming nature of the data - - then the decoding problem can be solved in linear time. We devise a sequential decoding procedure and derive the corresponding optimization problem of supervised clustering. We study the impact of collective attributes of email batches on the effectiveness of recognizing spam emails.

KW - Informatics

KW - Business informatics

UR - http://www.scopus.com/inward/record.url?scp=34547983265&partnerID=8YFLogxK

U2 - 10.1145/1273496.1273540

DO - 10.1145/1273496.1273540

M3 - Article in conference proceedings

AN - SCOPUS:34547983265

SN - 978-1-59593-793-3

SP - 345

EP - 352

BT - Proceedings of the 24th international conference on Machine learning

A2 - Ghahramani, Zoubin

PB - Association for Computing Machinery, Inc

CY - New York

T2 - Proceedings of the 24th international conference on Machine learning - ICML 2007

Y2 - 20 June 2007 through 24 June 2007

ER -

DOI

Recently viewed

Activities

  1. Teaching the machine how to assess grammar skills. Modelling verb-tense exercise characteristics as a basis for an adaptive E-learning system
  2. From Iconography to Visual Framing: A New Approach in Visual Communication
  3. Navigating between Predictability and Creativity in Complex Innovation Processes: The Role of Entrainment and Detrainment in Temporal Work
  4. Quality of reading instruction in language classrooms: Subject specific analysis of teaching quality
  5. Vulnerability and Decline of Societal Structures: Which Opportunities for Global Environmental Governance?
  6. What if Civilization Collapses? Management Scholarship in and for Deep Adaption
  7. Evaluation of tension-compression asymmetry in nanocrystalline PdAu using a Drucker-Prager type constitutive model.
  8. UV photodegradation of trimipramine under different environmental variables and chemical nature of aqueous solution - biodegradation and LC-MSn characterization of the formed transformation products
  9. One generation plants the trees, another gets the shade? Negotiators' perceptions and behaviors in intergenerational allocations of resources.
  10. The link between supervisory board reporting and firm performance in Germany and Austria
  11. Presentation: Nexus of Housing and Migration
  12. Presentation of the paper entitled: "A Dual Kalman Filter to Identify Parameters of a Permanent Magnet Synchronous Motor"
  13. The Predictive Power of Social Media Sentiment for Short-Term Stock Movements
  14. Ludic Overload - Ludic Overkill: Gamification in the age of Media Overload
  15. Karlstad Universität
  16. PRIORITIZATION OF VETERINARY ANTIBIOTICS FOR ENVIRONMENTAL ANALYSIS USING A SIMPLE SCREENING APPROACH
  17. 3rd Transalpine Organizing Creativity Paper Development Workshop