Discriminative Identification of Duplicates

Activity: Talk or presentationConference PresentationsResearch

Peter Haider - Speaker

Ulf Brefeld - Speaker

Tobias Scheffer - Speaker

The problem of finding duplicates in data is ubiquitous in
data mining. We cast the problem of finding duplicates in sequential data
into a poly-cut problem on a fully connected graph. The edge weights can
be identified with parameterized pairwise similarities between objects
that are optimized by structural support vector machines on labeled
training sets. Our approach adapts the similarity measure to the data and
is independent of the number of clusters. We present three large margin
approximations of learning the pairwise similarities: an integrated QP-
formulation, a sequential multi-class approach and a pairwise classifier.
We report on experimental results
18.09.200622.09.2006

Event

European Conference on Machine Learning

18.09.0622.09.06

Berlin, Berlin, Germany

Event: Conference

Recently viewed

Projects

  1. Gründungslabor

Publications

  1. More than a YouTube Channel
  2. Magnesium-based metal matrix nanocomposites—processing and properties
  3. Predicting online user behavior based on Real-Time Advertising Data
  4. Statistical methods for the evaluation of hydrological parameters for landuse planning
  5. Noninteracting optimal and adaptive torque control using an online parameter estimation with help of polynomials in EKF for a PMSM
  6. Scale-dependent diversity patterns affect spider assemblages of two contrasting forest ecosystems
  7. Photodegradation of micropollutants using V-UV/UV-C processes
  8. Biodegradability and genotoxicity of surface functionalized colloidal silica (SiO2) particles in the aquatic environment
  9. Reconfiguring Desecuritization
  10. Making the most out of timeseries symptom data
  11. Vergütung, variable
  12. Understanding and Supporting Management Decision-Making
  13. Fermentative utilization of coffee mucilage using Bacillus coagulans and investigation of down-stream processing of fermentation broth for optically pure L(+)-lactic acid production
  14. Knowledge transfer during the integration of knowledge-intensive acquisitions
  15. Exports, R&D and Productivity
  16. The role of task meaning on output in groups
  17. Geometric structures using model predictive control for an electromagnetic actuator
  18. Usage pattern-based exposure screening as a simple tool for the regional priority-setting in environmental risk assessment of veterinary antibiotics
  19. Design of Reliable Remobilisation Finger Implants with Geometry Elements of a Triple Periodic Minimal Surface Structure via Additive Manufacturing of Silicon Nitride
  20. Self-perception of the internal audit function within the corporate governance system - Empirical evidence for the European Union
  21. Material flow analysis between dynamic modelling and life cycle assessment
  22. Design rules for environmental biodegradability of phenylalanine alkyl ester linked ionic liquids
  23. Exploring Leverages and Pitfalls of Context Collapse in Modern Communication
  24. Comparison of Software Tools for Liquid Chromatography-High-Resolution Mass Spectrometry Data Processing in Nontarget Screening of Environmental Samples