Harvesting information from captions for weakly supervised semantic segmentation

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Authors

Since acquiring pixel-wise annotations for training convolutional neural networks for semantic image segmentation is time-consuming, weakly supervised approaches that only require class tags have been proposed. In this work, we propose another form of supervision, namely image captions as they can be found on the Internet. These captions have two advantages. They do not require additional curation as it is the case for the clean class tags used by current weakly supervised approaches and they provide textual context for the classes present in an image. To leverage such textual context, we deploy a multi-modal network that learns a joint embedding of the visual representation of the image and the textual representation of the caption. The network estimates text activation maps (TAMs) for class names as well as compound concepts, i.e. combinations of nouns and their attributes. The TAMs of compound concepts describing classes of interest substantially improve the quality of the estimated class activation maps which are then used to train a network for semantic segmentation. We evaluate our method on the COCO dataset where it achieves state of the art results for weakly supervised image segmentation.

Original languageEnglish
Title of host publication2019 International Conference on Computer Vision Workshops : ICCV 2019 : proceedings : 27 October-2 November 2019, Seoul, Korea
Number of pages10
Place of PublicationPiscataway
PublisherInstitute of Electrical and Electronics Engineers Inc.
Publication date10.2019
Pages4481-4490
Article number9022140
ISBN (print)978-1-7281-5024-6
ISBN (electronic)978-1-7281-5023-9
DOIs
Publication statusPublished - 10.2019
Externally publishedYes
Event17th IEEE/CVF International Conference on Computer Vision Workshop - ICCVW 2019 - Seoul, Korea, Republic of
Duration: 27.10.201928.10.2019
Conference number: 17
https://iccv2019.thecvf.com/

Bibliographical note

Publisher Copyright:
© 2019 IEEE.

    Research areas

  • Multimodal learning, Semantic segmentation, Weakly supervised learning, Weakly supervised semantic segmentation
  • Informatics

DOI

Recently viewed

Publications

  1. Fast, Fully Automated Analysis of Voriconazole from Serum by LC-LC-ESI-MS-MS with Parallel Column-Switching Technique
  2. Analysis And Comparison Of Dispatching RuleBased Scheduling In Dual-Resource Constrained Shop-Floor Scenarios
  3. Closed-form Solution for the Direct Kinematics Problem of the Planar 3-RPR Parallel Mechanism
  4. Exploration strategies, performance, and error consequences when learning a complex computer task
  5. Lessons learned for spatial modelling of ecosystem services in support of ecosystem accounting
  6. Construct Objectification and De-Objectification in Organization Theory
  7. Holistic and scalable ranking of RDF data
  8. Lyapunov Convergence Analysis for Asymptotic Tracking Using Forward and Backward Euler Approximation of Discrete Differential Equations
  9. Contextual movement models based on normalizing flows
  10. Global Finite-Time Stabilization of Planar Linear Systems With Actuator Saturation
  11. Analyzing User Journey Data In Digital Health: Predicting Dropout From A Digital CBT-I Intervention
  12. Web-scale extension of RDF knowledge bases from templated websites
  13. Clause identification using entropy guided transformation learning
  14. Experimentally established correlation of friction surfacing process temperature and deposit geometry
  15. Interpreting Strings, Weaving Threads
  16. Generating Energy Optimal Powertrain Force Trajectories with Dynamic Constraints
  17. Analyzing math teacher students' sensitivity for aspects of the complexity of problem oriented mathematics instruction
  18. FaST: A linear time stack trace alignment heuristic for crash report deduplication
  19. What does it mean to be sensitive for the complexity of (problem oriented) teaching?
  20. Improving students’ science text comprehension through metacognitive self-regulation when applying learning strategies
  21. A new way of assessing the interaction of a metallic phase precursor with a modified oxide support substrate as a source of information for predicting metal dispersion
  22. Computing regression statistics from grouped data
  23. Performance analysis for loss systems with many subscribers and concurrent services
  24. Stimulating Computing
  25. TARGET SETTING FOR OPERATIONAL PERFORMANCE IMPROVEMENTS - STUDY CASE -
  26. Integration of laser scanning and projection speckle pattern for advanced pipeline monitoring
  27. Comments on "Tracking Control of Robotic Manipulators With Uncertain Kinematics and Dynamics"
  28. Analysis of long-term statistical data of cobalt flows in the EU
  29. Simulation based optimization of lot sizes for opposing logistic objectives
  30. Gaussian processes for dispatching rule selection in production scheduling
  31. Exploring the limits of graph invariant- and spectrum-based discrimination of (sub)structures.
  32. Learning Analytics with Matlab Grader in Undergraduate Engineering Courses