Harvesting information from captions for weakly supervised semantic segmentation

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Standard

Harvesting information from captions for weakly supervised semantic segmentation. / Sawatzky, Johann; Banerjee, Debayan; Gall, Juergen.
2019 International Conference on Computer Vision Workshops: ICCV 2019 : proceedings : 27 October-2 November 2019, Seoul, Korea. Piscataway: Institute of Electrical and Electronics Engineers Inc., 2019. p. 4481-4490 9022140 (IEEE International Conference on Computer Vision workshops; Vol. 2019).

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Harvard

Sawatzky, J, Banerjee, D & Gall, J 2019, Harvesting information from captions for weakly supervised semantic segmentation. in 2019 International Conference on Computer Vision Workshops: ICCV 2019 : proceedings : 27 October-2 November 2019, Seoul, Korea., 9022140, IEEE International Conference on Computer Vision workshops, vol. 2019, Institute of Electrical and Electronics Engineers Inc., Piscataway, pp. 4481-4490, 17th IEEE/CVF International Conference on Computer Vision Workshop - ICCVW 2019, Seoul, Korea, Republic of, 27.10.19. https://doi.org/10.1109/ICCVW.2019.00549

APA

Sawatzky, J., Banerjee, D., & Gall, J. (2019). Harvesting information from captions for weakly supervised semantic segmentation. In 2019 International Conference on Computer Vision Workshops: ICCV 2019 : proceedings : 27 October-2 November 2019, Seoul, Korea (pp. 4481-4490). Article 9022140 (IEEE International Conference on Computer Vision workshops; Vol. 2019). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICCVW.2019.00549

Vancouver

Sawatzky J, Banerjee D, Gall J. Harvesting information from captions for weakly supervised semantic segmentation. In 2019 International Conference on Computer Vision Workshops: ICCV 2019 : proceedings : 27 October-2 November 2019, Seoul, Korea. Piscataway: Institute of Electrical and Electronics Engineers Inc. 2019. p. 4481-4490. 9022140. (IEEE International Conference on Computer Vision workshops). doi: 10.1109/ICCVW.2019.00549

Bibtex

@inbook{13c2379a3a944f5bacd91e0409b3aeca,
title = "Harvesting information from captions for weakly supervised semantic segmentation",
abstract = "Since acquiring pixel-wise annotations for training convolutional neural networks for semantic image segmentation is time-consuming, weakly supervised approaches that only require class tags have been proposed. In this work, we propose another form of supervision, namely image captions as they can be found on the Internet. These captions have two advantages. They do not require additional curation as it is the case for the clean class tags used by current weakly supervised approaches and they provide textual context for the classes present in an image. To leverage such textual context, we deploy a multi-modal network that learns a joint embedding of the visual representation of the image and the textual representation of the caption. The network estimates text activation maps (TAMs) for class names as well as compound concepts, i.e. combinations of nouns and their attributes. The TAMs of compound concepts describing classes of interest substantially improve the quality of the estimated class activation maps which are then used to train a network for semantic segmentation. We evaluate our method on the COCO dataset where it achieves state of the art results for weakly supervised image segmentation.",
keywords = "Multimodal learning, Semantic segmentation, Weakly supervised learning, Weakly supervised semantic segmentation, Informatics",
author = "Johann Sawatzky and Debayan Banerjee and Juergen Gall",
note = "Publisher Copyright: {\textcopyright} 2019 IEEE.; 17th IEEE/CVF International Conference on Computer Vision Workshop - ICCVW 2019, ICCVW 2019 ; Conference date: 27-10-2019 Through 28-10-2019",
year = "2019",
month = oct,
doi = "10.1109/ICCVW.2019.00549",
language = "English",
isbn = "978-1-7281-5024-6",
series = "IEEE International Conference on Computer Vision workshops",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "4481--4490",
booktitle = "2019 International Conference on Computer Vision Workshops",
address = "United States",
url = "https://iccv2019.thecvf.com/",

}

RIS

TY - CHAP

T1 - Harvesting information from captions for weakly supervised semantic segmentation

AU - Sawatzky, Johann

AU - Banerjee, Debayan

AU - Gall, Juergen

N1 - Conference code: 17

PY - 2019/10

Y1 - 2019/10

N2 - Since acquiring pixel-wise annotations for training convolutional neural networks for semantic image segmentation is time-consuming, weakly supervised approaches that only require class tags have been proposed. In this work, we propose another form of supervision, namely image captions as they can be found on the Internet. These captions have two advantages. They do not require additional curation as it is the case for the clean class tags used by current weakly supervised approaches and they provide textual context for the classes present in an image. To leverage such textual context, we deploy a multi-modal network that learns a joint embedding of the visual representation of the image and the textual representation of the caption. The network estimates text activation maps (TAMs) for class names as well as compound concepts, i.e. combinations of nouns and their attributes. The TAMs of compound concepts describing classes of interest substantially improve the quality of the estimated class activation maps which are then used to train a network for semantic segmentation. We evaluate our method on the COCO dataset where it achieves state of the art results for weakly supervised image segmentation.

AB - Since acquiring pixel-wise annotations for training convolutional neural networks for semantic image segmentation is time-consuming, weakly supervised approaches that only require class tags have been proposed. In this work, we propose another form of supervision, namely image captions as they can be found on the Internet. These captions have two advantages. They do not require additional curation as it is the case for the clean class tags used by current weakly supervised approaches and they provide textual context for the classes present in an image. To leverage such textual context, we deploy a multi-modal network that learns a joint embedding of the visual representation of the image and the textual representation of the caption. The network estimates text activation maps (TAMs) for class names as well as compound concepts, i.e. combinations of nouns and their attributes. The TAMs of compound concepts describing classes of interest substantially improve the quality of the estimated class activation maps which are then used to train a network for semantic segmentation. We evaluate our method on the COCO dataset where it achieves state of the art results for weakly supervised image segmentation.

KW - Multimodal learning

KW - Semantic segmentation

KW - Weakly supervised learning

KW - Weakly supervised semantic segmentation

KW - Informatics

UR - http://www.scopus.com/inward/record.url?scp=85082499279&partnerID=8YFLogxK

U2 - 10.1109/ICCVW.2019.00549

DO - 10.1109/ICCVW.2019.00549

M3 - Article in conference proceedings

AN - SCOPUS:85082499279

SN - 978-1-7281-5024-6

T3 - IEEE International Conference on Computer Vision workshops

SP - 4481

EP - 4490

BT - 2019 International Conference on Computer Vision Workshops

PB - Institute of Electrical and Electronics Engineers Inc.

CY - Piscataway

T2 - 17th IEEE/CVF International Conference on Computer Vision Workshop - ICCVW 2019

Y2 - 27 October 2019 through 28 October 2019

ER -

DOI

Recently viewed

Publications

  1. Continuous and Discrete Concepts for Detecting Transport Barriers in the Planar Circular Restricted Three Body Problem
  2. Language and Mathematics - Key Factors influencing the Comprehension Process in reality-based Tasks
  3. Public perceptions of CCS in context
  4. THE PARALLAX OF INDIVIDUATION
  5. Neural relational inference for disaster multimedia retrieval
  6. The relationship between audit committees, external auditors, and internal control systems
  7. Representation for interactive exercises
  8. Memory Acts: Memory without Representation.
  9. Using heuristic worked examples to promote solving of reality‑based tasks in mathematics in lower secondary school
  10. Mechanical performance prediction for friction riveting joints of dissimilar materials via machine learning
  11. Don’t underestimate the problems of user centredness in software development projectsthere are many!?
  12. Influence of Long-Lasting Static Stretching Intervention on Functional and Morphological Parameters in the Plantar Flexors
  13. Mathematics in Robot Control for Theoretical and Applied Problems
  14. Control of an Electromagnetic Linear Actuator Using Flatness Property and Systems Inversion
  15. Restricted nonlinear approximation and singular solutions of boundary integral equations
  16. Multilevel bridge governor by using model predictive control in wavelet packets for tracking trajectories
  17. The role of reading time complexity and reading speed in text comprehension
  18. Problem solving in mathematics education
  19. Topic Embeddings – A New Approach to Classify Very Short Documents Based on Predefined Topics
  20. On the Inclusion of Parameter Uncertainties into Engineering Design Computations
  21. Microstructure-based modeling of residual stresses in WC-12Co-sprayed coatings
  22. Artificial Intelligence in Foreign Language Learning and Teaching
  23. On the utility of indirect methods for detecting faking
  24. Spectral Early-Warning Signals for Sudden Changes in Time-Dependent Flow Patterns
  25. Identification of sites with elevated PM levels along an urban cycle path using a mobile platform and the analysis of 48 particle bound PAH
  26. Privatizing the commons
  27. Joint Item Response Models for Manual and Automatic Scores on Open-Ended Test Items
  28. "And I Think That Is a Very Straightforward Way of Dealing With It''
  29. Applications of the Simultaneous Modular Approach in the Field of Material Flow Analysis
  30. Enacting migration through data practices
  31. Combining an Internal SMC with an External MTPA Control Loop for an Interior PMSM