Harvesting information from captions for weakly supervised semantic segmentation

Publikation: Beiträge in SammelwerkenAufsätze in KonferenzbändenForschungbegutachtet

Standard

Harvesting information from captions for weakly supervised semantic segmentation. / Sawatzky, Johann; Banerjee, Debayan; Gall, Juergen.
2019 International Conference on Computer Vision Workshops: ICCV 2019 : proceedings : 27 October-2 November 2019, Seoul, Korea. Piscataway: Institute of Electrical and Electronics Engineers Inc., 2019. S. 4481-4490 9022140 (IEEE International Conference on Computer Vision workshops; Band 2019).

Publikation: Beiträge in SammelwerkenAufsätze in KonferenzbändenForschungbegutachtet

Harvard

Sawatzky, J, Banerjee, D & Gall, J 2019, Harvesting information from captions for weakly supervised semantic segmentation. in 2019 International Conference on Computer Vision Workshops: ICCV 2019 : proceedings : 27 October-2 November 2019, Seoul, Korea., 9022140, IEEE International Conference on Computer Vision workshops, Bd. 2019, Institute of Electrical and Electronics Engineers Inc., Piscataway, S. 4481-4490, 17th IEEE/CVF International Conference on Computer Vision Workshop - ICCVW 2019, Seoul, Südkorea, 27.10.19. https://doi.org/10.1109/ICCVW.2019.00549

APA

Sawatzky, J., Banerjee, D., & Gall, J. (2019). Harvesting information from captions for weakly supervised semantic segmentation. In 2019 International Conference on Computer Vision Workshops: ICCV 2019 : proceedings : 27 October-2 November 2019, Seoul, Korea (S. 4481-4490). Artikel 9022140 (IEEE International Conference on Computer Vision workshops; Band 2019). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICCVW.2019.00549

Vancouver

Sawatzky J, Banerjee D, Gall J. Harvesting information from captions for weakly supervised semantic segmentation. in 2019 International Conference on Computer Vision Workshops: ICCV 2019 : proceedings : 27 October-2 November 2019, Seoul, Korea. Piscataway: Institute of Electrical and Electronics Engineers Inc. 2019. S. 4481-4490. 9022140. (IEEE International Conference on Computer Vision workshops). doi: 10.1109/ICCVW.2019.00549

Bibtex

@inbook{13c2379a3a944f5bacd91e0409b3aeca,
title = "Harvesting information from captions for weakly supervised semantic segmentation",
abstract = "Since acquiring pixel-wise annotations for training convolutional neural networks for semantic image segmentation is time-consuming, weakly supervised approaches that only require class tags have been proposed. In this work, we propose another form of supervision, namely image captions as they can be found on the Internet. These captions have two advantages. They do not require additional curation as it is the case for the clean class tags used by current weakly supervised approaches and they provide textual context for the classes present in an image. To leverage such textual context, we deploy a multi-modal network that learns a joint embedding of the visual representation of the image and the textual representation of the caption. The network estimates text activation maps (TAMs) for class names as well as compound concepts, i.e. combinations of nouns and their attributes. The TAMs of compound concepts describing classes of interest substantially improve the quality of the estimated class activation maps which are then used to train a network for semantic segmentation. We evaluate our method on the COCO dataset where it achieves state of the art results for weakly supervised image segmentation.",
keywords = "Multimodal learning, Semantic segmentation, Weakly supervised learning, Weakly supervised semantic segmentation, Informatics",
author = "Johann Sawatzky and Debayan Banerjee and Juergen Gall",
note = "Publisher Copyright: {\textcopyright} 2019 IEEE.; 17th IEEE/CVF International Conference on Computer Vision Workshop - ICCVW 2019, ICCVW 2019 ; Conference date: 27-10-2019 Through 28-10-2019",
year = "2019",
month = oct,
doi = "10.1109/ICCVW.2019.00549",
language = "English",
isbn = "978-1-7281-5024-6",
series = "IEEE International Conference on Computer Vision workshops",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "4481--4490",
booktitle = "2019 International Conference on Computer Vision Workshops",
address = "United States",
url = "https://iccv2019.thecvf.com/",

}

RIS

TY - CHAP

T1 - Harvesting information from captions for weakly supervised semantic segmentation

AU - Sawatzky, Johann

AU - Banerjee, Debayan

AU - Gall, Juergen

N1 - Conference code: 17

PY - 2019/10

Y1 - 2019/10

N2 - Since acquiring pixel-wise annotations for training convolutional neural networks for semantic image segmentation is time-consuming, weakly supervised approaches that only require class tags have been proposed. In this work, we propose another form of supervision, namely image captions as they can be found on the Internet. These captions have two advantages. They do not require additional curation as it is the case for the clean class tags used by current weakly supervised approaches and they provide textual context for the classes present in an image. To leverage such textual context, we deploy a multi-modal network that learns a joint embedding of the visual representation of the image and the textual representation of the caption. The network estimates text activation maps (TAMs) for class names as well as compound concepts, i.e. combinations of nouns and their attributes. The TAMs of compound concepts describing classes of interest substantially improve the quality of the estimated class activation maps which are then used to train a network for semantic segmentation. We evaluate our method on the COCO dataset where it achieves state of the art results for weakly supervised image segmentation.

AB - Since acquiring pixel-wise annotations for training convolutional neural networks for semantic image segmentation is time-consuming, weakly supervised approaches that only require class tags have been proposed. In this work, we propose another form of supervision, namely image captions as they can be found on the Internet. These captions have two advantages. They do not require additional curation as it is the case for the clean class tags used by current weakly supervised approaches and they provide textual context for the classes present in an image. To leverage such textual context, we deploy a multi-modal network that learns a joint embedding of the visual representation of the image and the textual representation of the caption. The network estimates text activation maps (TAMs) for class names as well as compound concepts, i.e. combinations of nouns and their attributes. The TAMs of compound concepts describing classes of interest substantially improve the quality of the estimated class activation maps which are then used to train a network for semantic segmentation. We evaluate our method on the COCO dataset where it achieves state of the art results for weakly supervised image segmentation.

KW - Multimodal learning

KW - Semantic segmentation

KW - Weakly supervised learning

KW - Weakly supervised semantic segmentation

KW - Informatics

UR - http://www.scopus.com/inward/record.url?scp=85082499279&partnerID=8YFLogxK

U2 - 10.1109/ICCVW.2019.00549

DO - 10.1109/ICCVW.2019.00549

M3 - Article in conference proceedings

AN - SCOPUS:85082499279

SN - 978-1-7281-5024-6

T3 - IEEE International Conference on Computer Vision workshops

SP - 4481

EP - 4490

BT - 2019 International Conference on Computer Vision Workshops

PB - Institute of Electrical and Electronics Engineers Inc.

CY - Piscataway

T2 - 17th IEEE/CVF International Conference on Computer Vision Workshop - ICCVW 2019

Y2 - 27 October 2019 through 28 October 2019

ER -

DOI

Zuletzt angesehen

Publikationen

  1. Supporting Visual and Verbal Learning Preferences in a Second-Language Multimedia Learning Environment
  2. Writing as a Deeper Form of Concentration
  3. Multilingual disambiguation of named entities using linked data
  4. A data-driven methodological routine to identify key indicators for social-ecological system archetype mapping
  5. Portuguese part-of-speech tagging with large margin structure learning
  6. AUC Maximizing Support Vector Learning
  7. Analysis of the construction of an autonomous robot to improve its energy efficiency when traveling through irregular terrain
  8. Artificial intelligence in songwriting and composing - perspectives and challenges in creative practices
  9. How to support teachers to give feedback to modelling tasks effectively? Results from a teacher-training-study in the Co²CA project
  10. Sliding Mode Control of an Inductive Power Transmission System with Maximum Efficiency
  11. Robustness of coherent sets computations
  12. Deeper Insights into Different Consumer Perceptions of CSR Communication
  13. Legitimation problems of participatory processes in technology assessment and technology policy
  14. Collaborative open science as a way to reproducibility and new insights in primate cognition research
  15. Joint Proceedings of Scholarly QALD 2023 and SemREC 2023 co-located with 22nd International Semantic Web Conference ISWC 2023
  16. Expectations on Hierarchical Scales of Discourse
  17. Effect of yttrium addition on lattice parameter, Young's modulus and vacancy of magnesium
  18. Othering Space
  19. "And I Think That Is a Very Straightforward Way of Dealing With It''
  20. Self-perceived quality of life predicts mortality risk better than a multi-biomarker panel, but the combination of both does best
  21. What´s in a net? or: The end of the average
  22. Introduction to Philosophy of Management
  23. Lessons from modeling 100% renewable scenarios using GENeSYS-MOD
  24. Res Lunae: Characterizing Diverse Lunar Resource Systems Using the Social-Ecological System Framework
  25. Continental mapping of forest ecosystem functions reveals a high but unrealised potential for forest multifunctionality.
  26. Continued logarithm representation of real numbers
  27. Root-root interactions: extending our perspective to be more inclusive of the range of theories in ecology and agriculture using in-vivo analyses
  28. Sprachen in Liechtenstein