2020 |
 |
Amélie Royer, Christoph H. Lampert. "Localizing Grouped Instances for Efficient Detection in Low-Resource Scenarios", Winter Conference on Applications of Computer Vision (WACV) 2020
[Bibtex] [Abstract] [PDF] [Poster]
|
@article{odgi,
author = {Royer, Am\'{e}lie and Lampert, Christoph H.},
title = {Localizing Grouped Instances for Efficient Detection in Low-Resource Scenarios},
journal = {Winter Conference on Applications of Computer Vision (WACV)},
year = {2020}
}
State-of-the-art detection systems are generally evaluated on their ability to exhaustively retrieve objects densely distributed in the image, across a wide variety of appearances and semantic categories. Orthogonal to this, many real-life object detection applications, for example in remote sensing, instead require dealing with large images that contain only a few small objects of a single class, scattered heterogeneously across the space. In addition, they are often subject to strict computational constraints, such as limited battery capacity and computing power.
To tackle these more practical scenarios, we propose a novel flexible detection scheme that efficiently adapts to variable object sizes and densities: We rely on a sequence of detection stages, each of which has the ability to predict groups of objects as well as individuals. Similar to a detection cascade, this multi-stage architecture spares computational effort by discarding large irrelevant regions of the image early during the detection process. The ability to group objects provides further computational and memory savings, as it allows working with lower image resolutions in early stages, where groups are more easily detected than individuals, as they are more salient. We report experimental results on two aerial image datasets, and show that the proposed method is as accurate yet computationally more efficient than standard single-shot detectors, consistently across three different backbone architectures.
|
|
 |
Amélie Royer, Christoph H. Lampert. "A Flexible Selection Scheme for Minimum-Effort Transfer Learning", Winter Conference on Applications of Computer Vision (WACV) 2020
[Bibtex] [Abstract] [PDF] [Poster]
|
@article{flextuning,
author = {Royer, Am\'{e}lie and Lampert, Christoph H.},
title = {A Flexible Selection Scheme for Minimum-Effort Transfer Learning},
journal = {Winter Conference on Applications of Computer Vision (WACV)},
year = {2020}
}
Fine-tuning is a popular way of exploiting knowledge contained in a pre-trained convolutional network for a new visual recognition task. However, the orthogonal setting of transferring knowledge from a pretrained network to a vi-
sually different yet semantically close source is rarely considered: This commonly happens with real-life data, which is not necessarily as clean as the training source (noise, geometric transformations, different modalities, etc.).
To tackle such scenarios, we introduce a new, generalized form of fine-tuning, called flex-tuning, in which any
individual unit (e.g. layer) of a network can be tuned, and the most promising one is chosen automatically. In order to make the method appealing for practical use, we propose two lightweight and faster selection procedures that prove to be good approximations in practice. We study these selection criteria empirically across a variety of domain shifts and data scarcity scenarios, and show that fine-tuning individual units, despite its simplicity, yields very good results as an adaptation technique. As it turns out, in contrast to common practice, rather than the last fully-connected unit it is best to tune an intermediate or early one in many domain shift scenarios, which is accurately detected by flex-tuning.
|
|
 |
Krishnendu Chatterjee, Martin Chmelík, Deep Karkhanis, Petr Novotný and Amélie Royer "Multiple-Environment Markov Decision Processes: Efficient Analysis and Applications", International Conference on Automated Planning and Scheduling (ICAPS) 2020
[Bibtex] [Abstract] [PDF]
|
@article{flextuning,
author = {Chatterjee, Krishnendu and Chmelik, Martin and Karkhanis, Deep
and Novotny, Petr and Royer, Am\'{e}lie},
title = {Multiple-Environment Markov Decision Processes: Efficient Analysis and Applications},
journal = {International Conference on Automated Planning and Scheduling (ICAPS)},
year = {2020}
}
Fine-tuning is a popular way of exploiting knowledge contained in a pre-trained convolutional network for a new visual recognition task. However, the orthogonal setting of transferring knowledge from a pretrained network to a vi-
sually different yet semantically close source is rarely considered: This commonly happens with real-life data, which is not necessarily as clean as the training source (noise, geometric transformations, different modalities, etc.).
To tackle such scenarios, we introduce a new, generalized form of fine-tuning, called flex-tuning, in which any
individual unit (e.g. layer) of a network can be tuned, and the most promising one is chosen automatically. In order to make the method appealing for practical use, we propose two lightweight and faster selection procedures that prove to be good approximations in practice. We study these selection criteria empirically across a variety of domain shifts and data scarcity scenarios, and show that fine-tuning individual units, despite its simplicity, yields very good results as an adaptation technique. As it turns out, in contrast to common practice, rather than the last fully-connected unit it is best to tune an intermediate or early one in many domain shift scenarios, which is accurately detected by flex-tuning.
|
2018 |
 |
Amélie Royer, Konstantinos Bousmalis, Stephan Gouws, Fred Bertsch, Inbar Mosseri, Forrester Cole, Kevin Murphy. "XGAN: Unsupervised Image-to-Image Translation for Many-to-Many Mappings", Domain Adaptation for Visual Understanding Workshop at ICML/IJCAI/EJCAI 2018
[Bibtex] [Abstract] [PDF] [Slides]
|
@article{DBLP:journals/corr/abs-1711-05139,
author = {Am\'{e}lie Royer and
Konstantinos Bousmalis and
Stephan Gouws and
Fred Bertsch and
Inbar Mosseri and
Forrester Cole and
Kevin Murphy},
title = {{XGAN:} Unsupervised Image-to-Image Translation for many-to-many Mappings},
journal = {Domain Adaptation for Visual Understanding Workshop at ICML'18},
year = {2018}
}
Style transfer usually refers to the task of applying color and texture information from a specific style image to a given content image while preserving the structure of the latter. Here we tackle the more generic problem of semantic style transfer: given two unpaired collections of images, we aim to learn a mapping between the corpus-level style of each collection, while preserving semantic content shared across the two domains.
We introduce XGAN ("Cross-GAN"), a dual adversarial autoencoder, which captures a shared representation of the common domain semantic content in an unsupervised way, while jointly learning the domain-to-domain image translations in both directions. We exploit ideas from the domain adaptation literature and define a semantic consistency loss which encourages the model to preserve semantics in the learned embedding space. We report promising qualitative results for the task of face-to-cartoon translation. The cartoon dataset we collected for this purpose, CartoonSet, is publicly availale at https://google.github.io/cartoonset/index.html as a new benchmark for semantic style transfer.
|
2017 |
 |
Amélie Royer*, Alexander Kolesnikov*, Christoph Lampert. "Probabilistic Image Colorization",
British Machine Vision Conference (BMVC), 2017.
(∗) equal contribution
[Bibtex] [Abstract] [PDF] [Poster]
|
@InProceedings{ Royer_2017_BMVC,
author = {Royer, Am\'{e}lie, Kolesnikov, Alexander and Lampert, Christoph H.},
title = {Probabilistic Image Colorization},
booktitle = {British Machine Vision Conference (BMVC)},
month = {September},
year = {2017}
}
We develop a probabilistic technique for colorizing grayscale natural images. In light of the intrinsic uncertainty of this task, the proposed probabilistic framework has numerous desirable properties. In particular, our model is able to produce multiple plausible and vivid colorizations for a given grayscale image and is one of the first colorization models to provide a proper stochastic sampling scheme.
Moreover, our training procedure is supported by a rigorous theoretical framework that does not require any ad hoc heuristics and allows for efficient modeling and learning of the joint pixel color distribution. We demonstrate strong quantitative and qualitative experimental results on the CIFAR-10 dataset and the challenging ILSVRC 2012 dataset.
|
2016 |
 |
Amélie Royer, Guillaume Gravier, Vincent Claveau. "Audio word similarity for clustering with zero resources based on iterative HMM classification",
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016.
[Bibtex] [Abstract] [PDF] [Poster]
|
@InProceedings{ Royer_2016_ICASSP,
author = {Royer, Am\'{e}lie, Gravier, Guillaume and Claveau, Vincent},
title = {Audio word similarity for clustering with zero resources based on iterative HMM classification},
booktitle = {International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
month = {March},
year = {2016}
}
Recent work on zero resource word discovery makes intensive use of audio fragment clustering to find repeating speech patterns. In the absence of acoustic models, the clustering step traditionally relies on dynamic time warping (DTW) to compare two samples and thus suffers from the known limitations of this technique.
We propose a new sample comparison method, called 'similarity by iterative classification', that exploits the modeling capacities of hidden Markov models (HMM) with no supervision. The core idea relies on the use of HMMs trained on randomly labeled data and exploits the fact that similar samples are more likely to be
classified together by a large number of random classifiers than dissimilar ones.
The resulting similarity measure is compared to DTW on two tasks, namely nearest neighbor retrieval and clustering, showing that the generalization capabilities of probabilistic machine learning significantly benefit to audio word comparison and overcome many of the limitations of DTW-based comparison.
|
2015 |
 |
Amélie Royer, Christoph H. Lampert. "Classifier Adaptation at Prediction Time",
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
[Bibtex] [Abstract] [PDF]
|
@InProceedings{Royer_2015_CVPR,
author = {Royer, Am\'{e}lie and Lampert, Christoph H.},
title = {Classifier Adaptation at Prediction Time},
booktitle = {Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2015}
}
Classifiers for object categorization are usually evaluated by their accuracy on a set of i.i.d. test examples. This provides us with an estimate of the expected error when applying the classifiers to a single new image. In real application, however, classifiers are rarely only used for a single image and then discarded. Instead, they are applied sequentially to many images, and these are typically not i.i.d. samples from a fixed data distribution, but they carry dependencies and their class distribution varies over time.
In this work, we argue that the phenomenon of correlated data at prediction time is not a nuisance, but a blessing in disguise. We describe a probabilistic method for adapting classifiers at prediction time without having to retraining them. We also introduce a framework for creating realistically distributed image sequences, which offers a way to benchmark classifier adaptation methods, such as the one we propose.
Experiments on the ILSVRC2010 and ILSVRC2012 datasets show that adapting object classification systems at prediction time can significantly reduce their error rate, even with additional human feedback.
|