Where: EGR0108

Speaker: Wojtek Czaja (UMD (MATH)) -

Where: EGR 0108

Speaker: Aquia Richburg (UMD) -

Where: EGR 0108

Speaker: Dan Elton (UMD (ME)) -

Where: EGR0108

Speaker: Ilya Kavalerov (UMD (ECE)) -

Where: EGR 0108

Speaker: Zeyad Emam (UMD (AMSC)) -

Abstract: I will present the paper by the same title by Ghiasi et al.

Where: EGR0108

Speaker: Tom Goldstein (UMD (CS)) -

Where: EGR0108

Speaker: Prof Furong Huang (UMD (CS)) - https://sites.google.com/site/furongfionahuang/

Where: EGR0108

Speaker: Michael (Pekala) -

Where: EGR0108

Speaker: Matthew Guay (NIH ) -

Abstract: In this talk I will define conditional random fields for image processing, and describe how they may be approximated with recurrent neural networks for easy incorporation into deep neural network models.

Where: EGR0108

Speaker: Lechao Xiao (Google Brain) - https://ai.google/research/people/105681

Abstract: In recent years, state-of-the-art methods in computer vision have utilized increasingly deep convolutional neural network architectures (CNNs), with some of the most successful models employing hundreds or even thousands of layers. A variety of pathologies such as vanishing/exploding gradients make training such deep networks challenging. While residual connections and batch normalization do enable training at these depths, it has remained unclear whether such specialized architecture designs are truly necessary to train deep CNNs. In this talk, we demonstrate that it is possible to train vanilla CNNs with ten thousand layers or more simply by using an appropriate initialization scheme. We derive this initialization scheme theoretically by developing a mean field theory for signal propagation and by characterizing the conditions for dynamical isometry, the equilibration of singular values of the input-output Jacobian matrix. These conditions require that the convolution operator be an orthogonal transformation in the sense that it is norm-preserving. We present an algorithm for generating such random initial orthogonal convolution kernels and demonstrate empirically that they enable efficient training of extremely deep architectures.

Where: EGR0108

Speaker: Liam Fowl (UMD (Math)) -

Abstract: I will present the paper "Dropout as a Low-Rank Regularizer for Matrix-Factorization" by Vidal et al.

Where: EGR0108

Speaker: Micah Goldblum (UMD) -

Where: Kirwan Hall 3206

Speaker: Valeriia Cherepanova (UMD) -

Abstract: I will present a recent paper with the same title out of Google DeepMind.

Where: Kirwan Hall 3206

Speaker: Ankan Bansal (UMD) - http://ankan.umiacs.io/

Abstract: Scene understanding is a high-level vision task which involves not just localizing and recognizing objects and people but also inferring their layouts and interactions with each other. However, current systems for even atomic tasks like object detection suffer from several shortcomings. Most object detectors can only detect a limited number of object categories; and automated systems for detecting interactions between humans and objects perform very poorly. We hypothesize that scene understanding can be improved by using additional semantic data from outside sources.

Given the fact that it is nearly impossible to collect labelled training data for thousands of object categories, we introduce the problem of zero-shot object detection (ZSD). We first present an approach for ZSD using semantic information encoded in word-vectors which are trained on a large text corpus. One of the most important of challenges associated with ZSD is the definition of a “background” class. It is easy to define a “background” class in fully-supervised settings. However, it’s not clear what constitutes a “background” ZSD. We present principled approaches for dealing with this challenge. We evaluate our approaches on challenging sets of object classes, not restricting ourselves to similar and/or fine-grained categories as in prior works on zero-shot classification.

Next, we tackle the problem of detecting human-object interactions (HOIs). Here, again, it is impossible to collect labelled data for each type of possible interaction. We show that solutions for HOI detection can greatly benefit from semantic information. We present two approaches for solving this problem. In the first approach, we exploit functional similarities between objects to share knowledge between models for different classes. The main idea is that humans look similar while interacting with functionally similar objects. We show that, using this idea, even a simple model can achieve state-of-the-art results for HOI detection both in the supervised and zero-shot settings. Our second model uses semantic information in the form of spatial layout of a person and an object to detect their interactions. This model contains a layout module which primes the visual module to make the final prediction.

Where: Kirwan Hall 3206

Speaker: Yogesh Balaji (UMD) - http://www.cs.umd.edu/~yogesh/

Abstract: Generative Adversarial Networks (GANs) are a popular class of deep generative models that have achieved impressive results in applications such as image generation, speech synthesis, etc. While GANs provide a framework for sampling data from a distribution, they prohibit computation of sample likelihoods and this limits their use in statistical inference problems. In this talk, I will discuss our recent results on resolving this issue by constructing an explicit probability model for GANs that can be used for computing sample likelihoods. In particular, we prove that under this probability model, a family of Wasserstein GANs maximize a variational lower-bound on average sample log likelihoods. Using these results, I will show how sample likelihoods can be estimated for datasets such as MNIST, SVHN, CIFAR-10, etc.

Where: Kirwan Hall 3206

Speaker: Dong Dong (UMD (Math)) -

Abstract: First we will introduce Laplacian eigenmaps, which is an example of a nonlinear dimension reduction technique. Then we will talk about an ongoing effort to extend this method to allow more flexibility for applications.

Where: Kirwan Hall 3206

Speaker: Yiran LI (UMN) -

Abstract: It has been known that deep neural networks can be used as feature extractor to approximate functions, and there has been studies about the relationship between approximation error rate and the structure and size of the network. We propose a novel network structure that is based on Gabor frames, and prove its approximation error rate for certain types of smooth functions as a function of its structure and size. Our work also builds a bridge between the theory and application in that it is implementable with existing tools such as tensorflow. We also present implementation details of the Gabor network along with discussion of its potential applications as a function approximator.