RIT on Deep Learning Archives for Fall 2018 to Spring 2019

Organizational Meeting

When: Fri, September 7, 2018 - 12:00pm
Where: EGR0108
Speaker: Wojtek Czaja (UMD (MATH)) -

Intro to Deep Learning (Part 1)

When: Fri, September 14, 2018 - 12:00pm
Where: EGR 0108
Speaker: Aquia Richburg (UMD) -

Intro to Deep Learning (part 2)

When: Fri, September 21, 2018 - 12:00pm
Where: EGR 0108
Speaker: Dan Elton (UMD (ME)) -

GANS, Wavelets and Fourier Scattering Networks, and Generative Scattering Networks.

When: Fri, September 28, 2018 - 12:00pm
Where: EGR0108
Speaker: Ilya Kavalerov (UMD (ECE)) -

Laplacian Pyramid Reconstruction and Refinement for Semantic Segmentation

When: Fri, October 5, 2018 - 12:00pm
Where: EGR 0108
Speaker: Zeyad Emam (UMD (AMSC)) -
Abstract: I will present the paper by the same title by Ghiasi et al.


When: Fri, October 12, 2018 - 12:00pm
Where: EGR0108
Speaker: Tom Goldstein (UMD (CS)) -


When: Fri, October 19, 2018 - 12:00pm
Where: EGR0108
Speaker: Prof Furong Huang (UMD (CS)) - https://sites.google.com/site/furongfionahuang/

PhD Final Oral Exam: Harmonic Analysis and Machine Learning

When: Fri, October 26, 2018 - 12:00pm
Where: EGR0108
Speaker: Michael (Pekala) -

Conditional random fields as recurrent neural networks

When: Fri, November 2, 2018 - 12:00pm
Where: EGR0108
Speaker: Matthew Guay (NIH ) -
Abstract: In this talk I will define conditional random fields for image processing, and describe how they may be approximated with recurrent neural networks for easy incorporation into deep neural network models.

Training 10k-layer CNNs with mean field theory and dynamical isometry[joint w/ Norbert Wiener Center Seminar]

When: Fri, November 16, 2018 - 12:00pm
Where: EGR0108
Speaker: Lechao Xiao (Google Brain) - https://ai.google/research/people/105681
Abstract: In recent years, state-of-the-art methods in computer vision have utilized increasingly deep convolutional neural network architectures (CNNs), with some of the most successful models employing hundreds or even thousands of layers. A variety of pathologies such as vanishing/exploding gradients make training such deep networks challenging. While residual connections and batch normalization do enable training at these depths, it has remained unclear whether such specialized architecture designs are truly necessary to train deep CNNs. In this talk, we demonstrate that it is possible to train vanilla CNNs with ten thousand layers or more simply by using an appropriate initialization scheme. We derive this initialization scheme theoretically by developing a mean field theory for signal propagation and by characterizing the conditions for dynamical isometry, the equilibration of singular values of the input-output Jacobian matrix. These conditions require that the convolution operator be an orthogonal transformation in the sense that it is norm-preserving. We present an algorithm for generating such random initial orthogonal convolution kernels and demonstrate empirically that they enable efficient training of extremely deep architectures.

A Look Into the Effects of Dropout as a Regularizer

When: Fri, November 30, 2018 - 12:00pm
Where: EGR0108
Speaker: Liam Fowl (UMD (Math)) -
Abstract: I will present the paper "Dropout as a Low-Rank Regularizer for Matrix-Factorization" by Vidal et al.

Lifelong Learning with Dynamically Expandable Networks

When: Fri, December 7, 2018 - 12:00pm
Where: EGR0108
Speaker: Micah Goldblum (UMD) -

On the importance of single directions for generalization

When: Fri, March 29, 2019 - 12:00pm
Where: Kirwan Hall 3206
Speaker: Valeriia Cherepanova (UMD) -
Abstract: I will present a recent paper with the same title out of Google DeepMind.

Semantic Knowledge for Scene Understanding

When: Fri, April 12, 2019 - 12:00pm
Where: Kirwan Hall 3206
Speaker: Ankan Bansal (UMD) - http://ankan.umiacs.io/
Abstract: Scene understanding is a high-level vision task which involves not just localizing and recognizing objects and people but also inferring their layouts and interactions with each other. However, current systems for even atomic tasks like object detection suffer from several shortcomings. Most object detectors can only detect a limited number of object categories; and automated systems for detecting interactions between humans and objects perform very poorly. We hypothesize that scene understanding can be improved by using additional semantic data from outside sources.
Given the fact that it is nearly impossible to collect labelled training data for thousands of object categories, we introduce the problem of zero-shot object detection (ZSD). We first present an approach for ZSD using semantic information encoded in word-vectors which are trained on a large text corpus. One of the most important of challenges associated with ZSD is the definition of a “background” class. It is easy to define a “background” class in fully-supervised settings. However, it’s not clear what constitutes a “background” ZSD. We present principled approaches for dealing with this challenge. We evaluate our approaches on challenging sets of object classes, not restricting ourselves to similar and/or fine-grained categories as in prior works on zero-shot classification.
Next, we tackle the problem of detecting human-object interactions (HOIs). Here, again, it is impossible to collect labelled data for each type of possible interaction. We show that solutions for HOI detection can greatly benefit from semantic information. We present two approaches for solving this problem. In the first approach, we exploit functional similarities between objects to share knowledge between models for different classes. The main idea is that humans look similar while interacting with functionally similar objects. We show that, using this idea, even a simple model can achieve state-of-the-art results for HOI detection both in the supervised and zero-shot settings. Our second model uses semantic information in the form of spatial layout of a person and an object to detect their interactions. This model contains a layout module which primes the visual module to make the final prediction.

EntropicGANs meet VAEs: A statistical approach to compute sample likelihoods in GANs

When: Fri, April 26, 2019 - 12:00pm
Where: Kirwan Hall 3206
Speaker: Yogesh Balaji (UMD) - http://www.cs.umd.edu/~yogesh/
Abstract: Generative Adversarial Networks (GANs) are a popular class of deep generative models that have achieved impressive results in applications such as image generation, speech synthesis, etc. While GANs provide a framework for sampling data from a distribution, they prohibit computation of sample likelihoods and this limits their use in statistical inference problems. In this talk, I will discuss our recent results on resolving this issue by constructing an explicit probability model for GANs that can be used for computing sample likelihoods. In particular, we prove that under this probability model, a family of Wasserstein GANs maximize a variational lower-bound on average sample log likelihoods. Using these results, I will show how sample likelihoods can be estimated for datasets such as MNIST, SVHN, CIFAR-10, etc.

On variants of Laplacian eigenmaps

When: Fri, May 3, 2019 - 12:00pm
Where: Kirwan Hall 3206
Speaker: Dong Dong (UMD (Math)) -
Abstract: First we will introduce Laplacian eigenmaps, which is an example of a nonlinear dimension reduction technique. Then we will talk about an ongoing effort to extend this method to allow more flexibility for applications.

Gabor Networks with Proven Approximation Properties

When: Fri, May 10, 2019 - 12:00pm
Where: Kirwan Hall 3206
Speaker: Yiran LI (UMN) -
Abstract: It has been known that deep neural networks can be used as feature extractor to approximate functions, and there has been studies about the relationship between approximation error rate and the structure and size of the network. We propose a novel network structure that is based on Gabor frames, and prove its approximation error rate for certain types of smooth functions as a function of its structure and size. Our work also builds a bridge between the theory and application in that it is implementable with existing tools such as tensorflow. We also present implementation details of the Gabor network along with discussion of its potential applications as a function approximator.