Publications by authors named "John R Hershey"

Deep clustering is the first method to handle general audio separation scenarios with multiple sources of the same type and an arbitrary number of sources, performing impressively in speaker-independent speech separation tasks. However, little is known about its effectiveness in other challenging situations such as music source separation. Contrary to conventional networks that directly estimate the source signals, deep clustering generates an embedding for each time-frequency bin, and separates sources by clustering the bins in the embedding space.

View Article and Find Full Text PDF

We present a generative model and inference algorithm for 3D nonrigid object tracking. The model, which we call G-flow, enables the joint inference of 3D position, orientation, and nonrigid deformations, as well as object texture and background texture. Optimal inference under G-flow reduces to a conditionally Gaussian stochastic filtering problem.

View Article and Find Full Text PDF