Kuldeep Kulkarni

I am a research scientist in Adobe Research, Bengaluru, India. Before that I did a post-doc stint at Carnegie Mellon University where I worked with Aswin Sankaranarayanan. I received my PhD in Electrical Engineering from Arizona State University under the supervision of Pavan Turaga. Prior to that, I received my undergraduate degree in Electrical Engineering from the National Institute of Technology Karnataka, Surathkal, India in 2009. I hail from Ilkal, a tiny little town in India, where I spent 18 wonderful years.

Email  /  Resume  /  LinkedIn  / 


My broad research interests are in the areas of computer vision, deep learning and compressive sensing. Below are some selected projects I have worked on. For a more comprehensive set of publications, please refer to my resume.

ReconNet: Non-Iterative Reconstruction of Images from Compressively Sensed Measurements
Kuldeep Kulkarni, Suhas Lohit, Pavan Turaga, Ronan Kerviche, Amit Ashok
The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016
Project Page (Code available now).

This work was covered by 'Nuit Blanche'

The goal of this paper is to present a non-iterative and more importantly an extremely fast algorithm to reconstruct images from compressively sensed (CS) random measurements. To this end, we propose a novel convolutional neural network (CNN) architecture which takes in CS measurements of an image as input and outputs an intermediate reconstruction. We call this network, ReconNet . The intermediate reconstruction is fed into an off-the-shelf denoiser to obtain the final reconstructed image. On a standard dataset of images we show significant improvements in reconstruction results (both in terms of PSNR and time complexity) over state-of-the-art iterative CS reconstruction algorithms at various measurement rates. Further, through qualitative experiments on real data collected using our block single pixel camera (SPC), we show that our network is highly robust to sensor noise and can recover visually better quality images than competitive algorithms at extremely low sensing rates of 0.1 and 0.04. To demonstrate that our algorithm can recover semantically informative images even at a low measurement rate of 0.01, we present a very robust proof of concept real-time visual tracking application.

Fast Integral Image Estimation at 1% measurement rate
Kuldeep Kulkarni, Pavan Turaga
under revisions at IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) , 2016
arXiv:1601.07258 [cs.CV]

We propose a framework called ReFInE to directly obtain integral image estimates from a very small number of spatially multiplexed measurements of the scene without iterative reconstruction of any auxiliary image, and demonstrate their practical utility in visual object tracking. Specifically, we design measurement matrices which are tailored to facilitate extremely fast estimation of the integral image, by using a single-shot linear operation on the measured vector. Leveraging a prior model for the images, we formulate a nuclear norm minimization problem with second order conic constraints to jointly obtain the measurement matrix and the linear operator. Through qualitative and quantitative experiments, we show that high quality integral image estimates can be obtained using our framework at very low measurement rates. Further, on a standard dataset of 50 videos, we present object tracking results which are comparable to the state-of-the-art methods, even at an extremely low measurement rate of 1%.

Reconstruction-free action inference from compressive imagers
Kuldeep Kulkarni, Pavan Turaga
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)(Impact factor: 5.781, Accepted on July 31, 2015)
supplement / preprint

Persistent surveillance from camera networks, such as at parking lots, UAVs, etc., often results in large amounts of video data, resulting in significant challenges for inference in terms of storage, communication and computation. Compressive cameras have emerged as a potential solution to deal with the data deluge issues in such applications. However, inference tasks such as action recognition require high quality features which implies reconstructing the original video data. Much work in compressive sensing (CS) theory is geared towards solving the reconstruction problem, where state-of-the-art methods are computationally intensive and provide low-quality results at high compression rates. Thus, reconstruction-free methods for inference are much desired. In this paper, we propose reconstruction-free methods for action recognition from compressive cameras at high compression ratios of 100 and above. Recognizing actions directly from CS measurements requires features which are mostly nonlinear and thus not easily applicable. This leads us to search for such properties that are preserved in compressive measurements. To this end, we propose the use of spatio-temporal smashed filters, which are compressive domain versions of pixel-domain matched filters. We conduct experiments on publicly available databases and show that one can obtain recognition rates that are comparable to the oracle method in uncompressed setup, even for high compression ratios.

Reconstruction-free Inference on Compressive Measurements
Suhas Lohit, Kuldeep Kulkarni, Pavan Turaga, Jian Wang , Aswin Sankaranarayanan
4th IEEE Workshop on Computational Cameras and Displays (CCD), held in conjunction with IEEE CVPR, June 2015 (Best Paper Award)

Spatial-multiplexing cameras have emerged as a promising alternative to classical imaging devices, often enabling acquisition of "more for less". One popular architecture for spatial multiplexing is the single-pixel camera (SPC), which acquires coded measurements of the scene with pseudorandom spatial masks. Significant theoretical developments over the past few years provide a means for reconstruction of the original imagery from coded measurements at sub- Nyquist sampling rates. Yet, accurate reconstruction generally requires high measurement rates and high signal-tonoise ratios. In this paper, we enquire if one can perform high-level visual inference problems (e.g. face recognition or action recognition) from compressive cameras without the need for image reconstruction. This is an interesting question since in many practical scenarios, our goals extend beyond image reconstruction. However, most inference tasks often require non-linear features and it is not clear how to extract such features directly from compressed measurements. In this paper, we show that one can extract nontrivial correlational features directly without reconstruction of the imagery. As a specific example, we consider the problem of face recognition beyond the visible spectrum e.g in the short-wave infra-red region (SWIR), where pixels are expensive. We base our framework on smashed filters which suggests that inner-products between high-dimensional signals can be computed in the compressive domain to a high degree of accuracy. We collect a new face image dataset of 30 subjects, obtained using an SPC. Using face recognition as an example, we show that one can indeed perform reconstruction-free inference with a very small loss of accuracy at very high compression ratios of 100 and more.

Recurrence Textures for Activity Recognition from compressive cameras
Kuldeep Kulkarni, Pavan Turaga
IEEE International Conference on Image Processing, 2012

Recent advances in camera architectures and associated mathematical representations now enable compressive acquisition of images and videos at low data-rates. In such a setting, we consider the problem of human activity recognition, which is an important inference problem in many security and surveillance applications. We propose a framework for understanding human activities as a non-linear dynamical system, and propose a robust, generalizable feature that can be extracted directly from the compressed measurements without recon-structing the original video frames. The proposed feature is termed recurrence texture and is motivated from recurrence analysis of non-linear dynamical systems. We show that it is possible to obtain discriminative features directly from the compressed stream and show its utility in recognition of activities at very low data rates.

Course Projects

What makes Federer look so elegant ?
Kuldeep Kulkarni, Vinay Venkataraman , December, 2013

This guy has graciously allowed me to use his super-cool website template :)