Schedule Highlights - NIPS 2017 [PDF]

Dec 11, 2015 - S2VT: Sequence to Sequence -- Video to Text. Subhashini Venugopalan, Marcus Rohrbach, Jeff ...... Systems

0 downloads 4 Views 235KB Size

Recommend Stories


2017 Race Schedule PDF
Happiness doesn't result from what we get, but from what we give. Ben Carson

NIPS'13
When you do things from your soul, you feel a river moving in you, a joy. Rumi

2017 highlights
Before you speak, let your words pass through three gates: Is it true? Is it necessary? Is it kind?

Spring 2017 Session Schedule (pdf)
The happiest people don't have the best of everything, they just make the best of everything. Anony

SoNA 2017 Highlights
Just as there is no loss of basic energy in the universe, so no thought or action is without its effects,

UNWTO Tourism Highlights 2017
Happiness doesn't result from what we get, but from what we give. Ben Carson

Summer Collection 2017 Highlights
No amount of guilt can solve the past, and no amount of anxiety can change the future. Anonymous

Product highlights 2017
No amount of guilt can solve the past, and no amount of anxiety can change the future. Anonymous

Vietnam Highlights 2017
Make yourself a priority once in a while. It's not selfish. It's necessary. Anonymous

Holstein Highlights - April 2017
You're not going to master the rest of your life in one day. Just relax. Master the day. Than just keep

Idea Transcript


NIPS 2015 Workshop book

Workshop organizers make last-minute changes to their schedule. Download this document again to get the lastest changes, or use the NIPS mobile application.

Schedule Highlights Dec. 11, 2015 510 ac, Optimization for Machine Learning (OPT2015) Sra, Agarwal, Bottou, J. Reddi 510 bd, Machine Learning For Healthcare (MLHC) Karaletsos, Ranganath, Saria, Sontag

Generated Fri Feb 01, 2019

Besold, Miikkulainen, Marcus 512 dh, Multimodal Machine Learning Morency, Baltrusaitis, Courville, Cho 512 e, Machine Learning for (e-)Commerce Arcaute, Ghavamzadeh, Mannor, Theocharous 513 ab, Advances in Approximate Bayesian Inference Tran, Broderick, Mandt, McInerney, Mohamed, Kucukelbir, Hoffman, Lawrence, Blei 513 cd, Deep Reinforcement Learning Abbeel, Schulman, Singh, Silver

511 a, ABC in Montreal Meeds, Gutmann, Prangle, Marin, Everitt

513 ef, The 1st International Workshop "Feature Extraction: Modern Questions and Challenges" Storcheus, Kumar, Rostamizadeh

511 b, Machine Learning for Spoken Language Understanding and Interactions Celikyilmaz, Gasic, Hakkani-Tur

514 a, Adaptive Data Analysis Smith, Roth, Feldman, Hardt

511 c, Nonparametric Methods for Large Scale Representation Learning Wilson, Smola, Xing 511 d, Learning Faster from Easy Data II van Erven, Koolen 511 e, Modelling and inference for dynamics on complex interaction networks: joining up machine learning and statistical physics Opper, Roudi, Sollich 511 f, Statistical Methods for Understanding Neural Systems Fletcher, Macke, Adams, Sohl-Dickstein 512 a, Probabilistic Integration Osborne, Hennig 512 bf, Bounded Optimality and Rational Metareasoning Gershman, Lieder, Griffiths, Goodman 512 cg, Cognitive Computation: Integrating neural and symbolic approaches Garcez,

Page 1 of 52

514 bc, Time Series Workshop Anava, Khaleghi, Kuznetsov, Rakhlin 515 bc, Applying (machine) Learning to Experimental Physics (ALEPH) and «Flavours of Physics» challenge Serdyukov, Ustyuzhanin, Chrz■szcz, Dettori, Bettler Room 515 a, Machine Learning and Interpretation in Neuroimaging (day 1) Rish, Wehbe, Murphy, Langs, Cecchi, Grosse-Wentrup Dec. 12, 2015 510 ac, Reasoning, Attention, Memory (RAM) Workshop Weston, Chopra, Bordes 510 bd, Machine Learning in Computational Biology Fusi, Goldenberg, Mostafavi, Quon, Stegle 511 a, Learning, Inference and Control of Multi-Agent Systems Gómez, Neumann,

NIPS 2015 Workshop book

Yedidia, Stone 511 b, Bayesian Optimization: Scalability and Flexibility Shahriari, Adams, de Freitas, Shah, Calandra 511 c, Multiresolution methods for large-scale learning Dhillon, Kondor, Nowak, O'Neil, Teneva

Generated Fri Feb 01, 2019

513 ef, Black box learning and inference Tenenbaum, van de Meent, Kulkarni, Eslami, Paige, Wood, Ghahramani 514 a, Machine Learning From and For Adaptive User Technologies: From Active Learning & Experimentation to Optimization & Personalization Williams, Abbasi, Doshi-Velez

511 d, Machine Learning Systems Beutel, Chen, Singh, Angelino, Weimer, Gonzalez

514 bc, Transfer and Multi-Task Learning: Trends and New Perspectives Pentina, Lampert, Pan, Long, Hoffman, Sun, Saenko

511 e, BigNeuro 2015: Making sense of big neural data Dyer, Vogelstein, Koerding, Freeman, Tolias

515 a, Machine Learning and Interpretation in Neuroimaging (day 2) Rish, Wehbe, Murphy, Langs, Cecchi, Grosse-Wentrup

511 f, Extreme Classification 2015: Multi-class and Multi-label Learning in Extremely Large Label Spaces Varma, Cisse

515 bc, Bayesian Nonparametrics: The Next Generation Broderick, Foti, Schein, Tank, Wallach, Williamson

512 a, Quantum Machine Learning Wiebe, Lloyd 512 bf, Networks in the Social and Information Sciences Airoldi, Choi, Clauset, Ugander, Toulis 512 cg, Cognitive Computation: Integrating neural and symbolic approaches (day 2) Garcez, Besold, Miikkulainen, Marcus 512 dh, Learning and privacy with incomplete data and weak supervision Patrini, Jebara, Nock, Kotzias, Yu 512 e, Challenges in Machine Learning (CiML 2015): "Open Innovation" and "Coopetitions" Guyon, Viegas, Hamner, Kégl 513 ab, Scalable Monte Carlo Methods for Bayesian Analysis of Big Data Shahbaba, Teh, Welling, Doucet, Andrieu, Vollmer, Jacob 513 cd, Non-convex Optimization for Machine Learning: Theory and Practice Anandkumar, Uma Naresh, Chaudhuri, Liang, Oh

Page 2 of 52

NIPS 2015 Workshop book

Dec. 11, 2015

Optimization for Machine Learning (OPT2015) Suvrit Sra, Alekh Agarwal, Leon Bottou, Sashank J. Reddi 510 ac, Fri Dec 11, 08:30 AM

Generated Fri Feb 01, 2019

in optimization as relevant to machine learning. As in the past years, the workshop will continue to bring luminaries from the field of optimization to share classical perspectives, as well as give a platform for thought leaders from machine learning to share exciting recent advances. To this end, our tentative invited speakers for this year are: Elad Hazan, Guanghui Lan and Jorge Nocedal. Additionally, we will hope to continue the tradition of high quality contributed talks and posters.

I. INTRODUCTION -------------

Dear NIPS Workshop Chairs, We propose to organize the workshop OPT2015: Optimization for Machine Learning. As the eighth in its series, OPT 2015 builds on significant precedent established by OPT 2008--OPT 2014, all of which have been remarkably well-received NIPS workshops. The previous OPT workshops enjoyed packed (to overpacked) attendance, and this enthusiastic reception is an attestation to the great importance of optimization within machine learning. The intersection of OPT and ML has grown monotonically over the years, to the extent that now many cutting edge advances in optimization are arising from the ML community. The driving feature is the departure of algorithms from textbook approaches, in particular by paying attention to problem specific structure and to deployability in practical (even industrial) big-data settings. This intimate relation of optimization with ML is the key motivation for our workshop. We wish to use OPT2015 as a platform to foster discussion, discovery, and dissemination of the state-of-the-art

Page 3 of 52

OPT workshops have previously covered a variety of topics, such as frameworks for convex programs (D. Bertsekas), the intersection of ML and optimization, especially SVM training (S. Wright), large-scale learning via stochastic gradient methods and its tradeoffs (L. Bottou, N. Srebro), exploitation of structured sparsity (Vandenberghe), randomized methods for extremely large-scale convex optimization (A. Nemirovski), complexity theoretic foundations of convex optimization (Y. Nesterov), distributed large-scale optimization (S. Boyd), asynchronous and sparsity based stochastic gradient (B. Recht), algebraic techniques in machine learning (P. Parillo), insights into nonconvex optimization (A. Lewis), sums-of-squares techniques (J. Lasserre), optimization in the context of deep learning (Y. Bengio), among others. Several ideas propounded in these talks have become important research topics in ML and optimization --- especially in the field of randomized algorithms and stochastic gradient methods. An edited book "Optimization for Machine Learning" (S. Sra, S. Nowozin, and S. Wright; MIT Press, 2011) grew out of the first three OPT workshops, and contains high-quality contributions from many of the

NIPS 2015 Workshop book

speakers and attendees. Much of the recent focus has been on large-scale first-order convex optimization algorithms for machine learning, both from a theoretical and methodological point of view. Covered topics included stochastic gradient algorithms, (accelerated) proximal algorithms, decomposition and coordinate descent algorithms, parallel and distributed optimization. Theoretical and practical advances in these methods remain a topic of core interest to the workshop. Recent years have also seen interesting advances in non-convex optimization such as a growing body of results on alternating minimization, tensor factorization etc. We also do not wish to ignore the not particularly large scale setting, where one does have time to wield substantial computational resources. In this setting, high-accuracy solutions and deep understanding of the lessons contained in the data are needed. Examples valuable to MLers may be exploration of genetic and environmental data to identify risk factors for disease; or problems dealing with setups where the amount of observed data is not huge, but the mathematical model is complex. Consequently, we encourage optimization methods on manifolds, ML problems with differential geometric antecedents, those using advanced algebraic techniques, and computational topology, for instance.

Machine Learning For Healthcare (MLHC) THEOFANIS Karaletsos, Rajesh Ranganath, Suchi Saria, David Sontag 510 bd, Fri Dec 11, 08:30 AM Recent years have seen an unprecedented rise in the availability and size of collections of clinical data

Page 4 of 52

Generated Fri Feb 01, 2019

such as electronic health records. These rich data sources present opportunities to apply and develop machine learning methods to solve problems faced by clinicians and to usher in new forms of medical practice that would otherwise be infeasible. The aim of this workshop is to foster discussions between machine learning researchers and clinicians of how machine learning can be used to address fundamental problems in health care. Of particular interest to this year’s workshop is statistical modeling. The role of modeling in healthcare is two-fold. First, it provides clinicians with a tool to aid exploration of hypotheses in a data-driven way. Second, it furnishes evidence-based clinically actionable predictions. Examples include machine learning of disease progression models, where patients and diseases are characterized by states that evolve over time, or dose-response models, where the treatment details involving complex and often combinatorial therapies can be inferred in a data driven way to optimally treat individual patients. Such methods face many statistical challenges such as accounting for confounding effects like socioeconomic backgrounds or genetic alterations in subpopulations. Causal models learned from large collections of patient records, coupled with detailed patient specific data, enable precision medicine, wherein the models become in-silico testbeds for testing hypotheses for a single patient. In this workshop we bring together clinicians and machine learning researchers working on healthcare solutions. The goal is to have a discussion to understand clinical needs and the technical challenges presented by the needs including interpretable techniques which can adapt to noisy, dynamic environments and the biases inherent in the data due to being generated by the current standard of care. Part of our workshop includes a clinician pitch. This

NIPS 2015 Workshop book

Generated Fri Feb 01, 2019

clinician pitch will be a general call disseminated to clinicians asking for a five minute presentation of problems they are trying to solve based on empirical data. The problem presentations will followed by a discussion between invited clinicians and attending ML-researchers to help refine them and understand how machine learning will play a role in solving them. Finally, the pitch plays a secondary role of fostering new collaborations between machine learning researchers and clinicians: an important step for machine learning’s role in healthcare. We invite submissions of 2-page abstracts for poster contributions to the workshops and for short contributed talks. Topics of interest are: models for diseases and clinical data, temporal models, Markov decision processes for clinical decision support, multi-scale data-integration, issues with missing data, uncertainty and uncertainty propagation, non-i.i.d. structure in the data, critique of models, causality, model biases and transfer learning for healthcare. Schedule

08:15 AM

Integrating Artificial Intelligence Horng Learning into Emergency Care

08:45 AM

Data Driven Phenotyping for Diseases

Doshi-Velez

09:15 AM

Behavioral Analytics in Mental De Healthcare

11:00 AM

The risk of deploying unintelligible models in healthcare

Page 5 of 52

Caruana

11:45 AM

Extracting Information from Figures

11:45 AM

Sensor Technology to Detect Pain in the Dudding Neonate

12:15 PM

Sudden Death in Epilepsy

02:00 PM

Building [Machine] Learning Healthcare Shah Systems

02:30 PM

Finding Sparse Features in Strongly Mandt, Wenzel Confounded Medial Data

03:00 PM

Segmentation and Labelling of Adams Physiological Time Series Data

03:30 PM

Uncertainty in Dickerson Dynamic Matching

04:00 PM

Machine Learning Applied to Birth Asphyxia

Yu

Friedman

Onu

ABC in Montreal Ted Meeds, Michael Gutmann, Dennis Prangle, Jean-Michel Marin, Richard Everitt 511 a, Fri Dec 11, 08:30 AM Approximate Bayesian computation (ABC) or likelihood-free (LF) methods have developed mostly beyond the radar of the machine learning community, but are important tools for a large and diverse segment of the scientific community. This is particularly true for systems and population biology, computational neuroscience, computer vision,

NIPS 2015 Workshop book

healthcare sciences, but also many others. Interaction between the ABC and machine learning community has recently started and contributed to important advances. In general, however, there is still significant room for more intense interaction and collaboration. Our workshop aims at being a place for this to happen. The workshop will consist of invited and contributed talks, poster spotlights, and a poster session. Rather than a panel discussion we will encourage open discussion between the speakers and the audience.

Machine Learning for Spoken Language Understanding and Interactions Asli Celikyilmaz, Milica Gasic, Dilek Hakkani-Tur 511 b, Fri Dec 11, 08:30 AM The emergence of virtual personal assistants such as SIRI, Cortana, Echo, and Google Now, is generating increasing interest in research in speech understanding and spoken interaction. However, whilst the ability of these agents to recognise conversational speech is maturing rapidly, their ability to understand and interact is still limited to a few specific domains, such as weather information, local businesses, and some simple chit-chat. Their conversational capabilities are not necessarily apparent to users. Interaction typically depends on handcrafted scripts and is often guided by simple commands. Deployed dialogue models do not fully make use of the large amount of data that these agents generate. Promising approaches that involve statistical models, big data analysis, representation of knowledge (hierarchical, relations, etc. ), utilising and enriching semantic graphs with natural language components, multi-modality, etc. are being explored in multiple communities, such as Page 6 of 52

Generated Fri Feb 01, 2019

natural language processing (NLP), speech processing, machine learning (ML), and information retrieval. However, we are still only scratching the surface in this field. The aim of this workshop, therefore, is to bring together researchers interested in understanding and interaction in conversational agents, to discuss the challenges and new and emerging topics in machine learning which might lead to richer and more natural human-computer interaction. Obtaining meaning from human natural language is a complex process. The potential range of topics is vast and even well-formed utterances can be syntactically and semantically ambiguous. Spontaneous conversational speech naturally contains grammatical errors, repetitions, disfluencies, partial words, and out of vocabulary words. Conducting intelligent conversation over multiple turns requires maintaining the dialogue state over time, dealing with errors that arise from the speech recogniser, determining an adequate dialogue strategy, estimating the quality of that strategy, and generating natural language responses. Over the years many different approaches and models have been proposed (e.g. syntactic and semantic analysis of spoken text, hybrid models that use speech processing components as features for semantic analysis, learning representations for spoken text, contextual models, statistical models of dialogue). These methods have drawn inspiration from machine learning solutions e.g. sequence tagging, syntactic parsing, and language modelling, primarily because these tasks can be easily abstracted into machine learning formulations (e.g. structured prediction, dimensionality reduction, regression, classification, supervised or reinforcement learning). These representations have evolved into novel understanding models based on discriminative methods, Bayesian nonparametrics, neural networks, low rank/spectral

NIPS 2015 Workshop book

Generated Fri Feb 01, 2019

techniques, and word/phrase/sentence level embeddings based on deep learning methods.

conversational systems. Some key questions we will address include (but are not limited to):

In dialogue modelling, methods based on partially observable Markov decision processes and reinforcement learning have enabled limited domain dialogue models to be built that are trainable from data, robust to noise, and adaptable to changes in the user or domain. Following success in other areas, neural networks have also been applied to different aspects of dialogue modelling yielding significant improvements. The problem remains however as to how to extend these models to exploit the huge datasets that users of virtual personal assistants generate, and thereby enable the richer and more reliable conversation that users expect. Problems in spoken language understanding and dialogue modelling are particularly appealing to those doing core ML research due to the high-dimensional nature of the spaces involved (both the data and the label spaces), the need to handle noise robustly and the availability of large amounts of unstructured data. But there are many other areas within spoken language understanding and dialogue modelling for conversational systems where the ML community is less involved and which remain relatively unexplored, such as semantics, open-domain dialogue models, multi-modal dialogue input and output, emotion recognition, finding relational structures, discourse and pragmatics analysis, multi-human understanding (meetings) and summarization, and cross lingual understanding. These areas continue to rely on linguistically-motivated but imprecise heuristics which may benefit from new machine learning approaches.

* Representation/Optimization How can ML help provide novel representations and models to capture the structure of spoken natural language especially considering spontaneous conversational speech? What speech and NLP problems could benefit from new inference/optimization techniques? * Data In speech and NLP we typically have large amounts of less useful background data and small amounts of very useful in-domain data. Are current ML algorithms sufficient to gracefully deal with this problem? For example, can we harness non-dialogue data to build dialogue models? While many speech and NLP problems depend mainly on static speech or text corpora, dialogue is unique in that the user provides an opportunity for learning on-line. Which non-intrusive methods can we use to engage the user is such a way that it leads to improvement of the dialogue models? How can we design new ML paradigms (e.g., bootstrapping, semi-supervised learning) to address the lack of annotated data in complex structured prediction problems such as knowledge extraction and semantics?

The goal of this workshop is to bring together both applied and theoretical researchers in spoken/natural language processing and machine learning to facilitate the discussion of new frameworks that can help advance modern

Page 7 of 52

* Scalability So far ML-based dialogue systems have only tacked limited domains, how can we scale them to large open domains leveraging the semantic web? How can we tackle "scalability bottlenecks" unique to natural language? * Multi-lingual/Multi-human/Multi-modal conversation Can adaptation methods be developed to build conversational understanding systems for low resource languages without going through rigorous annotation processes? What technical challenges posed by multilinguality,

NIPS 2015 Workshop book

Generated Fri Feb 01, 2019

lexical variation in social media, and nonstandard dialects are under-researched in ML? What ML methods are needed for structural understanding of multi-human conversations? What ML methods can we deploy to support multi-modal conversation?

Representation Learning Andrew G Wilson, Alex Smola, Eric Xing 511 c, Fri Dec 11, 08:30 AM

Schedule

09:00 AM

Evaluating Prerequisite Qualities For End-to-End Dialog Systems

10:20 AM

Machine Learning Heck, Deng, Pietquin, For Conversational Mikolov Systems

03:00 PM

Best Three Paper Talks

Kadlec, Tafforeau, Chung

04:20 PM

Challenges for Conversational Applications

Georgila, Thomson, Black, Riccardi

Weston

Abstracts (1): Abstract 3: Best Three Paper Talks in Machine Learning for Spoken Language Understanding and Interactions, Kadlec, Tafforeau, Chung 03:00 PM Paper 1: Detecting Interrogative Utterances with Recurrent Neural Networks; Junyoung Chung, Hany Hassan Awadalla, Jacob Devlin Paper 2: Improved Deep Learning Baselines for Ubuntu Corpus Dialogs; Rudolf Kadlec, Martin Schmid, Jan Kleindienst Paper 3: Lexical Embedding Adaptation for Open-Domain Spoken Language Understanding; J. Tafforeau, F. Bechet, B. Favre, T. Artieres

Page 8 of 52

Nonparametric Methods for Large Scale

In 2015, every minute of the day, users share hundreds of thousands of pictures, videos, tweets, reviews, and blog posts. More than ever before, we have access to massive datasets in almost every area of science and engineering, including genomics, robotics, and climate science. This wealth of information provides an unprecedented opportunity to automatically learn rich representations of data, which allows us to greatly improve performance in predictive tasks, but also provides a mechanism for scientific discovery. That is, by automatically learning expressive representations of data, versus carefully hand crafting features, we can obtain a new theoretical understanding of our modelling problems. Recently, deep learning architectures have had success for such representation learning, particularly in computer vision and natural language processing. Expressive non-parametric methods also have great potential for large-scale structure discovery; indeed, these methods can be highly flexible, and have an information capacity that grows with the amount of available data. However, there are practical challenges involved in developing non-parametric methods for large scale representation learning. Consider, for example, kernel methods. A kernel controls the generalisation properties of these methods. A well chosen kernel leads to impressive empirical performances. Difficulties arise when the kernel is a priori unknown and the number of datapoints is large. One must develop an expressive kernel learning approach, and scaling such an approach poses different challenges than

NIPS 2015 Workshop book

scaling a standard kernel method. One faces additional computational constraints, and the need to retain significant model structure for expressing the rich information available in a large dataset. However, the need for expressive kernel learning on large datasets is especially great, since such datasets often provide more information to automatically learn an appropriate statistical representation. This 1 day workshop is about non-parametric methods for large scale structure learning, including automatic pattern discovery, extrapolation, manifold learning, kernel learning, metric learning, data compression, feature extraction, trend filtering, and dimensionality reduction. Non-parametric methods include, for example, Gaussian processes, Dirichlet processes, Indian buffet processes, and support vector machines. We are particularly interested in developing scalable and expressive methods to derive new scientific insights from large datasets. A poster session, coffee breaks, and a panel guided discussion will encourage interaction between attendees. This workshop aims to bring together researchers wishing to explore alternatives to neural networks for learning rich non-linear function classes, with an emphasis on nonparametric methods, representation learning and scalability. We wish to carefully review and enumerate modern approaches to these challenges, share insights into the underlying properties of these methods, and discuss future directions. Schedule

09:00 AM

Random Tensor Decompositions for Teh Regression and Collaborative Filtering

Generated Fri Feb 01, 2019

09:30 AM

Dirichlet-Hawkes Process with Applications to Ahmed Clustering Continuous-Time Document Streams

10:30 AM

Do Shallow Kernel Methods Match Sha Deep Neural Networks

02:30 PM

Graph Sparsification Approaches for Laplacian Smoothing

02:45 PM

Word, Graph, and Manifold Embedding from Hashimoto Markov Processes, Contributed

03:00 PM

Learning from Rankings

03:30 PM

Using Local Spectral Methods in Mahoney Theory and in Practice

04:30 PM

Sharp Analysis of Random Feature Expansions

05:15 PM

Discussion Panel

Vert

Bach

Abstracts (1): Abstract 9: Discussion Panel in Nonparametric Methods for Large Scale Representation Learning, 05:15 PM Speakers unknown

Page 9 of 52

Sadhanala

NIPS 2015 Workshop book

Learning Faster from Easy Data II Tim van Erven, Wouter Koolen 511 d, Fri Dec 11, 08:30 AM In both stochastic and online learning we have a good theoretical understanding of the most difficult learning tasks through worst-case or minimax analysis, and we have algorithms to match. Yet there are commonly occurring cases that are much easier than the worst case where these methods are overly conservative, showing a large gap between the performance predicted by theory and observed in practice. Recent work has refined our theoretical understanding of the wide spectrum of easy cases, leading to the development of algorithms that are robust to the worst case, but can also automatically adapt to easier data and achieve faster rates whenever possible. Examples of easier cases include (Tsybakov) margin conditions, low noise or variance, probabilistic Lipschitzness and empirical curvature of the loss (strong convexity, exp-concavity, mixability), as well as low-complexity decision boundaries and comparators, quantile bounds, and cases with few switches among few leaders. Adapting to such easy data often involves data-dependent bias-variance trade-offs through hyper-parameter learning, adaptive regularisation or exploration, or hypothesis testing to distinguish between easy and hard cases.

Page 10 of 52

Generated Fri Feb 01, 2019

The last two years have seen many exciting new developments in the form of new desirable adaptivity targets, new algorithms and new analysis techniques. In this workshop we aim to bring together researchers and practitioners interested in adaptation to easy data. The key questions we will discuss are: Which are the data-dependent trade-offs encountered (bias-variance or other)? Can we identify commonalities across different problem domains in strategies that are being used to deal with these trade-offs? And what is the price for adaptivity (if any)? Both theoretical and empirical insights are welcomed. Schedule

09:00 AM

Learning Faster from Easy Data II: Introduction

van Erven

09:30 AM

Clustering Is Easy When...

Ben-David

10:30 AM

Tsybakov Noise Adaptive Margin-Based Active Learning

Singh

11:00 AM

Adaptive Online Learning

Foster

02:30 PM

Easy Data

Grünwald

03:30 PM

Optimal and Adaptive Algorithms for Online Boosting

Kale

NIPS 2015 Workshop book

05:00 PM

05:45 PM

Generated Fri Feb 01, 2019

Andrea Montanari, Stanford University

Adaptive Regret Bounds for Non-Stochastic Bandits

Neu

Discussion Panel

van Erven, Koolen, Grünwald, Ben-David, Foster, Kale, Neu

Graham Taylor, University of Guelph

Abstracts (1): Abstract 8: Discussion Panel in Learning Faster from Easy Data II, van Erven, Koolen, Grünwald, Ben-David, Foster, Kale, Neu 05:45 PM

This workshop is co-sponsored by the European Network "NETADIS" (Statistical Physics Approaches to Networks Across Disciplines). See http://www.netadis.eu for further information and workshop details (NIPS 2015 tab).

Speakers unknown Workshop overview

Modelling and inference for dynamics on complex interaction networks: joining up machine learning and statistical physics Manfred Opper, Yasser Roudi, Peter Sollich 511 e, Fri Dec 11, 08:30 AM

Inference and learning on large graphical models, i.e. large systems of simple probabilistic units linked by a complex network of interactions, is a classical topic in machine learning. Such systems are also an active research topic in the field of statistical physics.

Invited speakers

Jose Bento Ayres Pereira, Boston College

Alfredo Braunstein, Politecnico di Torino

Ramon Grima, University of Edinburgh

Jakob Macke, MPI Biological Cybernetics Tuebingen

Page 11 of 52

The main interaction between statistical physics and machine learning has so far been in the area of analysing data sets without explicit temporal structure. Here methods of equilibrium statistical physics, developed for studying Boltzmann distributions on networks of nodes with e.g. pairwise interactions, are closely related to graphical model inference techniques; accordingly there has been much cross-fertilization leading to both conceptual insights and more efficient algorithms. Models can be learned from recorded experimental or other empirical data, but even when samples come from e.g. a time series this aspect of the data is typically

NIPS 2015 Workshop book

ignored.

More recently, interest has shifted towards dynamical models. This shift has occurred for two main reasons:

(a) Most of the interesting systems for which statistical analysis techniques are required, e.g. networks of biological neurons, gene regulatory networks, protein-protein interaction networks, stock markets, exhibit very rich temporal or spatiotemporal dynamics; if this is ignored by focusing on stationary distributions alone this can lead to the loss of a significant amount of interesting information and possibly even qualitatively wrong conclusions.

(b) Current technological breakthroughs in collecting data from the complex systems referred to above are yielding ever increasing temporal resolution. This in turn allows in depth analyses of the fundamental temporal aspects of the function of the system, if combined with strong theoretical methods. It is widely accepted that these dynamical aspects are crucial for understanding the function of biological and financial systems, warranting the development of techniques for studying them.

Generated Fri Feb 01, 2019

between loopy belief propagation, message passing algorithms and the Bethe free energy formulation has stimulated a large amount of research in approximation techniques for inference and the corresponding equilibrium analysis of disordered systems in statistical physics.

It is the goal of the proposed workshop to bring together researchers from the fields of machine learning and statistical physics in order to discuss the new challenges originating from dynamical data. Such data are modelled using a variety of approaches such as dynamic belief networks, continuous time analogues of these – as often used for disordered spin systems in statistical physics –, coupled stochastic differential equations for continuous random variables etc. The workshop will provide a forum for exploring possible synergies between the inference and learning approaches developed for the various models. The experience from joint advances in the equilibrium domain suggests that there is much unexplored scope for progress on dynamical data. Possible topics to be addressed will be:

Inference on state dynamics:

- efficient approximation of dynamics on a given network, filtering, smoothing

- inference with hidden nodes In the past, the fields of machine learning and statistical physics have cross-fertilised each other significantly. E.g. the establishment of the relation

Page 12 of 52

- existing methods including dynamical belief

NIPS 2015 Workshop book

propagation & expectation propagation, variational approximations, mean-field and Plefka approximations; relations between these, advantages, drawbacks

- alternative approaches

Learning model/network parameters:

- with/without hidden nodes

Learning network structure:

Generated Fri Feb 01, 2019

drift coefficient of a p-dimensional stochastic differential equation from a sample path of length T. I assume that the drift is parametrized by a high dimensional vector, and study the support recovery problem in the case where p is allowed to grow with T. In particular, I describe a general lower bound on the sample-complexity T by using a characterization of mutual information as time integral of conditional variance, due to Kadota, Zakai, and Ziv. For linear stochastic differential equations, the drift coefficient is parametrized by a p by p matrix which describes which degrees of freedom interact under the dynamics. In this case, I analyze an L1-regularized least-squares estimator and describe an upper bound on T that nearly matches the lower bound on specific classes of sparse matrices. I describe how this same algorithm can be used to learn non-linear SDEs and in addition show by means of a numerical experiment why one should expect the sample-complexity to be of the same order as that for linear SDEs.

- going beyond correlation information

Alfredo Braunstein: Bayesian inference of cascades on networks Abstracts of invited talks

Jose Bento: Learning Stochastic Differential Equations – Fundamental limits and efficient algorithms

Models based on stochastic differential equations (SDEs) play a crucial role in several domains of science and technology, ranging from chemistry to finance. In this talk I consider the problem of learning the

Page 13 of 52

We present a method based on Belief Propagation to study a series of inference problems on discrete dynamical cascade models based on partial and/or noisy observations of the cascades. The problems include the identification of the source, the discovery of undetected infected nodes, prediction of features of the future evolution, and the inference of the supporting network.

Ramon Grima: Exact and approximate solutions for

NIPS 2015 Workshop book

spatial stochastic models of chemical systems

Stochastic effects in chemical reaction systems have been mostly studied via the chemical master equation, a non-spatial discrete stochastic formulation of chemical kinetics which assumes well-mixing and point-like interactions between molecules. These assumptions are in direct contrast with what experiments tells us about the nature of the intracellular environment, namely that diffusion plays a fundamental role in intracellular dynamics and that the environment itself is highly non-dilute (or crowded). I will here describe our recent work on obtaining (i) exact expressions for the solution of the reaction-diffusion master equation (RDME) and its crowded counterpart (cRDME) in equilibrium conditions and (ii) approximate expressions for the moments in non-equilibrium conditions. The solutions portray an emerging picture of the combined influence of diffusion and crowding on the stochastic properties of chemical reaction networks.

Jakob Macke: Correlations and signatures of criticality in neural population models

Large-scale recording methods make it possible to measure the statistics of neural population activity, and thereby to gain insights into the principles that govern the collective activity of neural ensembles. One hypothesis that has emerged from this approach is that neural populations are poised at a ‘thermo-dynamic critical point’, and that this has important functional consequences (Tkacik et al 2014). Support for this hypothesis has come from studies that computed the specific heat, a measure of global population statistics, for groups of neurons subsampled from population recordings. These studies have found two effects which—in physical

Page 14 of 52

Generated Fri Feb 01, 2019

systems—indicate a critical point: First, specific heat diverges with population size N. Second, when manipulating population statistics by introducing a ’temperature’ in analogy to statistical mechanics, the maximum heat moves towards unit-temperature for large populations. What mechanisms can explain these observations? We show that both effects arise in a simple simulation of retinal population activity. They robustly appear across a range of parameters including biologically implausible ones, and can be understood analytically in simple models. The specific heat grows with N whenever the (average) correlation is independent of N, which is always true when uniformly subsampling a large, correlated population. For weakly correlated populations, the rate of divergence of the specific heat is proportional to the correlation strength. Thus, if retinal population codes were optimized to maximize specific heat, then this would predict that they seek to increase correlations. This is incongruent with theories of efficient coding that make the opposite prediction. We find criticality in a simple and parsimonious model of retinal processing, and without the need for fine-tuning or adaptation. This suggests that signatures of criticality might not require an optimized coding strategy, but rather arise as consequence of sub-sampling a stimulus-driven neural population (Aitchison et al 2014).

Andrea Montanari: Information-theoretic bounds on learning network dynamics

How long should we observe the trajectory of a system before being able to characterize its underlying network dynamics? I will present a brief review of information-theoretic tools to establish lower bounds on the required length of observation.

NIPS 2015 Workshop book

I will illustrate the use of these tools with a few examples: linear and nonlinear stochastic differential equations, dynamical Bayesian networks and so on. For each of these examples, I will discuss whether the ultimate information limit has been achieved by practical algorithms or not.

Graham Taylor: Learning Multi-scale Temporal Dynamics with Recurrent Neural Networks

The last three years have seen an explosion of activity studying recurrent neural networks (RNNs), a generalization of feedforward neural networks which can map sequences to sequences. Training RNNs using backpropagation through time can be difficult, and was thought up until recently to be hopeless due to vanishing and exploding gradients used in training. Recent advances in optimization methods and architectures have led to impressive results in modeling speech, handwriting and language. Applications to other areas are emerging. In this talk, I will review some recent progress on RNNs and discuss our work on extending and improving the Clockwork RNN (Koutnick et al.), a simple yet powerful model that partitions its hidden units to model specific temporal scales. Our “Dense clockworks” are a shift-invariant form of the architecture which which we show to be more efficient and effective than their predecessor. I will also describe a recent collaboration with Google in which we apply Dense clockworks to authenticating mobile phone users based on the movement of the device as captured by the accelerometer and gyroscope. Schedule

Generated Fri Feb 01, 2019

09:00 AM

Learning Stochastic Differential Bento Equations

10:30 AM

Correlations and Signatures of Macke Criticality in Neural Population Models

11:00 AM

Spotlight

Bachschmid-Romano, De Bacco, Bravi

11:00 AM

Spotlight Part II

Bhattacharya, Doya, Gibberd, Dasgupta, Soudry

02:30 PM

Information-theoretic bounds on learning Montanari network dynamics

03:00 PM

Bayesian Inference of Cascades on Braunstein Networks

05:00 PM

Exact and approximate solutions for spatial Grima stochastic models of chemical system

05:30 PM

Learning Multi-scale Temporal Dynamics Taylor with Recurrent Neural Networks

Statistical Methods for Understanding Neural Systems Allie Fletcher, Jakob H Macke, Ryan Adams, Jascha Sohl-Dickstein 511 f, Fri Dec 11, 08:30 AM 8:15 Opening remarks and welcome 8:30 Surya Ganguli Towards a theory of high dimensional, single trial neural data analysis:

Page 15 of 52

NIPS 2015 Workshop book

On the role of random projections and phase transitions 9:00 Katherine Heller Translating between human & animal studies via Bayesian multi-task learning 9:30 Mitya Chklovskii Similarity matching: A new theory of neural computation 10:00 Coffee break 1 10:30 Poster Session 1 11:00 Matthias Bethge Let's compete—benchmarking models in neuroscience 11:30 Yoshua Bengio Small Steps Towards Biologically Plausible Deep Learning 12:00 Lunch 2:30 Pulkit Agrawal The Human Visual Hierarchy is Isomorphic to the Hierarchy learned by a Deep Convolutional Neural Network Trained for Object Recognition 3:00 Yann Lecun Unsupervised Learning 3:30 Poster Session 2 4:00 Coffee break 2 4:30 Neil Lawrence The Mechanistic Fallacy and Modelling how we Think 5:00 Panel: Deep Learning and neuroscience: What can brains tell us about massive computing and vice versa? Yoshua Bengio, Matthias Bethge, Surya Ganguli, Konrad Kording, Yann Lecun, Neil Lawrence 6:00 Wrap up Posters Pulkit Agrawal, Mark D. Lescroart, Dustin E. Stansbury, Jitendra Malik, & Jack L. Gallant : The Human Visual Hierarchy is Isomorphic to the Hierarchy learned by a Deep Convolutional Neural Network Trained for Object Recognition Christian Donner and Hideaki Shimazaki: Approximation methods for inferring time-varying interactions of a large neural population Alexey Dosovitskiy and Thomas Brox: Inverting Convolutional Networks with Convolutional Networks

Page 16 of 52

Generated Fri Feb 01, 2019

Johannes Friedrich, Daniel Soudry, Yu Mu, Jeremy Freeman, Misha Ahrens, and Liam Paninski: Fast Constrained Non-negative Matrix Factorization for Whole-Brain Calcium Imaging Data Amin Karbasi, Amir Hesam Salavati, and Martin Vetterli: Learning Network Structures from Firing Patterns Jesse A. Livezey, Gopala K. Anumanchipalli, Brian Cheung, Prabhat, Friedrich T. Sommer, Michael R. DeWeese, Kristofer E. Bouchard, and Edward F. Chang: Classifying spoken syllables from human sensorimotor cortex with deep networks Gonzalo Mena, Lauren Grosberg, Frederick Kellison-Linn, E.J. Chichilnisky, and Liam Paninski: Large-scale Multi Electrode Array Spike Sorting Algorithm Introducing Concurrent Recording and Stimulation Jonathan Platkiewicz and Asohan Amarasingham: Monosynaptic Connection Test for Pairwise Extracellular Spike Data Akshay Rangamani, Jacob Harer, Amit Sinha, Alik Widge, Emad Eskandar, Darin Dougherty, Ishita Basu, Sydney Cash, Angelique Paulk, Trac D. Tran, and Sang (Peter) Chin: Modeling Local Field Potentials with Recurrent Neural Networks Maja Rudolph and David Blei: The Dirichlet-Gamma Filter for Discovery of Neural Ensembles and their Temporal Dynamics Organizers Recent advances in neural recording technologies, including calcium imaging and high-density electrode arrays, have made it possible to simultaneously record neural activity from large populations of neurons for extended periods of time. These developments promise unprecedented insights into the collective dynamics of neural populations and thereby the underpinnings of brain-like computation. However, this new large-scale regime for neural data brings significant methodological challenges. This workshop seeks to explore the statistical methods and theoretical tools that will be necessary to study these data, build new

NIPS 2015 Workshop book

models of neural dynamics, and increase our understanding of the underlying computation. We have invited researchers across a range of disciplines in statistics, applied physics, machine learning, and both theoretical and experimental neuroscience, with the goal of fostering interdisciplinary insights. We hope that active discussions among these groups can set in motion new collaborations and facilitate future breakthroughs on fundamental research problems.

The workshop will focus on three central questions: a) How can we deal with incomplete data in a principled manner? In most experimental settings, even advanced neural recording methods can only sample a small fraction of all neurons that might be involved in a task, and the observations are often indirect and noisy. As a result, many recordings are from neurons that receive inputs from neurons that are not themselves directly observed, at least not over the same time period. How can we deal with this `incomplete data' problem in a principled manner? How does this sparsity of recordings influence what we can and cannot infer about neural dynamics and mechanisms? b) How can we incorporate existing models of neural dynamics into neural data analysis? Theoretical neuroscientists have intensely studied neural population dynamics for decades, resulting in a plethora of models of neural population dynamics. However, most analysis methods for neural data do not directly incorporate any models of neural dynamics, but rather build on generic methods for dimensionality reduction or time-series modelling. How can we incorporate existing models of neural dynamics? Conversely, how can we design neural data analysis methods such that they explicitly constrain models of neural dynamics? c) What synergies are there between analyzing

Page 17 of 52

Generated Fri Feb 01, 2019

biological and artificial neural systems? The rise of ‘deep learning’ methods has shown that hard computational problems can be solved by machine learning algorithms that are built by cascading many nonlinear units. Although artificial neural systems are fully observable, it has proven challenging to provide a theoretical understanding of how they solve computational problems and which features of a neural network are critical for its performance. While such ‘deep networks’ differ from biological neural networks in many ways, they provide an interesting testing ground for evaluating strategies for understanding neural processing systems. Are there synergies between analysis methods for analyzing biological and artificial neural systems? Has the resurgence of deep learning resulted in new hypotheses or strategies for trying to understand biological neural networks?

Probabilistic Integration Michael A Osborne, Philipp Hennig 512 a, Fri Dec 11, 08:30 AM Integration is the central numerical operation required for Bayesian machine learning (in the form of marginalization and conditioning). Sampling algorithms still abound in this area, although it has long been known that Monte Carlo methods are fundamentally sub-optimal. The challenges for the development of better performing integration methods are mostly algorithmic. Moreover, recent algorithms have begun to outperform MCMC and its siblings, in wall-clock time, on realistic problems from machine learning. The workshop will review the existing, by now quite strong, theoretical case against the use of random numbers for integration, discuss recent algorithmic developments, relationships

NIPS 2015 Workshop book

Generated Fri Feb 01, 2019

between conceptual approaches, and highlight central research challenges going forward. Among the questions to be addressed by the workshop are * How fast can a practical integral estimate on a deterministic function converge (polynomially, super-polynomially, not just “better than sqrt(N)”)? * How are these rates related, precisely, to prior assumptions about the integrand, and to the design rules of the integrator? * To which degree can the source code of an integration problem be parsed to choose informative priors? * Are random numbers necessary and helpful for efficient multivariate integration, or are they a conceptual crutch that cause inefficiencies? * What are the practical challenges in the design of efficient multivariate integration methods that use such prior information? The workshop builds upon the growing field of probabilistic numerics, for which Probabilistic Integration is a core component. A community website for probabilistic numerics can be found at http://probabilistic-numerics.org. Schedule

09:10 AM

*Christian Robert* Can We Estimate a Robert Constant?

09:40 AM

*François-Xavier Briol* Theoretical Guarantees for Probabilistic Integration

10:30 AM

*Arthur Gretton* Learning with Probabilities as Inputs, Using Kernels

Page 18 of 52

11:00 AM

*Roman Garnett* Bayesian Quadrature: Lessons Learned and Looking Forwards

11:30 AM

*George Papamakarios* Papamakarios Distilling Intractable Generative Models

11:45 AM

*Jan Calliess* Bayesian Lipschitz Calliess Constant Estimation and Quadrature

02:30 PM

Convergence Rates of Kernel Bach Quadrature Rules

03:00 PM

*David Duvenaud* Automatic Differentiation: The most criminally Duvenaud underused tool in probabilistic numerics

03:30 PM

*Max Welling* Optimization Monte Welling Carlo

Garnett

Bounded Optimality and Rational Metareasoning Briol Samuel J Gershman, Falk Lieder, Tom Griffiths, Noah Goodman 512 bf, Fri Dec 11, 08:30 AM Gretton

Formal definitions of rationality are instrumental for understanding and designing intelligent systems. By specifying the optimal way to reason under the

NIPS 2015 Workshop book

constraint of limited information, Bayesian rationality has enabled tremendous advances in machine learning and artificial intelligence together with deep insights into human cognition and brain function. Bounded optimality (Horvitz, 1989; Russell, & Wefald, 1991a) extends Bayesian rationality by taking into account two additional constraints: limited time and finite computational resources. Bounded optimality is a practical framework for designing the best AI system possible given the constraints of its limited-performance hardware (Russell & Subramanian, 1995), and provides a way to capture the time and resource-constraints on human cognition. To adaptively allocate their finite computational bounded agents may have to perform rational metareasoning (Russel, & Wefald, 1991b) which corresponds to topics like cognitive control and metacognition studied in cognitive neuroscience and psychology. Current research in cognitive science is leveraging bounded optimality and rational metareasoning to understand how the human mind can achieve so much with so little computation (Gershman, Horvitz, & Tenenbaum, in press; Vul, Griffiths, Goodman, & Tenenbaum, 2014), to develop and constrain process models of cognition (Griffiths, Lieder, & Goodman, 2015; Lewis, Howes, & Singh, 2014), to reevaluate the evidence for human irrationality, and to rethink heuristics and biases (Lieder, Griffiths, & Goodman, 2013; Lieder, Plunkett, et al. 2015). Rational metareasoning and bounded optimality also have interesting connections to neuroscience including the top-down control of neural information processing (e.g., Shenhav, Botvinick, & Cohen, 2013) and neural coding (Gershman, & Wilson, 2007). This workshop brings together computer scientists working on bounded optimality and metareasoning with psychologists and neuroscientists reverse-engineering the computational principles that make the human brain incredibly

Page 19 of 52

Generated Fri Feb 01, 2019

resource-efficient. The goal of this workshop is to synthesize these different perspectives on bounded optimality, to promote interdisciplinary interactions and cross-fertilization, and to identify directions for future research. References ------------1. Horvitz, E. (1987). Reasoning about Beliefs and Actions under Computational Resource Constraints, Third Workshop on Uncertainty in Artificial Intelligence, Seattle, Washington. Association for Uncertainty and Artificial Intelligence. pp. 429-444. 2. Russell, S. J., & Wefald, E. H. (1991a). Do the Right Thing: Studies in Limited Rationality. Cambridge, MA: MIT Press. 3. Russell, S. J., & Wefald, E. H. (1991a). Principles of Metareasoning. Artificial Intelligence, 49, 361–395. 4. Russell, S. J., & Subramanian, D. (1995). Provably bounded-optimal agents. Journal of Artificial Intelligence Research, 2, 575-609. 5. Gershman, S.J., Horvitz, E.J., & Tenenbaum, J.B. (2015). Computational rationality: A converging paradigm for intelligence in brains, minds and machines. Science, 349, 273-278. 6. Vul, E., Goodman, N., Griffiths, T. L., & Tenenbaum, J. B. (2014). One and done? Optimal decisions from very few samples. Cognitive science, 38(4), 599-637. 7. Griffiths, T.L., Lieder, F., & Goodman, N.D. (2015). Rational use of cognitive resources: Levels of analysis between the computational and the algorithmic. Topics in Cognitive Science, 7(2), 217-229. 8. Lieder, F., Plunkett, D., Hamrick, J.B., Russell, S.J, Hay, N.J., & Griffiths, T.L. (2014). Algorithm Selection by Rational Metareasoning as a Model of Human Strategy Selection. Advances in Neural Information Processing Systems 27, pp. 2870-2878. 9. Lewis, R. L., Howes, A., & Singh, S. (2014). Computational Rationality: Linking Mechanism and

NIPS 2015 Workshop book

Behavior Through Bounded Utility Maximization. Topics in cognitive science, 6(2), 279-311. 10. Lieder, F., Griffiths, T. L., & Goodman, N. D. (2013). Burn-in, bias, and the rationality of anchoring. Advances in Neural Information Processing Systems 26, 2690-2798. 11. Shenhav, A., Botvinick, M. M., & Cohen, J. D. (2013). The expected value of control: an integrative theory of anterior cingulate cortex function. Neuron, 79(2), 217-240. 12. Gershman, S., & Wilson, R. (2010). The neural costs of optimal control. Advances in neural information processing systems 23, 712-720.

Cognitive Computation: Integrating neural and symbolic approaches Artur Garcez, Tarek R. Besold, Risto Miikkulainen, Gary Marcus

Generated Fri Feb 01, 2019

contributed papers and poster sessions, giving ample opportunity to interact and discuss the different perspectives and emerging approaches. The workshop targets a single broad theme of general interest to the vast majority of the NIPS community, namely the study of translations and ways of integration between neural models and knowledge representation for the purpose of achieving an effective integration of learning and reasoning. Neural-symbolic computing is now an established topic of wider interest to NIPS with topics that are relevant to almost everyone studying neural information processing. Some of the relevant keywords characterizing the event are: neural-symbolic computing; language processing; cognitive agents; multimodal learning; deep networks; symbol manipulation; variable binding; integration of learning and reasoning. Schedule

512 cg, Fri Dec 11, 08:30 AM While early work on knowledge representation and inference was primarily symbolic, the corresponding approaches subsequently fell out of favor, and were largely supplanted by connectionist methods. In this workshop, we will work to close the gap between the two paradigms, and aim to formulate a new unified approach that is inspired by our current understanding of human cognitive processing. This is important to help improve our understanding of Neural Information Processing and build better Machine Learning systems, including the reuse of knowledge learned in one application domain in analogous domains. The workshop brings together world leaders in the fields of neural computation, logic and artificial intelligence, natural language understanding, cognitive science, and computational neuroscience. Over the two workshop days, their invited lectures will be complemented with presentations based on Page 20 of 52

08:40 AM

Neural-symbolic integration: Challenges, Besold promises, perspectives, ideas

09:00 AM

Sum-Product Networks and Tractable Markov Logic: And End-to-End Neural-Symbolic System

Domingos

09:30 AM

Deep Symbolic Learning

Muggleton

10:30 AM

Unifying Symbolic and Probabilistic Reasoning via Mixed Graphical Models

Dechter

NIPS 2015 Workshop book

11:00 AM

Generated Fri Feb 01, 2019

Cognitive Foundations for Common-Sense Tenenbaum Knowledge Representation and Reasoning

03:30 PM

Combinatorial structures and processing in van der Velde Neural Blackboard Architectures

04:00 PM

Four facts about TPRs

05:00 PM

Putting the "neural" back in neural Marcus networks

05:30 PM

Discussion Panel with Afternoon Speakers (Day 2)

Smolensky

11:30 AM

Discussion Panel with Morning Speakers (Day 1)

Domingos, Muggleton, Dechter, Tenenbaum

03:30 PM

Reasoning with Memory Networks Successes and Challenges

Bordes

04:00 PM

Motivation

Guha

05:00 PM

How Can We Direct Wayne Our Agents?

Multimodal Machine Learning

05:30 PM

Discussion Panel with Afternoon Speakers (Day 1)

LP Morency, Tadas Baltrusaitis, Aaron Courville, Kyunghyun Cho

08:30 AM

Turing Computation with Recurrent Carmantini Artifcial Neural Networks

09:00 AM

Lifted Relational Neural Networks

Sourek

09:30 AM

Relational Knowledge Extraction from Neural Networks

Garcez

10:30 AM

Lifelong Machine Learning and Reasoning

Silver

11:00 AM

Dense Models and Witbrock Reasoning

11:30 AM

Discussion Panel with Morning Speakers (Day 2)

Page 21 of 52

Guha, Bordes, Wayne

Carmantini, Sourek, Garcez, Silver, Witbrock

van der Velde, Smolensky, Marcus

512 dh, Fri Dec 11, 08:30 AM Workshop Overview Multimodal machine learning aims at building models that can process and relate information from multiple modalities. From the early research on audio-visual speech recognition to the recent explosion of interest in models mapping images to natural language, multimodal machine learning is is a vibrant multi-disciplinary field of increasing importance and with extraordinary potential. Learning from paired multimodal sources offers the possibility of capturing correspondences between modalities and gain in-depth understanding of natural phenomena. Thus, multimodal data provides a means of reducing our dependence on the more standard supervised learning paradigm that is inherently limited by the availability of labeled examples. This research field brings some unique challenges

NIPS 2015 Workshop book

for machine learning researchers given the heterogeneity of the data and the complementarity often found between modalities. This workshop will facilitate the progress in multimodal machine learning by bringing together researchers from natural language processing, multimedia, computer vision, speech processing and machine learning to discuss the current challenges and identify the research infrastructure needed to enable a stronger multidisciplinary collaboration. For keynote talk abstracts and MMML 2015 workshop proceedings: https://sites.google.com/site/multiml2015/ Oral presentation - Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences Hongyuan Mei, Mohit Bansal, Matthew Walter Oral spotlights - An Analysis-By-Synthesis Approach to Multisensory Object Shape Perception. Goker Erdogan, Ilker Yildirim, Robert Jacobs - Active Perception based on Multimodal Hierarchical Dirichlet Processes. Tadahiro Taniguchi, Toshiaki Takano, Ryo Yoshino - Towards Deep Alignment of Multimodal Data. George Trigeorgis, Mihalis Nicolaou, Stefanos Zafeiriou, Bjorn Schuller - Multimodal Transfer Deep Learning with an Application in Audio-Visual Recognition. Seungwhan Moon, Suyoun Kim, Haohan Wang Posters - Multimodal Convolutional Neural Networks for Matching Image and Sentence. Lin Ma, Zhengdong Lu, Lifeng Shang, Hang Li - Group sparse factorization of multiple data views. Eemeli Leppäaho, Samuel Kaski - Unveiling the Dreams of Word Embeddings: Towards Language-Driven Image Generation. Angeliki Lazaridou, Dat Tien Nguyen, Raffaella

Page 22 of 52

Generated Fri Feb 01, 2019

Bernardi, Marco Baroni - Cross-Modal Attribute Recognition in Fashion. Susana Zoghbi, Geert Heyman, Juan Carlos Gomez Carranza, Marie-Francine Moens - Multimodal Sparse Coding for Event Detection. Youngjune Gwon, William Campbell, Kevin Brady, Douglas Sturim, Miriam Cha, H. T. Kung - Multimodal Symbolic Association using Parallel Multilayer Perceptron. Federico Raue, Sebastian Palacio, Thomas Breuel, Wonmin Byeon, Andreas Dengel, Marcus Liwicki - Bridge Correlational Neural Networks for Multilingual Multimodal Representation Learning. Janarthanan Rajendran, Mitesh Khapra, Sarath Chandar, Balaraman Ravindran - Multimodal Learning of Object Concepts and Word Meanings by Robots. Tatsuya Aoki, Takayuki Nagai, Joe Nishihara, Tomoaki Nakamura, Muhammad Attamimi - Multi-task, Multi-Kernel Learning for Estimating Individual Wellbeing. Natasha Jaques, Sara Taylor, Akane Sano, Rosalind Picard - Generating Images from Captions with Attention. Elman Mansimov, Emilio Parisotto, Jimmy Ba, Ruslan Salakhutdinov - Manifold Alignment Determination. Andreas Damianou, Neil Lawrence, Carl Henrik Ek - Accelerating Multimodal Sequence Retrieval with Convolutional Networks. Colin Raffel, Daniel P. W. Ellis - Audio-Visual Fusion for Noise Robust Speech Recognition. Nagasrikanth Kallakuri, Ian Lane - Learning Multimodal Semantic Models for Image Question Answering. Zichao Yang, Xiaodong He, Jianfeng Gao, Li Deng - Greedy Vector-valued Multi-view Learning. Hachem Kadri, Stephane Ayache, Cecile Capponi, François-Xavier Dupé - S2VT: Sequence to Sequence -- Video to Text. Subhashini Venugopalan, Marcus Rohrbach, Jeff Donahue, Raymond Mooney, Trevor Darrell, Kate Saenko

NIPS 2015 Workshop book

Generated Fri Feb 01, 2019

Schedule

Machine Learning for (e-)Commerce

09:00 AM

Introduction

09:15 AM

Visual Question Answering

10:00 AM

Listen, Attend and Walk: Neural Mapping of Navigational Instructions to Action Sequences

10:30 AM

Accepted Orals and Erdogan, Trigeorgis, Spotlights Moon, Taniguchi

11:00 AM

Multimodal Transfer Deep Learning with Applications in Moon Audio-Visual Recognition

02:30 PM

Generating Natural-Language Video Descriptions Mooney using LSTM Recurrent Neural Networks

03:15 PM

Cross-Modality Distant Supervised Learning for Deng Speech, Text, and Image Classification

Courville

Batra

512 e, Fri Dec 11, 08:30 AM

Walter

04:30 PM

Generating Images from Captions with Salakhutdinov Attention

05:15 PM

Automatic Cross-Media Event Schema Ji Construction and Knowledge Population

Page 23 of 52

Esteban Arcaute, Mohammad Ghavamzadeh, Shie Mannor, Georgios Theocharous

The goal of this workshop is to study the challenges in learning, evaluating, and mining of e-commerce and more classical commerce domains. As the largest commerce and e-commerce companies on the planet are adopting machine learning technologies, it becomes increasingly clear that these domains present different challenges that classical machine learning problems. In this workshop we plan to focus on the problems more than on solutions. We will consider problems such as identifying dysfunctional items or collections in a website, off-policy evaluation of marketing strategies, personalization of e-commerce experience, validation, sequential decisions, dynamic pricing, and others. Our main goal is to portray the main challenges of the field and to propose an industry-academia agreed collection of benchmarks problems for theoretical study and experimental work.

Advances in Approximate Bayesian Inference Dustin Tran, Tamara Broderick, Stephan Mandt, James McInerney, Shakir Mohamed, Alp Kucukelbir, Matthew D. Hoffman, Neil Lawrence, David Blei 513 ab, Fri Dec 11, 08:30 AM The ever-increasing size of data sets has resulted in an immense effort in Bayesian statistics to develop more expressive and scalable probabilistic models.

NIPS 2015 Workshop book

Inference remains a challenge and limits the use of these models in large-scale scientific and industrial applications. Asymptotically exact schemes such as Markov chain Monte Carlo (MCMC) are often slow to run and difficult to evaluate in finite time. Thus we must resort to approximate inference, which allows for more efficient run times and more reliable convergence diagnostics on large-scale and streaming data—without compromising on the complexity of these models. This workshop aims to bring together researchers and practitioners in order to discuss recent advances in approximate inference; we also aim to discuss the methodological and foundational issues in such techniques in order to consider future improvements. The resurgence of interest in approximate inference has furthered development in many techniques: for example, scalability, variance reduction, and preserving dependency in variational inference; divide and conquer techniques in expectation propagation; dimensionality reduction using random projections; and stochastic variants of Laplace approximation-based methods. Approximate inference techniques have clearly emerged as the preferred way to perform tractable Bayesian inference. Despite this interest, there remain significant trade-offs in speed, accuracy, generalizability, and learned model complexity. In this workshop, we will discuss how to rigorously characterize these tradeoffs, as well as how they might be made more favourable. Moreover, we will address the issues of its adoption in scientific communities which could benefit from advice on their practical usage and the development of relevant software packages.

Deep Reinforcement Learning Pieter Abbeel, John Schulman, Satinder Singh, David Silver Page 24 of 52

Generated Fri Feb 01, 2019

513 cd, Fri Dec 11, 08:30 AM Although the theory of reinforcement learning addresses an extremely general class of learning problems with a common mathematical formulation, its power has been limited by the need to develop task-specific feature representations. A paradigm shift is occurring as researchers figure out how to use deep neural networks as function approximators in reinforcement learning algorithms; this line of work has yielded remarkable empirical results in recent years. This workshop will bring together researchers working at the intersection of deep learning and reinforcement learning, and it will help researchers with expertise in one of these fields to learn about the other.

Schedule

09:00 AM

Deep Learning for Real-Time Atari Game Play Using Lee Offline Monte-Carlo Tree Search Planning

09:30 AM

On General Problem Solving and How to Schmidhuber Learn an Algorithm

11:00 AM

The Deep Reinforcement Learning Boom

11:30 AM

Deep RL in Games Tesauro Research

12:00 PM

Osaro

Arel

02:00 PM

Deep Robotic Learning

Levine

02:30 PM

RL for DL

Bengio

Mnih

NIPS 2015 Workshop book

05:00 PM

Deep RL for Learning Machines Riedmiller How to do Deep RL in Real World

05:30 PM

Compressed Neural Koutnik Networks for RL

The 1st International Workshop "Feature Extraction: Modern Questions and Challenges" Dmitry Storcheus, Sanjiv Kumar, Afshin Rostamizadeh 513 ef, Fri Dec 11, 08:30 AM UPDATE: The workshop proceedings will be published in a special issue of The Journal Of Machine Learning Research prior to the workshop date. For that reason, submissions are extended to 10 pages (excluding references and appendix) in JMLR format. The authors of accepted submissions will be asked to provide a camera-ready version within 7 days of acceptance notification.

The problem of extracting features from given data is of critical importance for the successful application of machine learning. Feature extraction, as usually understood, seeks for an optimal transformation from raw data into features that can be used as an input for a learning algorithm. In recent times this problem has been attacked using a growing number of diverse techniques that originated in separate research communities: from PCA and LDA to manifold and metric learning. It is the goal of this workshop to provide a platform to exchange ideas and compare results across these techniques. The workshop will consist of three sessions, each Page 25 of 52

Generated Fri Feb 01, 2019

dedicated to a specific open problem in the area of feature extraction. The sessions will start with invited talks and conclude with panel discussions, where the audience will engage into debates with speakers and organizers. We welcome submissions from sub-areas such as general embedding techniques, metric learning, scalable nonlinear features, deep neural networks. More often than not, studies in each of these areas do not compare or evaluate methods found in the other areas. It is the goal of this workshop to begin the discussions needed to remedy this. We encourage submissions to foster open discussions around such important questions, which include, but are not limited to: 1. Scalability. We have recently managed to scale up convex methods. Most remarkably, approximating kernel functions via random Fourier features have enabled kernel machines to match the DNNs. That inspired many efficient feature extraction methods, for instance Monte Carlo methods improved the results of Fourier features as well as approximating polynomial kernels via explicit feature maps showed remarkable performance. What does it all means for the prospects of convex scalable methods? Can they become state of the art in the nearest future? 2. Convex and non-convex feature extraction. While deep nets suffer from non-convexity and the lack of theoretical guarantees, kernel machines are convex and well studied mathematically. Thus, it is extremely tempting for us to resort to kernels in understanding neural nets. Can we shed more light on their connection? 3. Balance between extraction and classification stages. We often see in real world applications (e.g. spam detection, audio filtering) that feature extraction is CPU-heavy compared to classification.

NIPS 2015 Workshop book

Generated Fri Feb 01, 2019

The classic way to balance them was to sparsify the choice of features with L-1 regularization. A promising alternative is to use trees of classifiers. However, this problem is NP hard, so a number of relaxations has been suggested. Which relaxations are better and will the tree-based approaches to extraction/classification tradeoff become the state of the art? 4. Supervised vs. Unsupervised. Can we understand, which methods are most useful for particular settings and why? 5. Theory vs. Practice: Certain methods are supported by significant theoretical guarantees, but how do these guarantees translate into performance in practice? Schedule

08:50 AM

Feature Extraction Storcheus

09:15 AM

Do Shallow Kernel Methods Match Sha Deep Neural Networks?

10:00 AM

Discovering Salient Features via Zou Adaptively Chosen Comparisons

02:50 PM

Deep Manifold Traversal

03:35 PM

Explaining individual deep network predictions Montavon and measuring the quality of these explanations

Weinberger

Adaptive Data Analysis Adam Smith, Aaron Roth, Vitaly Feldman, Moritz Hardt 514 a, Fri Dec 11, 08:30 AM Adaptive data analysis is the increasingly common practice by which insights gathered from data are used to inform further analysis of the same data sets. This is common practice both in machine learning, and in scientific research, in which data-sets are shared and re-used across multiple studies. Unfortunately, most of the statistical inference theory used in empirical sciences to control false discovery rates, and in machine learning to avoid overfitting, assumes a fixed class of hypotheses to test, or family of functions to optimize over, selected independently of the data. If the set of analyses run is itself a function of the data, much of this theory becomes invalid, and indeed, has been blamed as one of the causes of the crisis of reproducibility in empirical science. Recently, there have been several exciting proposals for how to avoid overfitting and guarantee statistical validity even in general adaptive data analysis settings. The problem is important, and ripe for further advances. The goal of this workshop is to bring together members of different communities (from machine learning, statistics, and theoretical computer science) interested in solving this problem, to share recent results, to discuss promising directions for future research, and to foster collaborations. The workshop will consist several sessions of invited talks, with panel discussions consisting of the speakers following each session.

Time Series Workshop

Page 26 of 52

NIPS 2015 Workshop book

Oren Anava, Azadeh Khaleghi, Vitaly Kuznetsov, Alexander Rakhlin

Generated Fri Feb 01, 2019

09:00 AM

Learning Theory and Algorithms for Mohri Time Series

09:50 AM

Isotonic Hawkes Process

02:30 PM

Bayesian Time Series: Structured Fox Representations for Scalability

03:10 PM

Sparse Adaptive Prior for Time Dependent Model Parameters

03:20 PM

Design of Covariance Functions using Tobar Inter-Domain Inducing Variables

03:40 PM

Markov GP for Scalable Expressive Online Bayessian Kom Samo Nonparametric Time Series Forecasting

04:30 PM

Between stochastic and adversarial: forecasting with Mannor online ARMA models

05:10 PM

Confidence intervals for the mixing time of a Szepesvari reversible Markov chain from a single sample path

05:25 PM

Wavelet Methods for Time Series

06:05 PM

Temporal Regularized Matrix Yu Factorization

514 bc, Fri Dec 11, 08:30 AM Data, in the form of time-dependent sequential observations emerge in many key real-world problems ranging from biological data, to financial markets, to weather forecasting and audio/video processing. However, despite the ubiquity of such data, the vast majority of learning algorithms have been primarily developed for the setting in which sample points are drawn i.i.d. from some possibly unknown fixed distribution. While there exist algorithms designed to handle non-i.i.d. data, these typically assume specific parametric form of data-generating distribution. Such assumptions may undermine the possibly complex nature of modern data which can possess long-range dependency patterns that we now have the computing power to discern. On the other extreme, some online learning algorithms consider a non-stochastic framework without any distributional assumptions. However, such methods may fail to fully address the stochastic aspect of real-world time-series data. The goal of this workshop is to bring together theoretical and applied researchers interested in the analysis of time series, and the development of new algorithms to process sequential data. This includes algorithms for time series prediction, classification, clustering, anomaly and change point detection, correlation discovery, dimensionality reduction as well as a general theory for learning and comparing stochastic processes. We invite researchers from the related areas of batch and online learning, reinforcement learning, data analysis and statistics, econometrics, and many others to contribute to this workshop. Schedule

Page 27 of 52

Wang

Dodge

Gencay

NIPS 2015 Workshop book

Applying (machine) Learning to Experimental Physics (ALEPH) and «Flavours of Physics» challenge

Generated Fri Feb 01, 2019

challenges coming from Physics. We hope that as a result of this workshop (as well as a result of the Flavours of Physics challenge organized before the workshop and the new dataset that we shared in its scope that are to be discussed at the workshop), these problems will attract more attention from ML researchers.

Pavel Serdyukov, Andrey Ustyuzhanin, Marcin Chrz■szcz, Francesco Dettori, Marc-Olivier Bettler

Schedule

515 bc, Fri Dec 11, 08:30 AM

08:30 AM

Flavors of Physics Ustyuzhanin Challenge

Experimental physics actively develops frontiers of our knowledge of the Universe and ranges from macroscopic objects observed through telescopes to micro-world of particle interaction. In each field of study scientists go from raw measurements (celestial objects spectra or energies of detected particles inside collider detectors) to higher levels of the representation that are more suitable for further analysis and to human perception. Each measurement can be used for supporting or refuting certain theory that compete for predictive power and completeness.

09:00 AM

Wonders and Woes Hamner of ML Competitions

09:20 AM

Data Science at LHCb

11:10 AM

Open ML Problems in High Energy Whiteson Physics

02:00 PM

The HiggsML Story Kégl

02:40 PM

Deep Learning RNNaissance

In many areas of physical experiments it assimilated computational paradigms a long time ago: both simulators and semi-automatic data analysis techniques have been applied widely for decades. In particular, nonparametric classification and regression are now routinely used as parts of the reconstruction (inference) chain. More recently, state-of-the-art budgeted learning techniques have also started to be used for real-time event selection on LHC. Nevertheless, most of these applications went largely unnoticed by the machine learning (ML) community.

03:00 PM

Flavors of Physics: third place solution

03:30 PM

Identifying Tau to Three Muon Decay Events at the LHCb Using a Combination of Gargeya Hand-Crafted and Automatic Feature Engineering and Ensemble Algorithms

04:00 PM

Jet Images

de Oliveira

04:20 PM

Output Correction in HEP Using DPGMM

Omari

Our primary goal is to bring the Physics and ML communities together to initiate discussions on Physics-motivated problems and applications in ML. It is not unknown that the ML community is still largely untouched by the numerous learning Page 28 of 52

Head

Schmidhuber

NIPS 2015 Workshop book

Generated Fri Feb 01, 2019

04:45 PM

Building a Robust Tandon Detector Algorithm

applications (e.g. in psychiatry, or education) that make use of other available data sources.

04:50 PM

An alternative to ABC for likelihood-free inference

05:30 PM

Machine Learning in Williams HEP

This trend presents challenges and opportunities for machine learning. Real world applications typically involve much larger quantities of data, which can be continuously recorded in natural environments like the classroom, home or workplace. But this data is more noisy due to the lower-spec hardware and less controlled environment. And gathering data from much broader swathes of the population, whether healthy or dealing with a condition, results in more uncontrolled variation.

Cranmer

Machine Learning and Interpretation in Neuroimaging (day 1) Irina Rish, Leila Wehbe, Brian Murphy, Georg Langs, Guillermo Cecchi, Moritz Grosse-Wentrup Room 515 a, Fri Dec 11, 08:30 AM URL: https://sites.google.com/site/mliniworkshop2015/ Modern multivariate statistical methods have been increasingly applied to various problems in neuroimaging, including “mind reading”, “brain mapping”, clinical diagnosis and prognosis. Multivariate pattern analysis (MVPA) methods are designed to examine complex relationships between large-dimensional signals, such as brain MRI images, and an outcome of interest, such as the category of a stimulus, with a limited amount of data. The MVPA approach is in contrast with the classical mass-univariate (MUV) approach that treats each individual imaging measurement in isolation. Recent work in neuroscience has started to move away from conventional lab-based studies, towards more naturalistic behavioral tasks (e.g. normal reading, movie watching), with mobile neuroimaging technologies (EEG, NIRS), and real-world

Page 29 of 52

ML techniques have already revolutionized analysis of well-controlled lab data, and are even more necessary for these new applications. Richer stimuli and behavioral tasks provide opportunities for complex modeling (e.g. the psychological experience of perceiving and acting in a computer game), and non-lab contexts may provide much As in previous editions in this series (2011, 2012, 2013,2014) we will center this workshop around in-depth invited talks, and two panel discussions. Original contributions will be considered for a small number of additional talks, but the majority will be hosted during extended poster sessions. As in previous years this format will give ample opportunity for genuine discussion and exchange of ideas. Since this is an emerging area we will not be prescriptive on the precise topics within this theme. However we envisage receiving submissions on topics including: - modelling of more naturalistic stimuli, tasks and paradigms - real-world applications, e.g. neurological disease, education - neuroimaging outside the lab, mobile acquisition (EEG and NIRS) - comparing informativity of lab and mobile

NIPS 2015 Workshop book

neuro-data, trade-off between data quality and quantity - ‘pervasive’ behavioural data gathered incidentally from personal computing devices (e.g. audio, movement, location, touch screen and typing inputs) - multi-modal analysis of mental state inference from imaging and/or behavioral data - machine learning and pattern recognition methodology - linking machine learning, neuroimaging and neuroscience - given recent advances of deep learning in image analysis and other applications, a natural question to ask is whether neuroimaging analysis can benefit from such approaches? - given the enriched context, how does the brain's representation of an individual concept varies as a function of semantic context (e.g., word in sentences) and how combinations of multiple individual concepts are represented in the brain (e.g., sentence decoding)?

Page 30 of 52

Generated Fri Feb 01, 2019

NIPS 2015 Workshop book

Dec. 12, 2015

Reasoning, Attention, Memory (RAM) Workshop Jason E Weston, Sumit Chopra, Antoine Bordes 510 ac, Sat Dec 12, 08:30 AM Motivation and Objective of the Workshop In order to solve AI, a key component is the use of long term dependencies as well as short term context during inference, i.e., the interplay of reasoning, attention and memory. The machine learning community has had great success in the last decades at solving basic prediction tasks such as text classification, image annotation and speech recognition. However, solutions to deeper reasoning tasks have remained elusive. Until recently, most existing machine learning models have lacked an easy way to read and write to part of a (potentially very large) long-term memory component, and to combine this seamlessly with inference. To combine memory with reasoning, a model must learn how to access it, i.e. to perform *attention* over its memory. Within the last year or so, in part inspired by some earlier works [8, 9, 14, 15, 16, 18, 19], there has been some notable progress in these areas which this workshop addresses. Models developing notions of attention [12, 5, 6, 7, 20, 21] have shown positive results on a number of real-world tasks such as machine translation and image captioning. There has also been a surge in building models of computation which explore differing forms of explicit storage [1, 10, 11, 13, 17]. For example, recently it was shown how to learn a model to sort a small set of numbers [1] as well as a

Page 31 of 52

Generated Fri Feb 01, 2019

host of other symbolic manipulation tasks. Another promising direction is work employing a large long-term memory for reading comprehension; the capability of somewhat deeper reasoning has been shown on synthetic data [2], and promising results are starting to appear on real data [3,4]. In spite of this resurgence, the research into developing learning algorithms combining these components and the analysis of those algorithms is still in its infancy. The purpose of this workshop is to bring together researchers from diverse backgrounds to exchange ideas which could lead to addressing the various drawbacks associated with such models leading to more interesting models in the quest for moving towards true AI. We thus plan to focus on addressing the following issues: * How to decide what to write and what not to write in the memory. * How to represent knowledge to be stored in memories. * Types of memory (arrays, stacks, or stored within weights of model), when they should be used, and how can they be learnt. * How to do fast retrieval of relevant knowledge from memories when the scale is huge. * How to build hierarchical memories, e.g. employing multiscale notions of attention. * How to build hierarchical reasoning, e.g. via composition of functions. * How to incorporate forgetting/compression of information which is not important. * How to properly evaluate reasoning models. Which tasks can have a proper coverage and also allow for unambiguous interpretation of systems' capabilities? Are artificial tasks a convenient way? * Can we draw inspiration from how animal or human memories are stored and used? The workshop will devote most of the time in invited speaker talks, contributed talks and panel discussion. In order to move away from a

NIPS 2015 Workshop book

mini-conference effect we will not have any posters. To encourage interaction a webpage will be employed for realtime updates, also allowing people to post questions before or during the workshop, which will be asked at the end of talks or during the panel, or can be answered online. Please see our external page for more information: http://www.jaseweston.com/ram

Machine Learning in Computational Biology Nicolo Fusi, Anna Goldenberg, Sara Mostafavi, Gerald Quon, Oliver Stegle 510 bd, Sat Dec 12, 08:30 AM The field of computational biology has seen dramatic growth over the past few years. A wide range of high-throughput technologies developed in the last decade now enable us to measure parts of a biological system at various resolutions—at the genome, epigenome, transcriptome, and proteome levels. These technologies are now being used to collect data for an ever-increasingly diverse set of problems, ranging from classical problems such as predicting differentially regulated genes between time points and predicting subcellular localization of RNA and proteins, to models that explore complex mechanistic hypotheses bridging the gap between genetics and disease, population genetics and transcriptional regulation. Fully realizing the scientific and clinical potential of these data requires developing novel supervised and unsupervised learning methods that are scalable, can accommodate heterogeneity, are robust to systematic noise and confounding factors, and provide mechanistic insights. The goals of this workshop are to i) present emerging problems and innovative machine

Page 32 of 52

Generated Fri Feb 01, 2019

learning techniques in computational biology, and ii) generate discussion on how to best model the intricacies of biological data and synthesize and interpret results in light of the current work in the field. We will invite several rising leaders from the biology/bioinformatics community who will present current research problems in computational biology and lead these discussions based on their own research and experiences. We will also have the usual rigorous screening of contributed talks on novel learning approaches in computational biology. We encourage contributions describing either progress on new bioinformatics problems or work on established problems using methods that are substantially different from established alternatives. Kernel methods, graphical models, feature selection, non-parametric models and other techniques applied to relevant bioinformatics problems would all be appropriate for the workshop. We are particularly keen on considering contributions related to the prediction of functions from genotypes and that target data generated from novel technologies such as gene editing and single cell genomics, though we will consider all submissions that highlight applications of machine learning into computational biology. The targeted audience are people with interest in learning and applications to relevant problems from the life sciences, including NIPS participants without any existing research link to computational biology. Schedule

09:00 AM

Learning Deep Biological Frey Architectures for Genomic Medicine

09:45 AM

Multi-Tac deep neural network to predict CpG Angermueller methylation profiles from low-coverage sequencing data

NIPS 2015 Workshop book

10:05 AM

Generated Fri Feb 01, 2019

Large-Scale Sentence Clustering from Electronic Health Records for Stark Genetic Associations in Cancer

10:10 AM

Classifying Microscopy Images Using ConvolutionalKraus Multiple Instance Learning

10:30 AM

Basset: Learning the regulatory code of the accessible Kelley genome with deep convolutional neural networks

10:50 AM

A probabilistic method for quantifying chromatin interactions

11:45 AM

Tensor decomposition and causal inference for Hore multi-tissue gene expression experiments.

11:50 AM

Disease mechanism discovery by integrating exome and gene Mezlini expression datasets in one graphical model of disease

02:30 PM

Human Traits and Diseases

03:15 PM

Bayesian Gaussian Process latent variable models for pseudotime Campbell inference in single-cell RNA-seq data

03:35 PM

In Silico Design of Synthetic Genes for Total Cell Translation Control: Gonzalez a Multi-output Gaussian Processes approach

Mannerström

11:35 AM

Detecting significant higher-order associations Papaxanthos between genotype and phenotype while conditioning on covariates

11:40 AM

Genome-wide modelling of transcription kinetics reveals patterns of RNA production delays

Honkela

Learning, Inference and Control of Multi-Agent Systems Vicenç Gómez, Gerhard Neumann, Jonathan S Yedidia, Peter Stone 511 a, Sat Dec 12, 08:30 AM

Page 33 of 52

Peer

NIPS 2015 Workshop book

In the next few years, traditional single agent architectures will be more and more replaced by actual multi-agent systems with components that have increasing autonomy and computational power. This transformation has already started with prominent examples such as power networks, where each node is now an active energy generator, robotic swarms of unmaned aerial vehicles, software agents that trade and negotiate on the Internet or robot assistants that need to interact with other robots or humans. The number of agents in these systems can range from a few complex agents up to several hundred if not thousands of typically much simpler entities. Multi-agent systems show many beneficial properties such as robustness, scalability, paralellization and a larger number of tasks that can be achieved in comparison to centralized, single agent architectures. However, the use of multi-agent architectures represents a major paradigm shift for systems design. In order to use such systems efficiently, effective approaches for planning, learning, inference and communication are required. The agents need to plan with their local view on the world and to coordinate at multiple levels. They also need to reason about the knowledge, observations and intentions of other agents, which can in turn be cooperative or adversarial. Multi-agent learning algorithms need to deal inherently with non-stationary environments and find valid policies for interacting with the other agents. Many of these requirements are inherently hard problems and computing their optimal solutions is intractable. Yet, problems can become tractable again by considering approximate solutions that can exploit certain properties of a multi-agent system. Examples of such properties are sparse interactions that only occur between locally neighbored agents or limited information to make decisions (bounded rationality). Goal:

Page 34 of 52

Generated Fri Feb 01, 2019

The fundamental challenges of this paradigm shift span many areas such as machine learning, robotics, game theory and complex networks. This workshop will serve as an inclusive forum for the discussion on ongoing or completed work in both theoretical and practical issues related to the learning, inference and control aspects of multi-agent systems

Bayesian Optimization: Scalability and Flexibility Bobak Shahriari, Ryan Adams, Nando de Freitas, Amar Shah, Roberto Calandra 511 b, Sat Dec 12, 08:30 AM Bayesian optimization has emerged as an exciting subfield of machine learning that is concerned with the global optimization of noisy, black-box functions using probabilistic methods. Systems implementing Bayesian optimization techniques have been successfully used to solve difficult problems in a diverse set of applications. There have been many recent advances in the methodologies and theory underpinning Bayesian optimization that have extended the framework to new applications as well as provided greater insights into the behaviour of these algorithms. Bayesian optimization is now increasingly being used in industrial settings, providing new and interesting challenges that require new algorithms and theoretical insights. At last year’s NIPS workshop on Bayesian optimization the focus was on the intersection of “academia and industry”. Following up on this theme, the workshop this year will focus on scaling existing approaches to larger evaluation budgets, higher-dimensional search spaces, and more complex input spaces. While the computational complexity of common probabilistic regression

NIPS 2015 Workshop book

models used in Bayesian optimization have confined it to relatively low-dimensional problems and small evaluation budgets, there have, in recent years, been several advances in scaling these probabilistic models to more demanding application domains. Furthermore, many applications of Bayesian optimization only make sense when considering concurrent evaluations, which departs from the traditional, strictly sequential Bayesian optimization framework. Recent theoretical and practical efforts have addressed the mini-batch, or parallel, evaluation framework. The goal of this workshop is to bring together advances in scalable and flexible probabilistic modelling, and batch exploration strategies to establish the state of the art in Bayesian optimization capabilities. Specifically, we will invite participants of the workshop to share their experiences and findings in applying Bayesian optimization at new scales and in new application domains. In addition, we wish to attract researchers from the broader scientific community in order to demonstrate the flexibility of Bayesian optimization and invite them to consider including it in their own experimental methodology. The key questions we will discuss are: how to successfully scale Bayesian optimization to large evaluation budgets? How to tackle high-dimensional or complex search spaces? How to apply Bayesian optimization in massive, distributed settings? The target audience for this workshop consists of both industrial and academic practitioners of Bayesian optimization as well as researchers working on theoretical advances in probabilistic global optimization. To this end we have invited many industrial users of Bayesian optimization to attend and speak at the workshop. We expect this exchange of industrial and academic knowledge will lead to a significant interchange of ideas and a clearer understanding of the challenges and successes of Bayesian optimization as a whole.

Page 35 of 52

Generated Fri Feb 01, 2019

A further goal is to encourage collaboration between the diverse set of researchers involved in Bayesian optimization. This includes not only interchange between industrial and academic researchers, but also between the many different sub-fields of machine learning which make use of Bayesian optimization or its components. We are also reaching out to the wider global optimization and Bayesian inference communities for involvement. Schedule

09:00 AM

Bayesian Optimization

Ghahramani, Shahriari

09:30 AM

Safe Exploration for Bayesian Krause Optimization

10:30 AM

Scalable and Flexible Bayesian Optimization for Algorithm Configuration

02:30 PM

Bayesian Optimization and Schneider Embedded Learning Systems

03:00 PM

Applications of Bayesian Optimization to Systems

Deisenroth

04:30 PM

Information based methods for Black-box Optimization

Hoffman

Hutter

Multiresolution methods for large-scale learning

NIPS 2015 Workshop book

Generated Fri Feb 01, 2019

Inderjit S Dhillon, Risi Kondor, Rob Nowak, Mike O'Neil, Nedelina Teneva 511 c, Sat Dec 12, 08:30 AM There is a surge of new work at the intersection of multiresolution/multiscale methods and machine learning: - Multiresolution (wavelets) on graphs is one of the hottest topics in harmonic analysis, with important implications for learning on graphs and semi-spervised learning. - Hierarchical matrices (HODLR, H, H2 and HSS matrices), a very active area in numerical analysis, have also been shown to be effective in Gaussian processes inference. - Scattering networks are a major breakthrough, and combine ideas from wavelet analysis and deep learning. - Multiscale graph models are ever more popular because they can capture important structures in real world networks. - Multiscale matrix decompositions and multiresolution matrix factorizations, mirroring some features of algebraic multigrid methods, are gaining traction in large scale data applications. The goal of this workshop is to bring together leading researchers from Harmonic Analysis, Signal Processing, Numerical Analysis, and Machine Learning, to explore the synergies between all the above lines of work.

Schedule

09:00 AM

Random sampling of bandlimited signals on graphs

09:40 AM

Multiresolution Kondor Matrix Factorization

Page 36 of 52

Vandergheynst

10:30 AM

Beyond Nodes and Edges: Multiresolution Benson Models of Complex Networks

11:00 AM

Challenges in Multiresolution Methods for Graph-based Learning

11:30 AM

Probabilistic Theory Baraniuk of Deep Learning

11:50 AM

Hierarchical Decomposition of Kernel Matrices

March

02:00 PM

Multigrid-inspired Methods for Networks

Safro

02:30 PM

Fast Direct Methods for Gaussian O'Neil Processes

03:00 PM

Approximating Gaussian Garcke Processes with H^2 Matrices

03:20 PM

A Multiresolution Approach for Tensor Factorization

03:40 PM

Multiresolution analysis for the statistical analysis Sibony of incomplete rankings

04:30 PM

Structured Sparsity Bach and convex optimization

Mahoney

Srivastava

NIPS 2015 Workshop book

05:00 PM

Scaling Phenomena Mukherjee in Stochastic Topology

Machine Learning Systems Alex Beutel, Tianqi Chen, Sameer Singh, Elaine Angelino, Markus Weimer, Joseph Gonzalez 511 d, Sat Dec 12, 08:30 AM The broadening use of machine learning, the explosive growth in data, and the complexity of the large-scale learning systems required to analyze these data have together fueled interdisciplinary research at the intersection of Machine Learning and System design. Addressing these challenges demands a combination of the right abstractions -for algorithms, data structures, and interfaces -- as well as scalable systems capable of addressing real world learning problems. At the same time, it is becoming increasingly clear that data-driven and learning-driven approaches provide natural and powerful solutions to building and managing complex modern systems. In total, the flow of ideas between these two communities continues to offer promising opportunities toward solving even larger problems. Designing systems for machine learning presents new challenges and opportunities over the design of traditional data processing systems. For example, what is the right abstraction for data consistency in the context of parallel, stochastic learning algorithms? What guarantees of fault tolerance are needed during distributed learning? The statistical nature of machine learning offers an opportunity for more efficient systems but requires revisiting many of the challenges addressed by the systems and database communities over the past few decades. Machine learning focused developments in

Page 37 of 52

Generated Fri Feb 01, 2019

distributed learning platforms, programming languages, data structures, general purpose GPU programming, and a wide variety of other domains have had and will continue to have a large impact in both academia and industry. As the relationship between the machine learning and systems communities has grown stronger, new research in using machine learning tools to solve classic systems challenges has also grown. Specifically, as we develop larger and more complex systems and networks for storing, analyzing, serving, and interacting with data, machine learning offers promise for modeling system dynamics, detecting issues, and making intelligent, data-driven decisions within our systems. Machine learning techniques have begun to play critical roles in scheduling, system tuning, and network analysis. Through working with systems and databases researchers to solve systems challenges, machine learning researchers can both improve their own learning systems as well impact the systems community and infrastructure at large. This workshop aims to bring together researchers from the two communities and facilitate discussion on the challenges of Machine Learning Systems. This is a successor to the Big Learning workshop, which in past NIPS successfully focused on and brought attention to the need for scaling machine learning. Moving forward, this Machine Learning Systems workshop aims to address two major questions: (1) How should we design useful abstractions and build scalable systems to support large-scale machine learning? (2) How can we use machine learning in our systems to make them smarter and more efficient? Focal points for discussions and solicited submissions include but are not limited to: - Systems for online and batch learning algorithms - Systems for out-of-core machine learning - Implementation studies of large-scale distributed learning algorithms --- challenges faced and lessons

NIPS 2015 Workshop book

Generated Fri Feb 01, 2019

learned - Database systems for Big Learning --- models and algorithms implemented, properties (fault tolerance, consistency, scalability, etc.), strengths and limitations - Programming languages for machine learning - Data driven systems --- learning for job scheduling, configuration tuning, straggler mitigation, network configuration, and security - Systems for interactive machine learning - Systems for serving machine learning models at scale

Schedule

08:50 AM

TensorFlow: A system for machine learning on Dean heterogeneous systems

11:20 AM

Intelligent Services Gonzalez

02:30 PM

Multiworld Testing Bird

03:10 PM

ML and Functional Brandewinder Programming

04:30 PM

Privacy in a Data Driven World

Geambasu

05:10 PM

Scaling Machine Learning

Smola

06:00 PM

Hardware Trends for High Performance Analytics



Page 38 of 52

BigNeuro 2015: Making sense of big neural data Eva Dyer, Joshua T Vogelstein, Konrad Koerding, Jeremy Freeman, Andreas S. Tolias 511 e, Sat Dec 12, 08:30 AM Advances in optics, chemistry, and physics have revolutionized the development of experimental methods for measuring neural activity and structure. Some of the next generation methods for neural recording, promise extremely large and detailed measurements of the brain’s architecture and function. The goal of this workshop is to provide an open forum for the discussion of a number of important questions related to how machine learning can aid in the analysis of these next generation neural datasets. What are some of the new machine learning and analysis problems that will arise as new experimental methods come online? What are the right distributed and/or parallel processing computational models to use for these different datasets? What are the computational bottlenecks/challenges in analyzing these next generation datasets? In the morning, the goal will be to discuss new experimental techniques and the computational issues associated with analyzing the datasets generated by these techniques. The morning portion of the workshop will be organized into three hour-long sessions. Each session will start with a 30 minute overview of an experimental method, presented by a leading experimentalist in this area. Afterwards, we will have a 20 minute follow up from a computational scientist that will highlight the computational challenges associated with the technique. In the afternoon, the goal will be to delve deeper into the kinds of techniques that will be needed to make sense of the data described in the morning. To highlight two computational approaches that we

NIPS 2015 Workshop book

Generated Fri Feb 01, 2019

believe hold promise, we will have two 50 minute long methods talks. These talks will be followed by a scientist with big-data experience outside of neuroscience with the goal of thinking about organization, objectives, and pitfalls. Lastly we will have plenty of time for free form discussion and hold a poster session (open call for poster submissions). We envision that this workshop will provide a forum for computational neuroscientists and data scientists to discuss the major challenges that we will face in analyzing big neural datasets over the next decade.

02:00 PM

Lessons learned from big data projects in cosmology

03:00 PM

Sketching as a tool for numerical linear Woodruff algebra

04:30 PM

Low-dimensional inference with high-dimensional data

Szalay

Baraniuk

Schedule

09:00 AM

Methods overview: High-density Schaefer electrical recordings

Extreme Classification 2015: Multi-class and Multi-label Learning in Extremely Large Label Spaces Manik Varma, Moustapha M Cisse

09:30 AM

Computational discussion: High-density electrical recordings

Koerding

10:20 AM

Methods overview: Studying the function and Tolias structure of microcircuits

11:20 AM

Methods overview: Light field Andalman microscopy

11:50 AM

Computational discussion: Challenges in analyzing large neuroimaging datasets

Sapiro

12:20 PM

Spotlight

Huang, Gray Roncal, Goldstein

Page 39 of 52

511 f, Sat Dec 12, 08:00 AM Extreme classification, where one needs to deal with multi-class and multi-label problems involving an extremely large number of labels, has opened up a new research frontier in machine learning. Many challenging applications, such as photo, video and tweet annotation and web page categorization, can benefit from being formulated as supervised learning tasks with millions of labels. Extreme classification can also lead to a fresh perspective on other learning problems such as ranking and recommendation by reformulating them as multi-class/label tasks where each item to be ranked or recommended is a separate label. Extreme classification raises a number of interesting research questions including those related to: * Large scale learning and distributed and parallel training * Log-time and log-space prediction and prediction

NIPS 2015 Workshop book

on a test-time budget * Label embedding and tree approaches * Crowd sourcing, preference elicitation and other data gathering techniques * Bandits, semi-supervised learning and other approaches for dealing with training set biases and label noise * Bandits with an extremely large number of arms * Fine-grained classification * Zero shot learning and extensible output spaces * Tackling label polysemy, synonymy and correlations * Structured output prediction and multi-task learning * Learning from highly imbalanced data * Dealing with tail labels and learning from very few data points per label * PU learning and learning from missing and incorrect labels * Feature extraction, feature sharing, lazy feature evaluation, etc. * Performance evaluation * Statistical analysis and generalization bounds * Applications to ranking, recommendation, knowledge graph construction and other domains The workshop aims to bring together researchers interested in these areas to foster discussion and improve upon the state-of-the-art in extreme classification. We also aim to bring researchers from the recommender systems, information retrieval, data mining and computer vision communities to discuss real world application scenarios, evaluation metrics, best practices, etc. Several leading researchers will present invited talks detailing the latest advances in the field. We also seek extended abstracts presenting work in progress which will be reviewed for acceptance as spotlight+poster or a talk. The workshop should be of interest to researchers in core supervised learning as well as application domains such as recommender systems, computer vision, computational advertising, information retrieval and

Page 40 of 52

Generated Fri Feb 01, 2019

natural language processing. We expect a healthy participation from both industry and academia.

Quantum Machine Learning Nathan Wiebe, Seth Lloyd 512 a, Sat Dec 12, 08:30 AM Recent strides in quantum computing have raised the prospects that near term quantum devices can expediently solve computationally intractable problems in simulation, optimization and machine learning. The opportunities that quantum computing raises for machine learning is hard to understate. It opens the possibility of dramatic speedups for machine learning tasks, richer models for data sets and more natural settings for learning and inference than classical computing affords. The goal of this workshop is, through a series of invited and contributed talks, survey the major results in this new area and facilitate increased dialog between researchers within this field and the greater machine learning community. Our hope is that such discussion will not only help researchers to fully leverage the promise of quantum machine learning but also address deep fundamental issues such as the question of what learning means in a quantum environment or whether quantum phenomena like entanglement may play a role in modeling complex data sets.

Schedule

08:30 AM

Can Small Quantum Wiebe systems Learn?

NIPS 2015 Workshop book

Generated Fri Feb 01, 2019

Networks in the Social and Information

09:20 AM

Intro to Universal Deep Quantum Learning

10:30 AM

Machine Learning as Rotations (Quantum Deep Learning)

11:10 AM

Quantum models for non-physical data at the example Stark of item recommendation)

11:50 AM

Quantum ML via Matrix Inversion

Rebentrost

12:30 PM

Totally Corrective Boosting with Cardinality Penalization

Denchev

12:45 PM

Quantum-Inspired Graph Matching

Rossi

03:10 PM

Application of quantum annealing Adachi to Training of Deep Neural Networks

03:40 PM

Case Study towards Perdomo-Ortiz deep learning

04:30 PM

Quantum Amin Boltzmann Machine

05:10 PM

Fidelity-optimized quantum state estimation

06:00 PM

Emerging Quantum Processors and why the Machine Neven Learning Community should care

Page 41 of 52

Lloyd

Sciences Edo M Airoldi, David S Choi, Aaron Clauset, Johan Ugander, Panos Toulis

Kapoor

Hen

512 bf, Sat Dec 12, 08:30 AM Problems involving networks and massive network datasets motivate some of the most difficult and exciting inferential challenges in the social and information sciences. Modern network datasets in these areas represent complex relationships with rich information on vertex attributes, edge weights, multiple types of vertices and characteristics, all of which may be changing over time. These datasets are often enormous in size, detail, and heterogeneity, pushing the limits of existing inferential frameworks, while also requiring detailed domain knowledge in order to support useful inferences or predictions. Much progress has been made on developing rigorous tools for analyzing and modeling some types of large real-world social and information network datasets, but often this progress is distributed across disparate applied and theoretical domains. Network analysis is still a young and highly cross-disciplinary field, and the goal of this workshop is to promote cross-pollination between its constituent research communities. In particular, this workshop aims to bring together a diverse and cross-disciplinary set of researchers to discuss recent advances and future directions for developing new network methods in statistics and machine learning. By network methods, we broadly include those models and algorithms whose goal is to learn the patterns of interaction, flow of information, or propagation of effects in social and information systems. We also welcome empirical studies, particularly attempts to bridge observational methods and causal inference, and studies that combine learning, networks, and computational social science. We are also interested in research that unifies the study of both structure and content

NIPS 2015 Workshop book

in rich network datasets. While this research field is already broad and diverse, there are emerging signs of convergence, maturation, and increased methodological awareness. For example, in the study of information diffusion, social media and social network researchers are beginning to use rigorous tools to distinguish effects driven by social influence, homophily, or external processes — subjects historically of intense interest amongst statisticians and social scientists. Similarly, there is a growing statistics literature developing learning approaches to study topics popularized earlier within the physics community, including clustering in graphs, network evolution, and random-graph models. Finally, learning methods are increasingly used in highly complex application domains, such as large-scale knowledge graph construction and use, and massive social networks like Facebook and LinkedIn. These applications are stimulating new scientific and practical questions that often cut across disciplinary boundaries.

Cognitive Computation: Integrating neural and symbolic approaches Artur Garcez, Tarek R. Besold, Risto Miikkulainen, Gary Marcus 512 cg, Sat Dec 12, 08:30 AM

Learning and privacy with incomplete data and weak supervision Giorgio Patrini, Tony Jebara, Richard Nock, Dimitrios Kotzias, Felix Xinnan Yu 512 dh, Sat Dec 12, 08:30 AM

Page 42 of 52

Generated Fri Feb 01, 2019

Can we learn to locate objects in images, only from the list of objects those images contain? Or the sentiment of a phrase in a review from the overall score? Can we tell who voted for Obama in 2012? Or which population strata are more likely to be infected by Ebola, only looking at geographical incidence and census data? Are large corporations able to infer sensitive traits of their customers such as sex preferences, unemployment or ethnicity, only based on state-level statistics? In contrast, how can we publicly release data containing personal information to the research community, while guaranteeing that individuals’ sensitive information will not be compromised? How realistic is the idea of outsourcing machine-learning tasks without sharing datasets but only a few statistics sufficient for training? Despite their diversity, solutions to those problems can be surprisingly alike, as they all play with the same elements: variables without a clear one-to-one mapping, and the search for/the protection against models and statistics sufficient to recover the relevant variables. Aggregate statistics and obfuscated data are abundant, as they are released much more frequently than plain individual-level information; the latter are often too sensitive because of privacy constraints or business value, or too expensive to collect. Learning in those scenarios has been conceptualized, for example, by multiple instance learning, learning from label proportions, and learning from noisy labels, and it is common in a variety of application fields, such as computer vision, sentiment analysis and bioinformatics, whenever labels for single image patches, sentences or proteins are unknown, while higher-level supervision is possible. This problem is not limited to computer science, though. In fact, as natural, social and medical

NIPS 2015 Workshop book

Generated Fri Feb 01, 2019

disciplines have studied the problem of inference from aggregates for a long time, including the so-called ecological inference in political science, econometrics and epidemiology, and the modifiable areal unit problem in spatial statistics. But as those approaches are shown to be effective in practice, to the point that the available statistics reveal sensitive attributes with high accuracy, the question is turned around into a search for privacy guarantees. Traditional statistics has studied the problem of confidential data release. Research in k-anonymity, l-diversity and, more recently, differential privacy has proposed procedures to mask data in a way that one can trade-off protection and usefulness for statistical analysis. Schedule

09:10 AM

Kamalika ChaudhuriChaudhuri

10:30 AM

Nando de Freitas

11:10 AM

Kentaro Minami: $(\varepsilon, \delta)$-differential Minami privacy of Gibbs posteriors

de Freitas

11:30 AM

Yu-Xiang Wang: Learning with differential privacy: stability, learnability Wang and the sufficiency and necessity of ERM principle

11:50 AM

Vishesh Karwa: Private posterior distributions from variational approximations

Page 43 of 52

Karwa

02:30 PM

Wendy Cho : Information Tam Cho Theoretic Research Designs

03:20 PM

Fan Li: Alter-CNN: an approach for learning from label proportions with the Li application to ice-water Classification

03:40 PM

Tao Sun: Message passing for Sun collective graphical models

04:30 PM

Max Ott: Confidential Computing

Ott

Challenges in Machine Learning (CiML 2015): "Open Innovation" and "Coopetitions" Isabelle Guyon, Evelyne Viegas, Ben Hamner, Balázs Kégl 512 e, Sat Dec 12, 08:30 AM Challenges in Machine Learning have proven to be efficient and cost-effective ways to quickly bring to industry solutions that may have been confined to research. In addition, the playful nature of challenges naturally attracts students, making challenge a great teaching resource. Challenge participants range from undergraduate students to retirees, joining forces in a rewarding environment allowing them to learn, perform research, and demonstrate excellence. Therefore challenges can be used as a means of directing research, advancing the state-of-the-art or venturing in completely new domains.

NIPS 2015 Workshop book

Generated Fri Feb 01, 2019

the various actors of challenge organization. Because challenges have become stream line in the execution of Machine Learning projects, it has become increasingly important to regularly bring together workshop organizers, platform providers, and participants to discuss best practices in challenge organization and new methods and application opportunities to design high impact challenges. Following the success of last year's workshop (http://ciml.chalearn.org/), in which a fruitful exchange led to many innovations, we propose to reconvene and discuss the new avenues that have been explored and lay the basis for further developments. We are particularly interested in following progresses made in two conceptually important directions: 1) Open innovation: Organization of contests in which data are made available and the contestants must both formalize and solve a problem (with some constraints), leaving more freedom to creativity, while giving more difficulty to the organizers to objectively assess the results. 2) Coopetitions: Organization of contests encouraging both collaboration and competition, in an effort to make possible the contributions of many towards a the grand goal of solving the overall problem; this poses to the organizers the problem of rewarding partial contributions. We also want to closely follow more technical, albeit important aspects: 3) Platforms: New developments including "code submission" (platforms and protocols permitting code submission, as opposed to result submission, allowing fairer standardized comparisons in terms of hardware utilization and easier reproducibility) and " worksheets" or "scripts" facilitating code sharing. 4) Sharing, dissemination, and recognition: Facilitate sharing resources, including data, means of data collection and annotation, challenge announcements, best practices, challenge templates, publication channels, etc.; creation of awards to recognize academic services rendered by

Page 44 of 52

Scalable Monte Carlo Methods for Bayesian Analysis of Big Data Babak Shahbaba, Yee Whye Teh, Max Welling, Arnaud Doucet, Christophe Andrieu, Sebastian J. Vollmer, Pierre Jacob 513 ab, Sat Dec 12, 08:30 AM

In recent years, there have been ever-increasing demands for data-intensive scientific research. Routine use of digital sensors, high throughput experiments, and intensive computer simulations have created a data deluge imposing new challenges on scientific communities that attempt to process and analyze such data. This is especially challenging for scientific studies that involve Bayesian methods, which typically require computationally intensive Monte Carlo algorithms for their implementation. As a result, although Bayesian methods provide a robust and principled framework for analyzing data, their relatively high computational cost for Big Data problems has limited their application. The objective of this workshop is to discuss the advantages of Bayesian inference in the age of Big Data and to introduce new scalable Monte Carlo methods that address computational challenges in Bayesian analysis. This is a follow up to our recent workshop on Bayesian Inference for Big Data (BIBiD 2015) at Oxford University (https://github.com/BigBayes/bigbayes.github.io/wiki/BIBiD-201 It will consist of invited and contributed talks, poster spotlights, and a poster session. There will be a panel discussion on "Bayesian inference for Big Data" at the end of the session. Topics of interest include (but are not limited to): • Advantages of Bayesian methods in the age of Big Data • Distributed/parallel Markov Chain Monte Carlo

NIPS 2015 Workshop book

(MCMC) • MCMC using mini-batches of data • MCMC using surrogate functions • MCMC using GPU computing • Precomputing strategies • MCMC and variational methods • Geometric methods in sampling algorithms • Hamiltonian Monte Carlo • Sequential Monte Carlo This workshop has been endorsed by ISBA. Young researchers participating in the workshop can apply for an ISBA special Travel Award.

Non-convex Optimization for Machine Learning: Theory and Practice Anima Anandkumar, Niranjan Uma Naresh, Kamalika Chaudhuri, Percy Liang, Sewoong Oh 513 cd, Sat Dec 12, 08:30 AM Non-convex optimization is ubiquitous in machine learning. In general, reaching the global optima of these problems is NP-hard and in practice, local search methods such as gradient descent can get stuck in spurious local optima and suffer from poor convergence. Over the last few years, tremendous progress has been made in establishing theoretical guarantees for many of the non-convex optimization problems. While there are worst-case instances which are computationally hard to solve, focus has shifted in characterizing transparent conditions for cases which are tractable. In many instances, these conditions turn out to be mild and natural for machine learning applications. One area of non-convex optimization which has attracted extensive interest is spectral learning. This involves finding spectral decomposition of matrices and tensors which correspond to moments of a Page 45 of 52

Generated Fri Feb 01, 2019

multivariate distribution. These algorithms are guaranteed to recover a consistent solution to parameter estimation problem in many latent variable models such as topic admixture models, HMMs, ICA, and most recently, even non-linear models such as neural networks. In contrast to traditional algorithms like expectation maximization (EM), these algorithms come with polynomial computational and sample complexity guarantees. Analysis of these methods involves understanding the optimization landscape for tensor algebraic structures. As another example of guaranteed non-convex methods, there has been interest in the problem of dictionary learning, which involves expressing the observed data as a sparse combination of dictionary elements. Recent results have established that both the dictionary and the coefficients can be consistently recovered in the challenging overcomplete case, where the number of dictionary elements can exceed the input dimensionality. There is also interest in analyzing online algorithms for non-convex methods. A recent work has established that the simple stochastic gradient descent (SGD) with appropriately added noise can escape the saddle points and converge to a local optimum in bounded time for a large class of nonconvex problems. This is especially important since non-convex problems usually suffer from an exponential number of saddle points. Finally, recent years have also seen novel applications of non-convex methods with rigorous guarantees. For example, many of these methods have shown great promise in diverse application domains such as natural language processing, social networks, health informatics, and biological sequence analysis. There are certainly many challenging open problems in the area of non-convex optimization.

NIPS 2015 Workshop book

Generated Fri Feb 01, 2019

While guarantees have been established in individual instances, there is no common unifying theme of what makes non-convex problem tractable. Many challenging instances such as optimization for training multi-layer neural networks or analyzing novel regularization techniques such as dropout for non-convex optimization still remain wide open. On the practical side, conversations between theorists and practitioners can help identify what kind of conditions are reasonable for specific applications, and thus lead to the design of practically motivated algorithms for non-convex optimization with rigorous guarantees. This workshop will fill a very important gap in bringing researchers from disparate communities and bridging the gap between theoreticians and practitioners. To facilitate discussion between theorists and practitioners, we aim to make the workshop easily accessible to people currently unfamiliar with the intricate details of these methods. We plan to have an open problems session and a discussion session to spur further research in this area. There will also be invited poster session from top active student researchers in the area to increase quality participation in the workshop.

Provable algorithms for non convex Arora optimization

03:00 PM

Non convex Optimization by Complexity Progression

03:30 PM

Computably Feasible Greedy Algorithms for Neural Nets

03:30 PM

Computably Feasible Greedy Algorithms for Neural Nets --SPEAKER NOT REGISTERED

04:30 PM

Taking it Easy

05:00 PM

Spectral Algorithms for Learning HMMs Chen and Tree HHMs for Epigenetics Data

Mobahi



Black box learning and inference

Schedule

08:40 AM

Opening and Overview

Anandkumar

09:10 AM

Large-Scale Optimization for Deep Learning

LeCun

10:30 AM

When Your Big Data Valiant Seems Too Small

11:00 AM

Convolutional Dictionary Learning Huang through Tensor Factorization

Page 46 of 52

02:30 PM

Josh Tenenbaum, Jan-Willem van de Meent, Tejas Kulkarni, Ali Eslami, Brooks Paige, Frank Wood, Zoubin Ghahramani 513 ef, Sat Dec 12, 08:30 AM Probabilistic models have traditionally co-evolved with tailored algorithms for efficient learning and inference. One of the exciting developments of recent years has been the resurgence of black box methods, which make relatively few assumptions about the model structure, allowing application to broader model families.

NIPS 2015 Workshop book

In probabilistic programming systems, black box methods have greatly improved the capabilities of inference backends. Similarly, the design of connectionist models has been simplified by the development of black box frameworks for training arbitrary architectures. These innovations open up opportunities to design new classes of models that smoothly negotiate the transition from low-level features of the data to high-level structured representations that are interpretable and generalize well across examples. This workshop brings together developers of black box inference technologies, probabilistic programming systems, and connectionist computing frameworks. The goal is to formulate a shared understanding of how black box methods can enable advances in the design of intelligent learning systems. Topics of discussion will include: * Black box techniques for gradient ascent, variational inference, Markov chain- and sequential Monte Carlo. * Implementation of black box techniques in probabilistic programming systems and computing frameworks for connectionist model families. * Models that integrate top-down and bottom-up model representations to perform amortized inference: variational autoencoders, deep latent Gaussian models, restricted Boltzmann machines, neural network based proposals in MCMC. * Applications to vision, speech, reinforcement learning, motor control, language learning.

Schedule

02:30 PM

Variational Auto-Encoders and Kingma Extensions

Page 47 of 52

Generated Fri Feb 01, 2019

02:50 PM

Automatic Differentiation Variational Inference in Stan

Kucukelbir

03:15 PM

Variational Gaussian Process

Tran

03:35 PM

Black Box Policy Search with Probabilistic Programs

van de Meent

04:30 PM

Importance Weighted Autoencoders

Salakhutdinov

Machine Learning From and For Adaptive User Technologies: From Active Learning & Experimentation to Optimization & Personalization Joseph Jay Williams, Yasin Abbasi, Finale Doshi-Velez 514 a, Sat Dec 12, 08:30 AM ***UP TO DATE SCHEDULE is at Website: tiny.cc/mlaihci or https://sites.google.com/site/mlaihci/*** (MLAIHCI – Machine Learning, Artificial Intelligence, Human-Computer Interaction) TENTATIVE SCHEDULE (tiny.cc/mlaihci has UPDATED version) 8:50. Introductions 9:00 Michael Littman, Brown University: "Reinforcement Learning from users: New algorithms and frameworks"

NIPS 2015 Workshop book

Generated Fri Feb 01, 2019

10-10:30 Coffee Break Machine Teaching 10:30 Jerry Zhu, University of Wisconsin Madison: "Machine Teaching as a Framework for Personalized Education" Hoang M. Le, Yisong Yue, & Peter Carr. "Smooth Imitation Learning." [PDF] 11:45-1:30 Lunch. Embedding Algorithms in User Technologies 1:30 John Langford, Microsoft Research: "An Interactive Learning Platform for Making Decisions" Neil Heffernan, Worcester Polytechnic Institute: "Enabling real-time evaluation of crowdsourced machine learning algorithms: Experimentation and Personalization in online math problems on ASSISTments.org" 3:00-4:00 Spotlights & Posters 4-4:30 coffee break 4:30 Ambuj Tewari, Huitian Lei, & Susan Murphy. University of Michigan. "From Ads to Interventions: Contextual Bandit Algorithms for Mobile Health". (NIH application to "Heartsteps")

John Langford, Microsoft Research: "An Interactive Learning Platform for Making Decisions" Neil Heffernan, Worcester Polytechnic Institute: "Enabling real-time evaluation of crowdsourced machine learning algorithms: Experimentation and Personalization in online math problems on ASSISTments.org" Ambuj Tewari, Huitian Lei, & Susan Murphy. University of Michigan. "From Ads to Interventions: Contextual Bandit Algorithms for Mobile Health". (NIH application to "Heartsteps") Bei Peng, James MacGlashan, Robert Loftin, Michael L. Littman, David L. Roberts, & Matthew E. Taylor. "A Need for Speed: Adapting Agent Action Speed to Improve Performance of Task Learning from Turkers." Wei Sun, Anshul Sheopuri, Ying Li, & Thales S. Teixeira. "Cognitive Advertisement Design via Dynamic Bayesian Network." Stefanos Poulis & Sanjoy Dasgupta. "Interactive annotation with feature feedback: from theory to practice." Jens Schreiter, Mona Eberts, Duy Nguyen-Tuong, & Marc Toussaint. "Safe Exploration for Active Learning with Gaussian Process Models."

5:30-6:30 Conclusions & Future Directions PRESENTATIONS Jerry Zhu, University of Wisconsin Madison: "Machine Teaching as a Framework for Personalized Education" Michael Littman, Brown University: "Reinforcement Learning from users: New algorithms and frameworks"

Page 48 of 52

Hoang M. Le, Yisong Yue, & Peter Carr. "Smooth Imitation Learning." Bo Zhang. "Machine Teaching via Simulation Optimization." He He, Paul Mineiro, & Nikos Karampatziakis. "Active Information Acquisition." Adish Singla, Sebastian Tschiatschek, & Anrdreas

NIPS 2015 Workshop book

Krause. "Adaptive Sampling for Noisy Submodular Maximization with Applications to Crowdsourced Image Collection Summarization." Theja Tulabandhula. "Learning Personalized Optimal Control for Repeatedly Operated Systems." WORKSHOP TOPICS How can machine learning be embedded into user technologies to actively guide sampling of data and discovery through interventions, while also automatically optimizing and personalizing for user outcomes? For example: > How do Massive Open Online Courses automatically improve over time to maximize student learning and enjoyment as data is collected? > How can mobile apps minimize negative health behaviors like smoking and overeating by testing what interventions work for which people? > What machine learning methods have been successful versus ineffective in industry applications that trade off exploration and exploitation, such as personalized search and recommendations? A great deal of machine learning already addresses the 'big data' collected from people's use of online technologies, from well known Internet companies to novel settings like online education and health apps. But far less research has analyzed how algorithms can learn about the world more effectively by adapting online user technologies to make decisions about what data is collected, how experiments for causal discovery are conducted, and how to negotiate tradeoffs between exploration versus exploitation in real-time and with limited computational resources. Algorithms in real-world technologies must also

Page 49 of 52

Generated Fri Feb 01, 2019

optimize outcomes for users by adapting and personalizing how technology interacts with people. Ideally, algorithms instantiated in adaptive technologies appropriately trade off exploration – drawing inferences and guiding their learning in real time to do more effective learning and discovery – with exploitation – immediately deploying what is dynamically learned from resource bounded computations to optimize user outcomes through technology personalization. Themes of the workshop will include: > Generalizations and applications of reinforcement learning for real-time policy learning > Contextual and multi-armed bandits for active experimentation and personalization > Selecting and distributing interventions for causal discovery, optimal experimental design > Active, Online, Sequential Machine Learning > Interactive machine learning Around the common theme of algorithms that learn from intervening and collecting data in real-world large-scale online technologies, the workshop will bring together researchers in machine learning as well as statistics, human-computer interaction, education, health, and cognitive science. This allows sharing of critical knowledge about how online technologies can be designed in a way that advances machine learning research, as well as extending the ties between applications of machine learning to online websites and services.

Transfer and Multi-Task Learning: Trends and New Perspectives Anastasia Pentina, Christoph Lampert, Sinno Jialin Pan, Mingsheng Long, Judy Hoffman, Baochen Sun, Kate Saenko 514 bc, Sat Dec 12, 08:30 AM

NIPS 2015 Workshop book

This workshop aims to bring together researchers and practitioners from machine learning, computer vision, natural language processing and related fields to discuss and document recent advances in transfer and multi-task learning. This includes the main topics of transfer and multi-task learning, together with several related variants as domain adaptation and dataset bias, and new discoveries and directions in deep learning based approaches. Transfer and multi-task learning methods aim to better exploit the available data during training and adapt previously learned knowledge to new domains or tasks. This mitigates the burden of human labeling for emerging applications and enables learning from very few labeled examples. In the past years there have been increasing activities in these areas, mainly driven by practical applications (e.g. object recognition, sentiment analysis) as well as state-of-the-art deep learning frameworks (e.g. CNN). Of the recently proposed solutions, most lack joint theoretical justifications, especially those deep learning based approaches. On the other hand, most of the existing theoretically justified approaches are rarely used in practice. This NIPS 2015 workshop will focus on closing the gap between theory and practice by providing an opportunity for researchers and practitioners to get together, to share ideas and debate current theories and empirical results. The goal is to promote a fruitful exchange of ideas across different communities, leading to global advancement of the field. Tentative topics: New perspectives or theories on transfer and multi-task learning Dataset bias and concept drift Domain adaptation Multi-task learning Zero-shot or one-shot learning

Page 50 of 52

Generated Fri Feb 01, 2019

Feature based approaches Instance based approaches Deep architectures for transfer and multi-task learning Transferability of deep representations Transfer across different architectures, e.g. CNN to RNN Transfer across different modalities, e.g. image to text Transfer across different tasks, e.g. recognition and detection Transfer from weakly labeled or noisy data, e.g. Web data Transfer in practical settings, e.g. online, active, and large-scale learning Innovative applications, e.g. machine translation, computational biology Datasets, benchmarks, and open-source packages Schedule

11:40 AM

Learning Representations for Bengio Unsupervised and Transfer Learning

08:50 AM

Intro and Adapting Deep Networks Across Domains, Darrell Modalities, and Tasks

09:00 AM

Learning Shared Representations in Borsa MDPs

09:05 AM

On Weight Ratio Estimation for Covariate Shift

Urner

09:30 AM

The Benefit of Multitask Representation Learning

Pontil

NIPS 2015 Workshop book

Generated Fri Feb 01, 2019

10:30 AM

A Theory of Multiple Mohri Source Adaptation

02:30 PM

Domain Adaptation for Binary Ben-David Classification

02:30 PM

Shai Ben-David

03:00 PM

Multitask Generalized Wang Eigenvalue Program

03:30 PM

Actor-Mimic

05:00 PM

Sharing the "How" Liang (and not the "What")

05:30 PM

Transitive Transfer Yang Learning

Parisotto

Machine Learning and Interpretation in Neuroimaging (day 1) Irina Rish, Leila Wehbe, Brian Murphy, Georg Langs, Guillermo Cecchi, Moritz Grosse-Wentrup 515 a, Sat Dec 12, 08:30 AM

Bayesian Nonparametrics: The Next Generation Tamara Broderick, Nick Foti, Aaron Schein, Alex Tank, Hanna Wallach, Sinead Williamson 515 bc, Sat Dec 12, 08:30 AM In theory, Bayesian nonparametric (BNP) methods are perfectly suited to the modern-day, large data sets that arise in the physical, natural, and social sciences, as well as in technology and the Page 51 of 52

humanities. By making use of infinite-dimensional mathematical structures, Bayesian nonparametric statistics allows the complexity of a learned model to grow as the size of a data set grows---exhibiting desirable Bayesian regularization properties for small data sets and allowing the practitioner to learn ever more from data sets as they become larger. This flexibility, however, presents both computational and modeling challenges. While there have been recent developments in accelerated inference for Bayesian nonparametric models, many approaches are not appropriate for large datasets. Further, while we have seen a growth in models for applied problems that move beyond the foundational Dirichlet and Gaussian processes, the widespread adoption of BNP methods has been limited in applied fields. In this workshop, we will address the modeling, theoretical, and computational challenges limiting adoption and how they can be circumvented. In particular, we will engage with applications specialists to better understand the best directions for BNP development as a tool for conducting applied research. We will explore computational tools for posterior inference algorithms that address the unique challenges of BNP methods including high/infinite-dimensionality and flexibility: e.g., MCMC, SMC, variational methods, and small-variance asymptotics to name a few. We will also consider the design and implementation of software to perform Bayesian nonparametric analyses, both for detailed use by experts in the field and for automatic use by researchers outside the field. This workshop will bring together core researchers in BNP across a number of fields (machine learning, statistics, engineering, applied mathematics, etc.) with researchers working in a variety of application domains. We aim to focus on the next generation of BNP research by highlighting the contributions of younger researchers in the BNP community. We

NIPS 2015 Workshop book

Generated Fri Feb 01, 2019

anticipate that participants will leave the workshop with (i) a foundation for understanding BNP methods, (ii) a perspective on recent advances in the field via a number of invited and contributed talks as well as poster presentations, and (iii) an idea of the challenges facing the field and future opportunities via talks and a panel discussion featuring experts both within and outside of the BNP community. Schedule

08:30 AM

Random Measure Priors and Tractability

Orbanz

09:00 AM

Non-standard approaches to nonparametric Bayes

Miller

09:30 AM

Edge-exchangeable graphs, sparsity, Cai and power laws

09:45 AM

The general class of (sparse) random graphs arising from Veitch exchangeable point processes

03:30 PM

Bayesian Cluster Analysis: Point Estimation and Credible Balls

04:00 PM

Mondrian Forests for Large-Scale Lakshminarayanan regression when uncertainty matters

04:15 PM

Parallel Markov Chain Monte Carlo Zhang for the Indian Buffet Process

Page 52 of 52

Wade

05:00 PM

Large Scale Topic Models & Nonparametric Bayesian Models

05:30 PM

Modeling the Dynamics of Online Valera Learning Activity

06:00 PM

Looking for what you can't see: exploring subgroups using model-based clustering

Ahmed

Moser

Smile Life

When life gives you a hundred reasons to cry, show life that you have a thousand reasons to smile

Get in touch

© Copyright 2015 - 2024 PDFFOX.COM - All rights reserved.