Context and Compositionality in Biological and Artificial Neural Systems

Context and Compositionality in Biological and Artificial Neural Systems

NeurIPS 2019 Workshop, Vancouver, Canada

December 14th, 2019

Google Group


  1. Gary Marcus - Deep Understanding: The Next Challenge for AI (starts at 9:55)
  2. Gina Kuperberg - How probabilistic is language prediction in the brain? Insights from multimodal neuroimaging studies (starts at 54:36)
  3. Spotlight talks, incl. Paul Soulos - Discovering the Compositional Structure of Vector Representations with Role Learning Networks (starts at 0:00), Robert Kim - Recurrent Neural Networks as a Model to Probe Neuronal Timescales Specific to Working Memory (starts at 11:06), & Maxwell Nye - Learning Compositional Rules via Neural Program Synthesis (starts at 20:32)
  4. Tom Mitchell - Understanding Neural Processes: Beyond Where, and When, to How (starts at 30:14)
  5. Yoshua Bengio - Towards compositional understanding of the world by agent-based deep learning
  6. Ev Fedorenko - Composition as the Core Driver of the Human Language System
  7. Panel discussion incl. Paul Smolensky, Ev Fedorenko, Jacob Andreas, Kenton Lee, Gary Marcus, Yoshua Bengio, & Gina Kuperberg (chaired by Ted Willke)


The ability to integrate semantic information across narratives is fundamental to language understanding in both biological and artificial cognitive systems. In recent years, enormous strides have been made in NLP and Machine Learning to develop architectures and techniques that effectively capture these effects. The field has moved away from traditional bag-of-words approaches that ignore temporal ordering, and instead embraced RNNs [1][2][3][4], Temporal CNNs [5] and Transformers [6], which incorporate contextual information at varying timescales. While these architectures have lead to state-of-the-art performance on many difficult language understanding tasks [7],[8], it is unclear what representations these networks learn and how exactly they incorporate context. Interpreting these networks, systematically analyzing the advantages and disadvantages of different elements, such as gating or attention, and reflecting on the capacity of the networks across various timescales are open and important questions.

On the biological side, recent work in neuroscience suggests that areas in the brain are organized into a temporal hierarchy in which different areas are not only sensitive to specific semantic information [9] but also to the composition of information at different timescales [10][11]. Computational neuroscience has moved in the direction of leveraging deep learning to gain insights about the brain [12][13]. By answering questions on the underlying mechanisms and representational interpretability of these artificial networks, we can also expand our understanding of temporal hierarchies, memory, and capacity effects in the brain.

In this workshop we aim to bring together researchers from machine learning, NLP, and neuroscience to explore and discuss how computational models should effectively capture the multi-timescale, context-dependent effects that seem essential for processes such as language understanding. We believe that this will lead to both a deeper understanding of biological language systems, as well as improved artificial systems that leverage these insights to better understand language.

Important Dates

Name Date
Paper Submission Deadline Update: September 18th, 2019 (23:59 hours Anywhere on Earth) Previous: September 9th, 2019
Final Decisions September 30th, 2019
Camera Ready November 15th, 2019
Workshop Date December 14th, 2019


Date: Saturday December 14th, 2019

Room: West 217 - 219

Time Event
08:00 AM Opening Remarks
08:15 AM Gary Marcus - Deep Understanding: The Next Challenge for AI [Slides]
09:00 AM Gina Kuperberg - How Probabilistic is Language Comprehension in the Brain? Insights from Multimodal Neuroimaging Studies [Slides]
09:45 AM Poster Session + Break
10:30 AM Paul Soulos - Uncovering the Compositional Structure of Vector Representations with Role Learning Networks [Slides]
10:40 AM Robert Kim - Spiking Recurrent Networks as a Model to Probe Neuronal Timescales Specific to Working Memory [Slides]
10:50 AM Maxwell Nye - Learning Compositional Rules via Neural Program Synthesis [Slides]
11:00 AM Tom Mitchell - Understanding Neural Processes: Getting Beyond Where and When, to How [Slides]
12:00 PM Poster Session + Lunch
02:00 PM Yoshua Bengio - Towards Compositional Understanding of the World by Agent-Based Deep Learning [Slides]
03:00 PM Ev Fedorenko - Composition as the Core Driver of the Human Language System
03:30 PM Break
04:00 PM Panel Discussion: Ev Fedorenko, Kenton Lee, Paul Smolensky [Slides], Jacob Andreas [Ask a Question!]
05:30 PM Closing remarks

Invited Speakers

Yoshua Bengio

MILA - University of Montreal

Gina Kuperberg

Tufts University and Massachusetts General Hospital

Tom Mitchell

Carnegie Mellon University

Gary Marcus

New York University and Robust.AI

Paul Smolensky

Johns Hopkins University and Microsoft Research

Jacob Andreas

Massachusetts Institute of Technology


Javier Turek

Intel Labs

Alex Huth

The University of Texas at Austin

Shailee Jain

The University of Texas at Austin

Chris Honey

Johns Hopkins University

Tal Linzen

Johns Hopkins University

Emma Strubell

Facebook and Carnegie Mellon University

Leila Wehbe

Carnegie Mellon University

Kyunghyun Cho

Facebook and New York University

Alan Yuille

Johns Hopkins University

Accepted Papers

  1. Self-Organization of Action Hierarchy and Compositionality by Reinforcement Learning with Recurrent Networks. Dongqi Han, Kenji Doya, Jun Tani
  2. Multi-Context Term Embeddings: the Use Case of Corpus-based Term Set Expansion. Jonathan Mamou, Oren Pereg, Moshe Wasserblat, Ido Dagan
  3. Learning Compositional Rules via Neural Program Synthesis. Maxwell Nye, Armando Solar-Lezama, Joshua Tenenbaum, Brenden Lake
  4. Spiking Recurrent Networks as a Model to Probe Neuronal Timescales Specific to Working Memory. Robert Kim, Terry Sejnowski
  5. Localizing Occluders with Compositional Convolutional Networks. Adam Kortylewski, Qing Liu, Huiyu Wang, Zhishuai Zhang, Alan Yuille
  6. Why Attention? Analyzing and Remedying BiLSTM Deficiency in Modeling Cross-Context for NER. Peng-Hsuan Li, Tsu-Jui Fu, Wei-yun Ma
  7. A crossover code for high-dimensional composition. Rich Pang
  8. Radically Compositional Cognitive Concepts. Toby St Clere Smithe
  9. A memory enhanced LSTM for modeling complex temporal dependencies. Sneha Aenugu
  10. Towards Generation of Visual Attention Map for Source Code. Takeshi D Itoh, Takatomi Kubo, Kiyoka Ikeda, Yuki Maruno, Yoshiharu Ikutani, Hideaki Hata, Kenichi Matsumoto, Kazushi Ikeda
  11. Long-Distance Dependencies don’t have to be Long: Simplifying through Provably (Approximately) Optimal Permutations. Rishi Bommasani
  12. A representational asymmetry for composition in the human left-middle temporal gyrus. Steven Frankland
  13. Uncovering the compositional structure of vector representations with Role Learning Networks. Paul Soulos, R. Thomas Mccoy, Tal Linzen, Paul Smolensky
  14. Modelling the N400 brain potential as change in a probabilistic representation of meaning. Milena Rabovsky, Steven Hansen, James McClelland
  15. Interpreting and improving natural-language processing (in machines) with natural language-processing (in the brain). Mariya Toneva, Leila Wehbe
  16. Enhancing the Transformer with Explicit Relational Encoding for Math Problem Solving. Imanol Schlag, Paul Smolensky, Roland Fernandez, Nebojsa Jojic, Jianfeng Gao
  17. Inducing brain-relevant bias in natural language processing models. Dan Schwartz , Mariya Toneva, Leila Wehbe
  18. Sparse associative memory based on contextual code learning for disambiguating word senses. Max Raphael Sobroza Marques, Tales Marra, Deok-Hee Dufor, Claude Berrou
  19. Structured Sparsification of Gated Recurrent Neural Networks. Ekaterina Lobacheva, Nadezhda Chirkova, Aleksandr Markovich, Dmitry Vetrov
  20. Compositionality as Directional Consistency in Sequential Neural Networks. Najoung Kim, Tal Linzen
  21. Modelling Working Memory using Deep Recurrent Reinforcement Learning. Pravish Sainath, Pierre Bellec, Guillaume Lajoie
  22. Aging Memories Generate More Fluent Dialogue Responses with Memory Augmented Neural Networks. Omar U Florez, Erik Muller
  23. On Compositionality in Neural Machine Translation. Vikas Raunak, Vaibhav Kumar, Florian Metze, Jamie Callan
  24. Learning to Control Latent Representations on an External Memory for Few-Shot Learning. Omar U Florez, Erik Muller
  25. Decoding Affirmative and Negated Action-Related Sentences in the Brain with Distributional Semantic Models. Vesna Djokic

Call for Papers

Submit at:

We will consider the following (non-exhaustive) list of topics for contribution:

  • Contextual sequence processing in the human brain
  • Compositional representations
  • Compositional representations
  • Systematic generalization in deep learning
  • Compositionality in human intelligence
  • Compositionality in natural language
  • Understanding composition and temporal processing in neural network models
  • New approaches to compositionality and temporal processing in language
  • Hierarchical representations of temporal information
  • Datasets for contextual sequence processing
  • Applications of compositional neural networks to real-world problems

Formatting Instructions: All submissions must be in PDF format. Submissions are limited to four content pages, including all figures and tables; additional pages containing only references are allowed. You must format your submission using the NeurIPS 2019 LaTeX style file. Submissions that violate the NeurIPS style (e.g., by decreasing margins or font sizes) or page limits may be rejected without further review. All submissions should be anonymous.

Accepted papers will be presented during a poster session, with spotlight oral presentations for exceptional submissions. The accepted papers will be made publicly available as non-archival reports, allowing future submissions to archival conferences or journals.

The review process is double-blind. We also welcome published papers that are within the scope of the workshop (without re-formatting). Already-published papers do not have to be anonymous. They are eligible for poster sessions and will only have a very light review process.

Poster Instructions

Posters should be no larger than 36W x 48H inches or 90 x 122 cm (portrait). Also, posters should be on light weight paper, not laminated. They will be taped to the wall with the special tabs that we will supply.

Please redirect questions and all future correspondence to

Program Committee

  • Abhijit Mahabal, Pinterest
  • Cassandra Jacobs, University of Wisconsin
  • Chris Baldassano, Columbia University
  • Cory Shain, The Ohio State University
  • Emily Mugler, Facebook
  • Evelina Fedorenko, Massachusetts Institute of Technology
  • Guangyu Robert Yang, Columbia University
  • Jascha Sohl-Dickstein, Google Brain
  • Jun Tani, Okinawa Institute of Science and Technology Graduate University
  • Katrin Erk, University of Texas
  • Kenji Doya, Okinawa Institute of Science and Technology
  • Liberty Hamilton, The University of Texas at Austin
  • Samuel Bowman, NYU and Google Research
  • Shimon Edelman, Cornell University
  • Srini Narayanan, Google AI Language
  • Vy Vo, Intel Labs




  1. S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural Comput., 9(8):1735–1780, Nov. 1997.
  2. J. Chung, C. Gulcehre, K. Cho, and Y. Bengio. Gated feedback recurrent neural networks. Proceedings of the 32nd International Conference on Machine Learning, volume 37 of Proceedings of Machine Learning Research, pages 2067–2075, Lille, France, 07–09 Jul 2015.
  3. S. Chandar, C. Sankar, E. Vorontsov, S. E. Kahou, and Y. Bengio. Towards non-saturatingrecurrent units for modelling long-term dependencies.arXiv preprint arXiv:1902.06704, 2019.
  4. J. Chung, S. Ahn, and Y. Bengio. Hierarchical multiscale recurrent neural networks. In ICLR, 2017.
  5. S. Bai, J. Z. Kolter, and V. Koltun. An empirical evaluation of generic convolutional andrecurrent networks for sequence modeling.CoRR, abs/1803.01271, 2018.
  6. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. u. Kaiser, andI. Polosukhin. Attention is all you need. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach,R. Fergus, S. Vishwanathan, and R. Garnett, editors,Advances in Neural Information ProcessingSystems 30, pages 5998–6008. Curran Associates, Inc., 2017.
  7. M. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, and L. Zettlemoyer. Deepcontextualized word representations. InProceedings of the 2018 Conference of the North Amer-ican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 2227–2237, New Orleans, Louisiana, June 2018. Association for Computational Linguistics.
  8. J. Devlin, M. Chang, K. Lee, and K. Toutanova. BERT: pre-training of deep bidirectionaltransformers for language understanding.CoRR, abs/1810.04805, 2018.
  9. A. G. Huth, W. A. de Heer, T. L. Griffiths, F. E. Theunissen, and J. L. Gallant. Natural speech reveals the semantic maps that tile human cerebral cortex. Nature, 532(7600):453–458, 2016.
  10. C. Baldassano, J. Chen, A. Zadbood, J. W. Pillow, U. Hasson, and K. A. Norman. Discovering event structure in continuous narrative perception and memory. Neuron, 95(3):709 – 721.e5,2017.
  11. K. D. Himberger, H.-Y. Chien, and C. J. Honey. Principles of temporal processing across thecortical hierarchy. Neuroscience, 389:161 – 174, 2018. Sensory Sequence Processing in the Brain.
  12. L. Wehbe, A. Vaswani, K. Knight, and T. Mitchell. Aligning context-based statistical models of language with brain activity during reading. In EMNLP, 2014.
  13. S. Jain and A. Huth. Incorporating context into language encoding models for fmri. In Advances in Neural Information Processing Systems 31, pages 6628–6637, 2018.
  14. Y. Lerner, C. J. Honey, L. J. Silbert, and U. Hasson. Topographic Mapping of a Hierarchy of Temporal Receptive Windows Using a Narrated Story. Journal of Neuroscience 23 February 2011, 31 (8) 2906-2915.