Big Data Neuroscience 2020 Workshop: Organized by the Advanced Computational Neuroscience Network (ACNN)

Big Data Neuroscience Workshop 2020: Organized by the Advanced Computational Neuroscience Network (ACNN)

Recording of the 2020 Workshop

September 4, 2020


Date/Time: Friday, Sept 4, 2020, 1-5 PM US Eastern Time (GMT-4)
Registration (free): Registration Webform
Submit Title/Abstract: Enter your meta-data in this GSheet if you like to present a short 2-3-min rapid-fire (lightning) talk in one of 2 sessions. Students, trainees, fellows and junior investigators are strongly encouraged to apply for a lightning talk by submitting all the required meta-data.
URL to join the symposium: A ZOOM URL will be provided with your registration.


The Advanced Computational Neuroscience Network (ACNN) is a transinsitutional collaboration of many US Universities in the Midwest region. ACNN strives to build a broad consensus on the core requirements, infrastructure, and components needed to develop a new generation of sustainable interdisciplinary Big Neuroscience research. ACNN forges new collaborations to establish standards using neuroscience-focused ontologies, incorporate provenance metadata management, aggregate tools, index resources and repositories, and curate and share validated pipeline workflow. Many ACNN partners provide strong support to the creation of a sustainable neuroscience community that can effectively address the challenges of neuroscience Big Data and leverage the unique resources as well as human capital. A particular strength of the ACNN is the existing array of neuroscience datasets, analytical tools, and provenance metadata platforms bringing together common data sharing principles and interoperability platforms for the neuroscience research community.

ACNN 2020 (Virtual Symposium): Big Data Neuroscience
The 2020 annual ACNN symposium will take place virtually over ZOOM. This interdisciplinary and trans-institutional workshop will continue to promote the development of common practices and standardization to make it easier for neuroscience researchers to annotate and process data; to share data, tools and protocols, and to work with distributed high-performance computing environments. The symposium will feature keynote presentations from world-renowned speakers, hands-on demonstrations, rapid-fire community presentations, and an open-mic networking discussion. Each year, ACNN brings together members of the Midwest, national, and the global neuroscience research community to promote data reuse, aggregation, result validation and new discoveries in neuroscience.


1:00-1:30 PM Keynote 1 (Paul Thompson/USC) (20min + 10min discussion): ENIGMA, Big Data & the Human Brain: Imaging & Genomics of Brain Diseases in 100,000 Individuals from 45 Countries
1:30-1:55 PM Session 1 RAPID-FIRE demos/presentations 8x at 2-3-min ~ 30-min (Moderator: John Marcotte)
2:00-2:30 PM Hands-on tutorials (15-min each) ~ 30 min
2:30-3:00 PM Keynote 2 (Aina Puce/Indiana) (20-min + 10min discussion): Multimodal imaging datasets in single subjects: integrating MEEG data into
3:00-3:30 PM Hands-on tutorials (15-min each) ~ 30 min
3:30-3:55 PM Session 2 RAPID-FIRE demos/presentations 8x at 2-3-min ~ 30-min (Moderator: John Marcotte)
4:00-4:30 PM Keynote 3 (Tal Yarkoni/UT-Austin) (20min + 10min discussion): Towards large-scale synthesis and re-use of fMRI data: Neurosynth, NeuroScout and friends.
4:30-4:50 PM Keynote 4 (JB Poline/McGill University) (15min + 5min discussion): Necessary elements for the next generation of neuroscience infrastructures: the CONP and NeuroHub examples.
4:50 PM QA session with POs. Moderator: Lei Wang
NSF/NIH POs may present 2-5-min of ACNN-appropriate PA/Opportunities/Challenges.
5:15 PM Concluding Remarks. Adjourn.

2020 Symposium Evaluation Webform (all participants are strongly encouraged to provide constructive feedback).

Keynote Abstracts

Paul Thompson

ENIGMA, Big Data & the Human Brain: Imaging & Genomics of Brain Diseases in 100,000 Individuals from 45 Countries

Since 2009, the ENIGMA Consortium has published the largest genetic studies of the human brain, and the largest neuroimaging studies of 9 brain disorders, pooling data from 45 countries. Building on large-scale genetic studies yielding over 200 robustly replicated genetic loci associated with brain metrics (Grasby et al., Science 2020), ENIGMA conducted the largest neuroimaging studies to date in schizophrenia, bipolar disorder, major depressive disorder, post-traumatic stress disorder, substance use disorders, obsessive-compulsive disorder, ADHD, ASD, epilepsy, and 22q11.2 deletion syndrome, revealing factors that influence their onset, severity, and prognosis. We describe how consensus protocols evolved for MRI, DTI, resting state fMRI, and EEG, as well as clinical, genomic and epigenetic data, leading to vast worldwide studies of over 22 disorders and conditions. Innovations in data analysis include the use of generative adversarial networks, algorithmic fairness, and information theory to help pool data from many sites, and distributed computation and cooperative machine learning to help discover patterns in distributed datasets. We also describe new approaches to harmonize data for multi-site computations, and how the assumptions of the approaches affect the results. Finally, we cover how deep learning and data fusion methods may help to identify patterns in multimodal brain data distributed in biobanks across the world.

PMT is funded in part by NIH grants U54 EB020403, R01MH116147, R56AG058854, P41 EB015922, R01MH111671 and a Zenith Grant from the Alzheimer’s Association.

Tal Yarkoni

Towards large-scale synthesis and re-use of fMRI data: Neurosynth, NeuroScout and friends

Cognitive neuroscientists have been generating large amounts of functional MRI data for nearly three decades now. Despite the considerable cost of such data, most newly-acquired datasets are still used to produce only one or two publications before being archived forever. In this talk, I discuss a number of recent efforts to improve the efficiency of fMRI research by facilitating the public deposition, sharing, re-analysis, and synthesis of fMRI datasets. I introduce resources like NeuroVault, OpenNeuro, Neurosynth, and NeuroScout, and demonstrate how researchers can now conduct high-quality, scalable, efficient, and reproducible research on a wide range of questions involving the human brain, without having to acquire new fMRI data.

Aina Puce, IU

Multimodal imaging datasets in single subjects: integrating MEEG data into

Existing USA and French collaborators will expand the functionality and user base of, a cloud-based platform devoted to storage, curation, analysis, sharing and publication of neuroimaging data. Currently, users interact and analyze magnetic resonance imaging [MRI] based data, performing brain structure and function analyses. Here, I discuss expanding to handle human neurophysiology for the first time – specifically magneto- encephalographic and electroencephalographic [MEEG] data. MEEG’s high temporal resolution enhances studies of brain function in ways that MRI-based brain activity data cannot. We will implement data analysis ‘Apps’ on that will allow users to perform both basic and sophisticated brain network analyses – by integrating MEEG and MRI-based data. MEEG ‘Apps’ will use 2 widely-used open source MEEG software suites – FieldTrip [MATLAB-based] and MNE Python [Python-based]. We have the endorsement of the developers of these packages and expertise in our team to expand‘s functionality. We created 4 MEEG projects that will also make scientific gains in computational, systems and cognitive-social neuroscience. Project 1 will deal with basic MEEG (pre)processing, targeting new users of Project 2 will provide simulation tools for evaluating required statistical power in a MEG experiment prior to running the study – benefitting both entry-level and sophisticated users. Projects 3 and 4 target more mid-level and experienced MEEG scientists. Project 3 will provide tools for source modelling of MEEG data, and multimodal datasets in single subjects [from data recordings made in both USA and French laboratories from the 4 Co-PIs]. Finally, Project 4 will integrate MEEG data with white matter tract data in the human brain [based on structural MRI and diffusion weighting imaging [DWI] data]. This integrative analysis has already been generated in our existing collaboration.

JB Poline

Necessary elements for the next generation of neuroscience infrastructures: the CONP and NeuroHub examples

While there exist a large number of neuroscience infrastructures, few of these are adapted to big data made of a large resource such as the UK Biobank or a collection of smaller datasets such as OpenNeuro. In this talk, I will consider a "gap analysis" of the neuroscience infrastructures, and will take two examples from McGill university (the Canadian Open Neuroscience Platform and the NeuroHub platform) as well as others that answer some of these identified gaps.

Rapid Fire Talk Slides

    1. Knutson
    2. Deng
    3. Pascucci
    4. Levitas
    5. Silva
    6. Schiavi
    7. Kruper
    8. Berto
    1. Lee
    2. Fischer
    3. Gao
    4. Vinci-Booher
    5. -
    6. Lander
    7. Shen
    8. Li


This event is free. You can register at this Registration Webform.