What if Speech Could Unlock Early Detection of Alzheimer’s Disease?

In the early stages, Alzheimer's symptoms can be subtle and mistaken for normal age-related cognitive changes. While many people and their families may not recognize the signs as indicative of a serious cognitive disorder, early diagnosis can result in better quality of life and more effective treatment for those with Alzheimer’s disease.

The key to unlocking early diagnosis of Alzheimer’s disease may lie in our speech. Ongoing advances and research have shown subtle changes in speech patterns may precede noticeable cognitive symptoms. Furthermore, the ubiquity of smartphones, tablets and smart-home devices offers the potential to capture speech data that could be used to predict or monitor cognitive decline passively and securely without the need for a visit to the doctor’s office or invasive testing.

SpeechDx is working to enable speech biomarkers

SpeechDx is a pre-competitive initiative of leading researchers from academia and industry. Our mission is to facilitate the development of speech-based biomarkers to identify potential Alzheimer’s disease and related dementias (ADRD) early on. We are building a broad and deep dataset of speech data across multiple languages that is longitudinal and linked to individuals who are clinically well-characterized. This initiative will provide researchers with high-quality, harmonized voice and clinical data to develop early and accurate diagnostic technology for ADRD.

Our vision of speech biomarkers to transform Alzheimer’s care

By building the world’s largest repository of longitudinal, harmonized speech and clinical data, SpeechDx is working to overcome the challenges and realize the promise of speech biomarkers. We envision a future where we can illuminate the path to accurate early detection of ADRD and ultimately, better care for those with Alzheimer’s disease.

Contact us to join the initiative

SpeechDx is building the world’s largest dataset for developing and validating ADRD speech biomarkers. Contact us to join the partnership and gain access to SpeechDx data.

speechdxteam@alzdiscovery.org

About the SpeechDx Study

The SpeechDx study is a multi-site, observational study of 2,650 participants across several global clinical sites (details below). SpeechDx participants are diverse, well-characterized, and span the brain health spectrum. On a quarterly basis, each participant generously provides three years of high-quality speech data, and this speech data is paired with in-depth clinical data and harmonized across SpeechDx partner sites.

Site Principal Investigator Primary Language Cognitively Normal Subjective Cognitive Decline Mild Cognitive Impairment Alzheimer's disease Other (Frontotemporal degeneration, Parkinson's disease, not yet classified) Total
Boston University Dr. Rhoda Au English 27 74 50 22 27 200
Emory University Dr. Felicia Goldstein English 225 113 45 67 450
Barcelona Brain Health Initiative Dr. Javier Solana Sanchez Spanish, Catalan 360 140 500
Barcelona Beta Brain Research Centre Dr. Andreea Rădoi Spanish, Catalan 141 249 10 0 0 400
Bogalusa Heart Study Dr. Ileana De Anda-Duran English 300 80 20 400
Ace Alzheimer Center Dr. Sergi Valero Ventura Spanish, Catalan 6 230 164 400
Other (TBA) TBA English 150 50 10 90 300
All 1203 549 473 241 184 2650

Data Collection and Harmonization

SpeechDx collects two types of data—speech data and clinical data. These two types of data are paired and harmonized across participating sites.

Speech data

Participants contribute up to three years of speech data quarterly via the SpeechDx app. The SpeechDx app battery comprises a total of nine speech-eliciting tasks and baseline assessments:

Baseline assessments

  • Questionnaire (PHQ-8)
  • Sleepiness scale (Karolinska)
  • Vigilance assessment (PVT)

Speech-eliciting tasks

  • Picture description 1
  • Open-ended questions (3x)
  • Picture description 1 recall
  • Picture description 2
  • Story recall task
  • Storytelling task
  • Storytelling task recall

Clinical data

SpeechDx participants are thoroughly characterized via regular clinical visits over the three-year duration of the study. Each SpeechDx participant undergoes thorough neuropsychological testing, MRI imaging, and has defined blood-biomarker amyloid status. Many participants additionally undergo PET imaging and have defined cerebrospinal fluid status.

Data journey

The privacy of our participating subjects is paramount, and SpeechDx protects and deidentifies all participant data.

Speech data will be collected from each subject via a study-provided tablet once every quarter for the duration of the study. This data will be encrypted, but may contain identifiable voice data (for example, if a subject discloses their name or other personal information in the recording). To remove any personal identifying information (PII), the encrypted voice data will be transferred to a secure backend server where any PII is manually spliced out of the recording, resulting in pseudonymized voice data free of any PII that may inadvertently have been shared by the participant. This voice data will then be securely transferred to the AD Curation Studio hosted by the Alzheimer’s Disease Data Initiative (ADDI).

Within the AD Curation Studio, clinical sites will also contribute their pseudonymized clinical data. This data will be paired with the corresponding voice data and harmonized across sites.

A committee will review and approve any proposed use of the data to ensure protection of this data and compliance with ethical and legal requirements. Approved researchers will receive access to a protected, controlled environment within the AD Curation Studio to access and analyze the full, harmonized SpeechDx dataset. Data is not downloadable or otherwise exported from the AD Curation Studio, and no audio reproduction software is allowed to be installed on the workspace.

Join Us to Become a SpeechDx Partner

Be part of groundbreaking Alzheimer's research! SpeechDx will make data available to qualified researchers and institutions who wish to develop speech-based biomarker tools for ADRD by applying to join the SpeechDx initiative.

Contact us to join the partnership and gain access to SpeechDx data: speechdxteam@alzdiscovery.org

Our SpeechDX Partners

FAQ

SpeechDx is a partnership of researchers from academia and industry dedicated to advancing speech biomarkers for early Alzheimer's disease diagnosis.

The goal of SpeechDx is to facilitate the development of accurate and early speech-related diagnostic technology for Alzheimer's disease and related dementias (ADRD).

SpeechDx is for researchers, companies and other entities interested in contributing to Alzheimer's disease diagnosis and research through speech biomarker development.

To participate in SpeechDx, researchers and companies can apply to access the SpeechDx dataset for the purpose of speech biomarker development. SpeechDx is not currently enrolling additional participants wishing to contribute data.

To participate in SpeechDx, researchers and companies can apply to access the SpeechDx dataset for the purpose of speech biomarker development. Applications will be assessed based on alignment with SpeechDx’s mission and to ensure that all ethical requirements are met. Given the large philanthropic investment made for this study, ADDF will negotiate investment terms in line with its venture philanthropy model of investing in exchange for non-exclusive access to this dataset.

SpeechDx is not currently recruiting for new subjects or participating sites, but we welcome you to get in touch with us for more information at speechdx@alzdiscovery.org.

SpeechDx members will have access to high-quality, longitudinal paired speech and clinical data from approximately 2,600 individuals, which are harmonized across global sites.

Patient data will be safeguarded, deidentified, and handled in accordance with ethical and privacy standards to ensure confidentiality. Speech data will be encrypted but may contain identifiable voice data. To remove any personal identifying information (PII), the encrypted voice data will be transferred to a secure backend where any PII is manually spliced out of the recording, resulting in pseudonymized voice data. This voice data will then be securely transferred to the AD Curation Studio hosted by the Alzheimer’s Disease Data Initiative (ADDI). Within the AD Curation Studio, clinical sites will also contribute their pseudonymized clinical data.

No – this data was collected for the specific purpose of speech-based analyses. Other use cases may violate the ethical conditions for data collection and therefore, will not be considered. Further details about use of this dataset can be obtained through SpeechDx’s guidelines and approvals.

The first set of data will become available to SpeechDx partners in Q1 2025. Data will then be released in batches quarterly as longitudinal data collection continues until the end of the study (anticipated in Q4 2027).