[Type here]
DIGITAL VOICE CAPTURE
MANUAL FOR IN-PERSON
COGNITIVE TESTING
This manual is published by the ADRC Clinical Task Force Cognitive Working Group
and represents a collaboration between the Framingham Heart Study Brain Aging
Program at Boston University and the Indiana Alzheimer’s Disease Research Center
at Indiana University.
Sherral Devine, PhD
Cody Karjadi, MS
Hannah Craft, MPH
Last Updated: 9/14/2023 1
Table of Contents
Introduction ................................................................................................................................ 2
About this Manual ...................................................................................................................... 3
Part 1: Basic Digital Voice Capture ......................................................................................... 4
Language for IRB and Informed Consent ................................................................................... 4
Selecting Audio Recording Equipment ....................................................................................... 5
Audio Setup and Calibration ....................................................................................................... 6
Recording .................................................................................................................................. 7
Saving Unedited Audio Files ................................................................................................... 7
Part 2: Generating Analysis-Ready Files ................................................................................ 9
Personally Identifiable Information ............................................................................................. 9
Types of PII to Flag ................................................................................................................ 9
Procedure for Testers to Flag PII ...........................................................................................10
Processing Audio Recording .....................................................................................................10
Labeling and Silencing PII in Audacity ...................................................................................10
Using Labels to Save Cognitive Tests as Individual Files ......................................................11
Saving Processed Audio Files ...............................................................................................12
Quality Control ..........................................................................................................................15
Manual QC for PII ..................................................................................................................15
QC for File Labels and Locations ...........................................................................................16
Appendix A: Frequently Asked Questions (FAQs) .....................................................................17
Appendix B: Recording with the ZOOM H4N DVR ....................................................................18
Appendix C. Digital Voice Recorder Alternatives .......................................................................23
References ...............................................................................................................................29
Last Updated: 9/14/2023 2
Introduction
As research on Alzheimer’s disease and related dementias (ADRD) centers on detecting
symptoms earlier in the insidious onset process, there is an increasingly pressing need to
develop methods for detecting them, presumably before the impact of the underlying
pathological changes are irreversible. While there have been great advances in developing
imaging and blood-based biomarkers of AD at the clinically pre-symptomatic level, the
presumption of “pre-symptomatic” is currently being driven by the tools used to detect them.
Measuring longitudinal changes in cognitive function is one of the core clinical indicators of
pathological onset.
Surprisingly the research efforts to build more sensitive tools of cognitive function have not kept
pace with that of PET/MRI imaging or fluid (blood/CSF) biomarkers. Yet cognitive function
remains the primary outcome against which all these biomarkers as well as clinical trial
treatment impact are measured. Thus, the AD research community is investing in methods for
detecting AD pathology emergence at its earliest point but has not made concomitant
investment in methods for detecting AD-related cognitive symptoms that might be emerging in
parallel.
The implementation of digitally recording participant responses to neuropsychologist tests is the
easiest, most cost-effective way to detect early changes in cognition. Speaking is a cognitively
complex task and thus embedded in the spoken responses are acoustic and linguistic features
that likely map onto the multiple cognitive domains implicated by neuropathological changes. As
our cognitive capabilities shift, we express them through vocal responses in subtle ways, such
as switching up word choices or sentence structures because of word finding problems,
pausing, hesitating, and shifting as memory, attention, and executive functions are
compromised.
Currently, there are no gold standards in methods for analyzing voice recordings, but just as
with blood-based biomarkers, there is a growing, albeit still limited, literature suggesting that
analysis of digital voice recordings as a method for differentiating those with and without
cognitive impairment is promising. Thus, to facilitate the opportunities of using digital voice as a
novel method for assessing cognition, we provide a manual of operations that describes how to
collect digital voice recordings for research purposes.
Rhoda Au, PhD
Co-Principal Investigator; Framingham Heart Study Brain Aging Program
Boston University Professor of Anatomy & Neurobiology, Neurology, Medicine & Epidemiology
Last Updated: 9/14/2023 3
About this Manual
This document is intended to guide Alzheimer’s Disease Research Centers (ADRCs) and other
interested groups in audio recording the UDS4 cognitive exam for research analysis. At this
time, digital voice capture is encouraged but not required.
This manual is divided into two parts which provide ADRCs with two options for implementation.
Part 1 contains best practices recommendations for basic digital voice capture. This involves
recording and storing a digital audio file of the UDS4 cognitive exam with a prescribed file
naming convention and data log for later research analysis. Centers can choose to only
implement Part 1 at this time, with the option to pursue Part 2 when more resources are
available.
Part 2 guides centers through generating analysis-ready files that can be shared with outside
investigators. It involves audio editing for deidentification, dividing cognitive tests into individual
files, and quality control measures. Part 2 is resource intensive, and we are currently exploring
ways to financially support implementation. In addition, the National Alzheimer’s Coordinating
Center (NACC) is presently developing options for receipt and dissemination of deidentified
digital speech files.
Last Updated: 9/14/2023 4
Part 1: Basic Digital Voice Capture
Language for IRB and Informed Consent
The Framingham Heart Study Brain Aging Program (FHS-BAP) records the consent process in
addition to the battery, however centers can choose to only record the testing battery. The ex-
cerpts below can help centers develop their own language to include in IRB and consent forms.
FHS-BAP includes the following statement in its IRB application.
Prior to beginning the consent process, participants will be informed that the
examination will be audio recorded. The audio recording will also be stored for
future data analysis. If the participant refuses, audio recording will not be done;
otherwise, recording will begin and the consent process will commence. The
consent process is being audio recorded to facilitate quality control, to ensure
appropriate consenting. Participants have the right to refuse being recorded at any
time during the exam.
The FHS-BAP IRB application also contains the following language specifically regarding virtual
visits.
Use of this method of interview is expected to have minimal risks. The risk involves
potential for breach of confidentiality. The risk is minimized by using a secure web
platform that is used by many hospitals and clinics for doctors to communicate with
patients, hence we believe it will be secure for this research interview. The visits
will be digitally recorded (1x participant/month will be video recorded via Zoom
recording and each participant will be audio recorded) and kept on FHS servers
similar to our audio recordings of in person cognitive testing. The video recordings
will be retained for short term storage, long enough for QC purposes, and will then
be destroyed/removed. The FHS forms are filled and stored at the main FHS
facilities along with all other FHS records. Participants can always decline to
answer any question or decline to complete any test even if the participant
consents and completes to the rest of the questions or test within the examinations.
Specifically regarding use of smartphone-based cognitive tests: Use of the
smartphone-based cognitive applications is expected to have minimal risks. The
smartphone assessments are generally shorter in length (approximately 45
minutes in total) as compared to the standard NP in-exam (about 60 minutes in
total), so participant burden should be reduced. Breaks will be provided between
tests in order to reduce potential fatigue, which is likely to be transient and have
very little overall impact. The risk involves potential loss of confidentiality. While
the investigators are confident in the secure nature of data storage at FHS, such
as locked filing cabinets for storing paper files and use of password protection and
encryption for storing electronic files in computer systems, the information stored
on the application developers' servers is out of the study team's control. However,
Linus Health and other application developers will be blinded to any identifiable
information using system-generated study IDs. Participants can always decline to
answer any question or decline to complete any test even if the participant
consents and completes to the rest of the questions or test within the smartphone-
based cognitive assessment.
Last Updated: 9/14/2023 5
The FHS-BAP Informed Consent Form contains the following statements about audio
recordings.
Within the description of what will happen in the study:
“This session will be recorded using a digital audio recorder. Recordings will be
analyzed in conjunction with other study information. We will also use recordings
to make sure that your responses are accurately documented.”
For virtual visits:
During Your Virtual Visit: You will be asked similar questions and administered
the same tests that you would encounter during an in-person neuropsychological
exam. The tests given virtually will be as close as possible to the tests given in
person, modified only for use virtually. Later on, you may be offered a second in-
person evaluation at your convenience, in your home. The tests will be audio and
video recorded for data integrity purposes and further analysis.
Within the Confidentiality section:
We will store electronic files in computer systems with password protection and
encryption. Access to these records is limited to authorized FHS staff. However,
we cannot guarantee complete confidentiality. The files will be kept indefinitely,
and there are no plans to destroy any of the records. Coded data and digital re-
cordings from all tests will be stored in a repository and will be shared with quali-
fying investigators
“…Coded digital (video/audio) recording information will be analyzed by qualifying
collaborators inside and outside of BU Medical Campus/BMC. Your name and
other personally identifying information will not be shared with these entities.
Selecting Audio Recording Equipment
The audio equipment selected by a center will greatly affect the quality of recordings. In this
section we have outlined the important details to consider in this selection process.
FHS-BAP uses the Zoom H4N recorder, which meets the criteria below. If your center also
chose this recorder, see Appendix A for detailed instructions on its use. Additional recording
device recommendations can be found in Appendix B.
Factors to consider when choosing a recorder:
1. Portability (if off-site testing is done)
2. Compatibility with lab’s computers (Mac/PC/Linux)
3. Both AC Adapter and Battery options for power (rechargeable battery preferred)
4. Microphone: We recommend choosing a recording device that can lay on the table in
front of the participant and that has two microphones (for better sound quality than a
single mic) as part of the recorder.
Last Updated: 9/14/2023 6
Lapel mics, which clip to the participant’s shirt, may pick up too much rustling noise if the
participant moves. Headsets might be too cumbersome or uncomfortable. Hanging
(ceiling mounted) mics are not portable. If using external mics, avoid using condenser
microphones (they pick up more room reverb); use cardioid mics instead.
5. Recording capacity: Depending on lab conditions, determine whether the recorder can
collect multiple sessions before downloading the files to a computer, or whether it needs
to be downloaded after each testing session.
6. Sampling rate: The sampling rate is one measure of audio quality, expressed in Hertz
(Hz). The lowest acceptable sampling rate is 16000 Hz and we recommend capturing
audio at a sampling rate of 44100 Hz.
7. Format/encoding: We highly recommend collecting voice data in the WAV format with
LINEAR16 PCM encoding with at least 16000 Hz sampling rate. (For more detailed
information on audio recording format and encoding, see references 2-3.)
8. Quantity: If multiple testing sessions are scheduled simultaneously, testers will need
more than one recording device. If testers go off-site for sessions (i.e. to the participant’s
residence), they will need to take the device with them. Centers should consider
acquiring an extra device in case one is broken or lost.
9. Additional equipment: if the chosen recorder requires an SD card, the center should
ensure that multiple SD cards are purchased in case some are lost.
Audio Setup and Calibration
Recorder placement and the sound quality of the testing room will greatly impact the recording
quality. Sound reverberating off bare walls & floors, humming equipment, external noises (e.g.
loud colleagues, birds chirping, etc.), and microphone direction are some of the factors that can
diminish sound quality of voice recordings.
Sometimes these features are not under the tester’s control, and we must work with what we
have. This is especially true when recording is done in the participant’s home. However, if you
have the option to make changes to the testing environment, consider the following:
Ideal Testing Room
1. Small to medium size
2. Multiple soft surfaces like carpet, couches, pillows, etc.
3. Avoid rooms that have a lot of hard surfaces that will make sound bounce around, such
as windows, bare walls, and hard floors
4. Minimal exposure to external sounds (e.g., street noise, a conference room, loud
colleagues, ringing phones, plumbing, weather)
5. Turn off noisy things in the room (e.g., fan, phone, air conditioner, computer in overdrive)
6. Lay a towel or piece of cloth under the recorder
Placement of recorder
1. Point the mic(s) of the recorder toward the participant (and away from tester)
2. Place it in a location where it will be out of the way of testing (once you start recording,
you don’t want to be moving the recorder around)
Last Updated: 9/14/2023 7
3. If possible, place the recorder on furniture that is NOT the desk/table you are working on
(because sounds such as pages turning, bangs on the table, etc. get picked up), but be
sure it is close to the participant
The recording quality can be improved by sound-treating the testing room(s):
1. Floors
a. Carpet/rug
2. Ceiling/walls
a. Bass Traps
b. Acoustic Panels
c. Alternatively, can use packing blankets or mattress foam
d. Or, do it yourself (DIY)
i. How to build a sound absorbing panel in 5 easy steps
ii. How to build your own acoustic panels
iii. Budget Audio Treatment
iv. Cheap Sound Treatment Tests in a Commercial Office
v. How to install acoustic foam without damaging your walls
vi. Tips for DIY
The quality of audio can also be affected if the participant and/or examiner are wearing masks.
We ask centers to track whether masks were worn for each testing session and if the
participant, examiner, or both were wearing a mask.
A test recording in every room that is currently known to be a testing room should be sent to
FHS-BAP at Boston University to confirm the quality of sound recording for data extraction.
Email adcvoice@bu.edu for instructions on uploading test audio files. Please ensure that test
recordings contain no PII. Make sure that every tester is trained in how to use the recording
equipment and what to do with the audio recording after the testing session.
Recording
We recommend that centers record the entire UDS4 cognitive battery in a single recording,
because it might be distracting to the participant if the tester is repeatedly starting and stopping
the recorder. However, centers have the flexibility to record individual tests if they prefer to do
so. In a later section, we provide instructions on how to split a single recording into multiple
audio files so that each test is in a separate audio file.
Some tests are virtually silent, such as Trails, but we encourage centers to continue recording
even during these silences. Audio recordings can be used to analyze many different angles of
testing, so it can be useful to record during quiet tests to catch if people speak or make noise
during the test (such as verbalizing during Trails).
Saving Unedited Audio Files & Data Log
After testing, the examiner will download the recording from the recorder to a computer and
save it. All files should be saved with the naming convention: [ADRC
ID]_[Accession#].wav
It is imperative that centers maintain a data log with the following data variables for each
recording:
Last Updated: 9/14/2023 8
ADRC ID
Accession #
Visit date
Visit number
PTID/NACC ID
Cognitive tests (NACC Code)
Interviewer initials
Whether the participant and/or interviewer were wearing masks
o 1=interviewer
o 2=participant
o 3=both interviewer and participant
o 4=no mask
Where the recording took place
o 1=clinic
o 2=home
o 3=nursing home or assisted living
o 4=other
Name of the recording device used
If your center is completing Part 2 below, these additional variables should be tracked in the
data log:
PHI removed (Y/N)
Date of processing
Initials of processer
Program used
Quality check
Last Updated: 9/14/2023 9
Part 2: Generating Analysis-Ready Files
Personally Identifiable Information
Personally Identifiable Information (PII) is information that can be used to identify, locate, or
contact a single individual. It is essential that all PII is removed from the recording prior to
sharing with outside investigators. Examiners should try to avoid using a participant’s name
during testing; however it is not uncommon for a participant to say something in the middle of
testing that would be considered PII and therefore must be removed.
Types of PII to Flag
This is a comprehensive list of the types of PII. Some types may occur more frequently during
testing than others.
1. 18 HIPAA identifiers
1
of the individual or of relatives, employers, or household members
of the individual:
a. Name (including maiden name)
b. All geographic subdivisions smaller than a state, including street address, city,
county, precinct or neighborhood area, ZIP code, and their equivalent geocodes.
c. All elements of dates (except year) for dates directly related to an individual:
i. Birth date
ii. Admission date
iii. Discharge date
iv. Date of death
v. All ages over 89 (as well as the year of birth for this age group)
d. Telephone numbers
e. Fax numbers
f. Email addresses
g. Social Security numbers
h. Medical Record numbers
i. Health plan beneficiary numbers
j. Account numbers
k. Certificate/license numbers
l. Vehicle identifiers (e.g., serial numbers, license plate numbers)
m. Device identifiers and serial numbers
n. Web URLs
o. Internet protocol (IP) address
p. Biometric identifiers, including finger and voice prints
q. Full face photographic images and any comparable images
r. Any other unique identifying number, characteristic, or code
2. Research-related Identifiers
1. Start-of-Exam Recorded Identifiers: Participant ID, tester ID, and date
3. Regional Identifiers
1. Schools attended
2. Place of work
3. City of birth
Last Updated: 9/14/2023 10
Procedure for Testers to Flag PII
PII may not occur often during the recorded testing, and testers can limit PII by collecting all
participant-related information before beginning the recording and being mindful of speaking PII
such as referring to the participant by name.
It is extremely important that the examiner pay attention to every time PII is spoken by either
themselves or the participant. Upon speaking/hearing any such information, the examiner
should mark the active battery page at the time of the PII. If the examiner is not using paper for
the test (for example, if the center uses a computer or tablet instead), the center should agree
on a way to mark when PII is spoken. All examiners in a center should use the same way of
marking PII, in case the examiner who conducted the tests is not the same person who removes
PII from the recording.
Processing Audio Recording
Centers will use an audio editing software to remove PII from each recording. Some centers
might have testers process the audio that they recorded, or they might have someone else
process the recordings using the tester’s notes where they flagged PII.
Centers can choose their preferred software as long as it can silence PII in such a way that it
cannot be reversed and can save audio files in the WAV format. After silencing PII in the
recording, the edited audio file should be saved with the [ADRC ID]_[Accession#].wav” file
naming convention and corresponding details should be entered in the data log for sharing with
external investigators. (Do not save over the unedited audio file, save as a new file).
We recommend using the software Audacity, which is a free audio recording and editing
software for Windows, Mac, and Linux. Centers can download Audacity here:
https://www.audacityteam.org/download/
Audacity can be used to record audio with a connected microphone, or centers can use a
separate device to record and then download the recording to a computer that has the Audacity
program. To open an audio file in Audacity, you can drag-and-drop the file or open Audacity, go
to File > Open, and navigate to the appropriate file.
There are many online tutorials and instructional videos for Audacity users, so looking up your
questions online will usually yield a solution. For an introduction to the program, watch the
first 4 minutes of this video: https://www.audacityteam.org/download/ (after 4 minutes, the
video describes editing and noise reduction that you will not do for this project). We recommend
that new users become familiar with the Audacity program and practice the following steps with
sample audio before working on research recordings.
Labeling and Silencing PII in Audacity
Audacity has a Silence tool that you will use to remove Personally Identifiable Information (PII)
from the recording. To learn how to use the tool, watch this video
(https://youtu.be/VgI6PUNv0fY) and then follow the steps below.
Last Updated: 9/14/2023 11
You will follow these steps in conjunction with the steps in the next section (“Using Labels to
Save Cognitive Tests as Individual Files”), so it is important to read and practice all the steps
before working on recordings.
After opening the appropriate audio recording in Audacity, go to the drop-down menu at
the top of the program and find Tracks. Select Add New > Label Track. You will use
the label track to mark when PII occurred in each recording.
Listen to the audio to find the first occurrence of PII in the recording. If you click and drag
on the audio track, you can highlight a portion of audio; if you press play, it will only play
the highlighted section of audio. (This also works if you highlight part of the label track.)
You can use the Zoom tools to zoom in on the audio:
Once you find PII, click and drag on the label track until you have highlighted the area
with PII (you can press play to check). You can click and drag the start and end points of
the highlighted section and keep replaying the segment until you have isolated the PII.
Tip: It might be hard to avoid including words on either side of the PII: for example, if
someone says "Yep, my brother's name is John Smith and oh, um..." - It might be hard to
not also grab when they say "is" and "and" on either side of "John Smith", depending on how
fast they speak. We want to limit non-PII speech in the segments, but don’t spend too much
time trying to avoid capturing a word or two on either side of the segment. Overall, the idea
is "do your best effort" on limiting non-PII speech.
Press ctrl + B or cmd + B. This will create a label at the location you have highlighted.
Type “PII”.
When you click on the “PII” label you made, it will highlight the selected audio. Click on
the Silence icon or use ctrl+L to silence the selected audio:
Repeat these steps with each segment of PII until all are labeled and silenced.
Using Labels to Save Cognitive Tests as Individual Files
Centers will most likely record all cognitive tests in a session as a single audio file. This is
optimal because starting and stopping the recording for each test could be distracting to the
participant. You can use Audacity to easily split and save each test as an individual audio file.
Here is a video explaining this feature (https://youtu.be/72ewbraagj8).
At the top of the Audacity program, find the drop-down menu for Tracks. Select Add
New > Label Track. (Centers will also use a label track to save timestamps of PII. This
step will create a second label track. It is important to use separate label tracks for
the two tasks.)
Last Updated: 9/14/2023 12
Find the beginning of the first cognitive test. Click on the second label track so the
vertical line is positioned before the first test begins.
Press ctrl + B or cmd + B. This will create a label at the location you have selected. (If it
creates a label on the PII label track instead of the cognitive test label track, it’s because
you need to click on the second label track before pressing ctrl + B or cmd + B.)
Type in [ADRC ID]_[Accession#] (this will eventually become the audio file name). Each
test should have a different Accession # with the corresponding test noted in the data log
using the NACC code.
Find the space between the end of the first test and the beginning of the second test.
Click on the second label track at that location and use ctrl + B or cmd + B to create
another label. Name it appropriately. Do this at the beginning of every test.
Saving Processed Audio Files
After the PII and cognitive tests have been labeled and the PII is silenced, you need to save the
labels and audio files separately. To do so, follow these steps:
To save the timestamps, click on File > Export > Export labels
Last Updated: 9/14/2023 13
Save the timestamp file (.txt is the default file type) in your center’s designated file
location with the [ADRC ID]_[Accession#] naming convention corresponding with the
audio file. The text file will contain timestamps for both PII and the cognitive tests.
Now it is time to save the audio files. Go to the label track that has labels for the
cognitive tests. On the far right side of the track, click on the black triangle next to “Label
Track” and select Move Track to Top:
At the top of the program, click on File > Export > Export Multiple:
Last Updated: 9/14/2023 14
The following menu will appear. Make sure you have these options selected:
Click Export. Several windows will pop up, click Ok for all of them. The program will
save the audio as individual files.
Finally, save the Audacity project file in the center’s designated file location. To do this,
click on File > Save Project > Save Project As… Your center will use the Audacity
project file to conduct QC and make any necessary changes to the labels or audio.
All audio files should be stored in a secure location and backed up regularly.
Last Updated: 9/14/2023 15
Quality Control (QC)
The exact process for ensuring data quality is subject to each center’s protocol. We recommend
integrating efforts with the center’s existing QC procedures and documentation. Below we offer
suggested best practices that centers may want to adopt if feasible.
Centers should develop a system for tracking QC activities. This can be done using the center’s
preferred program such as REDCap or Excel. It should, at minimum, track the following
variables:
1. Participant ID
2. Tester ID
3. Recording date
4. Name of audio file
5. ID of person conducting QC
6. Date of QC activity
7. What type of QC is being done (as outlined below: Supervisor, Peer, Intra, Data
Integrity, etc.)
8. Whether the QC passed or failed
9. Why the QC failed, if applicable
We encourage centers to create a feedback loop for the QC process. This means that the
people who conducted testing and processed the audio (this might be the same person or
different people) are sent the QC results so they can tell if they made any mistakes. The testers
and audio processers can sign off on the QC to confirm that they reviewed any errors.
Manual QC for PII
Given the importance of maintaining the confidentiality of research participants, it is essential
that QC measures are implemented to ensure consistent and accurate removal of PII from the
recordings. While we provide a framework for QC below, we encourage centers to adapt and/or
develop a process that works best with their existing infrastructure. Our recommended format
includes three levels of QC: supervisor, peer, and self (intra).
Supervisor QC: A supervisor should regularly choose recordings at random to undergo QC.
We recommend selecting at least one recording from each tester every month. They should
listen to the entire recording to check that all PII has been labeled and silenced. If they find PII,
they should flag it and ask a staff member proficient with Audacity to label and silence the PII,
then save the audio file and timestamp (.txt file) as updated versions.
Peer QC: A Peer Reviewer should review 5-10% of the completed exams done by each tester
on a regular basis. The recordings should be chosen at random. This does not have to be a
trained NP tester; it just needs to be someone trained to listen for PII. The Peer Reviewer
should listen to the recordings and make sure all PII has been labeled and silenced. If they find
PII, they should flag it and ask a staff member proficient with Audacity to label and silence the
PII, then save the audio file and timestamp (.txt file) as updated versions.
Intra QC: Each person processing the audio should review their own test recordings, chosen at
random, to ensure all PII was accurately identified (we recommend reviewing one recording a
Last Updated: 9/14/2023 16
quarter). If they find PII, they should label and silence the PII, then save the audio file and
timestamp (.txt file) as updated versions.
Each recording that undergoes QC should be logged in the center’s designated QC tracking
system. On a regular basis (we recommend quarterly), the tracking system entries should be
reviewed to ensure that there are not (1) common pitfalls or (2) problems with the accuracy of
any particular examiner or person processing the audio. Any common pitfalls or differences of
opinion will be discussed with the team and steps taken to resolve them. If the person
processing the audio is not consistently accurate, they should be given feedback and additional
recordings from that person should be reviewed. The number of additional recordings that will
be reviewed will be decided upon by a supervisor based on the given circumstances.
QC for File Labels and Locations
We strongly recommend that centers implement a QC process that will ensure data files are
labeled correctly and located in the appropriate folders. This is essential because the recording
should contain no PII, which means the file name will be the only way to identify the recording.
Audio files must be labeled according to the prescribed naming convention ([ADRC
ID]_[Accession#].wav) that corresponds with the correct row in the data log. Without a correct
file name and match in the data log, it will be extremely difficult to align the voice data with
participant phenotypic data. The file name and location should correspond with the data log with
no discrepancies.
Last Updated: 9/14/2023 17
Appendix A: Frequently Asked Questions (FAQs)
[inserts FAQs and answers here]
Last Updated: 9/14/2023 18
Appendix B: Recording with the ZOOM H4N DVR
Initial Set-up
1. Turn the power on by moving the power switch on the left panel of the device to
“ON”.
2. Press the menu button on right side of recorder.
3. Scroll to “SYSTEM” and enter it.
4. Enter “DATE/TIME” and set the date and time. The recorder uses military time.
Press “OK”.
5. Return to the menu and enter “REC”.
6. Change “REC FORMAT” to “WAV48kHz/24bit”. Exit out of the menu completely by
repeatedly pressing the menu button.
7. Change the recording level to 50 using the “REC LEVEL” rocker on the right side of
the recorder.
8. Set microphones to 120º.
9. Initial set-up is complete. You may turn off the device.
Loading the SD Card
1. Be sure the power is OFF when inserting or removing the SD card to avoid
destroying data.
2. Insert the SD card into the slot on the left panel of the device
a. If “Format Card” appears on the display screen after inserting the card, it
means that the SD card has not been formatted in the H4n Pro device. To format
it, use the dial to select “YES”.
3. To check the remaining capacity of the SD card, press “MENU” and select “SD
CARD”. Select “REMAIN” which will then display the remaining capacity meter,
remaining space, and remaining recording time using the current settings.
Recording Instructions
1. The recorder should always have functioning batteries installed, regardless of
whether an AC adapter is being used.
Last Updated: 9/14/2023 19
2. Plug recorder into the AC adapter in the testing room (or, if testing elsewhere, have
an AC adapter with you and try to arrange the testing location so you can plug in the
recorder).
3. Turn the power on by moving the power switch on the left panel of the device to
“ON”
4. Be sure the “Stereo Mode” indicator is lit.
5. Put the recorder in “Recording Standby Mode” by pressing the “REC” button.
a. Recording standby means the mic is on but is not yet recording.
b. The red light on the recorder blinks when in standby mode.
6. Confirm all settings are correct (recording level = 50; recording format =
WAV48kHz/24bit; microphones are at 120º). See Initial Set-up section above for
instructions.
7. Make sure that the MIC button is pushed on the front of the recorder (NOT the “1” or
“2” buttons); see the ZOOM H4N DVR Image below.
8. Start recording by pressing the Play/Pause [►/||] button. The time counter on the
screen will advance, the recording symbol [●] will appear next to it, and the red light will
stop blinking and remain on.
9. Record the following information: Participant ID, Date of testing, and Examiner ID.
10. Press the Play/Pause [►/||] button again to pause recording until ready to start
recording the participant.
11. Lay the recorder down with the head of the recorder pointed directly at where the
participant will be seated. In testing rooms, place the recorder on the file cabinet next to
the testing table, as close to the participant as possible. It’s absolutely essential to place
your paper holder on the opposite side of the table relative to the recorder, because the
recorder is very sensitive and paper shuffling will muddle audio. If you are not in a
testing room, try to arrange to place the recorder on a different surface, but still close to
the participant, so it does not pick up all the paper shuffling, table jarring, etc.
12. After the participant has been consented and has signed the consent form, you may
begin recording the examination. For our NP studies at FHS, however, we have IRB
approval to audio record the consent process itself. In this case, first tell the participant,
“We will be audio recording this session for analysis and quality control purposes” (or
something along those lines), then begin recording. NOTE: If the participant reports that
they do not want to be audio recorded, turn off the recorder, remove it from the table,
and proceed with consenting/testing (unrecorded).
Last Updated: 9/14/2023 20
13. Since you are currently in “Standby” mode, press the Play [►/||] button. Again, the
time counter on the screen will advance, the recording symbol [●] will appear next to it,
and the red light will stop blinking and remain on. MAKE SURE THIS IS ALL
HAPPENING BEFORE PROCEEDING WITH TESTING.
14. Optionally, you may now slide the power switch toward “HOLD” on the left panel of
the device to disable button operation during recording (although preferably you will not
be touching the recorder at any time during testing, so this should not be necessary).
15. After all testing is complete, stop the recording by pressing the stop button [■].
16. YOU MUST TURN OFF THE RECORDER BEFORE UNPLUGGING THE A/C
ADAPTER OR ELSE THE RECORDING MAY BE LOST. (This is only true if the
batteries in your recorder are dead, but you should always follow this procedure to
ensure data is not lost.
Although you are unlikely to need to play the recording back on the DVR device itself,
because you will be using the ELAN software, this can be done by pressing the, [►/||]
button to play and the, [■] button to stop.
To play an older recording back, press “MENU” then select “FILE” using the dial. Select
the file to play and press. Select “SELECT” and press. Press the [►/||] button to start
playback.
Using USB to Transfer Files
1. Connect device to computer with USB cable.
2. Press the “MENU” button on the right panel of the device.
3. Select “USB” using the dial and press.
4. Select “STORAGE” and press.
5. The device is now connected to the computer and the files can be transferred
6. Save the file in the appropriate file location with the file naming convention for
unedited recordings
Dividing or Deleting a File
It is unlikely you will need to use these features; however, in the rare case that it may be
necessary (e.g., two participants were accidently recording in the same file), follow
these directions:
1. Press the “MENU” button on the right panel of the device.
2. Select “FOLDER” using the dial and press.
Last Updated: 9/14/2023 21
3. Select a folder using the dial and press.
a. To divide a file and a desired position, select “DIVIDE” and press. Press to
start the playback and press again at the division point. Select “YES” to confirm
the divide.
b. To delete a file, select “DELETE” using the dial and press. Select “YES” to
confirm deleting. **Never delete files from the recorders until you are 100%
certain they are correctly stored on the N drive**
Battery Type
1. To display the remaining battery life when using batteries, press “MENU
2. Select “SYSTEM” using the dial and press.
3. Select “BATTERY” using the dial and press.
4. Select the battery type: Alkaline or Ni-MH.
Software Update
1. To download the most recent system software, the device with an SD card must be
connected to a computer with access to the internet.
2. Open the ZOOM website (http://www.zoom.co.jp)
3. Connect the H4n Pro to the computer with the USB cable
4. Copy the downloaded software to the root directory of the SD card
5. Disconnect the H4n Pro
6. Turn it on while holding down the [►/||] button. Select “OK” when prompted to
upgrade the version.
Last Updated: 9/14/2023 22
Last Updated: 9/14/2023 23
Appendix C. Digital Voice Recorder Alternatives
This list was compiled by FHS-BAP and was last updated in June 2020. It can serve as a
resource for centers that are “shopping around” to find a recording device that best suits them.
Please note that some details may become outdated over time as device specifications and
models change.
Sony ICD-PX370 Mono Digital Voice Recorder with Built-In USB Voice Recorder
Record in MP3 audio quickly and easily.
MP3 files are compressed but require less memory, making them better for recording
long lectures or meetings.
Record up to 57 hrs of audio (MP3 128 kbps) with exceptional battery life that makes it
possible to record for long periods of time.
Transferring files to or from your computer is fast and convenient. Just plug the ICD-
PX370 straight into a free USB port for an immediate connectionno USB cable
needed.
Turn on Auto Voice Recording and the ICD-PX370 will optimize audio capture settings
for vocal frequencies. The result is a purer recording with reduced background noise.
And when you listen back to the recording, Clear Voice technology cleans up the signal
even more for improved clarity.
The 4GB2 memory stores up to 59hrs35m of recording (MP3 128kbps stereo).
Choose from four 'scene' presets (music, meeting, interview, dictation) to optimize the
audio settings.
User Manual [PDF]
32GB Digital Voice Recorder, Homder Voice Activated Recorder
Dynamic noise reduction chip & dual microphones to capture sound clearly, gives you a
really clear and natural audio.
All recordings are named with a timestamp, convenient to find the file you are looking
for.
Built-in 32gb flash memory stores up to 2,000+ hours of maximum recording time
utilizes DSP digital & AGC noise reduction technology to enhance human speech
recordings and filter out background noise, to give a really full clear and warmer vocal
recording.
A single full charge (about 4 hrs) could be continuously used 60+ hrs.
Multiple high-fidelity speakers ensure a crispy & loud enough playback even without
headphones.
Last Updated: 9/14/2023 24
Password function keeps your files far away from leaking.
EVISTR 16GB Digital Voice Recorder Voice Activated Recorder with Playback
Voice Activated Record
Reduce blank and whispering snippet
Voice Recorder USB Rechargeable
File name with Year, Month, Day, Hour, Seconds
Dynamic noise cancellation microphone, capture 1536kpbs crystal clear audio
Voice Recorder MAC Compatible (WIN Compatible)
Easy to figure out, press REC: starts to record; press STOP, save the recordings safely.
Small Voice Recorders with A-B repeat, fast forward, rewind function during playback, a
helpful recorder for lectures, meetings, interviews, speeches, class
Voice Activated Recorder: set the AVR voice activated function, record only when the
teacher is speaking, reduce blank and whispering snippets, save space and time.
Recording your appointment, meetings, interviews,speeches, lectures easily.
Easy File Management: recordings with time stamp, easy to find out when you recorded,
what it recorded.
#1 best-seller on amazon
User Manual [PDF]
Does not have Autosave feature
Do not shut down the device until you press STOP to confirm the file saved
properly, it will show “Saved!”
Do not shut down the device, while you are formatting, wait until it shows “format
completed”
16GB Digital Voice Activated Recorder - aiworth 1160 Hours Sound Audio Recorder Dictaphone
E36 voice recorder equipped with dual sensitive microphone and professional recording
IC, support up to 1536Kbps PCM recording,provide a super clear recorded voice
Built-in 800mAh rechargeable battery, support up to 45 hours continuous
recording.16Gb flash memory could save 1160 hours recording files at most,
in addition to this can support up to 32GB TF card(In addition to purchase) expansion
and voice activated recording.
The most user friendly voice recorder designed by aiworth, all operation buttons on the
front side, operational logic like smart phone.
Last Updated: 9/14/2023 25
With graphic user guide
Lifetime software update
Power-on password protection- 3-digit password,8000 combinations; without the
password, no one could turn on the device and overheard your recorded files.After three
trial and errors, device will auto turn off.
16 levels to adjust the play speed; play faster, jump to the point you exactly want to
playback;play slowly let you hear every single word clearly.
User Manual [PDF]
Aomago 8GB Audio Recorder Mini Portable Tape Dictaphone with Playback, USB, MP3
This recorder upgraded its higher sensitive microphones, meaning that you can enjoy
premium quality sound.
Simple three-click recording, saving and playing, make it super user friendly.
Set the recorder to voice activated recording, catch the speaking words only.
A-B REPEAT FUNCTION: This is a great feature to help you study language, review
lessons from selected starting point A to ending point B. You don’t have to go back or
forward to listen to the words any more.
Easy transfer files: Voice recorder mac compatible. It supports recording files in MP3 or
WAV format. You can transfer files easily by connecting to a computer via supplied
Micro USB cable.
8GB MEMORY CAPACITY
High Quality (128 kbps) : 7680 Mins
Short Play (64 kbps) : 16920 Mins
Long Play (32 kbps) : 33120 Mins
7 EQ modes
Different languages
USB connection, for uploads and downloads
Battery life expectancy: up to 12 hours continuous recording
Warning:
Do not use the right “POWER” button to totally shut down your voice recorder, or it will
reset your voice recorder system time to default.
Last Updated: 9/14/2023 26
We suggest you press the PLAY/PAUSE button for two seconds to power off your voice
recorder, and next time you just need to press “PLAY/PAUSE” again to wake up your
voice recorder.
When Battery is almost exhausted or too weak, functions may be limited, please
recharge!
Charge time between 3 to 4 hours, turn on the voice recorder before charging.
Press the REC button while recording to pause or resume recording. The LED will flash
when the recording is paused.
User Guide [PDF]
Wohlman. Digital Voice Recorder 16GB 1536kbps Touch Screen High Recording Quality Noise
Reduction Easy Operation Auto Activation MP3 Voice Recorder
Clear recording with a resolution of 1536 Kbps and microphones with dynamic noise
reduction, higher bit rate, higher recording quality, crystal-clear recordings and MP3
player.
The built-in 180mAh battery can record 12 hours continuously. With 16 GB of internal
storage, you can save up to 145 hours of recordings or 1500 songs.
Automatic recording is possible with the preset time. Simply record and save with the
"REC / Save" button. Simple operations with touch buttons.
With the automatic voice recognition function, the recorder automatically starts recording
when the sound is recognized. Without sound, it will be in standby mode to reduce
recording capacity and power consumption. The detecting distance can reach up to
10m.
With the A-B repeat play function, the recording can play back within a certain period of
time. You can also fast forward and rewind during playback, which is useful for reviewing
lessons, meeting records, songs, interviews, etc. We offer a one year guarantee.
With the USB cable, you can easily transfer the files to the computer as well as delete
the files directly. Compatible with Windows and IOS systems.
The password setting secures your recording data
Tschisen V93 is embedded with AGC noise reduction design and will give you high
quality recordings.
Speech recognition automatically picks up detected sounds and stops recording when it
is quiet.
Olympus Voice Recorder WS-853 with 8GB, Voice Balancer, True Stereo Mic
High quality MP3 recording
USB Direct connect with battery charge function
Last Updated: 9/14/2023 27
8 gb internal memory
Micro SD card slot
Playback speed control 0.5X to 2.0X
The True Stereo Mic with two directional microphones positioned at a 90 degree layout,
enables highest quality recording with an authentic stereo experience.
By differentiating the position and the distance of the speakers in meetings and
conferences the recording is highly precise, letting you feel as if you are actually in the
recording scene.
Auto Mode function makes it easier for users by automatically adjusting the microphone
sensitivity according to the volume of the speaker. To set this function, simply select
'Auto' for the recording level from the menu.
The Simple Mode supports beginners by having the recorder display only the necessary
information in large font. It also limits the functions in the menu to those which are
frequently used.
For advanced users, the Normal Mode with full functionality is recommended.
The WS-853 can connect directly to a computer via the built-in USB connector. This
makes it possible to easily save data anytime, anywhere without the need to bring along
a USB cable. Furthermore, WS-853 is equipped with a protective cover to keep dust out
of the connectors.
The built in stand placed on the back of the body is carefully designed to reduce the
noise from the surface when the recorder is placed on a table. It works much like a
kickstand and allows users to read the menu without having to look down at the
recorder.
When recordings contain multiple speakers, the Voice Balancer makes smaller voices
louder and ensures that louder voices stay below a given level, providing playback
where everyone can be heard clearly. This comes in handy when recording sound
sources from multiple positions, such as at a meeting. The prominent noise produced
when amplifying small sounds is reduced. By eliminating the lower and higher frequency,
the voice is even more enhanced.
The noise-cancellation function powerfully reduces unwanted ambient noise such as air-
conditioner noise or projector fan noise enabling clear playback quality. The function is
very effective when playing back meeting recordings.
NYTIMES #2 pick
User Guide [PDF]
Sony ICDUX560BLK Digital Voice Recorder 1" Black
NYtime #1 pick for voice recorders
Last Updated: 9/14/2023 28
Built in stereo microphone and voice operated recording
Three recording options: wide/stereo, narrow/focus and normal
Quick charge; up to 1 hour recording time, with 3 minute charge
Easy to use user interface and recording level indicator
Micro SD memory card slot, headphone jack & mic input. LCD backlight
Record in MP3/LPCM with a high-sensitivity S-Microphone
Up to 4 GB of built-in storage, expandable via MicroSD (SDHC/SDXC) cards
Focus and wide microphone modes to suit lectures or meetings
Direct USB built-in for easy connection to PC
FM radio to listen to or record radio broadcasts
Normal, focus, and wide-stereo recording provide you with the opportunity to record the
audio that you need to capture in any environment, while the slim and lightweight build
make it easy to take with you wherever you go and the easy to use up makes file
searching simple.
UX560 received the highest overall ratings from our panel of test listeners (nytimes). It
produces clear, understandable audio in the classroom, quiet office, and noisy coffee
shop settings. It also offers a better collection of features than the other models we
tested, with an easy-to-navigate menu system, a bright backlit screen, 39 hours of
recording time (in MP3 format), 27-hour battery life, voice-activated recording to pause
and restart after silences, and a pop-out USB 3.0 connector that lets you recharge the
recorder and transfer files to a computer easily. Like many of the other recorders we
looked at, it comes with an adequate amount of onboard storage (4 GB) but accepts
microSD cards, so you can record and store hundreds of hours of recorded audio should
you need it. The UX560 is also the slimmest recorder we testedat 0.43 inch thick it can
easily fit in a shirt or pants pocket.
User Manual [PDF]
SONY PCM-D10
Reliable hi-res recordings of up to 192kHz/24-bit
3-way adjustable high-resolution 40K frequency response microphones
2 XLR-TRS combo jacks with 48V phantom power
Digital dual-path limiter function prevents distortion
Bluetooth capability for both remote control and playback via Sony's free REC Remote
app
Last Updated: 9/14/2023 29
More expensive than most ($500)
Commonly used for podcasters, radio, amatuer film makers
Capture flawless Hi-Res sound anytime, anywhere with the pcm-d10 portable recorder.
Record professional sound with Hi-Res Audio at up to 192kHz/24-bit. Whether it's your
live music set, new podcast episode or breaking news report, the pcm-d10 unlocks a
new level of detail and texture. The three-way adjustable microphones adapt to your
situation, while the twin XLR/TRS combo jack lets you plug in your choice of input. High-
quality dual ADCs maximize S/N and independent analog volume dials give you precise
control of your inputs.
Tascam DR-05 recorders
The dual internal condenser microphones can handle anything from subtle to loud, with
sensitivity to capture every detail
A revamped layout means operations like recording, adjusting levels, deleting bad takes
and adding Markers are quick and easy
Uses only two AA batteries, but can record for an outstanding 17. 5 hours; It can also be
powered by a USB mobile battery
Connect to a PC using USB Audio Interface Mode for voiceover work, live streaming,
podcasting and songwriting with studio-quality audio
Used in a study investigating pitch modulation in human mate choice
In this study the researchers used a sampling rate of 96 kHz and 24-bit amplitude
quantization. Recordings were stored onto microSDHC media cards as
uncompressed WAV files and later transferred to a laptop computer for editing
and analysis. This method allowed us to obtain high-quality, directional voice
recordings that would otherwise be difficult to obtain in a noisy environment using
a stationary microphone.
Acoustic editing and analysis were performed in Praat v. 6.0.21 [32]. Fragments
of silence, acute noise, non-verbal vocalizations (e.g. laughter) and multi-voicing
(e.g. the voice of the dating partner) were first manually removed from audio files.
Recordings were then segmented into multiple parts each corresponding to a
given participant and a single speed date. We further split each sound file into
three equal time segments (beginning, middle and end of the date; mean
segment duration 50.6 ± 23 s), resulting in a total of 726 voice clips for acoustic
analysis.
References
1. Guidance Regarding Methods for De-identification of Protected Health Information in
Accordance with the Health Insurance Portability and Accountability Act (HIPAA) Privacy
Rule (Office for Civil Rights) 7-8 (2012).
Last Updated: 9/14/2023 30
2. Best Practices. Google Cloud. 2022. Updated 2/10/2022. 2022.
https://cloud.google.com/speech-to-text/docs/best-practices
3. Introduction to audio encoding. Google Cloud. 2022. Updated 2/10/2022. 2022.
https://cloud.google.com/speech-to-text/docs/encoding