solution is to prepare a customized Android version with modified
audio drivers. One recent research [54] has pointed out the security
risks in Android device driver customizations. Android inherits the
driver management methods of Linux and devices are placed under
/dev (or /sys) as files. Zhou et al. found the vulnerability that
certain important devices become unprotected (permission setting)
during a customization. An unauthorized app can get access to
sensitive devices, namely user data. Based on this method, similar
vulnerability may occur on audio drivers (/dev/snd), but we
have not found such a case of unprotected writing privileges on
our test devices.
Another perspective of soundless attacks is high-frequency sound
(such as higher than 20 kHz), which could be played by the phone
and is difficult to hear by humans [42]. Unfortunately (fortunately
for security), Google Voice Search only accepts reasonable human
sound frequency and filters out other ranges, so the idea of high-
frequency sound is not impracticable.
Since Google Voice Search is an Internet based service, we
ever tried to analyze the feasibility of connection hijacking and
data package tampering. After tests, the difficulties lie in that
the connection is TLS protected and transmitted voice data are
compressed by an unclear compression encoding algorithm.
Quiet vs. Noisy. The scene of launching GVS-Attack is a quiet
environment. The volume of voice commands could be very low
and still be recognized by Google Voice Search. Actually we also
tested the performance of attacks in noisy environments, such as
on the subway and in the canteen. The expected result would be
changed, namely the volume of voice commands could be very
loud and still be hidden in the background noisy. But test results
showed the background noisy (especially human voice) affected
the accuracy of speech recognition, to some degree. In addition,
context-aware analysis will become complex and may need the
RECORD_AUDIO permission.
Scope of Attack. Since Google Services Framework is pre-
installed on nearly all brands of Android devices, most Android
devices can be affected by GVS-Attack, especially these equipped
with Android 4.1 or higher versions. To voice assistant apps using
Google Speech-to-Text (STT) service, similar attacks could be
launched. Even these using independent speech recognition en-
gines should also be reviewed carefully, such as Samsung S Voice
app. Furthermore, similar attacks may occur on iOS, Windows
Phone platforms and other smart devices supporting voice control.
However, for the lack of experimental devices, we haven’t tested
them temporarily and left them for future research.
Limitation. We didn’t carry out a user study about the minimum
available sound volume (Section 4.3), because the noise-induced
nocturnal awakening is affected by many other factors, such as
sleep stages, age, gender, smoking, etc [31]. Content-aware
information analysis (Section 4.2) may be affected by hardware
configurations. For example, high memory usage may derive
from small capacity RAM. Light sensors and accelerometers from
different vendors may have different precision. So it is difficult to
create a general detector to handle all cases.
7. RELATED WORK
Sensor based Attacks. On mobile platforms, sensor based
attacks have been designed and analyzed in several previous pa-
pers [46, 40, 13, 10, 47, 49, 30]. One typical example is
Soundcomber [46]. Schlegel et al. designed a Trojan with
few and innocuous permissions, that can extract targeted private
information (such as credit card and PIN number) from the audio
sensor of the phone. Besides, it proved that smartphone based
malware can easily achieve targeted, context-aware information
discovery from sound recordings. In another research project,
through completely opportunistic use of the phone’s camera and
other sensors, PlaceRaider [49] can construct three dimensional
models of indoor environments and steal virtual objects.
The work of [40] showed that accelerometer readings are a
powerful side channel which can be used to extract entire sequences
of entered text on a smartphone touchscreen keyboard. In [47],
their side-channel attack utilizes the video camera and microphone
to infer PINs entered on a number-only soft keyboard on a s-
martphone. The microphone is used to detect touch events, while
the camera is used to estimate the smartphone’s orientation, and
correlate it to the position of the digit tapped by the user. The
work of [30] studied environmental sensor-based covert channels
in mobile malware. Out-of-band command and control channels
could be based on acoustic, light, magnetic and vibrational signal-
ing. This research is a bit like our GVS-Attack in the aspect of
voice command transmission. The differences are that, in GVS-
Attack, the command receiver (Google Voice Search) is not a part
of malware (VoicEmployer) and the RECORD_AUDIO permission
is not necessary. So our attack scheme is more insidious.
Beyond that, Sensors could be used for fingerprinting devices. A
mechanism was proposed by [20] that smartphone accelerometers
possess unique fingerprints, which can be utilized for tracking user-
s. Similar fingerprinting methods were designed with microphone
and speaker in [18, 57]. Through playbacking and recording audio
samples, they could uniquely identify an individual device based
on sound feature analysis.
Inter-Application Communication. In [15], Chin et al. focused
on Intent-based attack surfaces. It analyzed unauthorized Intent
receipt can leak user information. Data can be stolen by eaves-
droppers and permissions can be accidentally transferred between
apps. Another attack type is Intent spoofing, that a malicious
app sends an Intent to an exported component. If the victim
app takes some action upon receipt of such an Intent, the attack
can trigger that action. In their following work [33], Kantola
et al. proposed modifications to the Android platform to detect
and protect inter-application messages that should have been intra-
application messages. The target is to automatically reduce attack
surfaces in legacy apps.
Application Analysis Android permission-based security has
been analyzed from many aspects. For example, permission
specification and least-privilege security problems were studied
in [22] and [9]. Both of them designed corresponding static
analysis tools to detect over-privilege problems. Permission esca-
lation and leakage problems are also hot research topics. Related
research include [24, 19, 12, 14, 28]. Permission-based behavioral
footprinting was used to detect known malware in Android market
at large-scale in [56]. The work of [52] noticed the problem of
pre-installed apps. Wu et al. found a lot of those apps were overly
privileged for vendor customizations.
To system code level analysis, dynamic analysis technique is an
efficient solution. TaintDroid [21] is a system-wide dynamic taint
tracking and analysis system capable of simultaneously tracking
multiple sources of sensitive data. It automatically labels (taints)
data from privacy-sensitive sources and transitively applies labels
as sensitive data propagates through program variables, files, and
interprocess messages. DroidScope [53] is an emulation based An-
droid malware analysis engine that can be used to analyze the Java
and native components of Android apps. Unlike current desktop