Siri’s Voice Could Carry Sensitive Info to An Attacker

Researchers draw attention to steganography-based attacks
 
Researchers from Italy and Poland managed to devise an attack that uses Siri's audio stream to Apple’s server to deliver to an attacker private info stolen from the victim’s iPhone.
Exfiltrating the information is done by embedding it in the voice traffic, which is then intercepted by the threat actor through a man-in-the-middle technique. Basically, it is steganography, the method for concealing data in a container that preserves its properties intact.

Secret data hidden in the voice stream

Luca Caviglione (National Research Council of Italy) and Wojciech Mazurczyk (Warsaw University of Technology) called the attack iStegSiri and described how it would function in a paper at Arxiv.

The two researchers observed that when Siri receives a voice command from its owner it sends it to Apple cloud, which delivers the answer as a textual representation.

If secret bits with sensitive info are integrated in the audio sent to the server and then the communication is intercepted, an attacker could extract the extra info available in the stream and decode it.

“Specifically, iStegSiri is based on three steps. In step 1, the secret message is converted to an audio sequence based on the proper alternation of voice and silence. In step 2, the sound pattern is provided to Siri as the input via the internal microphone. Consequently, the device will produce traffic toward the remote server that requires audio-to-text conversion," the duo wrote in the paper.

"In step 3, the recipient of the secret communication passively inspects the conversation and, by observing a specific set of features, applies a decoding scheme to extract the secret information,” Caviglione and Mazurczyk added.

Steganography-based attacks are not unlikely to happen

The researchers said that algorithms used for synchronization, latency reduction and packetization delay did not allow manipulating the entire traffic so that alteration could pass unnoticed; in order to achieve their goal they split it into several components that represented pause and talk periods.

By doing so, they were able to encode 1 and 0 in the data flow, and thus transmit pieces of information covertly. They said that the iStegSiri attack could deliver secret data at a rate of about 0.5 bytes per second. Applied to a card number, about two minutes would be required.

This type of attack is only a demonstration intended for the security industry to take the precautions for preventing a compromise that could be perpetrated in this manner.

Caviglione and Mazurczyk draw attention to the fact that for iStegSiri to be successful the iOS device needs to be jailbroken, for access to Siri’s internal functions, and the possibility to intercept the traffic sent by the personal assistant to the server.

Siri is not the only piece of software where steganography could be applied to steal information. It could be replaced by any products functioning the a similar way, such as Shazam and Google Voice.

Comments