Audio API for Soundscape
This document aims to describe the API for a library that can be exposed to the main Kotlin code in the Soundscape app. On Android this library will be in C++ so that it can directly call the Oboe and Steam Audio libraries which allows positioning of audio within a 3D space with HRTF.
iOS Soundscape audio classes
The iOS app has a host of audio classes to support playback of beacons, text to speech and other sounds.
classDiagram
class SoundBase{
+SoundType type
+Int layerCount
}
class Sound{
nextBuffer(Int for_layer)
}
class DynamicSound{
+AVAudioFormat commonFormat
+BeaconAccents introAsset
+BeaconAccents outroAsset
asset(CLLocationDirection userHeading, CLLocation userLocation)
buffer(AssetType asset)
}
class TTSSound{
TTSSound()
}
class LayeredSound{
LayeredSound()
}
class SynchronouslyGeneratedSound{
SynchronouslyGeneratedSound()
}
class GenericSound{
GenericSound()
}
class GlyphSound{
GlyphSound()
}
SoundBase <|-- Sound
SoundBase <|-- DynamicSound
Sound <|-- TTSSound
Sound <|-- LayeredSound
Sound <|-- SynchronouslyGeneratedSound
Sound <|-- ConcatenatedSound
DynamicSound <|-- BeaconSound
SynchronouslyGeneratedSound <|-- GenericSound
GenericSound <|-- GlyphSound
class AudioPlayer{
AudioPlayerIdentifier id
SoundBase sound
AudioPlayerState state
Bool isPlaying
Bool is3D
Float volume
prepare(AVAudioEngine engine)
updateConnectionState(AudioPlayerConnectionState state)
play(Heading userHeading, CLLocation userLocation)
resumeIfNecessary()
stop()
}
AudioPlayer<|--BaseAudioPlayer
BaseAudioPlayer<|--ContinuousAudioPlayer
BaseAudioPlayer<|--DiscreteAudioPlayerDelegate
AudioPlayer<|--DynamicAudioPlayer
AVAudioPlayer<|--FadeableAudioPlayer
- GlyphSound are short audio files with specific meaning to the user e.g. app going offline/online, or entering/leaving a waypoint. I find them fairly indistinguishable, but perhaps with use and practice their meaning becomes obvious.
- LayeredSound seems to be for mixing sounds together. Each layer can have its own EQ parameters defined.
- ConcatenatedSound simply joins sounds together to be played one after the other.
- BeaconSound vary with the user heading/location.
The classes can be user together to build complex sounds. An example playback from the Swift code is:
// Create a GlyphSound of the type poiSense:
let earcon = GlyphSound(.poiSense, compass: direction)
// Create a Text to Speech sound
let tts = TTSSound(approach, compass: direction)
// Concatenate them
let ttsSound = ConcatenatedSound(earcon, tts)
// And layer with a travelEnd sound
let layered = LayeredSound(ttsSound, GlyphSound(.travelEnd, compass: direction))
Audio Engine Kotlin classes
There’s a great video showing how to map Kotlin classes over to C++. The Kotlin class simply has a long which is the pointer to the C++ object and then this is passed into C wrapper functions which forward calls on to the C++ objects. The other useful bit is the synchronized keyword and the way that the Kotlin member functions are defined. With this knowledge we can then simply design Kotlin classes for the audio engine. The current classes we have in C++ from the proof of concept are:
The only thing that Kotlin needs to be able to do is create and destroy Beacons. An AudioEngine class to wrap this behaviour up with audio initialization and destruction makes sense.
interface AudioEngine {
fun createBeacon(latitude: Double, longitude: Double) : Long
fun createTextToSpeech(latitude: Double, longitude: Double, text: String) : Long
fun updateGeometry(listenerLatitude: Double, listenerLongitude: Double, listenerHeading: Double)
fun setBeaconType(beaconType: Int)
}