3. Implementation

The most important new features that have been implemented in the recent version of Sweep are an advanced form of scrubbing which models the physics of a turntable for playing vinyl records, and improvements in the visual synchronisation to depict application latency. This section introduces the implementation of Sweep's waveform visualisation and the recent improvements.

3.1. Visualisation

Sweep 0.1 improved on the visual representation of audio data by combining a display of the waveform peak with an overlay of the average value. Together these provide the user with a notion of both the overall loudness and the dynamic shape of the sound. Additionally, a 3D bevel effect was applied to the waveform rendering, which by emphasising the differences in peak values, accentuates pitch differences at various zoom levels, providing a rough indication of sonic texture. Although not strictly providing any complex analysis, this often provides just enough extra visual texture to distinguish between simple instrumental and vocal portions of a recording. An example of Sweep's waveform rendering is shown in Figure 1.

Figure 1. Screenshot of waveform view in Sweep

3.2. Scrubbing

The major addition to Sweep's usability was the implementation of interactive scrubbing. Scrubbing in a digital media editor allows the user to locate specific items of interest or jump directly to specific points in time by interacting directly with a timeline.

Sweep features a number of innovative, complementary scrubbing methods:

Sweep's scrubbing was modelled on the quality of interaction available when working with tape reels and vinyl records. Vinyl records are such a directly responsive format that a skilled user such as a professional disc jockey is able to use them to quickly cue and mix together songs, and for some musical genres such as hip-hop, the skilled practitioner incorporates the audible scanning of the record under finger-tip control into the music, in an artform known as turntablism. This advanced level of interactivity was used as a benchmark -- if a digital audio editor could be created with such direct responsiveness that it could be used artistically, it would surely provide a much needed usability boost to the more mundane task of editing. In turn, this introduces the possibility of easily editing the sounds that are used in performance, which is of course impractical with vinyl.

The audible characteristics of vinyl, especially when played on the turntable of a professional disc jockey, are subtly different and inherently more pleasing than the simple fast playback of a tape reel. Three contributing factors are wear on the record groove, non-linear filtering introduced by forced motion of the stylus, and controlled momentum of the turntable under the action of a slipmat.

Firstly the "smoother" sound of vinyl is somewhat due to physical wear introduced by contact of the stylus each time a record is played, such that over time the groove is widened and high-frequency details are smoothed over. This is a general trait of vinyl records and introduces a constant distortion of the sound, so it is not desirable to explicitly model it in a digital audio editor as this would misrepresent the audio data during editing.

Secondly, the physics of moving a stylus quickly through the groove of a vinyl record introduces a complex filtering. The microscopic shape of a record groove is depicted in Figure 2, with stereo channels encoded as horizontal and sideways variations. Upon forced motion the stylus' increased momentum causes it to skip over the high-frequency details encoded in the groove. This filtering removes much of the annoying high frequency components which are introduced by the increase in playback speed. Although the actual filtering introduced by a stylus on vinyl is non-linear and would be costly to implement in software, it is usefully approximated by the application of a simple lowpass filter.

Figure 2. Cutaway diagram of vinyl groove.

Lastly, the weight of a turntable provides a fair amount of momentum, such that when a record is sped up by the disc jockey's finger, it takes some time to slow down to the drive speed of the turntable. This momentum also provides a more subtle smoothing of the record's motion, such that any sudden changes invoked by the disc jockey produce a somewhat less marked change in the record's playback. A similar amount of momentum was modelled in Sweep's scrub tool, such that if desired the cursor can be thrown back and forth along the waveform display, and such that sudden changes in direction and speed are smoothed over to provide non-jerky responsiveness.

3.3. Monitoring playback latency

Recent efforts have vastly improved the ability of the Linux kernel to schedule interactive events, including low latency work by Andrew Morton and Ingo Molnar, and Montavista's work on kernel preemption maintained by Robert Love. This work has been so effective that with a properly tuned kernel the latency introduced by audio buffering can be reduced to the vicinity of 1 ms. However this currently requires some configuration on the user's part, and is specific to Linux. It is also important to realise that the latency percieved by a user is not only introduced by the kernel, but also by the application, and it is the application's responsibility to take the total latency into account when synchronising audio with visuals. The basic configuration window for selecting the amount of device buffering requested by Sweep is shown in Figure 3.

Figure 3. Sweep's device buffering configuration.

For the sake of portability and acceptable behaviour when running stock kernels, it was necessary in Sweep to introduce some visual feedback of the delay caused by device buffering. During playback, Sweep displays two cursors simultaneously, as shown in Figure 4: the white cursor to the right is under the user's control, and can be moved by the transport controls and the scrub tool; the green cursor to the left always displays the position of the audio that can currently be heard. Hence if the user scans or scrubs through the file, the white cursor is moved immediately but the green cursor may lag slightly due to buffering in the audio device, and due to motion smoothing introduced by the modelling of momentum. Thus the user has a true representation of their influence over the playback position, and is not misled by contradictory audio and visuals. This also provides an obvious visual representation of the application latency, which is otherwise a fairly abstract concept.

Figure 4. Sweep's cursors: playback (left) and user (right)