Hard-Disk Recording and Editing of Digital Audio. Presented at the 89th AES convention, September 21-25 1990, Preprint Number 3006 (K-6)
Whither Dither: Experience with High-Order Dithering Algorithms in the Studio. with Julia C. Wen. Presented at the 95 AES convention, October 7-10 1993, Preprint Number 3747 (B3-AM-3)
Breaking the Sound Barrier: Mastering at 96 kHz and Beyond. Presented at the 101st AES Convention, November 8-11 1996, Preprint Number 4357 (I-2)
Music Recording in the Age of Multi-Channel. Presented at the 103rd AES Convention, September 26-29 1997, Preprint Number 4623 (F-5)
Towards a Rational Basis for Multichannel Music Recording. (with Jack H. Vad) Presented at the 104th AES Convention, May 16-19 1998
A Native Stereo Editing System for Direct-Stream Digital. (with Ayataka Nishio and Yasuhiro Ogura) Presented at the 104th AES Convention, May 16-19 1998
48-Bit Integer Processing Beats 32-Bit Floating-Point for Professional Audio Applications. Presented at the 107th AES Convention, September 24-27 1999, Preprint Number 5038 (L-3)
For my older papers, I don't have machine-readable copies.
The Synthesis of Complex Audio Spectra by Means of Discrete Summation Formulas Journal of the Audio Engineering Society, Volume 24, Number 9, November 1976, pp717-727. "A new family of economical and versatile synthesis techniques has been discovered, which provide a means of controlling the spectra of audio signals, that has capabilities and control similar to those of Chowning's frequency modulation technique. The advantages of the current methods over frequency modulation synthesis are that the signal can be exactly limited to a specified number of partials, and that 'one-sided' spectra can be conveniently synthesized." That was the abstract of the paper.
Linear-Phase Bandsplitting: Theory and Applications (with Mark Berger)
Presented at the 76th Convention of the Audio Engineering Society, October 8-11, 1984,
New York, Preprint 2132 (session A-1)
"There are a number of applications for banks of bandpass filters in professional audio
studios, both for film and music production. In this paper, we explore digital techniques
for bandsplitting that have the property that the spectrum may be separated into a number
of bands such that when these bands are added back together, the result is a pure delay. There
need be no amplitude or phase distortion other than delay. This allows such applications
as linear-phase graphic equalizers, multi-band noise gates, and many other improvements over
conventional studio equipment. These algorithms have been implemented on a large-scale
audio signal processor and run in real time. They are currently being used in major motion
The Use of the Phase Vocoder in Computer Music Applications Journal of the Audio Engineering Society, Volume 26, Number 1/2, January/February 1978, pp42-45. This paper is one of the first (maybe the absolute first) to show how to use short-term Fourier transform as a method of analyzing and synthesizing musical sound, but with the signal-processing rigor necessary to make the system an identity in the absence of modification. Probably the most ignored contribution, and the one I consider probably the most important, is the technique for unwrapping the time-variant phase. Equation (9) represents a largely foolproof unwrapping method that involves no heuristics. This paper led to much of the subsequent work by Dolson and others who have extended and refined the method for time and frequency modification of high-quality musical sound.
Signal Processing Aspects of Computer Music: A Survey Proceedings of the IEEE, Volume 65, Number 8, August 1977, pp1108-1137. This was an invited paper. Larry Rabiner invited me to write and submit this paper. It still stands as a reasonable survey of signal processing in music. It is interesting that synthesis is so little used today, whereas recording and playback (i.e., sampling) is so common. I guess it's a lot easier. Missing from this paper is any discussion of processing of the signal (aside from analysis). The computation for any interesting processing, except maybe reverberation, was so expensive at that time that we were not able to do much of it.
About This Reverberation Business
Computer Music Journal, Volume 3, Number 2, June 1979. This is a somewhat rambling
random walk through some investigations into room reverberation. I had originally
submitted it to the Journal of the Acoustical Society of America (JASA). I got a
scathing review back that I swear was longer than the paper. The reviewer complained
that it was in "an antequated discursive style." Yeah, that's probably correct.
The reviewer differed with me on several technical points. I thought about it a while
and concluded that the reviewer missed the point and didn't know what he was talking
about, and in at least one area was flat wrong. Rather than try to fight with the
reviewer, I sent it to CMJ, who was quite happy to publish it the way I wrote it.
On the Segmentation and Analysis of Continuous Musical Sound by Digital Computer Center for Computer Research in Music and Acoustics, Department of Music, Stanford University, Report No. STAN-M-3, May 1975. This is my doctoral dissertation at Stanford. My thesis advisor was Alan Kay. This is generally sited as the seminal work on transcription of music. That is, you play a piece (a duet, in this case) into the computer and some time later, it prints out a score. This proved very difficult, and especially with 1975 computer hardware (DEC PDP-10).
On the Transcription of Musical Sound by Computer Computer Music Journal 1977. This is a reprint of a conference paper at the Japan computer conference earlier that year. I was unable to attend the conference (lack of travel budget). CMJ was kind enough to reprint the article later. This is a relatively brief summary of the work in my doctoral dissertation (above).
The Use of Linear Prediction of Speech in Computer Music Applications Journal of the Audio Engineering Society, March 1979, Volume 27, Number 3, pp134-140. This paper was about the base technology for my early computer pieces "Perfect Days" and "Lions are Growing". This was building on the work of Charles Dodge, Tracy Petersen, and many others. It turns out to be quite difficult to synthesize speech, even given a recording that you use as a template. I got to revive these techniques when I did my piece "The Man in the Mangroves Counts to Sleep", but with modern computing techniques. In this paper, I was boasting that it "only" took 45 minutes to synthesize 30 seconds of sound(!). Note that many of the techniques outlined in this paper are still used today.
The Use of Prime Residues as a Block Erasure Code with Linear Decoding Time Worldcom 2008. I include this paper mostly because I wanted to point out that I do things besides audio sometime. Plus, in case it needed a recommendation, to strongly advise engineers (audio or otherwise) to take all the math they can stomach. The innovation in this paper makes use of number theory, group theory, and statistical communication theory. I would probably not have done it without all that background. From the abstract ". . . prime residue encoding forms a non-systematic block erasure code that is asymptotically MDS (maximum distance separable) as the word size is increased. The uses for this code include digital fountain implementation, efficient payload distribution for digital watermarking, and more." This code related to Luby codes and "Tornado" codes. The big advantage is that the signal can be reconstructed from any N packets. If you miss one, you just copy one from your neighbor, even if it was encoded with a different set of primes.
A Note on the Implementation of Audio Processing by Short-Term Fourier Transform
"Short-term Fourier Transform (STFT) forms the backbone of a
great deal of modern digital audio processing. A number of pub-
lished implementations of this process exhibit time-aliasing dis-
tortion. This paper reiterates the requirements for alias-free pro-
cessing and offers a novel method of reducing aliasing."