Composing Speech:  

Investigation and Application of Musical 
Expression Embedded in Spoken Language  

 
Marc Du Plessis 
1043671 

 
Dissertation submitted in partial fulfilment of the requirements  
for the degree of Master of Music 

in the Wits School of Arts, Faculty of Humanities,  
University of the Witwatersrand 

 
Johannesburg, March 2024 

 
Acknowledgments 
 

This dissertation has been a challenging and rewarding learning experience which I could not 

have completed without the help of my two supervisors, Dr Cameron Harris and Dr Jonathan 

Crossley. Their constructive critique has equipped me with the skills to write this dissertation 

and compose the supporting music in a manner that leaves me feeling proud and accomplished.   

I would like to say a special word of thanks to my partner Christy, my parents, Charl and 

Nicole, and my friends who have supported me in my passion and given me the help and 

guidance needed. 

  
 iii 

Declaration 
 

I declare that this research thesis is my own unaided work. It is submitted for the degree of 

Master of Music at the University of the Witwatersrand, Johannesburg. It has not been 

submitted before for any other degree or examination at any other university. 

 
Marc Du Plessis 

 
12 March 2024  


 iv 

 
Abstract 
 

This dissertation explores the musical potential of emotive expression in cut-up speech sounds. 

Cut-up is a twentieth-century technique with roots in Dadaism in which one cuts “pre-existing 

material into radical juxtapositions” (BBC, 2015), made popular in literature by William 

Burroughs in the 1950s and 60s. Speech is used primarily to communicate information relating 

to the world around us, but it operates sonically. Therefore, it has inherent parameters that can 

be manipulated to inform how information is received. The ability to manipulate the inherent 

sonic parameters of speech is one way in which it can be emotively coded. Sung vocals with 

lyrical content in music differ from speech in that the roles of information communication and 

the manipulation of the sonic parameters are reversed. Where speech relies on the manipulation 

of sonic parameters to augment or diminish the information being conveyed, sung vocals that 

utilise lyrical content rely on the semantic content to augment or diminish the sonic 

characteristics of the voice. Sung vocals could therefore be thought of as sonic utterances that 

are semantically coded. These inherent parameters are shared by music as it also operates in 

the sonic realm. The researcher used electronic music production techniques to isolate the 

shared parameters between music and speech (pitch, rhythm, timbre, and dynamics), and 

composed expressive, accessible, and engaging musical works based on these parameters. 

Digital music technology has the capacity to explore the limitations of sonic expression, due 

to its capacity to manipulate recorded sound waves. Therefore, it equipped the researcher with 

the necessary tools to manipulate cut-up speech sounds with compositional intent.  

 
The objective of this research was to compose musical works that drew from popular music 

styles, with an aesthetic focus on rich, timbrally expressive vocal material created from 

recordings of speech, to understand the expressive capabilities of the chosen raw material 

(speech sounds). The methodological procedure was to record speech from various sources, 

edit (cut-up) the phrases to create brief clips that were divorced from semantic signification, 

present the edited clips to an audience, and analyse their responses. The researcher used the 

insights from this analytical process to inform the use of the same speech sounds in the 

compositional practice. The researcher presented 26 examples (brief composed cut-ups of 

speech sounds) to 45 participants in a survey group and eight South African music industry 

professionals in one-on-one interviews. The responses yielded qualitative data that was 

analysed using thematic coding, followed by statistical analysis using Spearman’s rank 


 v 

correlation (1904). The results provided vague answers to the primary research questions, but 

ultimately supplied the researcher with various qualitative interpretations of how the speech 

sounds expressed meaning in a cut-up context. This informed the researcher’s creative practice 

in the musical application of cut-up speech.  

 
Although the interpretation of the qualitative data did not result in definitive answers to the 

research questions, the aim of this research to explore the musical application of emotive 

expression in speech was achieved. The understanding that a listener experiences music in an 

inter-subjective and inter-contextual manner, combined with the expressive nature of the raw 

materials, liberated the researcher to compose expressive music without the need to know each 

listener’s subjective experience of expression.  

 
Musical works: 
https://drive.google.com/drive/folders/1gdVt2qVFIa4I0CnfkoWeOAVBEylDqR-h?usp=drive_link  

 
Unprocessed Examples (Original Sound Edits): 
https://drive.google.com/drive/folders/1Z48UKr37gaqX0TIiaxuhYVOGr5wWEshF?usp=drive_link  

 
Processed Examples (Survey and Interview Edits): 
https://drive.google.com/drive/folders/1_Jqi58y12FVrqOixsQQZpj1XqIaApYZP?usp=drive_link  

 
Ethics clearance number:  

H21/10/07 

  
 vi 

Table of Contents 

 
Acknowledgments ............................................................................................................. ii 

Declaration ...................................................................................................................... iii 

Abstract ............................................................................................................................ iv 

Table of Contents ............................................................................................................. vi 

Index of Tables and Figures ............................................................................................ viii 

1. Introduction .............................................................................................................. 1 

1.1 Aim ...................................................................................................................................... 1 

1.2 Rationale .............................................................................................................................. 4 

2. Literature Review ...................................................................................................... 8 

2.1 Accessibility in Music .......................................................................................................... 8 

2.2 Music and Language ......................................................................................................... 11 

2.3 Cut-Up ............................................................................................................................... 15 

2.4 Artistic Research ............................................................................................................... 16 

2.5 Popular Music and Composition ....................................................................................... 18 

2.6 Qualitative Research ......................................................................................................... 23 

3. Methodology ............................................................................................................ 25 

3.1 Initial Research .................................................................................................................. 25 

3.2 Creating Examples ............................................................................................................ 27 

3.3 Designing a Questionnaire ................................................................................................. 31 

3.4 Qualitative Data Gathering ............................................................................................... 33 

3.5 Data Capturing .................................................................................................................. 36 

3.6 Transcription ..................................................................................................................... 37 

3.7 Thematic Coding and Analysis .......................................................................................... 38 

3.8 Quantitative Analysis ........................................................................................................ 39 

3.9 Compositional Methodology .............................................................................................. 42 

4. Presentation and Discussion of Results .................................................................... 47 

4.1 NVivo ................................................................................................................................. 47 

4.2 Participant Survey Results ................................................................................................ 50 

4.3 Coding ............................................................................................................................... 51 

4.4 Expert Interview Results ................................................................................................... 54 

4.5 Likert Scale Questions ....................................................................................................... 56 


 vii 

4.6 Phrases ............................................................................................................................... 60 

4.7 Correlation charts ............................................................................................................. 61 

4.8 Intelligent label charts ....................................................................................................... 63 

5. Exploring the Data in Compositional Practice ......................................................... 66 

6. Presentation, Analysis, and Commentary of Music .................................................. 71 

6.1 Midnight’s Children .......................................................................................................... 71 

6.2 The Weary Traveller ......................................................................................................... 74 

6.3 Confusion Kills .................................................................................................................. 75 

6.4 Inner Piece ......................................................................................................................... 77 

6.5 Undesirable ........................................................................................................................ 78 

6.6 Day Drifter ........................................................................................................................ 80 

6.7 Oblivious ............................................................................................................................ 81 

6.8 Melancholy Folly ............................................................................................................... 82 

6.9 Stubborn ............................................................................................................................ 84 

6.10 Do You Understand? ....................................................................................................... 85 

6.11 Whatever It Takes ........................................................................................................... 87 

6 Conclusion .............................................................................................................. 89 

References ...................................................................................................................... 92 

Appendices ...................................................................................................................... 98 
 
  
 viii 

Index of Tables and Figures  
 

Tables 
Chapter 3 
Table 3.1: Demographic Description of Sources ............................................................ 27 
Table 3.2: Parameters for Defining Examples ............................................................... 28 
Table 3.3: Breakdown of Examples ............................................................................... 29 
Table 3.4: Interpretation of Correlation Coefficient Values  ......................................... 41 
 

Chapter 4 
Table 4. 1: Participant Survey Respondents' Age Range ............................................... 50 
Table 4. 2: Participant Survey Respondents' Gender .................................................... 50 
Table 4. 3: Participants Survey Respondents' Home Language .................................... 51 
Table 4. 4: Participant Survey Respondents and All Codes ........................................... 52 
Table 4. 5: Participant Survey Respondents and "Speech" ........................................... 54 
Table 4. 6: Expert Respondents’ Gender ....................................................................... 55 
Table 4. 7: Expert Respondents and All Codes .............................................................. 56 
Table 4. 8: Participant Survey Respondents' Responses to Happy Likert Question ..... 57 
Table 4. 9: Expert Respondents' Responses to Happy Likert Question ......................... 58 
Table 4. 10: Participant Survey Respondents' Responses to Fast Likert Question ....... 59 
Table 4. 11: Expert Respondents' Responses to Fast Likert Question ........................... 60 
Table 4. 12: Participant Survey Respondents' Responses to Phrase .............................. 61 
Table 4. 13: Expert Respondents' Responses to Phrase ................................................. 61 
Table 4. 14: SPSS Spearman’s Rank Correlation Coefficient Table for the Most 
Common Participant Survey and Expert Respondent Thematic Codes ........................ 62 
Table 4. 15: Percentage of Significant Correlations Between Examples based on Shared 
Characteristics  .............................................................................................................. 63 
Table 4. 16: Breakdown of SPSS Spearman's Rank Correlation Coefficient for all 
Looped Examples ........................................................................................................... 64 
 

Chapter 6 
Table 6. 1: “Midnight’s Children” Formal Structure .................................................... 72 
Table 6. 2: “The Weary Traveller” Formal Structure ................................................... 74 
Table 6. 3: “Confusion Kills” Formal Structure ............................................................ 75 
Table 6. 4: “Inner Piece” Formal Structure ................................................................... 77 
Table 6. 5: “Undesirable” Formal Structure ................................................................. 78 
Table 6. 6: “Day Drifter” Formal Structure .................................................................. 80 
Table 6. 7: “Oblivious” Formal Structure ..................................................................... 81 
Table 6. 8: “Melancholy Folly” Formal Structure ......................................................... 82 
Table 6. 9: “Stubborn” Formal Structure ...................................................................... 84 
Table 6. 10: “Do You Understand?” Formal Structure ................................................. 85 
Table 6. 11: “Whatever It Takes” Formal Structure ..................................................... 87 


 ix 

Figures 
Chapter 2 
Figure 2.1: "From Musical Practice to Artistic Research" (Crispin 2015: 58) .............. 18 
 

Chapter 6 
Figure 6.3. 1: “Confusion Kills” Spectrograph .............................................................. 76 

Figure 6.4. 1: “Inner Piece” Spectrograph ..................................................................... 77 

 
 1 

1. Introduction 
 
1.1 Aim 
 

“Now listen to this.” The words were smudged together. They snarled and whined and 

barked. It was as if the words themselves were called in question and forced to give up their 

hidden meanings (Burroughs 1962: 21). 

Speech and music share expressive parameters such as pitch, rhythm, timbre, and dynamics. 

This research explores three primary questions to determine if one’s ability to interpret emotive 

meaning in both mediums is directly linked to this phenomenon. These questions were explored 

through utilising a mixed methodology of qualitative and quantitative analysis, and through the 

creative act of composition. Through the latter, the researcher explored the extent to which the 

expressive parameters of speech can be utilised to create musical material. By divorcing speech 

sounds from their original syntactic structure and utilising them in the creation of musical 

works, the researcher composed music that is primarily emotive, but also accessible in-so-far-

as a general listener may be able to engage with, and interpret meaning from, the music. The 

music is presented in the form of an eleven-track electronic album which borrows aesthetically 

and structurally from popular music. The musical works were inspired by Salman Rushdie’s 

Midnight’s Children (1981), in which the protagonist is born with the gift to hear the thoughts 

his fellow characters. As such, each track in the album embodies a metaphysical representation 

of fictional characters imagined by the researcher.  

 
This research interrogates three research questions in relation to the emotive expression of 

English speech sounds in a cut-up form. 

 
(1) Do speech and music share parameters that express emotive meaning in an analogous 

manner?  

 
Both music and speech are sonic-based mediums of communication and make use of pitch, 

rhythm, timbre, and dynamics: is our emotive experience of both mediums analogous? 

Literature pertaining to the relationship between language and music will be reviewed and used 

to analyse the qualitative results of this question discussed in Chapter 4. 

 
 2 

(2) How does the act of composing with cut-up speech sounds change the emotive meaning 

for the receiver?  

 
This research aims to investigate whether the emotive signification of speech is altered by 

rearranging its syntax based on formal musical structures. In order to derive data-driven insight 

into this question, the researcher created a listening experiment that was presented to two 

groups of participants. One comprised university music students and non-music specialist 

members of the public, and the other involved one-on-one interviews with South African music 

industry experts. This question was also considered by the researcher in his own reflection on 

his musical works. The technique of rearranging speech sounds was inspired by William 

Burroughs, an American writer who helped popularise the styles of writing and literature 

known as “cut-up” in The Nova Trilogy (commonly referred to as The Cut-up Trilogy), a series 

of three novels written between 1961 and 1964. The novels were constructed: 
using the ‘cut-up’ method in which existing texts, including Burroughs’ own writing and/or 
writings by other authors, were physically cut into pieces of variable length and re-assembled 
in random order to generate unexpected juxtapositions and new syntactic relationships (Murphy 
2002).  

The researcher was initially introduced to Burroughs’ work through the song “Fire is Coming” 

by Flying Lotus featuring David Lynch (2019). The song features cut-up-style lyrics narrated 

in a Burroughs-esque voice by David Lynch. Burroughs’ writing subsequently became a source 

of inspiration as the technique and resulting experience of the writing style felt analogous to 

the practice of composition. Like Burroughs, the researcher cut existing material (recordings 

of speech) and reassembled the pieces to create new syntactic relationships between the sonic 

artifacts. The researcher’s approach differed from Burroughs’ in that the cut-ups were created 

from a single phrase of recorded speech, by removing utterances from the original phrases. 

This process resulted in new syntactic relationships between the remaining speech utterances. 

These cut-ups became the raw materials used for the album tracks. Burroughs’ application of 

the cut-up technique, on the other hand, joined strings of words, sentences, and occasionally 

passages, with those from other sections or source material. 

 
(3) Can a composer extract and harness the speech sounds which allow one to signify 

emotive meaning through spoken language?  

 
This question investigates the problem of signification. Is it possible to isolate and harness the 

speech sounds that signify emotive expression from the audio of recorded speech? By 


 3 

divorcing linguistic meaning from speech, is the emotive component of spoken language more 

easily identifiable? One of the methods for investigating this question was to create a series of 

26 brief audio examples that exhibited the cut-up method and present these examples to 

participants, both in a group survey and one-on-one interviews. The data gathered would be 

analysed using qualitative thematic coding as described by Virginia Braun and Victoria Clarke 

in Successful Qualitative Research (2013) and quantitative data analysis using Spearman’s 

Rank Correlation Coefficient to explore potential connections between participants’ responses. 

 
This dissertation and its accompanying album should be viewed as the culmination of process-

based artistic research (research through art and design (Frayling 1993: 5), with each operating 

iteratively towards the same end. Namely, to investigate the extent to which music and speech 

communicate emotive meaning in an analogous manner. The nature of the relationship between 

language and music has been considered at length for centuries. Scholar Michael Davis 

discusses Jean Jacques Rousseau’s (1712-1778)  interpretation of the relationship between 

language and music in his paper “The Music of Reason in Rousseau’s ‘Essay on the Origin of 

Languages’” (2012). In this paper Davis explains that Rousseau’s approach stems from the 

notion that language developed because humans experience one another as subjects rather than 

objects (2012: 392). Subjects require a logical system (language) through which they can 

interact while objects are able to rely on sensations communicated through imitative 

expressions such as painting or music (Davis 2012: 392-393). In this interpretation humans are 

subjects that can perceive themselves in the world, and therefore require methods of 

communication (language) that allow them to interact with other humans about their 

experiences and the external world around them. In Rousseau’s view (as described by Davis) 

objects include animals and rely on instinct and feeling to navigate the world, sensations that 

are imitated in artistic works. 

 
This dissertation does not conform to Rousseau’s interpretation of the relationship between 

music and language. Rousseau implies a binary separation between musical, or artistic, 

expression and language, presenting a problematic perspective of people who are, in 

Rousseau’s view, capable of subjective and objective experiences. Instead, the researcher has 

chosen to approach the relationship of music and language as two necessary modes of 

communication that are connected through their nature as temporal sonic mediums and work 

together in both forms to convey information about the human experience. 

 
 4 

This project emerged from a desire to understand which elements of spoken language are 

analogous to modes of musical expression and explore the ways in which the emotive content 

of spoken language could be used in a musical context. Musical and linguistic expression are 

continuously combined in popular music, and the researcher wished to investigate this 

phenomenon by examining the role of the emotive delivery of cut-up speech sounds and how 

they can impact the overall expression of a musical work. 

 
The album composed in conjunction with this dissertation makes use of instrumentation limited 

to cut-up speech sounds and drum kit. The drums are the musical instrument through which 

this researcher artistically expresses himself. The voice was of compositional interest because 

it is the primary instrument through which most people communicate with one another. Our 

conceptual lives, and consequently, our perceptual lives, are born, grow, and develop through 

our experience with language (Langer 1942: 126). As a result, one can communicate and 

cultivate relationships with others. The combination of these two elements appeared to suggest 

a compositional framework through which to explore conceptual and creative expression. 

 
The approach of incorporating live drums and speech in the context of pop-influenced 

electronic music has been previously explored by American drummer, Zack Danziger. 

Danziger used hybrid acoustic and electronic drum kits to perform in Wednesday Night Titans, 

a duo comprising Danziger and bassist Kevin Scott. Their music combines 1980s wrestling 

videos, a hybrid drum kit setup of live drums and electronic triggers and sample pads, with live 

bass. The result is a multimedia performance that allows them to improvise with the audio 

(speech) and visuals of the wrestling clips using both acoustic and electronic techniques in a 

live context.  

 
1.2 Rationale 
 

The researcher proposes that without intent and an understanding that the tools at our disposal 

are a means to an end, we give the tool the agency in our work. A tool carries out a particular 

function and when used without creative intent, will continue to produce a predictable outcome. 

Electronic music technology is no different. In the world, technology and music have a 

symbiotic relationship. Since the invention of the first musical instruments, technology has 

been used to create new sounds and develop new instrumentation to satisfy artistic conceptions. 

The documentation of music through music notation, the enhanced quality of the sound 


 5 

produced during a performance through architecture (controlling the acoustic qualities of a 

space), and the ability to record music, are all examples of technological developments that 

allowed music to be experienced in different ways. These developments simultaneously laid 

the groundwork for new conceptions of music. Our interaction with technology as it relates to 

music production and distribution can be viewed as dialectic because our exploitation of the 

available technology is often restricted to the functionality of whatever tool we happen to be 

using. The common-place use of musical technology could imply some acceptance of the 

restrictions of the tools at our disposal. However, these restrictions are not fixed, as artistic and 

technological experimentation is continually redefining the boundaries of what is possible, and 

what is accepted.  

 
The ability to record sounds for later playback has the innate ability to be changed by the 

medium through which it is heard. For example, magnetic tape allows for sounds to be played 

slower or faster depending on the speed setting of the tape machine. Vinyl record players have 

a similar ability. They can be played at standardised speeds (35, 45, and 75 revolutions per 

minute) or physically manipulated to alter the playback speed of the record. This concept 

highlights a key artistic consideration when exploring and creating electronic music. There 

needs to be artistic intent governing how technology is used to create art, otherwise the result 

could lack the depth required to be truly meaningful. In its current guise electronic technology 

has expedited the creation, distribution, and consumption of music. The researcher’s decision 

to compose a popular music album using electronic music techniques creatively challenges the 

accepted restrictions of the medium. The adoption of formal aspects such as structure and 

instrumentation was achieved with an emphasis on presenting thematic material in an 

intentional and meaningful way. 

 
Communication, how humans share ideas and information, is at the forefront of this 

dissertation’s rationale, along with the desire to explore how communication adapts in relation 

to its medium. Language and speech were the natural next step, while the medium of electronic 

music allowed for flexibility and freedom in the exploration of compositional methods that 

would not limit the creative output. When we communicate with one another we are engaging 

in the transfer of information. This information can take the form of conceptual ideas, data, 

artistic expressions, or emotive expressions (amongst others) and by participating in the 

exchange of information we develop our understanding of the people and world around us.  

 
 6 

Linear and Interactive Communication 

There are two modes of communication called linear and interactive communication (Bock 

2014: 36-37). Linear communication refers to a one-way sending of information to a receiver 

in which a sender encodes (speaks) a message to a receiver who then decodes the information 

and, hopefully, understands what is being communicated (36). In interactive communication 

we are constantly aware of the receiver and adjust the encoded messages to communicate in a 

particular environment or with a particular person more effectively. (37). Additionally, there is 

a semiotic element to the way we communicate. Languages are a system of signs that signify 

depending on the context in which we use them, but it is important to note that “signs are made 

and remade not used and sent” (45). This phenomenon happens as the result of framing: “the 

mental filters we use to interpret an interaction or text” (41) - and interpretation (45). This 

phenomenon can be applied to our engagement with music. On the surface, our interaction with 

music seems to be a linear form of communication: we are presented with an encoded message, 

and we decode and interpret meaning from what we hear.  Just as language signs are made and 

remade depending on the framing or context, so too are musical signs. This concept greatly 

interests the researcher as a composer and influenced the creative aspect of this project. It is 

the researcher’s goal to communicate emotive expression with the people that hear this 

project’s music. The idea that through the reframing and the continuous remaking of signs, one 

can have novel and meaningful musical conversations, encourages the notion that a reframing 

of the “signs” of electronic music production can broaden the artistic scope of the medium. 

 
Language can often fall short when trying to articulate the intricate nature of our sentient life, 

as our emotions can be complex and abstract, and not easily put into words (Langer 1942: 48). 

On the other hand, music excels in representing a connotative, symbolic, and subjective 

experience due to the two phenomena’s (music and sentient life) “similarity of logical form” 

(28). This proposes that both music and one’s felt emotions convey meaning in the same way. 

The meaning of music lies in the receiver’s subjective feeling, just as it does with one’s 

emotional experience. Although music and language function differently in conveying 

meaning (Adorno 2002: 113), speech may be implementing some of the same parameters that 

characterise musical expression to convey emotions and connotative meaning. Imagine hearing 

someone say, “Oh no, that’s terrible” in an earnest and caring tone compared to a sarcastic or 

apathetic tone. The sentence and person are the same, yet one may make you feel comforted 

and understood, while the other may make you feel unheard or ridiculed. This example 

highlights the effect of our articulation on communication. The researcher suggests that the 


 7 

reason we can change the way we speak, and subsequently evoke a different response from the 

people we communicate with, is because we are able to control the musical parameters of our 

speech patterns (pitch, rhythm, timbre, and dynamics).  

 
According to philosopher and musicologist Theodor Adorno, music does not form a system of 

signs that could facilitate a true abstraction of what is experienced (2002: 113). However, much 

of contemporary popular music includes lyrics, which function as a vehicle for conceptual and 

linguistic information. “Language is conception, and conception is the frame of perception” 

(Langer 1942: 126). The voice as an instrument of communication is not merely a vehicle for 

conveying a system of signs that signify conceptual ideas. Public speakers, educators, 

politicians, and radio and podcast presenters are not just able to convey information in an 

eloquent manner, they are able to captivate or evoke an emotional response from their audience. 

As a composer this phenomenon presented an opportunity to use cut-up speech sounds and the 

musical parameters of speech to affect emotive expression in the creation of musical works.  

 
Historically, music and language have developed similarly by incorporating form in a 

disciplined manner (Langer 1942: 216). Music and language, spoken and written, both rely on 

form to eloquently express their “import” (content) (ibid.), but what happens when the form of 

language and its import obey the grammar and syntax of a specific musical expression? In the 

same way that grammar functions in language using rules to govern how words and sentences 

are constructed, music can also be understood as having rules that facilitate the construction of 

melodic and harmonic phrases and form dependent on the place, time, and community in which 

the music has been created. The compositions have been created as a way of exploring how the 

emotive components of cut-up speech sounds, largely divorced of semantic meaning, behave 

in a musical context.  

  
 8 

2. Literature Review 
 
This chapter presents the literature used by this research as it pertains to the three research 

questions. The questions are each multifaceted and could be approached in a variety of ways. 

As a composer whose aim is to create musical works informed by their research, these 

questions were regarded as avenues for creative practice. An investigation into the literature 

below has been used to inform analysis and creative practice. Thus, it has enabled the 

researcher to consider how one might address the research questions from a compositional 

perspective.  

 
The literature has been divided into the various foci that the researcher explores. Section 1 is 

concerned with musical accessibility, specifically outlining composer and writer Jochen 

Eisentraut (2013) and composer and scholar Dan Dediu’s (2012) ideas that there are certain 

practices and qualities of accessible music that allow for easier engagement on the part of a 

listener. Section 2 is concerned with the relationship between language and music. This project 

makes use of speech sounds in its compositional practice and as a result, the idea that language 

and music share parameters has been explored in the literature. Section 3 is focused on the 

creative application of the cut-up method used by writer, William Burroughs. Section 4 deals 

with artistic research, a field that places emphasis on the production of knowledge through 

process-based creative practice. Section 5 discusses the expressive capabilities of popular 

music as it pertains to the aesthetic of the researcher’s musical works. The creative output of 

this research takes the form of eleven electronic music tracks that conform to the format of a 

popular music album. Section 6 briefly introduces the qualitative and quantitative analysis 

methods that will be discussed in Chapter 3 as this project uses data gathered from respondents 

to inform its compositional output.  

 
2.1 Accessibility in Music 
 
This project centres the idea of accessibility around the aspects of a piece of music which a 

listener can recognise and to which they can relate. Those characteristics of a piece of music 

that sound or seem familiar to a listener. Composer and scholar Dan Dediu provides a definition 

for accessibility: “a phenomenon born of the relationship of communication between sender 

and receiver, by which the sent information has direct access to the mental space of the receiver 

of that information” (2012: 53).  


 9 

 
Composer and writer Jochen Eisentraut created a typology for exploring accessibility in various 

musical situations. It is challenging to define a specific quality that music must possess to be 

received accessibly as “in considering musical accessibility we are looking at interactions 

between human beings and music (often via some media) and as such they involve an intricate 

interplay of musical structures, cultural constructs and psychology” (Eisentraut 2013: 11-12). 

For music to be accessible within the boundaries of Eisentraut’s proposal, it not only needs to 

be physically accessible to people, but it also needs to interact with culture for human beings 

to be receptive to its meaning. The researcher is interested in how structures of music, timbral 

qualities in the sound sources used, and/or aesthetic choices made by the composer may allow 

the music created in this research to be received in a more accessible manner. 

 
Eisentraut’s typology divides musical accessibility into three levels: (1) contact, (2) reception, 

and (3) participation (2013: 15). The levels, although specific in their focus, are not to be 

thought of as isolated and self-contained, as every situation can be analysed from each 

perspective (Eisentraut 2013: 2). Contact describes how music needs to be experienced by a 

listener in a physical manner (21). If a listener never hears a piece of music, any discussion 

about accessibility is immediately halted. If someone wants to listen to music, they need ways 

to access it; these ways naturally change and evolve in conjunction with culture and 

technology, and they are influenced by socio-economic and geographical factors (21). 

Reception can be interpreted from a variety of perspectives: does the music communicate 

emotive meaning or accompany an activity in one’s life and/or is one able to consciously 

interpret formal structures within the music that would facilitate a deeper or more complex 

understanding of what is being heard (22)? However, one does not need to be able to 

consciously interpret and evaluate musical form, or its technical attributes to receive input from 

music, because music can function as a regulator (relaxing music after a stressful experience), 

a confidant (music that helps one externalise their feelings), or a distraction from one’s 

emotional experience (music that accompanies a mundane activity) ( 22).  

 
Level 2 accessibility is a process of engagement that deals with aesthetics, individual 

psychology, and preference, and it is also subject to change in accordance with the lives of the 

individuals that experience it (Eisentraut 2013: 1, 9).  

 
 10 

Level 3 accessibility deals with participation: “can I play an active part in that particular 

musical sphere?” (Eisentraut 2013: 15). Playing a musical instrument and learning to perform 

music falls into this level of accessibility. Socially interacting with a musical scene by dressing 

in a particular way or making friends with people who share the same musical taste also forms 

part of participation in music. “If we are keen enough on some musical genre, will we be able 

to become part of it in some way” (1). For a person to participate in some way with music or a 

musical social group, levels 1 and 2 are regarded as a prerequisite because without contact with 

the music or an affinity for it, it is unlikely any meaningful participation can take place (23). 

According to Eisentraut, if one engages in the level of participation, the potential for deeper 

understanding and appreciation is possible (45).  

 
Eisentraut’s book, although providing a comprehensive examination of the accessibility of 

music, is more a study of the species of musical accessibility than it is a prescriptive guide to 

accessible music (Braae 2014: 395). His aim is to locate accessibility and make sense of its 

function within the discourse, but this dissertation wishes to focus on how one may use 

accessibility as a creative tool to communicate with a receiver. One critique of Eisentraut’s 

book is that he uses dualism to explain a concept that he states is pluralistic and multifaceted, 

which weakens his conceptual framework for accessibility. Anthropologist Evangelos 

Chrysagis suggests that the field would benefit more from an ethnographic approach to musical 

discourses (Chrysagis 2014: 369).  

 
If a musician desires to consider the accessibility of their music, they risk creating a listening 

environment that does not present any new information (Dediu 2012: 50), in favour of a 

rearrangement of previously experienced musical phenomena. This leads to music that appears 

pleasant and confirms our established taste but does not allow us to gain any new perspectives 

apart from the potential of the lyrical content. Conversely, music that contains none of these 

accessible characteristics, although containing only new or novel material, has a potentially 

harder task of conveying expression to a larger audience (if that is the intention) due to a lack 

of grounding in an established musical environment.  

 
If we consider music to be a medium of communication (Dediu 2012), then, like language, 

music must have certain recurring phrases, syntax or grammar that are both common and 

commonly understood. These require little-to-no effort on the part of recipient to decipher their 

meaning. Accessibility in music with the intention to convey an artistic message in a new or 


 11 

novel way needs to consider the medium (firstly, music, and secondly, stylistic, or genre-based 

tropes) and the message (what is new and how does it augment the expectations of the 

listener?). 

 
Dediu approaches the topic of accessibility in music from the perspective of a composer and 

musicologist. He explains the phenomenon of accessibility as both a relational phenomenon 

and something to be built. Accessibility is achieved in the act of discovery on the part of a 

receiver rather than something that a work of art inherently possesses (2012: 60). It is Dediu’s 

belief that music encompasses five roles for the receiver: psychological, social, cultural, 

ideological, and moral. The idea of accessibility is crucial because, for the role to be 

successfully realised, understanding is required (49). On this point Dediu draws from Russian 

linguist I. M. Lotman and his concept of mental space, “the informational reservoir of a given 

individual at a certain moment in time, containing the totality of the information they hold” 

(50). For communication to occur between sender and receiver, there need to be common 

elements in both of their mental spaces. Without these, communication breaks down and no 

information is transferred. However, total overlap of elements in the mental space of the sender 

and receiver results in “communication without content” (50). 

 
So, is this proposing that for an artistic work to be accessible it should have an X value of 

accessible content and Y value of new and novel information? Even if such a process was 

developed and used to create art, the outcome would most certainly fall short of “perfect” 

accessibility, in which every receiver was able to receive the desired new information. As 

Eisentraut states, accessibility operates on an individual level between receiver and artwork. 

Considering this, along with the understanding that cultural context is extremely variable, the 

scope of musical accessibility becomes evident. When addressing the idea that objects (in this 

case works of art) are accessible, Dediu has this to say: “an artistic object cannot actually be 

accessible, but only have in its structure elements that, in a certain context and in relation to a 

certain receiver, can act as catalysts for communication and help the emergence of the relational 

phenomenon of accessibility” (2012: 59). 

 
2.2 Music and Language 
 

It is beyond the scope of this research to detail a linguistic history and analysis of the various 

ways in which language functions to convey meaning, or to provide a definitive method for 


 12 

understanding language. However, the musical source materials used in this project are cut-up 

speech sounds. Therefore, an investigation of language and music’s ability to express meaning 

allowed the researcher to understand how these cut-up speech sounds could be utilised for 

composition. “Both speech and music are noted to be ‘particulate’ systems, in which a set of 

discrete elements of little inherent meaning are combined to form structure with a great 

diversity of meanings” (Dilley 2009: 535). This section will discuss philosophies surrounding 

musical expression and unpack linguistic definitions that have informed the research. 

 
If one considers the notion that music is a system that can signify meaning (Dilley 2009: 535), 

then it is logical to ask: what kind of meaning does music signify? Suanne Langer, a mid-

twentieth century philosopher believed that music conveys meaning because it can be 

understood as “a tonal analogue of emotive life” (Langer 1953: 27). In other words, music is 

experienced in the same way that one “feels” the reality of sentience. Music theorist, Raymond 

Monelle, reviewing Langer’s conceptualisation of musical expression, states that, often 

complex and abstract, one’s sentient experience is constantly in flux and can be challenging to 

conceptualise using language (Monelle 1992: 9). Music expresses, not in the sense that it 

directly conveys emotions experienced by the listener, but rather it is felt and understood as 

the “expression of an idea … it is a symbol by virtue of being felt as quality rather than 

recognised as a function” (8). Langer’s “felt” idea is not limited to emotional information (e.g., 

happiness or sadness), rather, art can convey any feeling that one can experience, from our 

sense of touch and the feeling of pain or euphoria to abstract emotions and thoughts (Reece 

1977: 45).  

 
What, then, makes a piece of music, or any other work of art, successful? According to Langer, 

the task that every artist faces is to effectively, through symbolism, convey an individual and 

subjective experience objectively (Langer 1964: 381). Following this proposal, an artist would 

use the knowledge of their medium to create an artistic object that acts as a representation of 

their subjective experience. The artistic object, however, cannot converse with a receiver. Its 

meaning is encoded by the artist through their ability to manipulate the medium. Receivers 

interact with the artistic object from their own subjective perspective and interpret meaning 

based on their knowledge of the symbolism. The successful transmission of message between 

the artist and the receiver, as per Langer’s interpretation, is then dependent on how well the 

transformation has been constructed.  

 
 13 

The role of symbolism is to represent concepts and ideas (Langer 1953: 26). Signs refer to or 

suggest specific entities, the way that a noun functions in language. However, symbols can 

express ideas to a receiver, and often express a multitude of ideas to a multitude of receivers 

based on context and discourse (26). Therefore, one could choose to interpret the success of an 

artistic work in its ability to move beyond signification into the world of symbolism. This 

statement presents a pertinent discussion in relation to the researcher’s third primary research 

question: Can a composer extract and harness the speech sounds which allow one to signify 

emotive meaning through spoken language? If the success of an artistic work requires it to live 

in the world of symbolism, inside which a multitude of ideas can be expressed to a multitude 

of receivers irrespective of cultural context or discourse, then supposing a composer would be 

able to extract that which signifies emotive meaning through language, would the music be a 

successful work of art?  

 
Musicologist Simon Emmerson’s insight speaks to Langer’s idea that music can communicate 

a felt or experienced version of our sentient realities. Emmerson’s idea of “image” presents the 

concept that the sounds used by a composer can evoke in the listener, an “image” (Emmerson 

1986: 17) that is not simply the representation of the sonic stimulus, but an “image” or idea 

“lying somewhere between true synaesthesia with visual image and a more ambiguous complex 

of auditory, visual and emotional stimuli” (17). The listener’s cognition of sound, which is 

often spatially separated from its source, creates an “image” of expression. One can be 

transported to a nostalgic memory, a location from their past, it can trigger the recollection of 

other sounds, or as Emmerson explains, an “image” containing each of these experiences. In 

The Relation of Language to Materials (1986), Emmerson was not focused on establishing a 

definition of musical expression, or even to fully uncover what “images” are evoked by music, 

but rather the relationship between musical composition and the perceived “image” (17).  

 
It is important to understand, not just the way music is potentially able to express meaning, but 

how the relationship between music and language is reflected in literature. In this regard, four 

main concepts are commonly used: music as language, language in music, music in language, 

and language about music (Feld and Fox 1994: 26-27). Music as language details the view that 

music can be approached using linguistic models of analysis and expression (26). Music in this 

category is viewed “as an autonomous formal domain, abstractable as hierarchical structure or 

cognitive process” (26). The structures can be interpreted using linguistic organisational ideas 

like grammar and syntax, encompassing morphology, the study of how words are formed from 


 14 

phonemes, and phonology, the study of the sound patterns used in language (26). This approach 

to understanding music falls in line with the constructivist, structuralist linguistic models that 

hold the belief that languages are autonomous formal domains. Regarding music as language, 

gesture should not be overlooked. Language is produced by physical action: the vibration of 

one’s vocal cords, the shape of one’s mouth and lips, and often with facial expression and other 

bodily gestures like hand movement and changes in posture. Music also relies on gesture for 

its production: acoustic, electric, and digital instruments all rely on gesture to produce sound. 

 
Language in music investigates the “intertwining of language and music in verbal art, song 

texts, and musical performance” (Feld and Fox 1994: 27). Prosody, the patterning of stresses 

and intonation used in language, and paralanguage (vocal timbre, the use of dynamic 

expression, and pacing/tempo) are contained within the realm of music in language (27). 

Lastly, a focus on language about music investigates the use of language within discourses that 

surround musical aesthetics (27). 

 
By examining the parameters shared by music and speech such as pitch, rhythm, and timbre, 

one could hypothesize that the emotive information interpreted in speech is a result of the same 

“felt” expression of an idea. It is not the linguistic content alone that facilitates our 

understanding of someone’s emotional state during a conversation, but its symbiotic 

partnership with features of the mode of delivery.  

 
All these categories that consider the relationship between music and language, whether 

analogous to linguistic models or the language about musical discourse, need to be 

contextualised and understood from the perspective of being firmly rooted in western ideas of 

musical form and linguistic thought. Even Langer’s interpretation relies on western 

conceptualisation of sentient experience. Although music and language are universals in the 

sense that every culture uses them, the theoretical understanding of how they function is far 

from universal (Feld and Fox 1994: 28).  This research does not claim that its findings are 

universal truths of musical and linguistic meaning, but explored how the results could be 

applied to creative practice. 

 
 15 

 
2.3 Cut-Up 
 
The cut-up technique was used and popularised by William Burroughs, an American writer 

who “was widely recognized as one of the most politically trenchant, culturally influential and 

innovative artists of the twentieth century” (Burroughs 2014: i). Burroughs used forms of the 

technique in his writing. The Soft Machine (1961), The Ticket That Exploded (1962), and Nova 

Express (1964), referred to as The Nova Trilogy or The Cut-Up Trilogy, are three novel-style 

works that Burroughs wrote using the cut-up technique.  

 
Applying the cut-up technique to audio is something Burroughs was aware of, explaining that 

Brion Gysin, the “rediscoverer” of the cut-up technique, had “pointed out that the cut-up 

method could be carried much further on tape recorders” (Odier 1970: 28). Burroughs’ use of 

the method in his critically acclaimed The Nova Trilogy (1961-64) means that his name has 

become intrinsically associated with it, but he was always quick to try and explain that Gysin 

was the inventor of the idea (Ryan 2022). Burroughs explains that tape recorders offer a variety 

of ways in which sounds can be manipulated that are impossible to achieve in writing: “effects 

of simultaneity, echoes, speed-ups, slow-downs, playing three tracks at once, and so forth” 

(Odier 1970: 29). Burroughs’ use of the cut-up technique in his writing influenced the 

compositional approach and aesthetic of this research. 

 
Burroughs’ cut-up method was fluid in the sense that he employed cut-ups to varying degrees 

in his creative process; using parts of cut-up text as the inspiration for larger, more linear, ideas 

and discarding the experiments, or by making use of actual sentences comprised of cut-up 

material (Odier 1970: 29). It was, by his accounts, both a conscious and unconscious endeavour 

in the sense that he was in control of the material he proceeded to cut-up (although this can be 

a random act as well) and whichever method he used to rearrange the material, but unconscious 

of what the result would be (30). The success or failure of a cut-up would be determined upon 

an examination of the finished experiment and not based on the method chosen to achieve the 

result (30). 

 
“It is hoped that the extension of cut-up techniques will lead to more precise verbal experiments 

closing this gap and giving a whole new dimension to writing” (Odier 1970: 27).  Burroughs 

was hopeful that by extending the cut-up method into the realms of music (audio), film and 


 16 

photography (visual), mediums less limited than the written word, it could help elevate the 

creative possibilities of writing. Early editing techniques in film, such as cross-cutting, could 

have possibly been the inspiration for the adoption of similar techniques in other mediums. By 

manipulating recordings of human speech, the researcher intends to explore the possibilities 

that the cut-up technique presents to investigate whether the sonic information in cut-up speech 

sounds aid one’s understanding of signified emotive meaning in language. 

 
Cut-ups should not be thought of as a relic of the past, either. In our virtual lives, many of us 

experience examples of cut-up on a regular basis. Short-form videos on Instagram and Tik-Tok 

employ a technique known as stitching in which a user can edit video and audio content with 

existing content on the platform to create their own versions of the original. In fact, Burroughs 

was fully aware of how powerful a tool cut-up could be in the entertainment industry, stating, 

“you want the widest possible circulation for your cut-up video tapes. Cut-ups techniques could 

swamp the mass media with total illusion” (Odier 1970: 181). 

 
2.4 Artistic Research 
 
Artistic research, like any research tradition, is rooted in the desire to ask and answer questions, 

solve problems, ascertain new knowledge, and add to the wealth of knowledge that precedes it 

(Nelson 2013: 3). However, the focus of artistic research is that there exists “a special mode of 

functioning as an artist that goes beyond the natural and intuitive enquiring of the artistic mind 

and encompasses something of the more systematic methods and explicitly articulated 

objectives of research” (Crispin 2015: 56).  

 
The artist-researcher should try and find a method that allows them to exist between the realms 

of the purely intuitive and “subjective” exploration and that of “objective” or scientific research 

(Crispin 2015: 57; Young 2015: 153). One must be able to exist and act as an insider to the 

creative practice, i.e., the person or participant making the work, and step away from that 

perspective and occupy the position of observer or critic. Navigation between these two modes 

is crucial to an informed artistic research methodology (Suoranta, Hannula, Vadén 2014: 16). 

The researcher’s experience of engaging with existing literature and the subsequent analysis of 

the results derived from the participant responses informed how the creative practice 

approached the use of cut-up speech sounds. Similarly, the insight gained from the researcher’s 


 17 

creative practice acted as a catalyst for the pursuit of research that could help to conceptualise 

their experience.  

 
Scholar, Estelle Barrett shares Crispin’s view, stating that both explicit and tacit knowledge 

are required by practitioners of creative arts research (Barrett 2007: 4). Barrett’s belief in the 

validity of creative arts research comes, in part, from its ability to expose new or marginalised 

“realities” outside of traditional research scopes (4). Artistic research is subjectively motivated, 

intuitive, and due to the nature of creative practices, multidisciplinary, with insight and 

inspiration drawn from many disparate sources.  

 
The artist-researcher is positioned to have the ability to pose research questions that would not 

occur to the scientific researcher due to the process-based nature of artistic enquiry, resulting 

in the production of knowledge that can produce insight into creative practice (Crispin 2015: 

60). However, the question of successful outcomes is always present when discussing artistic 

practice as research. As Barret writes: “because of the complex experimental, material and 

social processes through which artistic production occurs and is subsequently taken up, it is 

not always possible to quantify outcomes of studio production” (2007: 3). To ensure that one 

can interpret the results of artistic research, it is of vital importance that the research question(s) 

be focused to clearly convey that which the researcher is trying to investigate (Crispin 2015: 

61). The result of the artistic research, apart from a written thesis, is the practice or creative 

output of the research. This is vital to artistic research as the submitted practice is “evidence of 

research inquiry” (Nelson 2013: 9). Figure 1 is a visual representation of the process by which 

musical practice moves from being the core of musical art-making to a process viewed as viable 

research activity (Crispin 2015: 58). Musical practice is the act of performing or creating a 

piece of music. Informed musical practice is the act of acquiring information that can be used 

to inform a musical performance or inform the choices made in the creation of a piece of music. 

This information could relate to the historical accuracy of a performance (how would a piece 

have been performed at the time of its creation), or the stylistic/aesthetic choices made by a 

composer working within a particular genre. Informed, reflective musical practice is the 

process of reflecting on how the performance or compositional choices, made based on 

contextual information, influence the music. Lastly, research in-and-through musical practice 

(artistic research) is the meticulous process of research into the various processes of making 

music and situating these processes within a methodological framework (57-58). Artistic 

research should encompass all the categories from which it extends.   


 18 

 
Figure 2.1: "From Musical Practice to Artistic Research" (Crispin 2015: 58) 

 
The role of theory is vitally important to the artistic researcher as it can be used to derive the 

tools to investigate conceptual spaces (the artistic world created by the practitioner) (Young 

2015: 163). The process of working within the subjective world of taste, intuition, and feeling, 

and the objective world of theory and research, facilitates progressive results in the creation of 

artistic works. The conceptual space, born from a subjective creative practice, can be navigated 

and conceptualised using theoretical tools (music theory, colour theory, conceptual 

frameworks such as Dennis Smalley’s spectromorphology (Smalley 1997: 107)), providing 

structure and organisation to a possibly disconnected artistic discourse.  

 
2.5 Popular Music and Composition 
 

This section presents literature pertaining to popular music as the researcher’s own musical 

works have drawn from selected aesthetics and structural characteristics from the popular 

music discourse. The writings of scholar, Keith Negus, and philosopher, Theodore Gracyk, 

outline the conceptual frameworks as they analyse narrative and timbre respectively in popular 

music. This section also presents the researcher’s musical influences: the aesthetic and/or 

conceptual approaches that inspired his musical works.  

 
 19 

Negus’s paper “Narrative, Interpretation, and the Popular Song” (2012) posits that within the 

realm of popular music, “even if the inspiration is implicit or unacknowledged, songs are heard 

alongside and in relation to other songs (by songwriter and listener alike)” (370). Negus implies 

that part of the meaning behind a song is in the listeners’ (including the songwriter’s) subjective 

position. Their interpretation is influenced by their prior interaction with other songs. To 

analyse meaning in popular music, Negus suggests that one should approach from an inter-

subjective and inter-contextual perspective (2012: 368). Inter-subjectivity means that an idea, 

belief, or understanding of something is shared between two or more people. In the context of 

music, inter-subjectivity can be thought to represent a shared musical experience or a shared 

interpretation of musical meaning. Musical inter-contextuality refers to the shared context of a 

musical event.  

 
The relevance of Negus’s approach for the researcher is that it focuses on the way popular 

music can express narrative meaning. A narrative approach to musical analysis is when one 

experiences an account of various connected musical events and one’s interpretation of the 

relationship between these musical events is what expresses meaning. Narrative is far more 

accessible when lyrical content is present in musical works because one is more able to connect 

semantic events into a logical narrative. However, music without lyrics still presents events 

that can be interpreted narratively, albeit in a more ambiguous manner. Negus speaks of the 

“poetic possibilities of ambiguity” (2012: 379), in which popular music’s ambiguous narrative 

expression contributes to its inter-subjective and inter-contextual emotive expression. He uses 

the example of Randy Newman’s “You’ve Got a Friend in Me” (1995) from the Pixar film Toy 

Story (1995). In his interpretation, Negus explains that in the context of the film the song 

describes the friendship between a boy and his toy. The context of the film makes the semantic 

content of the lyrics and musical events very clear, but if a listener were to listen to the song in 

a separate context, then the ambiguous nature of the events presented in the song are applicable 

to other relationships (such as personal friendships). The lyrics limit infinite interpretations of 

the song’s meaning, but this illustrates the powerful effect that context and subjectivity have 

on the listener’s experience.  

 
From the perspective of a composer, the understanding that narrative is a crucial part of popular 

music’s expression aided the researcher in conceptualising their musical works. The 

researcher’s approach to each of the eleven tracks in the album created was to develop a 

metaphysical narrative representing his experience with numerous characters. The tracks were 


 20 

composed with the idea that each musical event would be interpreted (either consciously or 

unconsciously) by a listener and experienced as a recollection of an interaction. Negus’s 

description of music’s ambiguity and his advocation for an inter-subjective and inter-

contextual approach intertwines with the second research question (How does the act of 

composing with cut-up speech sounds change the emotive meaning for the receiver?). The 

ambiguous or lack of semantic information in the cut-up speech sounds opens the interpretation 

of the music to varying inter-subjective and inter-contextual experiences.  

 
Having focused on the conceptual analysis of popular music, what follows is a discussion of 

the composite elements of popular music and the importance of timbre as a significant 

contributing factor of the emotional experience of a listener. Gracyk’s argument in his paper 

“Sound and Vision: Colour in Visual Art and Popular Music” (2003) is that “technological 

developments enhancing the role of timbre in musical arrangements make timbre highly 

analogous to visual colour in modern painting” (49). This process has allowed timbre to act at 

a level equal to parameters that are conventionally thought of as defining formal structure 

(melody, harmony, form, orchestration) in conveying emotive expression in popular music 

(52). Gracyk compares musical timbre to the use of colour in visual art stating that historically 

the two elements have been regarded as secondary to the structural elements of composition in 

both mediums. In relation to rock music, he states that the structural clarity exhibited in many 

of the works in the genre is in service of the true expressive resources of the music: song 

writing, production, and tone colour (50).  

 
The researcher’s adoption of a popular music aesthetic was influenced by the desire to 

emphasise timbre as the core expressive parameter in the music. Although both speech and 

music share the structural parameters of pitch (by extension, melody) and rhythm that can be 

organised according to linguistic or music syntaxes, the timbre of the cut-up speech sounds 

presented a rich resource with which to express emotive meaning in a musical context. In 

experimenting with this idea in an accessible musical context, a popular music approach to 

structure facilitated an experience that could foreground timbre within established musical 

structures. 

 
Some of the researcher’s musical influences use the ideas expressed above to great musical 

effect. It must be noted here that any discussion of popular music places one in the realm of 

subjective interpretation and that Negus states, paraphrasing Chris Kennett (2008), that 


 21 

“personal listening is all that scholars can legitimately offer” (2012: 377). Considering this, the 

researcher discusses the artists below only in service of providing context for the aesthetic 

choices of his own musical works.  

 
Nine Inch Nails’ “Hurt” (1994), the final track of the album The Downward Spiral, through 

production and intentional performance, draws attention to Trent Reznor’s intimate and 

strained vocal timbre in delivering a visceral and expressive account. Supported by diatonic 

melodic phrasing and a stable and unwavering rhythmic foundation (among other structural 

elements), the timbre of the voice acts as one of the key elements expressing emotive meaning.  

Aesthetically, Trent Reznor (founder and principal songwriter of Nine Inch Nails) has opted 

for an unambiguous approach to the structural organisation of his music, in service of timbrally 

expressive orchestration. This approach results in music that presents a listener with enough 

established and common musical context that one’s attention can be drawn to production, 

timbre, and narrative of the music. 

 
Amon Tobin, the Brazilian electronic music producer, though far more experimental in his 

approach to sound design/timbre and arrangement, was significantly influential in the 

researcher’s musical work. Tobin’s music served as an aesthetic touchstone throughout the 

compositional process (specifically Out From Out Where (2002)) because his music is novel, 

exciting, dense, and presented in a way that does not immediately alienate a listener. He “plays” 

with, stretches, and warps the structural components of popular music in a way that challenges 

or reframes one’s understanding of their function within the music. Tobin has also made use 

of the cut-up vocal fragments is his music. “Verbal” (feat. MC Decimal R.) (2002) from 

Tobin’s Out From Out Where (2002) uses cut-up vocals as the lead element of the piece. His 

2011 album ISAM is collection of thirteen tracks that were created from sound recordings of 

his child’s voice, manipulated to varying degrees to create timbrally rich musical environments. 

The similarity of source material that Tobin uses in these pieces influenced how the researcher 

treated the cut-up speech sounds. The tracks on ISAM (2011) often obscure the voice to the 

extent that it is challenging to perceive the original source sounds, while “Verbal” presents the 

vocal cut-up in a direct and obvious way so that the listener is immediately aware of the source 

material. The researcher experimented with the degree to which the cut-up speech sounds could 

be manipulated to find expressive possibilities within the material but was careful to retain the 

vocal identity. 

 
 22 

Along with cut-up speech sounds, drums were the only other instrument used to create the 

musical works for this research. As a result, the researcher’s approach to the instrument was 

similarly considered. One influence was Nate Smith’s Pocket Change (2018), a solo drum 

album that establishes and develops thematic variation in the context of established pop, funk, 

and jazz -influenced, groove-based drumming. Smith’s variation of repetitive patterns within 

a groove context was adopted by the researcher in the composition of his drum parts for the 

album. Subtle changes to the thematic content of groove-based drumming gave the music a 

stable and predictable foundation that also propelled its development. 

 
JoJo Mayer’s work with his electronic band Nerve created an environment in which acoustic 

drums and electronic instrumentation coexist. Mayer and Jack Quartet’s performance of Don 

Li’s “Different Zones” (2021) makes use of drum kit, string quartet, and recorded playback of 

Mayer’s speech. Speech phrases are repetitively played while the drums and string quartet 

metrically modulate the musical accompaniment to alter the listeners perception of the music. 

This performance, although not only for drum kit and speech, acted as a reference that 

demonstrated a specific approach in which the drum kit could be used to accompany recorded 

speech and how compositional arrangement could be used to alter the perception of repeated 

phrases.  

 
An aesthetic influence that informed the researcher’s music was Radiohead. Aesthetically, 

Radiohead balance experimentation with established pop tropes. The band has also made use 

of the cut-up technique in both writing lyrical content and vocal arrangement. “Everything In 

Its Right Place” from their album Kid A (2000) starts with Thom Yorke’s vocals cut-up and 

layered over a synthesized melodic pad. The remainder of the piece contains sung vocals 

phrases derived from lyric cut-ups while cut-up vocals move to the background and provide 

rhythmic development and timbral contrast to lead vocals and instrumental accompaniment. 

The influences described in this section (among others) illuminated potential avenues of 

creative exploration during the compositional process, some of which were conceptual and 

bear little resemblance to the original artists, while other influences expose themselves in more 

obvious and direct references.  

 
 23 

2.6 Qualitative Research 
  
This research gathered qualitative data from participants to answer the three primary research 

questions. Questionnaires were used to facilitate group responses and one-on-one interviews 

were conducted to gather more nuanced and detailed qualitative data. All the data gathered for 

this research was organised and analysed using the process of thematic coding, outlined by 

psychologists Virginia Braun and Victoria Clarke (2013), to locate common themes found in 

the data and extrapolate associations and trends in the participant responses.  Chapter 3 details 

how thematic coding was used to interpret the responses of participants and explains the 

quantitative methods used to investigate the responses for instances of significant correlation.  

 
The process of coding is commonly used in analysing qualitative data. It describes the process 

of relating the data that has been captured to one’s initial research questions or objectives by 

identifying aspects that are relevant (Braun and Clarke 2013: 206). However, it is not always 

immediately evident to the researcher which aspects of the data are of relevance. Braun and 

Clarke identify two categories of thematic coding: (1) selective coding and (2) complete coding 

(206). Selective coding is the process of finding instances that support the researcher’s 

theoretical and analytical goals. The instances that pique interest but are not directly relevant 

to the initial topic of the research are not coded. In the case of selective coding the researcher 

is fully aware of the types of responses or data that will be useful for their research aims. 

Complete coding requires the researcher to code the entire data set, taking note of everything 

that may be of interest. Once this process is complete, the researcher can be more selective 

(206). Complete coding ensures that the entire dataset has been explored for any useful insight, 

allowing the researcher the flexibility to adjust based on the information the data contains.  

 
The codes created during the process of selective and complete coding are named using “a 

word or brief phrase that captures the essence of why you think a particular bit of data may be 

useful” (Braun and Clarke 2013: 207). The codes used to group various instances of data 

together can be data-driven or researcher-driven, depending on the goals of the research being 

undertaken. Data-driven coding requires the researcher to use semantic codes derived from the 

responses in the data because they act as a summary of explicit information and mirror the 

language used by participants in their responses (207). Researcher-driven codes relate to the 

researcher’s own conceptual and theoretical frameworks and function as references to implicit 

information in the data (207). Researcher-driven coding can be susceptible to confirmation bias 


 24 

in that the researcher is analysing the data with the intention of categorising the results based 

on previous theoretical frameworks, which can result in a skewed representation of the 

responses. Thus, the researcher chose to implement a data-driven coding approach which, 

while still subjective and vulnerable to the researcher’s own bias, designs data categories based 

on the information in the responses rather than preconceived theoretical perspectives.  

 
There are three common theoretical approaches to using research interviews as a method for 

gathering qualitative data. The first and second perspectives, described by scholars Sandy Qu 

and John Dumay in their paper “The Qualitative Research Interview” (2011), neopositivist and 

romanticism, are representative of more historically established views, whilst the third 

perspective, localism, looks to disrupt conventional theoretical approaches (Qu and Dumay 

2011: 240-241). The neopositivist interview approach treats the accounts of the interviewee, 

regarded in this approach as a knowledgeable truth teller, as an objective transfer of knowledge, 

and considers the interview as a tool for obtaining data (241). The potential limitation of this 

view is that context and the interviewee are removed from the equation. Only the information 

is considered, and it is considered the truth, spoken by someone who can explain potentially 

complex or sensitive ideas accurately and logically in an eloquent manner (Koven 2014: 503).  

 
The romanticism interview approach states that “interviews are best understood as speech 

events” (Koven 2014: 501). They take place in a location, with participants that are dynamic 

and multifaceted. The romanticism approach regards the interview process as an encounter 

between interviewer (an empathetic listener) and interviewee (a participant), the accounts of 

which are viewed as “a pipeline of knowledge mirroring interior and exterior reality leading to 

in-depth shared understanding” (Qu and Dumay 2011: 241). The romanticism approach is often 

favoured for ethnographic interviews, while the neopositivist approach is considered more 

scientific and context-free in nature.   

 
A localist interview method allows a researcher to flexibly approach complex issues from 

various perspectives (Qu and Dumay 2011: 242) and produces “situational accounts that must 

be understood in their own social context” (241). Localism resists the idea that interviews are 

situations removed from social context and place emphasis on interpreting the narrative of an 

interview as an account of a phenomenon that is situated in an empirical setting (242). The 

localist approach is considered to occupy the borders between structured, semi-structured, and 

unstructured interview styles and has the flexibility to function within each of these situations. 


 25 

3. Methodology 
 
This chapter details the chronological methods chosen during the research process; the 

subheadings reflect this progressive process. These methods were chosen to answer this 

project’s primary research questions through the codification of participant responses to 

controlled voice recordings with the aim of creating a taxonomy of vocal sounds which could 

be used as musical material. A mixed-method approach that draws from artistic research, 

qualitative data capture, and quantitative statistical analysis was used.  

 
As outlined in the introductory chapter, the researcher aimed to test and answer these questions 

through the process of musical compositions utilising fragmented, digitally recorded, speech 

audio. The researcher chose to gather and analyse data collected from participant responses to 

incorporate the perspective of the listener in artistic practice and ground the conclusions of the 

research in a mixed-method approach.  

 
What follows is a brief description of each subsection contained within this methodology. It is 

presented as such to give the reader a broad overview of the methodological process before  

explaining the application of each method. 

 
Initial Research: Initial research phase and context surrounding linguistic literature. 

Creating Examples: The creation of audio examples. Explanation of each audio example and 

its rationale.  

Designing A Questionnaire: Methodology and rationale for the use of questionnaires. 

Qualitative Data Gathering: How and why qualitative data was gathered from participants. 

Data Capture and Transcription: Data processing for analysis. 

Thematic Analysis and Coding: Methodologies used to analyse the qualitative data.  

Quantitative Analysis: Analysing the qualitative data using quantitative methods.  

Compositional Methodology: Explores this project’s compositional process.  

 
3.1 Initial Research 
 
This research sought to understand how the speech sounds used in language communicate 

emotive meaning. What are the formal elements of one’s speech and how do they combine to 

express emotive meaning?  


 26 

 
The field of linguistics was important in the initial research , focusing on phonetics, phonology, 

and morphology. An understanding of the phonetic sounds of the voice allows for the isolation 

of speech sounds based on established linguistic definitions. Hypothetically, any conclusion 

derived from the data gathered by this project could theoretically be applied to other 

compositions utilising cut-up speech sounds.  

 
The digital recordings used by the researcher in the initial phase of experimentation comprised 

brief clips of speech describing the researcher’s surroundings, emotive readings from books, 

and expressive phrases attempting to embody feelings including anger, joy, and sadness. The 

editing process started by identifying phonetic sounds such as vowels and consonants to create 

audio clips that exhibited one “type” of vocal sound. Some of the experiments contained only 

the consonant part of a phrase, others were limited to only vowels, and some contained a 

combination of both.  

The aim of creating cut-up speech sounds was to compose clips that utilised the phonetic 

qualities inherent in the recordings to explore the timbral characteristics of the phonetic sounds 

in a musical context. The clips containing consonants appeared to have a percussive and 

somewhat jarring quality/timbre, while the clips containing only vowels appeared to have a 

more melodic and fluid timbre. The clips containing a combination of vowels and consonants 

seemed to create the feeling of interplay between the two timbres.  

 
The researcher’s initial listening experience was paradoxical in nature. On the one hand, the 

sounds were clearly vocal utterances that held onto enough sonic information to “hint” at 

semantic meaning, and on the other hand, the sounds were somewhat jarring due to their 

fragmented nature and resisted the superimposition of meaning. The juxtaposition was a 

successful result because, from a compositional viewpoint, the fragmented sounds appeared 

familiar enough to be understood as vocal sounds but had the potential to draw one deeper into 

the experience. In the attempt to ascertain any semantic or emotive meaning, the listener would 

have to “fill in the gaps”, thus encouraging interaction.  

 
The subjective nature of the insight gleaned from the initial experimentation is not to be 

overlooked. It is possible that a different listener, one that was not directly involved in the 

ideation, recording, and creative act of producing the vocal cut-up audio, may have reacted 

differently when exposed to the audio. Therefore, to test whether the researcher’s interpretation 


 27 

would be one shared by others, the project set about designing two situations in which 

qualitative responses to the cut-ups could be gathered through group surveys and one-on-one 

interviews respectively. The next section of this methodology explains how the cut-up voice 

recordings, referred to henceforth as “examples”, were created. 

 
3.2 Creating Examples 
 
Building on the experimentation described in the previous section, the researcher created a 

series of cut-up examples to be presented to participants to record and analyse their responses. 

By analysing responses, the researcher set out to create a musical taxonomy of the cut-up 

speech sounds that could be used to compose musical works.  

 
The examples were named for the chronological manner in which they would be played for a 

survey group (Example 1-26). The artistic methodology that was used to create these examples 

was influenced by William Burroughs and his use of the “cut-up technique” in his series of 

novels titled The Nova Trilogy (1961-1964).   

 
26 examples were created. The number of examples was influenced by the duration of the focus 

group sessions; the first session was to last one hour, and the participants would need adequate 

time to complete their responses. Each example was allocated two minutes, totalling 52 

minutes of response time and roughly ten minutes for an intermission. The audio was recorded 

for the sole purpose of this project. The examples were created from recordings of five people 

of varying gender and ethnicity. The source material was recorded in as similar conditions as 

possible with the intention to minimise the discrepancy in the recording quality from source-

to-source.  

 
Table 3.1: Demographic Description of Sources 

Source Age Gender Ethnicity Home Language 

A 24 Female Coloured English 

B 25 Male Black English 

C 25 Male White English 

D 22  Male Black Sesotho 

E 25 Female White English 

 
 28 

Sources A, C, and E were recorded in the researcher’s home studio with the aid of a vocal 

booth to minimise ambient noise in the recordings. Sources B and D, for logistical reasons, 

were unable to record in the same location. However, a brief outlining the recording setup (mic 

placement and performance notes) and content was supplied. Each source was asked to record 

a series of phrases in three emotive tones: happy, sad, and neutral. Suggestions were given for 

phrases, but each source was instructed to supply their own additional phrases if they wished, 

providing the emotive expression was in-line with the above. 

 
The speech audio was edited by removing one or more parts from the recording, resulting in a 

fragmented cut-up. Eight parameters defined the characteristics of the examples: pitch; timbre; 

attack; proximity; rhythm; articulation; contour; and emotion. 

 
Table 3.2: Parameters for Defining Examples 

Parameter Definition Implementation 

1. Pitch Is the example high-pitched, low-pitched, or 

neutral?  

Cut-ups were edited in such 

a way as to draw attention to 

the pitch of the sounds. 

2. Timbre What is the “colour” of the sound?  Digital processing effects 

were used when necessary 

to alter the timbre. 

3. Attack Refers to the how the initial transient of the 

sound wave unfolds over time. 

What parts of the words 

were removed? 

4. Proximity Where is the sound spatially located in 

relation to the listener? 

Source material: proximity 

to the microphone. Digital 

processing: delay. 

5. Rhythm Does the example employ a steady or 

uneven rhythm? 

 
Choosing where cuts were 

made to create a frantic, 

steady, or stagnant rhythm. 

6. Delivery In what way was the speech spoken? 

Whisper, shout, etc. 

 
Using sources that exhibit a 

variety of articulation. 

7. Contour How does the pitch change over time? 

 
Create examples that exhibit 

a range of contours. Low-to-


 29 

high, high-to-low, 

monotone, etc. 

8. Emotion What is the emotional import of the source 

sound? 

Editing and processing in 

such a way as to either 

diminish or augment the 

original emotion of the 

speaker’s delivery. 

 
Although a certain parameter was employed to guide the editing and processing of each 

example, speech audio inherently exhibits a number, if not all, of these parameters. The 

researcher is aware that the participants may not be drawn to, or compelled to respond to, the 

focal parameter of each example. 

Table 3.3: Breakdown of Examples1 

Example Speaker Original Phrase Edited Phrase (spelt 

phonetically)2 

Emotion Parameter 

1 A “Aw, are you sure there is nothing 

we can do?” 

“aw-y-u-uh-e-an-oo” Sombre 

 
Pitch 

2 C “Yeah, I understand. I just never 

thought it would happen like this.” 

“ea-un-an-sne-ou-ha-

en-ts” 

Sombre 

 
Timbre 

3 D “Who do you think you are?” “Ho-d-y-th-nk-y-ah” Angry Proximity 

4 B “You just don’t understand.” “na-ts-nu-ts-uo” Neutral Attack 

5 A “I swear to God, if I have to ask 

you one more time!” 

“i-swe-t-go-f-i-ve-t-k-

o-mo-t” 

Angry Rhythm 

6 C “Don’t you ever say I didn’t care. 

How dare you.” 

“uo-ea-e-in-d-dy-a-sa-

oy-tn” 

Angry Articulation 

7 E “No ways, it’s been such a long 

time.” 

“n-wa-s-it-s-een-uh-

on-I” 

Happy Contour 

8 B “You just don’t understand.” “u-on-u-n-er-an” Angry Emotion 

9 D “Ke motlotlo ka wena monna.” – 

I’m proud of you, man. 

“ke-tl-tlo-ka-we-mo” Happy Attack 

10 B “You just don’t understand.” “dan-e-un-nd-u-nd-st-

ju-ne-an-un-dst-a-d” 

Happy Pitch 

11 A “Yay! I’m so happy for you.” “ou-fy-pa-ay-y” Happy Articulation 

12 E “Can I tell you a secret?” “ca-te-se-cr-t”  Proximity 

 
1 See appendix A1.1 – A1.26 for a comprehensive breakdown of the artistic intention and processing applied to 
each example. 
2 https://drive.google.com/drive/folders/1Z48UKr37gaqX0TIiaxuhYVOGr5wWEshF?usp=drive_link  


 30 

13 E “Yes, I understand. I just never 

thought it would happen like this.” 

“sec-klou-is-men-ts-

ch-na-nus-at-ts-du-ah-

a-si” 

Sombre Emotion 

14 C “Bag of bones that the fastest hand 

punctuated with fan-fares 

interspersed even bleaker state of 

gloom although how could anyone 

console him.” 

“ba-bo-st-at-th-fa-tst-

a-ctu-ith-f-e-s-per-ev-

n-eak-r-sta-of-gl-thou-

ow-could-y-c-sole” 

Neutral Rhythm 

15 C “Walking into the night, falling in 

the pool and yelling for help, but 

then you realise you are sleeping 

… f*#k.” 

“wa-hat-oo-hel-ing-fu” Neutral Timbre 

16 A “Oh my gosh, that’s so exciting!” “o-y-o-a-o-e-i-ing” Happy Contour 

17 (looped) D “I’m proud of you, man.” “ou-o-yu-ma” Happy Emotion 

18 C “I asked you to do it and now look 

what’s happened! This could all 

have been avoided!” 

“i-ah-d-u-t-now-ook-

wh-ts-hp-is-c-d-a-b-n-

a-vo” 

Angry Articulation 

19 B “You just don’t understand.” “d-st-e-an-ts-ow-ts-

oy” 

Sad Rhythm 

20 E “Yes! Go! Woohoo!” “ye-wo-go-hoo” Happy Attack 

21 (looped) D “Ke masoabi haholo ho utloa seo.” 

– I’m so sorry to hear that. 

“em-oa-o-tl-se” Sad Proximity 

22 B “You just don’t understand.” “s-e-an-st-a-an-s-e-an-

st-st-a-a-an” 

Happy Timbre 

23 C “Bag of bones that the fastest hand 

punctuated with fanfares 

interspersed even bleaker state of 

gloom although how could anyone 

console him.” 

Unintelligible  Neutral Pitch 

24 D “Ha ke tsebe, na u lekile ho mo 

letsetsa.” – I don’t know, have you 

tried calling him? 

“ha-ets-a-u-omo-ets” Neutral Articulation 

25 (looped) A “mm mm, no, uh ah, we’re not 

doing that here!” 

“m-ah-m-e-w-n-he-

oh” 

Angry Emotion 

26 E “I can’t believe it.” Unintelligible 

 
Happy Contour 

 
The phrases were chosen based primarily on whether the emotive delivery was convincing. 

The examples were assigned a parameter for the editing process based on the inherent attributes 

of the recordings. The nature of the selection process was subjective, and the researcher 

understands that the interpretation of the original recordings may not be consistent with other 

listeners. Examples 9, 21, and 24 use phrases spoken in Sesotho. The reason for doing so was 


 31 

to determine whether participants would be able to identify cut-up speech sounds of a non-

English language. 

 
The recordings were edited using Logic Pro X by selecting clips of the audio and isolating 

them using the software’s cutting tool. The audio clips that remained were edited using Logic’s 

fade tool to ensure that the beginning and end of each small clip was heard without any clicks 

or pops which would emphasize where the audio was cut. 

 
3.3 Designing a Questionnaire 
 
“Questionnaires offer an objective means of collecting information about people’s knowledge, 

beliefs, attitudes, and behaviours” (Boynton and Greenhalgh 2004: 1312). A questionnaire was 

created for the participants to supply feedback in the form of qualitative data. It takes the form 

of a standardised questionnaire, meaning each participant was given the same version of the 

questionnaire and exposed to the same stimulus (1313). Each example required the participants 

to answer four questions: (1) How would you describe this example? (2) Was this example 

happy? (3) Was this example fast? (4) Did you hear a word or a phrase?  

 
(1) How would you describe this example? 

This question was designed to give the participants an opportunity to respond in an open-ended 

manner, allowing for free responses (Boynton and Greenhalgh 2004: 1313). By providing an 

open-ended question to the participant for each example, the research aims to see how people 

respond to the audio without prompting. This question occurs first of the four questions because 

the researcher wished to gather the participant’s initial response in as much detail as possible. 

If placed in a different order the participant may not have had enough time to provide their full 

insight. This question supplied the bulk of the qualitative data gathered from the participant 

responses. 

 
(2) Was this example happy? 

This was one of two Likert scale questions asked for each example. A Likert scale is “a rating 

system, designed to measure people’s attitudes, opinions, or perceptions” (Jamieson 2023) and 

is commonly used to gather data for social and educational research (ibid.). The Likert scale 

offered five responses to choose from: Strongly Disagree; Disagree; Neutral; Agree; Strongly 

Agree. Both question two and question three are rating scales and therefore produce data that 


 32 

allows for qualitative statistical analysis (Boynton and Greenhalgh 2004: 1313). The rationale 

for using a Likert scale was to derive quantitative data that may reveal the extent to which a 

participant agreed with the assertion that the given example was “happy”. A five-point Likert 

scale is the most prevalent variation of the scale, as a larger number of points, such as seven, 

has shown to discourage participants from selecting the extreme positive or negative options, 

and a four- or six-point scale produces a broad positive or negative response when considering 

the entire dataset (Jamieson 2023). The prompt, “happy”, asks the listener to focus on the way 

the audio is delivered and any emotive information present.  

 
(3) Was this example fast? 

The Likert scale for this question was used to measure the how the participant’s responded to 

the question of whether the given example was “fast”. The rationale for using “fast” was to 

draw the listener’s attention to a parameter of the sound, not the semantic or emotive meaning 

of the utterances heard. The Likert scale offered five possible responses to the question: 

Strongly Disagree; Disagree; Neutral; Agree; Strongly Agree. 

 
The Likert scales used for questions two and three supplied the researcher with a means to 

establish a central tendency for a given example by calculating the mean and mode of the 

derived data (Jamieson 2023). A Likert scale facilitates this kind of analysis because the five 

points of the scale are often given ordinal interval values from 1 to 5 (Jamieson 2023). An 

ordinal level of measurement means that the numbers have directionality: 1 is more negative 

or positive than 2, and so on (Jamieson 2023). The researcher does not treat the intervals as 

equal, as doing so implies the assumption that a response of 4 is twice as negative or positive 

as a response of 2 (Jamieson 2023). 

 
(4) Did you hear a word or phrase? 

The last of the questions dealt with whether a participant was able to hear what sounded like a 

word or a phrase in the example. The question was presented in two parts: (1) Did you hear a 

word or phrase? (2) If so, what did you hear? If the participant heard a word or phrase, then 

they would be able to write it out on a separate line. If a word or a phrase was heard, the 

subsequent analysis could aim to discover the cause of such a phenomenon and draw on the 

analysis to create similar sonic events in the researcher’s musical works. Part one of question 

four is a statement-based question in which the participant is asked to respond “yes” or “no” 

(Boynton and Greenhalgh 2004: 1313), while part two of question four is an open-ended 


 33 

question that allowed the participant to respond freely, similar to question 1. Hearing the audio 

in a looped fashion when creating the examples, the researcher was able to interpret words and, 

occasionally, phrases that were not present in the original recordings. The question was 

included in the questionnaire to determine whether participants would respond in a similar 

manner based on brief exposure to the examples.  

 
3.4 Qualitative Data Gathering  
 
To test whether the emotive meaning of a recorded phrase withstood the rearrangement of its 

syntax (How does the act of composing with cut-up speech sounds change the emotive meaning 

for the receiver?), the researcher designed two contextual situations in which participant 

information could be gathered. Once gathered, the qualitative data could be analysed using 

qualitative and quantitative analysis methods to determine the nature of the participant’s 

experience. Should common themes be interpreted from the participant responses, a taxonomy 

could potentially be created and used by the researcher in their compositional practice. The 

degree to which the data could be used to create a musical taxonomy of speech sounds would 

depend on whether the participant survey and expert interview respondents’ responses 

displayed significant correlation across all 26 examples.  

 
The survey data was gathered from two iterations of group listening. In both sessions the 

participants were played the 26 audio examples numerous times (the variation in time is 

discussed below), with breaks in-between each playback for the participants to write their 

responses. The first iteration of the participant survey comprised non-music expert participants 

from the public (21 of 45 participants), while the second iteration was conducted with first-

year music students (24 of 45 participants), meaning that 53% of the sample set are regarded 

as having musical training. This bias is discussed in relation to the results (see p