Composing Excitement: Understanding the Visual Rhetoric of the Voice

Reality TV has been increasingly studied by media and cultural studies scholars and practitioners, becoming a subject of critique across the disciplines of communication. While many reality TV programs have been the target of analysis for racism, sexism, and violence cultivation, not much has been done to explore the visual compositions in this genre to consider the rhetoric of excitement or audience stimulation. Using a data-driven approach, this article examines the visual formats and presentational structures employed by a popular reality franchise vocal competition show, The Voice , as produced in China, America, and the United Kingdom. Through a comparative content analysis, the author examines the 2014 finale performances of each version of the show, focusing on their stage design, cinematography, and screen graphics. This study finds distinctions in the show’s visual mechanism across three countries, thus suggesting imperative insights to the diversity of visual characterization and the functions of visual rhetoric in a modern TV talent show.


Introduction
Candid Camera (1948) is often credited as reality TV's first practitioner. 1 Since TV's golden age, and with the rise of MTV, reality TV has entered mainstream popular culture, big time.
In recent years, thisreality TV has given birth to a notable subset of sensational programming that is competition-based talent show. This new format involves the elimination of one or more participants per episode, a panel of judges, and the concept of immunity from scripted eliminationleaving the audience in suspense of which contestant would stay or be defeated in the coming episodes. Evidently, this particular genre relies heavily on visual manipulation from stage to screento stimulate audience excitement and prolong their adherence to the shows.

The Voice Franchise
The Voice is a reality TV singing competition broadcast based on the original The Voice of Holland, created by John de Molwith Dutch signer Roel van Velzen, and first aired in 2010.
The concept of the show is to find and "feature the country's best unknown artists" (NBC, 2014), age 15 and over, from public auditions. The show's format consists of three segments: blind auditions, battle rounds, and live shows. During the blind auditions, all judges (known as coaches) are seated with their back facing the stage. As the contestant performs, and if the coaches like what they hear and want to mentor the performer for the next stage, they will push a button to turn around. This format presumably eliminates a coach's bias due to characteristics or personality of the performer. In the battle rounds, where two advancing artists are mentored and developed by their respective coaches, the contestants battle against another member of their team while their coaches decide who should stay in the competition.
This continues to the final, live shows, where the surviving contestants perform in front of the coaches, an audience, and live broadcast. In the finale, the coaches have 50-50 say with the audience and public votes. 2 One may be curious of why this study chooses to focus on The Voice franchise amidst other reality singing competitions, namely the Idol and Got Talent franchises. Based on IMDb's ranking, The Voice (USA) has the highest ratings in the US when compared to American Idol and America's Got Talent, while in the UK, both Britain's Got Talent and The Voice UK received an equaled viewership. 3 Whereas, in the East, distinctions are harder to make between Chinese Idol, The Voice of China, and China's Got Talent.Given the available ratings, The Voice does match up with its predecessors as a new scheme of reality singing competition with its blind audition format. As the current study is interested in how the current most popular reality singing show employs visual rhetoric, and, given the diversity of its audience, The Voicefranchise provides most fitting dataset for the intentions of this study.

The Rhetoric and Visuality of Reality Shows
A fewcommon questionsthat someone studying popular TV often gets are these: Why study them in the first place? What can we learn from studying a genre in which buzz and hysteria are deliberately constructed and heightened? How could such artificial cultureso massproduced and consumedpossiblybe considered worthy of an intellectual analysis?
In some ways, these questions have themselves implied the importance of investigating mass culture. By examining reality TV shows through a critical lens, we learn about our society at large, understanding how our social lives may be affected by the dominant class, those who are in the position of power to "control events and meanings" (Brummett, 2011, p. 4). In his book, Rhetoric in Popular Culture, Barry Brummett discusses how TV first and foremost serves as a commercial enterprise where advertisements are blended into its regular programming. As ads and TV programs increasingly employ similar formats, such as that of music video and reality shows, any clear distinctions between the two are now blurred.
Eventually, the selling of commodities becomesgradually less separable from TV programming (Brummett, 2011, p. 199). Given such intrinsic motive, an analysis or criticism of reality TV uncovers the underlying intentions of its visual composition as service for commodification while exploring some central characteristics of reality programming.
Further, such analysis allows us to enter into the realm of visual culture studies, one that scrutinizes contemporary culture and "visual events in which information, meaning or pleasure sought by the public in an interface with visual technology" (Mirzoeff, 2002, p. 3).
By deliberately unpacking the visual compositions and design of effects in a reality TV show, we may arrive at a better understanding of the rhetorical construction of perception and the plethora of ways in which we experience reality as it is manipulated by visual tools and technologies.

Cultural Considerations
As W.J.T. Mitchell (1986Mitchell ( , 1994 argues that images and language are intricately intertwined, and that the meanings of language are grounded in cultural reality, we should evaluate the cultural significance when studying visuals.In their pivotal work, Reading Images: The Grammar of Visual Design, Gunther Kress and Theo van Leeuwen (2006) argue that both language and visual communication are semiotic processes rooted in cultures specified by a society (p. 19). In other words, cultural perspectives are unique and crucial for contemplation in any design of visual stimulation. Stuart Hall, considerably the father of cultural studies, puts it aptly, Culture, it is argued, is not so much a set of thingsnovels and painting or TV programmes or comicsas a process, a set of practices. Primarily, culture is concerned with the production and exchange of meanings -the 'giving and taking of meaning' -between the members of a society or group … Thus culture depends on its participants interpreting meaningfully what is around them, and 'making sense' of the world, in broadly similar ways. (Hall, 1997, p.2) Producers of mass, popular culture must hence work within "rigidly defined conventions," and adhere to the "rigidly defined values and beliefs of the social institution within which their work is produced and circulated" (Kress & Van Leeuwen, 2011, p. 115). To be critical consumers, viewers of reality TV must at least recognize these communicative intentions and the values and beliefs for what they are. This study serves as a springboard to reaching such ideal. Specifically, the author is interested in the following research questions:  3. Screen effects: On-screen graphics andanimations.

Samples Selection
This author chooses to study versions of The Voice from three countries with an aim to discover the diversity of visual rhetoric between western and eastern cultures. The author fully acknowledges that the selection of these specific versions of the show may be a limitation to the study. Nonetheless, given the size of the countries of which these versions of The Voice represent and the viewership of these shows, the author is confident that the selected versions of the show offer a platform for understanding the generic visual practices employed in the West and the East.
All visual data of this study are collected fromthe earliest season finale of each version of the show in 2014: To ensure consistency across all three versions of the show, this study focuses only on the visual composition and screen effects during a singer's performance in the finale episodes.
Online Journal of Communication and Media Technologies Volume: 7 -Issue: 2 April -2017

Primary Findings
After carefully watching of all selected finale episodes, the author summarized the visual format and presentational structure of each version of the show.

Cinematography
To consider the patterns in camera work across three The Voice productions, the author observed all the video clips selected for this study, marking the frequency of different basic camera shots and transitions. The following tables display the instances of each camera action:

Discussion of Findings: The (Visual) Rhetoric of Excitement From Stage to Screen: Editing as Creating
Thompson and Bowen in their Grammar of the Edit confirm that the editor is one of the last people in the creative chain of film production who puts together the footage in a way that "makes sense, tells a story, gives information and/or entertains" (2009, p. 53). Thus, the editor is a storyteller who uses his or her technical knowledge and editing skills to craft a piece of rhetoric. In fact, the editor, as Thompson and Bowen contend, is responsible for making sure that the types of edits "fall within the accepted grammar of the program's genre" as when the editing style "falls outside the traditional audience's understanding," the program may not be well received (p. 108). Such responsibility reinforces my argument that the editor is him or herself a creator of the show. What the audience sees on the TV screen is carefully crafted by the editor to achieve certain rhetorical functions, i.e. to narrate, to inform, or to entertain. This serves as the guiding lens for the rest of this analysis as the author seeks connections between the visual features in the three The Voice shows and their audiences' "traditional understanding" of culture and reality TV.
The first research question of this study seeks to identify key, distinctive visual features across all three ofThe Voice productions: In order to understand how these similarities and variances are possibly results of cultural differences, the author now turns to the second research question of this study: RQ 2: What are some noticeable patterns of visual rhetoric across the three versions of The Voice and how are they tied to their respective cultural realities?
Online Journal of Communication and Media Technologies Volume: 7 -Issue: 2 April -2017

Red, White, and Blue
In "The Rhetoric of Black, White, and Red: Reasonability and Aesthetics to Persuade with Color," Jose Luis Caivano and Mabel Lopez (2003) discuss the use of colors in specific cultural contexts as a tool for persuasion. They propose that color can be "a privileged element" that works as "proofs" in persuasive type (p. 341). As observed and mentioned in the previous sections, colors red, white, and blue seem to be the primary colors used in the visual composition of the three The Voice shows studied here.Such useismotivated by the desire of the producers and editors to relate with their audiences through national identification.
Turning to the national flags of America, the UK, and China, it will become clear to us how red, white, and blue get woven into the shows. In both the US and UK, the red-white-blue triad is used on the countries' respective flags. Whilst, on the Chinese flag, red and yellow are the colors used. This explains why blue is added as another primary color in The Voice UK and The Voice USA but not in The Voice of China, as the main audiences of the US and UK versions may be more closely attached to the red-white-blue triad than their Chinese counterparts.The manipulation of the colors in the shows is thus a way of keeping the audience engaged with the shows by appealing to the audience's underlying (or even subconscious) identification with their national colors.

Close-up: Direct Address and Cultural Intimacy
Kress and van Leeuwen (2006) state that the relation between the producer and the audience is "represented rather than enacted" (p. 116). In the current case, the relations between the producer, the audience, and the performerare represented through the visual configuration of the screen. The home viewer is "imaginarily" put into contact with the performer through screen compositions that create "direct address" (p. 117). Having tallied all instances of camera shots and transitions in the three shows studied in this project, the author has found a gradual increase of close-up shots moving from the Chinese to the British and finally to the Since westerners like Americans and British people tend to look directly into each other's eyes when talking, conveying spontaneity, informality, and equality in exchanges (Carteret, 2011), the use of close-up shots was more frequent in The Voice UK and The Voice USA than The Voice of China. These close-up shots may signify a strong contact with the performer as the viewer is invited into a relation of affinity through a clear spectacle of the performer's facial expressions and minute nonverbal gestures. On the other hand, as eastern practices tend to emphasize respect and humility through downcast and indirect eye contact (Carteret, 2011), less close-up shots were used in The Voice of China as to optimize viewers' direct confrontation with the performer.
The notion of social distance is also discussedby Edward Hall (1966)

in Kress and van
Leeuwen's deliberation of the composition of the image frame. In a nutshell, close-up shots may perform more functions than just reinforcing social norms of intimacythey can be rhetorically manipulated to insinuate different levels of acquaintance between the viewer and the performer, i.e. stranger, friend, or even the self. In addition, patterns of distance can also be used "to signify respectfor authorities" (Kress & van Leeuwen, 2006, p. 126). By simply adjusting the fill of the screen, the sense of the personal and the impersonal, as well as the emotive and the detached, can be engineered. Yet, looking back at the tables of instances in the findings section, it does not seem that the producers have maneuvered the types of camera shot to achieve such goals, as the results show a uniform use of all shot types and transitions among all performers within each show.

Viewer Involvement and Perception of Reality
Continuing on the discussion of viewer engagement, media scholars have argued that reality TV programs invite their audience to perceive the televised show as actual reality and to become involved with the show through increased identification with the performer and the show space (Godlewski & Perse, 2010). According to Kress and van Leeuwen (2006), the camera angle plays role in differentiating viewer detachment and engagement with an image object (p. 136). They argue that frontal angle tends to invite the viewer to see from the photographer perspective, encoding the image-producer as involved with the represented participants. In this study,the author adds that camera's shift of focus to the judges, the band, and the audience during the performances also increases viewer's identification with the performer. In all three versions of The Voice, it has been observed that the camera focus has shifted to highlight the band-in-action, judges' reaction, and audience's emotion during the performances. Such framings may create a pseudo-reality among the viewers as though they were there with the performers, seeing the para-actions from first-person point of view. The result is a feeling of identification. The viewers are likely to believe that they are with the performer and that what they see is the actual portrayal of the performance even though they were not watching a live performance. The danger in this, of course, is that if the viewers assume they can identified with the performer, they may potentially forget what could be a staged and edited event. The viewers may also forget the role of the producers and editors in selecting and choosing to broadcast only certain segments of the shows. Nonetheless, when compared among the three versions of the show, there is no observable difference between the shows in terms of the choices of focus on the live band, judges, and audience during the shows. This suggests that such framingmay not be culturally influenced.

Implications andConclusion
As the findings of this study suggest, reality shows like The Voiceare carefully crafted for rhetorical purposes, driven by specific cultural realities. Producers and editors present images in ways that facilitate preferred viewings of the show, whether through stage setup, camera shots and transitions, or screen graphics. Several differences are observed between western and eastern framing practices and this reinforces the importance of acknowledging the notion of audience in visual rhetoric. For future studies, critics of visual rhetoric should identify the kind of cultural logic or rationale behind the structure that orders images for this can tell us a lot about how the images are rhetorical. Critics need to ask what sort of visual reality is being created and how the rules of that reality would affect its audience.
For filmmakers, editors, and producers, this study urges those who are in the positions of creating visual artifacts to acknowledge that, inevitably, their cultural values and practices get woven into the production process. Such realization should help them gain a critical distance between themselves and the product, and to reflect critically how their work affects audience's involvement and perception of reality. Conclusively, this study has revealed how TV as a visual technology cultivates a sense of reality; and reality TV, as a popular genre, Online Journal of Communication and Media Technologies Volume: 7 -Issue: 2 April -2017 rides on such convenience to create adherence from its audience by manipulating the visual presentations of the shows. The study has noted that:  Overall, the UK and US versions of The Voice(western) tend to be more minimalistic than the Chinese version (eastern).
 The selling of commodification becomes increasingly inseparable from the content of TV and the shows in specific, as proven by the screen graphics that invite audience participation and the presence of sponsor's brand name on the physical stage.
 Colors remain an important visual feature that cultivates familiarity among a specific group of audience as they may identify themselves with different colors they are used to seeing or feel attached to. Yet, all studies have limitations. Some limitations pertaining to this study include the narrow selection of samples due to the lack of access to complete series of the shows and the lack of prior studies in this area to provide a validated methodology and/or theoretical framework for research. The latter, however, presents this study as an important springboard to further research in this area. Additionally, since this study has relied solely on self-reported data by an independent researcher, the data may contain potential biases due to selective viewing and perpetuating cultural values of the author. This calls for a multi-coder research design for future studies to increase reliability of the data and findings.
Notwithstanding these shortcomings, this study has identified several distinctions in The Voice's visual mechanism across China, the US, and the UK, suggesting newunderstandings of the diversity inframecomposition and the functions of visual rhetoric in a modern reality TV franchise.
Online Journal of Communication and Media Technologies Volume: 7 -Issue: 2 April -2017