Since 2002, every time Academy Award nominating season rolls around, it is guaranteed that journalists will once again raise the question, “When Will a Motion Capture Actor Win an Oscar?” (Hart). This discussion began when New Line Cinema, the company that released Peter Jackson’s The Lord Of The Rings: The Two Towers, made a concerted effort to garner a Best Supporting Actor award nomination for Andy Serkis, the British actor who played Gollum in the film–a nomination that was not forthcoming. Twentieth Century Fox would repeat the gesture in both 2012 and 2014, again on behalf of Serkis, this time for his performances as the ape Caesar in Rise Of The Planet Of The Apes and Dawn Of The Planet Of The Apes. Again, there were no nominations, even though critics and audiences alike have praised all of Serkis’s performances as Gollum and Caesar, for which he has won other awards.

Before proceeding any further, I want to address the fact that I just described Serkis, who has become the most famous actor to work extensively with performance capture as well as a staunch advocate for the practice, as having played Gollum and given a performance as Caesar. Serkis has said,  “Reviewers have a strange way of describing my performances. […] They’ll say things like, ‘Serkis lent his voice to’ or ‘inspired the emotions’ or ‘lent his movements to’ or ‘emotionally retained the backbone of,’ as opposed to ‘performed the role.” 1

A film actor myself, I choose to pay Serkis the respect of avoiding such circumlocutions and describing him simply as an actor who plays roles through performance capture even though the status of motion capture performance as acting is precisely the question at issue.

“The Gollum Problem” is how Derek Burrill summarizes the questions raised about performance capture and film acting and the reluctance of the Academy of Motion Picture Arts and Sciences (AMPAS, the body that awards the Oscar), to recognize what Serkis and others do as acting. He captures the debate in a series of questions: “Was Serkis present enough in the performance? At what point is something too digitized? If something is partially digitized, what of its ontology, its presence? Can someone (or something) perform, in the traditional sense, in the digital?” 2

These are far-reaching questions that go well beyond the realm of film acting, to the general status of performance in the Digital Age. Here, I will restrict myself to examining the problematic status of performance capture in film acting primarily through Burrill’s first and third questions–those concerned with presence and ontology. My point of departure is yet another question: Why is it that the voting membership of the Academy, made up of film industry professionals of all types, is apparently unwilling to nominate actors who perform through performance capture?

Note that there is nothing in the Academy’s rules governing the Oscars to prevent actors working in performance capture from being nominated for acting awards. In fact, when the Academy modified its definition of an animated film in 2010, it specifically included new language to the effect that “motion capture by itself is not an animation technique.”3 This clause can be interpreted legalistically as a way for the Academy to shift films using performance capture out of the animation category, thus paving the way for such performances to be considered in the acting categories.

It is the case, however, as an LA Times study of 2012 revealed, that the nearly 6,000 voting members of the Academy

“are markedly less diverse than the moviegoing public, and even more monolithic than many in the film industry may suspect. Oscar voters are nearly 94% Caucasian and 77% male, The Times found. Blacks are about 2% of the academy, and Latinos are less than 2%. Oscar voters have a median age of 62, the study showed. People younger than 50 constitute just 14% of the membership.” 4

It would not be surprising if this group were to prove to be conservative in its understanding of what film acting is, and what kinds of things that show up on screen should be eligible to be awarded as such. However, the fact that the Academy’s membership is dominated by aging white men in no way explains exactly why performance capture may be objectionable from their presumed conservative standpoint.

Discussing the Gollum Problem” on in 2003, Ivan Askwith suggests that Serkis was not nominated because Academy voters could not see Serkis in Gollum, just as they could not see John Hurt in the title character of The Elephant Man, a performance with which Serkis’s are frequently compared.

Although nominated for best actor, John Hurt did not win in 1980, a loss he has partly attributed in recent interviews to the fact that the audience could not recognize him at all. It seems almost certain that Serkis, had he been nominated, would face the same problem.

Defining the problem in terms of the actor’s recognizability, Askwith frames it as a problem of iconicity, to use a semiotic term. Recall C. S. Peirce’s definition of an iconic sign or icon:

“Most icons, if not all, are likenesses of their objects. A photograph is an icon” 5 (emphasis in original).

Gollum is not a likeness of Serkis–they look almost nothing alike. Bert States suggests that we need to be able to perceive the actor in the character in order to appreciate the performer’s artistry:

“We always recognize Olivier in Hamlet or Olivier behind the dark paint of Othello. […] When the artist in the actor comes forth, we are reacting to the actor’s particular way of doing his role.” 6

This process is short-circuited, however, if we cannot recognize Serkis in Gollum, leading to a situation in which the performance is not perceived as creditworthy and therefore not eligible for an award. I use the term creditworthy in the way Stan Godlovitch does when talking about musical performance, to mean that we as audience can “read back from what [we perceive of the performance] to the [performer’s] creditworthiness and hence skilfulness.” 7

Gollum’s lack of iconic resemblance to Serkis seriously inhibits this process of reading back, and therefore appreciation of the actorly skill Serkis brought to his performance.

However, iconicity is not the only semiotic dimension of the Gollum Problem. I shall argue here that although it is reasonable to characterize the actor as an iconic sign for the character he or she plays, there is a deeply embedded historical discourse around film acting that sees the actor’s gestures and, especially, facial expressions on screen not primarily as iconic signs for the character’s gestures and expressions, but as indexical signs of the actor’s interiority. In what follows, I will investigate the indexicality of film acting, first by examining the relationship of the Gollum Problem to current debates over the ontology of digital imagery, then by contextualizing it in relation to entrenched assumptions about the nature of film acting. First, however, I will offer a brief overview of performance capture as a technology and process.

What is Performance Capture?

Commentators regularly trace the history of motion capture back to the studies of animal and human motion through sequential photographs undertaken by Edweard Muybridge in the last quarter of the 19th Century. Fast-forwarding to the motion picture era, another important precedent for motion capture is the rotoscope, a device invented in 1915 by the celebrated animator Max Fleischer, whose studio produced both Betty Boop and Popeye cartoons. The rotoscope allowed animators to draw over filmed performers by hand to transform them into cartoon characters, as the Fleischer Studio did with band-leader Cab Calloway, who appears as a ghostly dancing walrus in the 1932 animated Betty Boop short, Minnie The Moocher. This technique is still in use.

Although the field began to develop rapidly in the late 1980s, electronic motion capture actually began in 1967 with Lee Harrison III’s Scanimate. For the first time, a performer wearing a “data suit” could directly control the movements of an animated figure on a television screen through her own movements. 8 This remains the basic procedure of motion capture to this day: a performer is outfitted with sensors at a number of points on the body. As the performer moves, high-resolution digital cameras track the displacement of the sensors and relay this information to a computer. The data can then be used to animate an onscreen CGI (Computer Generated Imagery) figure created by designers and animators, as was the case for Sirkis’s portrayals of Gollum, Caesar, King Kong in Peter Jackson’s 2005 film, and Capt. Haddock in Steven Spielberg’s The Adventures Of Tintin. Tom Hanks portrayed many of the characters in The Polar Express through motion capture and CGI. It is the combination of motion capture data extracted from an actor’s performance and CGI that allows the screen to be populated not only with fantastic creatures but also with digital doubles that resemble actors or with huge crowds of extras that exist only as data. A digital double performed some of Val Kilmer’s stunts in Batman Forever; the assembled masses in Ridley Scott’s Gladiator were constructed from motion capture data taken from 2,000 human performers.

Originally, motion capture systems captured only full-body movements. Around 2005, motion capture systems that allowed the capturing of facial expressions came into use. At this point, motion capture (or MoCap) was renamed as “performance capture” (or PerfCap) because the sense was that the technology could now engage with a fuller range of an actor’s expressive capabilities. Because motion capture still does not deal well with hands and feet, a sub-specialty has evolved called “hand-overing” (the analogy is with voice-over) in which data captured separately from hand movements is added to full-body motion capture.

An important innovation in performance capture occurred between The Two Towers (2001) and James Cameron’s Avatar (2009), another film to make extensive use of the technology. Serkis’s first portrayal of Gollum entailed

“a three-stage process. […] First, Serkis would perform his scenes on set with the other actors, the same scenes would then be performed again without Serkis, and Serkis would finally replicate these same scenes back in the WETA studio while wearing his motion-capture suit and markers.” 9

(Weta Digital is the New Zealand effects company founded by Jackson that is currently at the cutting edge of PerfCap technology). Ultimately, the performance seen on screen was constructed from the data collected during the studio sessions and added into the film. Technological innovations developed for Avatar made it possible to largely eliminate the second and third stages since enough motion capture data could be extracted from the actors when they were performing with one another to go straight into animating their virtual characters. This streamlined process is the one used for the recent Planet Of The Apes movies and has strengthened the arguments Serkis and others make to defend the idea that performances created through performance capture should be understood simply as acting and, therefore, worthy of Oscar consideration.

Digital Images, Motion Capture, and the Question of Indexicality

The issues raised by performance capture overlap with questions raised about the general differences between analog (chemical) photography and digital photography. One of the chief debates in this area surrounds the question of indexicality. Whereas Peirce defines icons as referring to their objects through resemblance, he defines the index as “A sign […] which refers to its object not so much because of any similarity […] nor [by association] […] as because it is in dynamical (including spatial) connection […] with the individual object. “10

Analog photography is thus both iconic and indexical in that photographs resemble their subjects, they derive from a spatial relationship between those subjects and the camera, and they point to the subjects’ existence in the real world. In order for something to be photographed, it has to have been present before the camera. A painting of a unicorn, for example, is iconic but not indexical because we recognize it as a likeness of a unicorn, but it does not point to a real entity that actually posed for the painting.

D. N. Rodowick argues that whereas we perceive analog photography as “the automatic transcription of a past state of affairs” and therefore as indexical, with digital photography

“… the logic through which indexicality is effected changes fundamentally. Digital capture involves a discontinuous process of transcoding: converting a nonquantifiable image into an abstract or mathematical notation. In digital capture, the indexical link to physical reality is weakened, because light must be converted into an abstract symbolic structure independent of and discontinuous with physical space and time.” 11

Others argue, however, that indexicality persists in digital images. Tanine Allison, in one of the very few academic articles on performance capture published to date, stakes out this position:  “I consider motion capture to be an example of ‘digital indexicality,’ a blend of computer-generated images and material recorded from reality.” 12

Allison appears to have borrowed the concept of “digital indexicality” from Philip Rosen. Without disputing the ontological distinctions between analog and digital photography emphasized by Rodowick and others, Rosen argues that digital images can nevertheless function in the same manner as analog photographs by asking,

“How is it that a digital camera can be sold not as a displacement, but as a replacement for a conventional still camera? How is it that, in the appropriate context, a digital photograph may take on the functions of an indexical photograph, such as family testament or photo-journalism? The answer is that the digital camera does not necessarily exclude all operations of a photochemical camera; it even uses a lens to gather light. […] [I]t retains indexical import as a light-sensing device. […] This means that the digital camera is a machine for configuring “pure data,” according to a certain range of prior pictorial norms, namely those identified with photography. But this does not make it nonindexical.” 13

By shifting the ground of the debate from ontology to function, Rosen is able to make a persuasive case that we habitually use digital media to replicate the indexicality of analog media. But this argument is hard to sustain in the context of performance capture.

At the level of the technology, it is significant that so-called motion capture “cameras” are not “light-sensing devices” in the same sense that photo-chemical or even digital still cameras are. As described by Qualisys, a manufacturer of motion capture equipment, these cameras “emit infrared light onto the markers [on the performer’s body] that reflect the light back to the camera sensor. This information is then used to calculate the position of targets with high spatial resolution” (“Motion Capture”). In order to have enough data to reconstruct movement in three dimensions, each motion capture subject has to be shot from at least three different angles simultaneously. A motion capture camera is not, then, a machine for configuring “pure data”—it is a machine for generating the pure data from which the performer’s movement can only be configured by using another data-driven machine, a computer. In short, the motion capture camera does not capture anything that resembles an image in itself, as the digital camera can still be argued to do. As David Saltz observes,

“Motion capture simply records the changing positions of a discrete set of points on the performer’s body. […] The real product of a motion capture session is simply a long list of numbers.” 14

Describing motion capture as “a performance extractor,” Saltz further points out that

“motion capture, in effect, captures verbs without subjects, performance without identity.” 15

Whereas an analog film or video recording of a performer in motion both iconically represents that person and indexically points back to the person whose movements were recorded, motion capture captures only the motion, not the person, and therefore does neither. Rather, it renders what were once movements produced by a specific subject as abstract information. This data set only becomes an image when the data is used to animate a figure on a screen, and this figure, understood as a sign, does not point directly back to reality. This image neither is, nor even can be perceived to be, an “automatic transcription” of some real occurrence.

The effect Rodowick ascribes to digital photography, that of weakening “the indexical link to physical reality” is thus exacerbated in the case of performance capture. Commenting on Peirce’s definition of the index, T. L. Short observes that the

“relation of index to object depends on the existence of the latter. As [Peirce] wrote in 1902, ‘An index is a sign which would, at once, lose the character which makes it a sign if its object were removed.’” 16

This removal of the object of the sign is precisely what happens through performance capture—the performer, whose actions would normally ground the sign in reality, disappears, leaving behind only a trail of numerical data. The indexical link is thus damaged beyond repair. As an indexical sign, the only object to which a figure on screen created through performance capture could possibly point is the motion capture data set.

 The Indexicality of Film Acting

The disappearance of the performing subject in performance capture is even more problematic when considered in the context of film acting. Actors on screen are iconic signs for the characters they portray, of course. But there is a deeply entrenched historical discourse around film acting that defines how it is understood among people who work in the medium. In the terms of this discourse, actors on screen are indexical signs that point to the actor’s own interiority, to the thoughts and emotions that underpin the actor’s work. This discourse can be traced from the early days of film theory up to present day acting pedagogy.

Both Béla Balázs and Siegfried Kracauer, notable names in the history of film theory, contributed to the discourse in question through their joint emphases on the camera’s ability to do more than reproduce the surface of reality. According to both, the camera penetrates below the surface to reveal hidden truths, including those lurking in the psyches of actors.

“Close-ups are often dramatic revelations of what is really happening under the surface of appearances. […] They show the faces of things and those expressions on them which are significant because they are reflected expressions of our own subconscious feeling.” 17

Balázs famously described the power of the camera by saying,

The language of the face cannot be suppressed or controlled. However disciplined and practisedly hypocritical a face may be, in the enlarging close-up we see even that it is concealing something, that it is looking a lie. For such things have their own specific expressions super posed on the feigned one. It is much easier to lie in words than with the face and the film has proved it beyond doubt. 18

Balázs implies that since no one has the ability to completely control facial expressions and what they communicate, film actors must be able to perform in ways that will not be unmasked as false by the camera. Kracauer expresses a similar idea:

The film actor is less independent of his physique than the stage actor, whose face never fills the whole field of vision. The camera . . . reveals the delicate interplay between physical and psychological traits, outer movements and inner changes. [M]ost of these correspondences materialize unconsciously. 19

More directly than Balázs, Kracauer proposes that since the camera reveals connections between the film actor’s interiority and an external appearance that the actor does not fully control consciously, the actor’s primary concern is with achieving an inner state that will result in the appropriate external appearance. This suggests, in turn, that the actor’s facial expressions and emotive gestures on screen are indexical signs pointing to the actor’s internal psychological or emotional state.

Although Balázs and Kracauer both published their definitive works of film theory after the Second World War, their formative experiences of cinema occurred in the 1920s in Germany. As Cynthia Baron has shown, with the arrival of sound in film, the Hollywood studios became increasingly interested in training actors, resulting in the establishment of a number of acting schools around Los Angeles in the 1930s and 1940s. The studios actually ran some of them, while others were independent but nevertheless served the film industry. The principles taught in these schools derived largely from those imparted by the first generation of European and American actors and acting teachers to bring versions of the Stanislavskian approach to the United States, such as expatriates like Maria Ouspenskay and Richard Boleslavsky, and figures associated with the Group Theater like Morris Carnovsky(a founder of one such school, the Actors’ Laboratory in Los Angeles, in 1941) and Stella Adler.

The precepts governing the way film acting was taught in these schools often reiterate assertions concerning the nature of the apparatus found in early film theory. For example, Josephine Dillon begins her book Modern Acting: A Guide For Stage, Screen And Radio (1940) which became a standard text, by discussing the implications of the close-up for acting and the consequent importance of the actor’s eyes. 20 Lillian Burns, who coached actors for MGM, describes the camera as “a truth machine.” 21 Comparing acting on stage with acting on film in a 1949 essay for the magazine Theater Arts, the actor Hume Cronyn, who was associated with the Actors’ Laboratory, states,

“the camera will often reflect what a man thinks, without the degree of demonstration required in the theater.” 22

One of the main acting techniques advocated by many of these teachers was the inner monologue, a silently verbalized stream of the character’s fictional consciousness. Dillon, in her book, describes the value of this technique to the film actor:

Since the expressions of the eyes in acting are to represent the emotions of the part played, the actor should, in studying the part, improvise the probable mental conversations of the person portrayed, and memorize them as carefully as the written dialogue. Then the eye expressions will be right; they will be convincing or believable. [A]lso, if the actor is thinking actual words during a close-up in which there is no spoken dialoguer, the timing of his glance and of the changes of expression in his eyes will be perfect. 23

In other words, the actor’s external expressions arise from maintaining the appropriate internal state. Or, as Burns so succinctly puts it,

“You cannot say ‘dog’ and think ‘cat’ because ‘meow’ will come out if you do.” 24

In this way, the aspects of the film actor’s performance perceptible to the audience serve as indexical signs pointing to a process taking place inside the actor that gives rise, perhaps unconsciously and beyond the actor’s full control, to the facial expressions and gestures that render the character.

This way of thinking about film acting persists to the present day, particularly in the context of actor training. Jeremiah Comey declares in the pages of The Art Of Film Acting, a textbook published in 2002, that

“the camera sees everything and demands absolute truth. […]  The camera sees what you truly are at the moment.” 25

He elaborates:

All acting in film takes place in close-up . . . where the camera sees your emotions. You can’t hide anything because the camera sees everything—fear, happiness, anxiety, lack of confidence, nervousness, whatever is going on inside you. On film you cannot “act” in love, you have to be in love. 26

I can attest from my own current experience as a film actor that this strong belief that if the actor is feeling the right thing, the camera will pick it up more or less automatically remains dominant. I have heard both directors and casting people regularly say things along the lines of,

“If you are feeling it, I will see it.”

It is clear that this understanding of film acting, in which the surface manifestations of the actor’s work are indexical signs pointing to what is happening in the actor’s interior in the moment of performance, challenges those who would argue for seeing performance capture as acting. The reason is self-evident: we as audience cannot read back from the gestures and expressions of Gollum or the ape Caesar to their internal states since CGI figures possess no interiority: they are simply particular ways of displaying data on a screen. In defending performance capture as a kind of acting, it therefore becomes necessary to argue that the actor’s expressions and gestures can be successfully captured and transferred to the CGI figure, to be read either as indexing this figure’s presumed emotional state or pointing back to the actor whose performance was captured. As Allison points out, this is precisely the tack taken by Serkis and Jackson in their attempt to frame Serkis’s performance as King Kong as acting:

Publicity materials pretend the translation of bodily movement from Serkis to the Kong digital model is a seamless, almost magical process, a supernatural translation from one body into the other. […] Jackson and Serkis’ comments about Kong acknowledge the character’s computer-generated exterior, but insist that Serkis’ performance “carries over” through the technology. For them, motion capture provides not only a record of Serkis’ movements that are transferred to the Kong digital puppet, but also a way to “capture” Serkis’ emotion and the psychological aspects of his performance 27.

Although Allison acknowledges through her use of the word “pretend” that the position adopted by Jackson and Serkis is problematic, she goes on to advance a similar argument by saying,

The character of Kong stands as both an instantiation of the indexical that takes the form of animation (Serkis’ recorded movement animates Kong) and an instantiation of animation that takes the form of the indexical (a digitally constructed creature that literally takes the shape of Serkis’ body in motion 28).

The difficulty is that Kong does not “literally take the shape of Serkis’s body in motion” and, in fact, could not possibly do so. This is not because of some deficit in the technology that will eventually be overcome. Rather, it has to do with the difficulty of transferring movements across entities with different physiognomies. As Marc Boucher points out in an essay on the use of motion capture in dance, since

“a giraffe does not move like an anteater, mapping motion data from one onto the other results in something that is physically impossible because of the extreme anatomical and morphological variance.” 29

Even though apes and humans are physically much more similar to one another than are a giraffe and an anteater, the problem persists, especially when it comes to the subtleties of facial expression. An article on the technical dimensions of performance capture in Dawn Of The Planet Of The Apes describes the process of constructing the facial expressions of the CGI apes as one of reworking the actors’ expressions for the structure of an ape’s face–a laborious process that is partly achieved automatically by software and partly manually by animators:

the capture is stabilized, and then translated not only from human to ape facial shape, but also into a system that allows the animators to adjust lip sync, correct expressions and better match the acting choices of the MoCap artists when shown on an Ape face. […] It is also possible for certain combinations of blend shapes or expression to be created by the data and or the hand animation. These are not character achievable or “off-model.” When this occurs a corrective shape is automatically triggered that avoids the bad combination and produces a substitute expression. 30

In other words, the expressions we see on the screen apes’ faces are not identical to those Serkis and the other performers made when captured, but had to be adjusted to fit the shape of an ape’s face. Furthermore, if an expression cannot be made to work on the CGI figure, a different one, generated by the software, is automatically used.

This points to the other major problem with the argument that CGI gives us direct access to gestures and expressions captured from live performers: it completely elides the contributions of the animators and designers who intervene in the process by using the motion capture data to create the characters we actually see. There are, for example, many compelling “before and after” style video clips that place footage of actors in motion capture suits side-by-side with the corresponding scene from the finished film. Such presentations make it seem as if the transfer is fairly direct, but only by leaving out everything that had to happen to get from one state to the other. Serkis himself has been roundly criticized for pronouncements in favor of seeing performance capture as acting that often seem to either ignore or denigrate the animators’ work. The intervention of animators between the actors whose performance is captured and the figure we ultimately see on screen considerably muddies the question of indexicality. Precisely what does this figure point to: its own fictional interiority? That of the actor? Or the work of the animators who actually created it? Commentators like Allison would like to have it all ways, but this is not viable in relation to a tradition of acting that insists that the facial expressions of the entity we have seen on screen in close-up are indexical signs for processes taking place within the same entity at the time of filming.


I trust it is clear that it has not been my purpose here to take a stand on the question of whether or not the Academy should consider granting Oscars to performance capture actors, a question on which I remain ambivalent. I have used Gollum’s Problem as a heuristic for attempting to think through the question of why the idea that such performances should be accepted unproblematically as instances of screen acting is controversial. I have pursued this inquiry along two lines, both having to do with the indexicality of the screen image. Following the first line, I suggested that although digital photography has put the ontology of the image into question, arguments that digital images can at least function indexically are credible. Such arguments do not, however, extend to the context of motion capture, where the technology undermines indexicality by divorcing performance (in the form of movement) from the materiality, identity, and presence of the performer.

Following the second line, I proposed that film acting today is defined by a deeply entrenched discourse whose history can be traced through both film theory and practice, a discourse that posits that the gestures and, especially, facial expressions we see actors make on screen are indexical signs for processes taking place within their minds and psyches–whether those processes are understood as spontaneous emotional reactions or constructed inner monologues. Inasmuch as CGI figures possess no interiority, they cannot be perceived this way, and arguments that the expressions of the actor whose performance was captured and the interior states to which they point simply transfer to the CGI figure are highly questionable in light of the design decisions and technical routines involved in bringing the two together into a screen performance.

I am not suggesting that the discomfort around the idea that performance capture in film should be seen as acting, of which the Academy’s reluctance is symptomatic, is necessarily permanent. I think it is quite possible that in the not-too-distant future we as audiences, and perhaps even the Academy, will find a way of understanding and classifying such performances. However, Andrew Darley describes the current situation accurately:

A certain sense of ambiguity haunts the imagery of such films. This is related to uncertainty as to the status of the imagery in terms of origination: is it cartoon animation, three-dimensional (puppet) animation, live action or, perhaps, a combination of all three? One becomes fascinated with the imagery precisely because of this uncertainty, seduced by the ways it recasts, amalgamates and confuses familiar techniques and forms. 31

There is pleasure in this seductive ambiguity, but it also makes it impossible for us to read back from what we see on screen to the creditworthy actions of those who created it. Returning to one of Burrill’s foundational questions–“Can someone (or something) perform, in the traditional sense, in the digital?”–the answer, at least where performance capture and film acting are concerned, is “No.”

When we do reach the point of creating a category for motion capture/CGI performances, I suspect we may not perceive them as a species of screen acting at all. Saltz suggests that “motion capture is, at root, a form of virtual puppetry” 32 and he is not alone in characterizing it in this way. David J. Sturman, in an essay that chronicles early uses of motion capture for entertainment purposes, describes it as “computer puppetry” and notes that puppeteers make better performers in this medium than actors or mimes. 33 Even Serkis has said that thinking of his relationship to Gollum as that of a puppeteer to his puppet made it easier for him to perform using motion capture. 34 Since AMPAS does not give awards for puppetry, and performers like Serkis would much prefer to think of themselves as actors rather than puppeteers, this characterization of perfcap is probably not going to get much traction in Hollywood. Arguably, it nevertheless provides a more accurate account of the process than seeing these performances as constituting a variety of screen acting.

Comparing CGI figures like those in the Planet Of The Apes movies with traditional physical puppets, Donald Crafton writes,

“such films use […] motion capture to choreograph puppets (or puppetlike simulations) without wires or other visible human agency so artfully that they seem to be animals performing on their own. […] In their ability to affect audiences, these puppet figures have their own agency.” 35

This claim is part of Crafton’s larger argument that audiences experience animated figures on screen not as performances by animators but as performers in their own right–a claim that applies equally to performances like Serkis’s and to the classic animation Crafton primarily discusses.

Although I, too, wish to emphasize the autonomy of perfcap figures, the idea that we perceive these figures on screen not primarily as performances by actors or animators but as performers themselves, I will couch the point in different terms. In his book The Play Of Nature, the philosopher Robert Crease offers an extended analogy between scientific experimentation in laboratories and theatrical performances as a means of developing a philosophical account of experimentation. Scientific theories, in his view, are analogous to play scripts in that both are merely “abstract testimonials to the possibility of the presence of phenomena.” 36  In the case of theater, the phenomenon is the world of the play populated by characters described in a script. In the case of experimentation, the phenomena are scientific events hypothesized in theory. In both cases, the object of the performance, whether on stage or in a laboratory, is to bring the phenomenon forth, to make it present and perceptible to what Crease repeatedly identifies as “a suitably prepared” audience. 37

Perfcap CGI figures bridge Crease’s two categories of phenomena. On the one hand, they are characters, inhabitants of fictional worlds described in dramatic texts (film scripts). While they are not scientific phenomena strictly speaking, they are technological phenomena of some significance. Coaxing them into existence requires just as much theory, apparatus, and personnel as the staging of any major scientific breakthrough, such as the identification of a particle that may be the “Higgs boson” at the CERN (Conseil Européen pour la Recherche Nucléaire) laboratory in Switzerland in 2012. Crease’s view of performance is expansive—beyond the immediate procedures and products on view to an audience, he emphasizes the significance of production, which he defines as all of the decisions, materials, locations, institutions, funds, and people involved in revealing the phenomenon. From this perspective, the perfcap figures we see on screen do not appear as performances by individual actors but, rather, as techno-dramatic phenomena brought forth through a complex process of production that includes actors, writers, designers, animators, and many others, working together. The resulting phenomena appear to their audiences as largely autonomous figures with their own agency, as Crafton suggests, just as the audience for the Higgs particle perceives it as existing independently of the production process that made it perceptible to them .38

This article originally appeared in Archee and has been reposted with permission.


1 Quoted in Hart, Hugh. “When Will a Motion Capture Actor Win an Oscar?,”, 24 January 2012,

2 Burrill, Derek Alexander. “Out of the Box: Performance, Drama, and Interactive Software,” Modern Drama, vol. 48, no. 3, Autumn 2005, pp. 492-512, p. 492.

3 Academy of Motion Picture Arts and Sciences. 87th Annual Academy Awards of Merit, Los Angeles, Academy of Motion Picture Arts and Sciences, n. d., p. 7.

4 Horn, John, Nicole Sperling and Doug Smith. “Unmasking the Academy: Oscar Voters Overwhelmingly White, Male,” LA, 19 February 2012,

5 Quoted in Short, T. L. Peirce’s Theory of Signs. Cambridge: Cambridge University Press, 2007, p. 215.

6 States, Bert O. “The Actor’s Presence: Three Phenomenal Modes” in Zarrilli, Phillip (ed.), Acting (Re)Considered : A Theoretical and Practical Guide. New York: Routledge, 2002, pp. 23-39, p. 26.

7 Godlovitch, Stan. Musical Performance: A Philosophical Study. New York: Routledge, 1998, p. 26.

8 Sturman, David J. “Computer Puppetry,” IEEE Computer Graphics and Applications, January / February 1998, p. 38-45, p. 38.

9 Romano, Nick. “How’d They Do That? A Brief Visual History of Motion-Capture Performance on Film,”, 14 July 2014,

10 Quoted in Short, T. L. op. cit., p. 219.

11 Rodowick, D. N. The Virtual Life of Film. Cambridge: Harvard University Press, 2007, p. 117.

12 Allison, Tanine. “More than a Man in a Monkey Suit: Andy Serkis, Motion Capture, and Digital Realism,” Quarterly Review of Film and Video, vol. 28, 2011, pp. 325-341, p. 326.

13 Rosen, Philip. Change Mummified: Cinema, Historicity, Theory. Minneapolis: University of Minnesota Press, 2001, p. 308.

14 Saltz, David. “The Ontology of Motion Capture,” unpublished text presented at the Annual Meeting of the American Society of Aesthetics, January 2003, p. 3.

15 Ibid., p. 10.

16 Short, T. L. op. cit., p. 219.

17 Balázs, Béla. Theory of the Film: Character and Growth of a New Art, trans. Edith Bone. London: Dennis Dobson, 1952, p. 56.

18 Ibid., p. 63.

19 Kracauer, Siegfried. “Remarks on the Actor” in Robertson Wojcik, Pamela (ed.), Movie Acting: The Film Reader. New York: Routledge, 2004, pp. 19-27, p. 21.

20 Dillon, Josephine. Modern Acting: A Guide for Stage, Screen and Radio. New York: Prentice-Hall, 1940, p. 3.

21 Quoted in Baron, Cynthia. “Crafting Film Performances: Acting in the Hollywood Studio Era” in Robertson Wojcik, Pamela (ed.), Movie Acting: The Film Reader. New York: Routledge, 2004, pp. 83-94, p. 88.

22 Cronyn, Hume. “Notes on Film Acting” in Senelick, Laurence (ed.), Theater Arts on Acting. New York: Routledge, 2008, pp. 370-376, p. 373.

23 Dillon, Josphine. op. cit., p. 9.

24 Quoted in Baron, Cynthia. op. cit., p. 88.

25 Comey, Jeremiah. The Art of Film Acting: A Guide for Actors and Directors. Oxford: Focal Press, 2002, p. 14.

26 Ibid., p. 18.

27 Allison, Tanine. op. cit., p. 333.

28 Ibid., p. 335.

29 Boucher, Marc. “Virtual Dance and Motion-Capture,” Contemporary Aesthetics, Vol. 9, 2011,

30 Seymour, Mike. “Ape Acting,” FX Guide, 17 July 2014,

31 Darley, Andrew. Visual Digital Culture: Surface Play and Spectacle in New Media Genres, New York: Routledge, 2000, p. 84.

32 Saltz, David. op. cit., p. 12.

33 Ibid., p. 42.

34 Searls, Collette. “Unholy Alliances and Harmonious Hybrids: New Fusions in Puppetry and Animation” in Posner, Dassia N., Claudia Orenstein and John Bell (eds.), The Routledge Companion to Puppetry and Material Performance. Abingdon: Routledge, 2014, pp. 294-307, p. 304.

35 Crafton, Donald. Shadow of a Mouse: Performance, Belief, and World-making in Animation. Berkeley: University of California Press, 2012, p. 68.

36 Crease, Robert P. The Play of Nature: Experimentation as Performance. Bloomington: Indiana University Press, 1993, p. 161.

37 Ibid., p. 158.

38 This text has been published in a first version in PAJ: A Journal of Performance and Art Volume 39, Number 3, September 2017 (PAJ 117)  pp. 7-23


Academy of Motion Picture Arts and Sciences. 87th Annual Academy Awards of Merit, Los Angeles, Academy of Motion Picture Arts and Sciences, n. d.

Allison, Tanine. “More than a Man in a Monkey Suit: Andy Serkis, Motion Capture, and Digital Realism,” Quarterly Review of Film and Video, vol. 28, 2011, p. 325-341.

Askwith, Ivan. “Gollum Dissed by the Oscars?,” 18 February 2003,

Balázs, Béla. Theory of the Film: Character and Growth of a New Art, trans. Edith Bone, Dennis Dobson, London, 1952.

Baron, Cynthia. “Crafting Film Performances: Acting in the Hollywood Studio Era” in Robertson, Pamela Wojcik(ed.), Movie Acting: The Film Reader, Routledge, New York, 2004, pp. 83-94.

Boucher, Marc. “Virtual Dance and Motion-Capture,” Contemporary Aesthetics, Vol. 9, 2011online.

Burrill, Derek Alexander. “Out of the Box: Performance, Drama, and Interactive Software,” Modern Drama, vol. 48, no. 3, Autumn 2005, p. 492-512.

Comey, Jeremiah. The Art of Film Acting: A Guide for Actors and Directors, Oxford: Focal Press, 2002.

Crafton, Donald. Shadow of a Mouse: Performance, Belief, and World-making in Animation. Berkeley: UC Press, 2012.

Crease, Robert P. The Play of Nature: Experimentation as Performance. Bloomington: Indiana University Press, 1993.

Cronyn, Hume. “Notes on Film Acting” in Senelick, Laurence (ed.), Theater Arts on Acting. New York: Routledge 2008, pp. 370-376.

Darley, Andrew. Visual Digital Culture: Surface play and spectacle in new media genres. New York: Routledge, 2000.

Dillon, Josephine. Modern Acting: A Guide for Stage, Screen and Radio. New York: Prentice-Hall, 1940.

Godlovitch, Stan. Musical Performance: A Philosophical Study. New York: Routledge, 1998.

Hart, Hugh. “When Will a Motion Capture Actor Win an Oscar?,”, 24 January 2012, online.

Horn, John, Nicole Sperling and Doug Smith. “Unmasking the Academy: Oscar Voters Overwhelmingly White, Male,” LA, 19 February 2012, online.

Kracauer, Siegfried. “Remarks on the Actor” in Robertson, Pamela Wojcik (ed.), Movie Acting: The Film Reader, New York: Routledge, 2004, pp. 19-27.

“Motion Capture Technology,”, n.d., online

Rodowick, D. N. The Virtual Life of Film. Cambridge: Harvard University Press, 2007.

Romano, Nick. “How’d They Do That? A Brief Visual History of Motion-Capture Performance on Film,”, 14 July 2014. Online.

Rosen, Philip. Change Mummified: Cinema, Historicity, Theory. Minneapolis: University of Minnesota Press, 2001.

Saltz, David. “The Ontology of Motion Capture,” unpublished text presented at the Annual Meeting of the American Society of Aesthetics, January 2003.

Searls, Collette. “Unholy Alliances and Harmonious Hybrids: New Fusions in Puppetry and Animation” in Posner, Dassia N., Claudia Orenstein and John Bell (eds.), The Routledge Companion to Puppetry and Material Performance. Abingdon: Routledge, 2014, pp. 294-307.

Seymour, Mike. “Ape Acting,” FX Guide, 17 July 2014, online.

Short, T. L. Peirce’s Theory of Signs. Cambridge: Cambridge University Press, 2007.

States, Bert O. “The Actor’s Presence: Three Phenomenal Modes” in Zarrilli, Phillip (ed.), Acting (Re)Considered : A Theoretical and Practical Guide. New York: Routledge, 2002, pp. 23-39.

Sturman, David J. “Computer Puppetry,” IEEE Computer Graphics and Applications, January / February 1998, pp. 38-45.

This post was written by the author in their personal capacity.The opinions expressed in this article are the author’s own and do not reflect the view of The Theatre Times, their staff or collaborators.

This post was written by Philip Auslander.

The views expressed here belong to the author and do not necessarily reflect our views and opinions.