DCS > Essays > Tales from the Tech Retreat 2017 — Taking Stock of VR

Tales from the Tech Retreat 2017 — Taking Stock of VR

JM Headshot2014Med
by James Mathers
Cinematographer and Founder of the Digital Cinema Society
(Excerpted from the February 2017 Digital Cinema Society eNewsletter)

HPA TechRetreatIn a tradition of several years now, I have devoted my January eNewsletter to covering the Consumer Electronics Show as it relates to content creators.  Then for our February edition I like to tell you about what I discovered at the HPA Tech Retreat from which I just returned; this year will be no different.  However, I took in so much at this year’s Retreat, from VR to Cloud Workflows, that I’m going to have to break it up.  So, for this essay I’m going to concentrate on VR.

Although they formerly changed their name last year to the Hollywood Professional Association from the Hollywood Post Alliance to reflect the organization’s expanding role in content creation, the acronym and the format of the Retreat remain the same.  It is a very high level gathering that brings together a core group of technology leaders to share with each other challenges, solutions, and innovations they’re working on.  This proves valuable to help keep technology moving forward on a more productive path.  The week-long Retreat has been held for many years in the Palm Springs area of Southern California about a two-hour drive east from Los Angeles.

If there was an overarching theme this year, I would say it was to take stock of disruptive technologies.  Please know that I don’t mean to infer any negative connotation when I say, “disruptive”.  Although it may not have been welcomed by Blacksmiths and saddle makers, if Henry Ford had not disrupted the transportation industry, we might all still be riding around today in horse-drawn carriages. It is the way of progress and innovation, and as Entertainment Industry Professionals, we need to keep abreast of these changes or get left behind at the station.

DigitalVR TitleGfxOne technology that got a lot of attention this year was VR.  In fact, a daylong session was devoted to it.  Organized by VR Pioneer and longtime DCS member Lucas Wilson, Founder and CEO at SuperSphereVR and Marcy Jastrow from the Technicolor Experience Center where she heads their immersive media efforts, the seminar brought together Content Creators and Service Providers to give an idea of where VR is at today.  DCS members Alan Lasky, Director of Studio Product Development at VR production company 8i and Phil Lelyveld, VR/AR Initiative Program Lead, Entertainment Technology Center at USC were among the many distinguished panelists.

At this point, there is still a lot of confusion in the marketplace, from basic terminology to how to effectively tell stories in this new medium, and perhaps most importantly, how to monetize the efforts?  That said, an Industry gathering such as this, with a lineup of panelists who are actually in the trenches every day trying to make this all work, is just what the doctor ordered.  Here are some of my take-aways from the session:

Let’s start with terms; I’m relatively new to VR, so there were quite a few that I hadn’t previously heard, or that may have been thrown around without my having a very clear idea of what they mean.

“6 DoF”, “6 Degrees of Freedom”, refers to the freedom of movement available to the viewer in 3D space.  They are free to change position as forward/backward, up/down, left/right combined with rotational moves on axis, often termed pitch, yaw, and roll.  To use a First-person shooter video game as an example, if the game provides five degrees of freedom including forwards/backwards, slide left/right, up/down (jump/crouch/lie), yaw (turn left/right), and pitch (look up/down), but also allows leaning control, then it would be considered to offer 6 DoF.

“Agency,” or “sense of agency" (SA) refers to the sense of control the viewer feels in initiating, executing, and controlling their actions in a VR space.  Can they simply look around from a static POV, or with a better sense of agency, their movements and action will be reflected and reacted to in the VR environment as in the aforementioned shooter game.

“Equirectangular View” is when you take a 360 degree image and lay it out in a flat two dimensional projection, much like a map of the world laid out in a Mercator projection, compared to a spherical globe.  This flat view is how VR images are digitized, then computer graphic processing, or GPU, renders this data into 360 VR to allow the viewer to look around as if through a window within the panoramic 360°x180° field of view.

"Room Scale," defines the area you can physically move inside the VR experience.  As opposed to a view where you could only spin around and see 360 from one spot, a Room Scale experience would allow you to move around within a space of say, 10’x10’, (depending on the setup), and your movements would be tracked for interactive changes in perspective within that space.

“Location Based” vs “Mobile” VR experience.  Practically everyone has a mobile enabled VR player in their pocket these days; of course, I’m talking about your smart phone.  You can go anywhere that you get a signal to view or download streaming VR.  Location Based, on the other hand, refers to installations set up with high-powered computers and some kind of tethered head mounted display.  An example I’m eager to experience that has recently been set up across the street from The Grove shopping center in Los Angeles is the IMAX Experience Center.  They currently sell tickets for admission to 8 different experiences lasting from about 10 to 30 minutes each.  It can be a solo experience, such The Walk, which is based on the movie about a tight rope walker on a wire stretched between two skyscrapers, or you can join your friends in virtual combat situations playing RAW DATA or soaring over Paris in EAGLE FLIGHT MULTI-PLAYER

“Volumetric Capture” is not simple to explain, but let me try in relation to VR.  It’s basically a technique used to capture the geometry of a subject in multiple dimensions so that when the viewer moves around in the VR environment, their POV can move around the subject accordingly.  It is extremely computationally intensive, and probably not completely practical at this point, but once perfected, it would allow unfettered access within the VR space so that the viewer could literally circle an object with the correct POV from every angle along the way.  Marrying such images with photo-realistic computer graphics presents many great options for storytelling.

8i VR CaptureLytro LensArraySuch image capture can be accomplished in many ways.  Some systems surround a subject with large numbers of cameras, then stitch those images together, such as the 8i company that uses such methods to create, mix and experience photo-realistic human images they refer to as holograms.  A character’s live performance can be captured this way, as opposed to doing a 3D full body scan of the performer and then using motion capture to later animate the character.  There are also more computational approaches to shooting live action instead of such optical processes.  Lytro cinemaIn the case of the Lytro cinema camera, it works to capture information about the light rays traveling through the space. Their light field sensor uses an array of micro-lenses placed in front of an otherwise conventional sensor to record the light’s intensity, color, and directional information.

“Omnidirectional Stereo Capture” -  When almost all points surrounding the capture devices are seen by three or more cameras to create a 360 Spherical, as well as a Stereoscopic view.  Radiant images vr camera jaunt oneThese systems tie together anywhere from 8 to 24 separate images so that every point in the field is covered by at least two or three cameras.  Examples are the Nokia OZO, the Jaunt ONE, and the Facebook Surround 360 camera systems.

“Spherical Audio” refers to the ability to capture and playback sound as if it is coming from differing directions.  I had never before heard the term “Head Related Transfer Function” or “HRTF.”  However, it refers to the way we can tell which direction a sound comes from and how far away it might be based on the fraction of a second difference as various parts of the sound wave reach the ear.  A Spherical Sound mix can digitally alter the sound files to make the listener believe the sound is coming from various directions, which can make for a valuable tool in “Attention Directing”.

So what is “Attention Directing”?  It is any number of methods that VR storytellers might use to guide the viewer to the point in the VR environment where they need to be looking to follow the story.  Traditional techniques of cutting, shallow depth-of-field, and close-ups, don’t always work so well in VR, so it has been quite a challenge that VR content creators are currently trying to deal with.  It’s almost as if a whole new cinematic language needs to be developed to communicate effectively in VR.  In short, we need to figure out new ways to direct the viewers attention so they can follow the story the content creator is trying to tell.

“Nodal” VR Capture – This is probably the most basic method of shooting VR and one that can give the most control over such factors as lighting and audio recording.  It involves dividing the 360-degree field of view into sections, then shooting locked off shots from a common nodal, (or center), point to cover every angle emanating from that spot.  For example, if the 360 view was broken into quadrants, which is common, then the shots would need to be rotated to capture four separate shots, each with a 90 degree field of view, so that they can be combined in post, (or “stitched”), to create a single 360-degree shot.  Of course, it means you have to shoot any action four times over, but breaking the coverage area up like this allows that each section can be lit and mics placed for the best effect from an area off-camera.  Thus the challenge of hiding crew and equipment is alleviated compared to shooting live 360.

It is also possible to shoot stereo via this nodal technique simply by using two cameras at a time in typical stereo rig.  This Nodal Stereo method actually has another advantage over the previously described Omnidirectional Stereo Capture.  Since all points in the field of view are seen by at least two or three cameras at once in Omnidirectional Stereo Capture, there are areas close to the camera where a subject might appear in more than one place, even after stitching the overlapping images. This causes all kinds of complications and is the reason why cameras like the OZO, Jaunt ONE, and Facebook Surround 360 warn against getting the subject too close to the camera.  This close limit is about 4 to 12 feet depending on the rig.

I’m sure there are many more terms that need to be understood in VR, and the handful I’ve offered here are just a start, but now allow me to relate some of the other info gleaned from this VR seminar.

“Show Me The Money” – Although there is a strong demand for quality content, nobody has locked onto exactly how to monetize it.  Vendors are experimenting with charging admissions for VR experiences, such as the new IMAX venue in L.A.. In fact, IMAX has committed $50 million to a production fund for various VR content, so they are now serious players in the game.  However, with the still very limited number of venues available, it could take a long while to recoup that kind of investment.

There are also advertising based models, which is likely where Facebook is looking to cash in on their $2 Billion plus investment in Oculus and other VR efforts.  Attracting viewers to the portal, then showing them ads before the content, is one approach.  This method has advantages for VR professionals because more work will be generated in creating those VR ads.

Adobe creative cloudA DCS supporter that is entering into the VR Advertising fray is Adobe, who recently debuted a project for advertising in VR at the Mobile World Congress in Barcelona.  The prototype project is focused on theater-style viewing of 2-D videos in a VR environment, similar to the way Netflix, HBO and Hulu make their existing catalogs available on headsets.  Tools to facilitate such efforts will eventually be made part of the Creative Cloud content creation suite, which already makes it possible to edit VR video in Premiere Pro.

Subscription based models are also being floated, but it will take an abundance of good content to build a consistent audience and keep them coming back.  Xbox is rolling out a Netflix-like subscription service called Xbox Game Pass which could conceivably also service VR content.  It is said to offer unlimited access to over one hundred Xbox One and Xbox 360 titles as part of a monthly subscription of $9.99.  Meanwhile, at a similar cost, Sony’s PlayStation Now is a subscription service that lets you stream hundreds of PS3 games to your PS4 and Windows PC.  Although currently concentrated on games, these systems should be able to adapt to VR.

The real hold up to the adoption of all these models is a lack of quality content.  It’s that same old chicken or the egg problem that has plagued other new entertainment technologies.  Why should a consumer spend up to $500 on a high-end headset system on top of having to invest in a powerful computer if there is not already quality content to draw them in?  In fact, the sales of several new tethered hardware systems that came out with much fanfare during the 2016 holiday season have been pretty disappointing.

Although it is not nearly up to the playback standards of a tethered system, most experts on the panels agreed that mobile is where VR will first come into the fore.  With billions of phones already in consumers’ pockets, and only a simple apparatus, (even cardboard), necessary to turn them into VR playback devices, this is the mass market that will drive adoption.  Perhaps the other modes of VR will follow along as the technology matures and the systems become more affordable.  For example, as the processing power of smart phones increases, they could potentially be used to power tethered headsets.  Better yet, as wireless technology improves, those once tethered smart phones could feed high quality headsets wirelessly from a phone that might be in your pocket.

A method for VR to create added value that is already seeing some success is to use the content as a promotional tool.  An example is when companion VR experiences are released along with major motion pictures in order promote the movie.  The Martian and Wild are examples from Fox Studios.  It’s relatively easy for studios to leverage their assets and simultaneously produce such content when they already have the actors, sets, and other elements of production available for a project.

Another concept I became aware of at the seminar is that VR is a “Data Hog.”  When 360 capture systems require upwards of 24 digital cameras shooting 30 to 60 fps, at a minimum of HD resolution, (and many times higher), it requires one heck of a lot of storage.   The Jaunt system, for example, requires 200gb per minute to record its uncompressed output.  I’m sure it makes my friends at OWC and other storage providers very happy, but it is another major challenge for content creators.

It’s not just a question of where to put it.  It is also a matter of efficiently transporting all that data around.  Taking advantage of cloud-based workflows would seem to be the answer, but bandwidth will have to improve before the problem is really solved.  Once that happens, data can immediately be uploaded to VR stitching and rendering services so that VR content creators can view Dailies instead of VR Weeklies or Monthlies.

OZO AloneOnTripodSpeeding up the stitching and rendering processes is a critical forward step for VR.  In the meantime, systems like the OZO, which can transmit a live proxy 360 degree image, will have a great advantage so that content creators can see in real time what they’re recording.  TeradekSphereAdditionally, a solution I saw at the Tech Retreat is from another DCS supporter, Teradek. Their Sphere system can work with a variety of cameras to live stitch and composite VR signals, then stream them for monitoring and control purposes to mobile devices such as an iPad.  The Sphere stitching engine doesn’t require a PC and can internally composite up to 8 1080p cameras.  Teradek iPadIt also has some color correction abilities to manually or automatically help to match the cameras, as well as the capability to record spacial sound.  Transmitting and recording the signal remotely can also help when you’re seeing everywhere and fighting to keep gear out of the shot.

A couple of last tips I noted to improve VR content creation.  One is to avoid flares, especially when shooting Omnidirectional Stereo, because you’re using several separate cameras and each camera sees the flare slightly differently, making it almost impossible to create a smooth composite.  The other tip is to always keep the camera parallel to the horizon if you don’t want to engender motion sickness in your audience.

I don’t know where else you could find all this information gathered in one place at one time, but I certainly do appreciate the folks who put this excellent event together.  If we’re to be successful, we’ve got to work together, not only to figure out how to make compelling VR content, but also how to monetize it.   Educational efforts like the HPA Tech Retreat, and especially the VR symposium, are a great step in the right direction.