Zhihao Hong; Workers Software program Engineer | Emma Adams; Sr. Software program Engineer | Jeremy Muhia; Sr. Software program Engineer | Blossom Yin; Software program Engineer II | Melissa He; Sr. Software program Engineer |
Video content material has emerged as a well-liked format for folks to find inspirations at Pinterest. On this weblog submit, we are going to define latest enhancements made to the Adaptive Bitrate (ABR) video efficiency, in addition to its optimistic impression on consumer engagement.
Phrases:
- ABR: An acronym for Adaptive Bitrate (ABR) Streaming protocol.
- HLS: HTTP dwell streaming (HLS) is an ABR protocol developed by Apple and supported each dwell and on-demand streaming.
- DASH: Dynamic Adaptive Streaming over HTTP (DASH) is one other ABR protocol that works equally to HLS.
ABR Streaming is a broadly adopted protocol within the business for delivering video content material. This technique includes encoding the content material at a number of bitrates and resolutions, leading to a number of rendition variations of the identical video. Throughout playback, gamers improve the consumer expertise by deciding on the absolute best high quality and dynamically adjusting it primarily based on community circumstances.
At Pinterest, we make the most of HLS and DASH for video streaming on iOS and Android platforms, respectively. HLS is supported via iOS’s AVPlayer, whereas DASH is supported by Android’s ExoPlayer. At the moment, HLS accounts for roughly 70% of video playback classes on iOS apps, and DASH accounts for round 55% of video classes on Android.
Startup latency is a key metric for evaluating video efficiency at Pinterest. Subsequently, we are going to first study the steps of beginning an ABR video. To provoke playback, shoppers should receive each the manifest information and partial media information from the CDN via community requests. Manifest information are usually small-size information however present important metadata about video streams, together with their resolutions, bitrates, and many others. In distinction, the precise video and audio content material is encoded within the media information. Downloading sources in each file varieties takes time and is the first contributor to customers’ perceived latency.
Our work goals to cut back latency in manifest loading (highlighted in Determine 1), thereby enhancing total video startup efficiency. Though manifest information are small in dimension, downloading them provides a non-trivial quantity of overhead on video startup. That is significantly noticeable with HLS, the place every video comprises a number of manifests, leading to a number of community spherical journeys. As illustrated in Determine 1, after receiving video URLs from the API endpoint, gamers start the streaming course of by downloading the principle manifest playlist from the CDN. The content material inside the principle manifest gives URLs for the rendition-level manifest playlists, prompting gamers to ship subsequent requests to obtain them. Lastly, primarily based on community circumstances, gamers obtain the suitable media information to begin playback. Buying manifests is a prerequisite for downloading media bytes, and decreasing this latency results in sooner loading instances and a greater viewing expertise. Within the subsequent part, we are going to share our answer to this downside.
At a excessive degree, our answer eliminates the spherical journey latency related to fetching a number of manifests by embedding them within the API response. When shoppers request Pin metadata via endpoints, the manifest file bytes are serialized and built-in into the response payload together with different metadata. Throughout playback, gamers can swiftly entry the manifest data domestically, enabling them to proceed on to the ultimate media obtain stage. This technique permits gamers to bypass the method of downloading manifests, thereby decreasing video startup latency.
Now let’s dive into a few of the implementation particulars and challenges we discovered through the course of.
One of many main hurdles we confronted was the overhead imposed on the API endpoint. To course of API requests, the backend is required to retrieve video manifest information, inflicting an increase within the total latency of API responses. This challenge is especially outstanding for HLS, the place quite a few manifest playlists are wanted for every video, leading to a major enhance in latency resulting from a number of community calls. Though an preliminary try to parallelize community calls offered some aid, the latency regression continued.
We efficiently tackled this challenge by incorporating a MemCache layer into the manifest serving course of. Memcache gives considerably decrease latency than community calls when the cache is hit and is efficient for platforms like Pinterest, the place fashionable content material is constantly served to varied shoppers, leading to a excessive cache hit price. Following the implementation of Memcache, API overhead was successfully managed.
We additionally performed iterations on backend mechanisms for retrieving manifests, evaluating fetching from CDN or fetching straight from origins: S3. We finally landed S3 fetching because the optimum answer. In distinction to CDN fetching, S3 fetching presents higher efficiency and value effectivity.
After a number of iterations, the ultimate backend circulate seems to be like this:
AVPlayer
For the gamers’ aspect, we make the most of the AVAssetResourceLoaderDelegate APIs on iOS to customise the loading strategy of manifest information. These APIs allow functions to include their very own logic for managing useful resource loading.
Determine 4 illustrates the manifest loading course of between AVPlayer and our code. Manifest information are delivered from the backend and saved on the consumer. When playback is requested, AVPlayer sends a number of AVAssetResourceLoadingRequests to acquire details about the manifest knowledge. In response to those requests, we find the serialized knowledge from the API response and provide them accordingly. The loading requests for media information are redirected to the CDN deal with as regular.
ExoPlayer
Android employs a comparable strategy by consolidating the loading requests at ExoPlayer. The method begins with getting the contents of the .mpd manifest file from the API. These contents are saved within the tag property of MediaItem.LocalConfiguration.
// MediaItemTag.kt
knowledge class MediaItemTag(
val manifest: String?,
)// VideoManager.kt
val mediaItemTag = MediaItemTag(
manifest = apiResponse.dashManifest,
)
val mediaItem = MediaItem.Builder()
.setTag(mediaItemTag)
.setUri(...) // instance: "https://instance.com/some/video/url.mpd"
.construct()
After the MediaItem is about and we’re able to name put together, the notable customization begins when ExoPlayer creates a MediaSource. Particularly, Android implements the MediaSource.Factory interface, which specifies a createMediaSource method.
In createMediaSource, we will retrieve the DASH manifest that was saved within the tag property from the MediaItem and remodel it into an ExoPlayer primitive utilizing DashManifestParser:
// CustomMediaSourceFactory.kt
override enjoyable createMediaSource(mediaItem: MediaItem): MediaSource
val mediaItemTag = mediaItem.localConfiguration!!.tag
val manifest = (mediaItemTag as MediaItemTag).manifestval dashManifest = DashManifestParser()
.parse(
mediaItem.localConfiguration!!.uri,
manifest.byteInputStream()
)
With the contents of the manifest obtainable in reminiscence, using DashMediaSource.Factory’s API for side-loading a manifest is the final step required:
class PinterestMediaSourceFactory(
personal val dataSourceFactory: DataSource.Manufacturing facility,
): MediaSource.Manufacturing facility override enjoyable createMediaSource(mediaItem: MediaItem): MediaSource
val dashChunkSourceFactory = DefaultDashChunkSource
.Manufacturing facility(dataSourceFactory)
val dashMediaSourceFactory = DashMediaSource
.Manufacturing facility(
/* chunkSourceFactory = */ dashChunkSourceFactory,
/* manifestDataSourceFactory = */ null,
)
return dashMediaSourceFactory.createMediaSource(
dashManifest,
mediaItem
)
Now, as an alternative of getting to first fetch the .mpd file from the CDN, ExoPlayer can skip that request and instantly start fetching media.
As of this writing, each Android and iOS platforms have absolutely carried out the above answer. This has led to a major enchancment in video efficiency metrics, significantly in startup latency. Consequently, consumer engagement on Pinterest was boosted because of the sooner viewing expertise.
With the flexibility to control the manifest loading course of, shoppers could make native changes to attain extra fine-grained video high quality management primarily based on the UI floor. By eradicating undesirable bitrate renditions from the unique manifest file earlier than offering it to the gamers, we will restrict the variety of bitrate renditions obtainable for the participant, thereby gaining extra management over playback high quality. As an illustration, on a big UI floor reminiscent of full display, high-quality video is extra preferable. By eradicating the decrease bitrate renditions from the unique manifest, we will be sure that gamers solely play high-quality video with out getting ready a number of units of manifests at backend.
We want to lengthen our honest gratitude to Liang Ma and Sterling Li for his or her important contributions to this venture and distinctive technical management. Their experience and dedication have been instrumental in overcoming quite a few challenges and driving the venture to success. We’re deeply appreciative of their efforts and the optimistic impression they’ve had on this initiative.
To study extra about engineering at Pinterest, take a look at the remainder of our Engineering Weblog and go to our Pinterest Labs website. To discover and apply to open roles, go to our Careers web page.