The Video Experiment At OLX, Part 1 — Why and How
With the ever-increasing adoption of smartphones and reduced data cost, there has been immense growth in video traffic in the past couple of years.
It’s estimated that by 2022, 82 percent of the global internet traffic will come from video streaming and downloads (Cisco, 2019).
With most cities supporting in India supporting 4G today, internet bandwidth is no more a challenge. Hence, we decided to leverage this opportunity to enable Video as a media attachment in our current Ad posting flow.
What that basically means is that seller will be able to record and upload a demonstration of the product he wishes to sell on the platform. As a buyer, when I look at a video having a working demo of the product I am planning to buy, I develop some confidence (or trust) on the product condition and seller intent.
In this series of posts, we’ll tell you how we built adding video upload as a step in the current ad posting flow. In the first part, we cover:
- Backend Architecture
- Components Involved
There are ~400K ads being (successfully) added to the platform every day, not to forget there is a definite drop off too in the funnel right before the user reaches the “Post Now” button. In this situation, adding a video upload step could backfire if it becomes a bottleneck in the funnel. At the same time, video upload duration is dependent on multiple factors like the camera quality (resolution), fps, the computation power of the recording device (for encoding) which is quite varied (in android devices), and finally the internet bandwidth. Also, adding a new step into a conventional system where our users are habituated into posting ads in a predictive manner might just break the momentum and become a victim of what is called “resistance to change” among our users.
We also had to make this modular in the sense that this is still an experiment so we do not want to move major blocks in a currently stable platform until we are sure about what works best for our users.
There were two parts of integration for video as a feature on the platform. One is at the seller side where inventory (video) is created on the platform while the second is when a user consumes the video as a “buyer”. For the first phase, we plan to understand the user journey as a seller.
We will run an A/B test to expose the feature to our user base using Laquesis (our internal A/B tool) and decide which flow the user should navigate to on a click of the “Sell” button. There are three variants to our experiment all of which had the same basic sequence
- Recording video of the product
- Review and select images extracted from the video (since images are mandatory)
- Fill remaining attributes bases ad category (video uploads in the background)
- “Post Now”
Let’s now talk about components involved in implementing this seller journey
Mantis (Backend Media Server)
The Media Server has 4 primary responsibilities:
1. Video Upload
By selecting S3 as our data store for videos, we leverage the S3 multipart URL signing mechanism to enable clients to directly write to S3, bypassing our app servers and saving not only an extra hop but also infrastructure resources (the game changer). Our backend system exposes a single endpoint for app clients to generate a signed URL with a defined expiry time , which can be used to directly write a specific object (.mp4 file) to S3 at a specific path, with multi-part support to make the uploads fail-safe. In addition, we use the S3 transfer acceleration feature to optimize the flow of video packets to our central bucket via the closest edge server over the “Aws Backbone” network. The multi-part upload in combination with transfer acceleration proves to be the least latency solution for media upload.
2. Video Processing
Video Processing is a broad term that includes:
When a video is recorded at source and stored in mp4 format, it still is a compressed version of it depending on the encoding format the recording device supports (mostly H.264). Hence, the size of the video is not just dependent on its length but also the format (due to the principle of spatial locality). Transcoding is accepting an already compressed (or encoded) version of the source file, decompressing it (into raw version), and then compressing it again based on output profile (format and resolution). This is a highly CPU intensive job.
2.2 Watermark Insertion
OLX watermark is inserted into the output feed at a specific position ((x,y) coordinates) derived from the resolution of the output profile. Our HLS output is multi-bitrate: 720p, 540p, 480p and 360p.
The resultant feed is then packaged into multiple chunks and indexed into a playlist file that makes the feed streamable. This enables the output video to be playable immediately after the download of the first chunk and the subsequent ones can be downloaded in the background to create a buffer. We selected HLS as the output format since it is supported by the majority of platforms (Android, iOS, and most browsers).
3. Smooth Streaming
This part takes care of delivering video to the end-user in the most optimised way for a smooth experience with minimal buffering. We use CDN for last-mile delivery to make sure video segments are downloaded from the closest edge server geographically. HLS as an output profile supports Adaptive Bitrate(ABR) to handle low-bandwidth issues where the player at client end automatically switches to the lowest bitrate segments in low network bandwidth conditions to make sure there minimal buffering experience (the time it takes to download video segments while there are no existing segments to play)
4. Video Lifecycle Management
System maintenance is a key aspect here as we need to make sure that data being created is archived at a fixed cycle. This makes sure that we do not bloat our storage servers hence, keeps a tab on our storage costs. We run scheduled lambda cleanup jobs to avoid provisioning dedicated VMs and save on costs further.
In the next part, we will cover another critical aspect of the solution which covers the aspect of recording and delivering videos smooth i.e Client Library. Also, there is a need to decide the leader among 3 user flow variants and conclude on the experiment. This resulted in the invention of another tool: “Video Analysis tool (VAT)” that will be used to review and measure the videos being added to the platform on parameters like quality, relevance, and content (audio and video).
The Video Experiment At OLX, Part 1 — Why and How was originally published in OLX Group Engineering on Medium, where people are continuing the conversation by highlighting and responding to this story.