- Can we leverage the existing cloud infrastructure from Amazon, Microsoft, or Google, or we are focusing on inventing them by ourselves?
- We could expect to hear that we can leverage, because it's unrealistic for most of the companies.
- Cost optimization on CDN level
- Definitely worth to discuss because it costs a lot.
-
- (Optional) Recommendation system based on your preferences (immediate changes, scheduled changes)
- bitrates & transcoding questions
-
- (Optional) Prepared set of 100 videos on the main screen
- (Optional) Offline streaming?
- Switching between bitrate for smooth user experience?
-
(Optional, it leads us to another system design) Live streaming: It refers to the process of how a video is recorded and broadcasted in real time. The notable differences are:
- Live streaming has a higher latency requirement, so it might need a different streaming protocol.
- Live streaming has a lower requirement for parallelism because small chunks of data are already processed in real-time.
- Live streaming requires different sets of error handling. Any error handling that takes too much time is not acceptable.
- Ability to upload videos fast
- Smooth video streaming
- Ability to change video quality
- Low infrastructure cost
- High availability, scalability, and reliability requirements
- Clients supported: mobile apps, web browser, and smart TV
- Assume the product has 5 million daily active users (DAU).
- Users watch 5 videos per day.
- 10% of users upload 1 video per day.
- Assume the average video size is 300 MB.
- Total daily storage space needed: 5 million * 10% * 300 MB = 150TB
- CDN cost.
- When cloud CDN serves a video, you are charged for data transferred out of the
- Let us use Amazon’s CDN CloudFront for cost estimation. Assume 100% of traffic is served from the United States. The average cost per GB is $0.02. For simplicity, we only calculate the cost of video streaming.
- 5 million * 5 videos * 0.3GB * $0.02 = $150,000 per day.
- CDN and blob storage are the cloud services we will leverage.
- Load balancer: A load balancer evenly distributes requests among API servers.
- API servers: All user requests go through API servers except video streaming.
- Metadata DB: Video metadata are stored in Metadata DB. It is sharded and replicated to meet performance and high availability requirements.
- Metadata cache: For better performance, video metadata and user objects are cached.
- Original storage: A blob storage system is used to store original videos. A quotation in Wikipedia regarding blob storage shows that: “A Binary Large Object (BLOB) is a collection of binary data stored as a single entity in a database management system” [6].
- Transcoding servers: Video transcoding is also called video encoding. It is the process of converting a video format to other formats (MPEG, HLS, etc), which provide the best video streams possible for different devices and bandwidth capabilities.
- Transcoded storage: It is a blob storage that stores transcoded video files.
- CDN: Videos are cached in CDN. When you click the play button, a video is streamed from the CDN.
- Completion queue: It is a message queue that stores information about video transcoding completion events.
- Completion handler: This consists of a list of workers that pull event data from the completion queue and update metadata cache and database.
Transcoding is computationally expensive and time-consuming.
Meta uses DAG (Directed Acyclic Graph) programming model which defines tasks in stages so they can be executed parallelly or sequentially.
- Inspection: Make sure videos have good quality and are not malformed.
- Video encodings: Videos are converted to support different resolutions, codec, bitrates, etc. Figure 14-9 shows an example of video encoded files.
- Thumbnail. Thumbnails can either be uploaded by a user or automatically generated by the system.
- Watermark: An image overlay on top of your video contains identifying information about your video
Preprocessor
- Video splitting. Video stream is split or further split into smaller Group of Pictures (GOP) alignment. GOP is a group/chunk of frames arranged in a specific order. Each chunk is an independently playable unit, usually a few seconds in length.
- Some old mobile devices or browsers might not support video splitting. Preprocessor split videos by GOP alignment for old clients.
- DAG generation. The processor generates DAG based on configuration files client programmers write. Figure 14-12 is a simplified DAG representation which has 2 nodes and 1 edge:
- Cache video segments. Preprocessor stores Video parts and metadata in temp storage. For resiliency (retry mechanism).
DAG. Scheduler
Idea: split the video processing onto independent tasks.
Resource Manager. Also known as Task Scheduler
- Is responsible for managing the resource allocation and priorities management.
- Find optimal worker for the certain task
- Job Queue management
Encoded Video
Is the final output of the encoding pipeline.
- Upload video closer to the end user
- High Parallelism
Protocols
- MPEG-DASH "Moving Picture Experts Group" - "Dynamic Adaptive Streaming over HTTP"
- Apple HLS. "Http Live Streaming"
- Microsoft Smooth Streaming
- Adobe Http Dynamic Streaming (HDS)
Naive video streaming diagram:
Recommendation: They all support different video encodings. You have to choose the right streaming protocol for your business case.
- For less popular content, we may not need to store many encoded video versions. Short videos can be encoded on-demand.
- Some videos are popular only in certain regions. There is no need to distribute these videos to other regions.
- Build your own CDN like Netflix and partner with Internet Service Providers (ISPs). Building your CDN is a giant project; however, this could make sense for large streaming companies. An ISP can be Comcast, AT&T, Verizon, or other internet providers.