Quantcast
Viewing latest article 5
Browse Latest Browse All 613

What Makes TikTok’s Algorithms So Effective?

Image may be NSFW.
Clik here to view.

On Tuesday, the newly-elected U.S. President Donald J. Trump signed an executive order that gave TikTok a 75-day reprieve from being shut down in the U.S. by the Justice Department, much to the delight of fervent TikTok users.

But, during the signing, the President made his intention clear: TikTok would have to sell half of itself to the U.S., by way of a U.S.-owned entity of some sort, to remain operating in the U.S.

The transactionally-minded President then added, as an aside, “Every rich person has called me about TikTok.”

Half can be tricky, though. You want to make sure you get the right half. Otherwise, you’ll end up with the empty husk of a media service, ala Friendster.

The company now has about 170 million users in the U.S. But that audience can vanish quickly. You want the part that keeps users logging in. The valuable part is the algorithm that runs the recommendation service.

TikTok’s massive user base is a testament to its addictive nature. The key to keeping users engaged? It’s powerful algorithms. These algorithms drive the recommendation system, constantly feeding users a stream of content tailored to their interests.

As University of Zurich researchers Maximilian Boeker and Aleksandra Urman noted in their study, “An Empirical Investigation of Personalization Factors on TikTok,” the platform’s recommendation system is arguably its most important success driver.

TikTok, and its parent company, the Bejing-based ByteDance, has been tight-lipped when it comes to the design and operation of the algorithms that feed their users content.

Those willing to dive into a research paper from ByteDance engineers and other researchers may find some hints into how TikTok keeps its users coming back.

Unveiling the Monolith

The paper, “Monolith: Real Time Recommendation System With Collisionless Embedding Table,” presented at the 2022 ACM Conference on Recommender Systems (RecSys), offers valuable insights. While not claiming to describe TikTok’s exact algorithms, it reveals ByteDance engineer’s approach to designing a highly -effective recommendation system.

The paper details the “unorthodox” trade-offs the researchers made that led to significant performance improvements, resulting in a recommendation system called “Monolith” that consistently outperforms other systems with the same memory usage, the researchers asserted.

Former TikTok engineer Arman Khondker pointed to this paper as being fundamental to understanding TikTok’s approach. Khondker hailed the TikTok algorithm as “years ahead of the competition” and “without question, the most valuable piece of software in existence.” Elon Musk himself responded to this claim with a succinct “For now.”

Image may be NSFW.
Clik here to view.

The Goals of TikTok’s Algorithm

A 2021 internal TikTok document obtained by the New York Times revealed the four main goals of the company’s algorithm: user value, long-term user value, creator value, and platform value. Essentially, TikTok prioritizes keeping users engaged and spending time on the platform.

The algorithm considers various factors, including likes, comments, and video watch time, to determine which content to show users. It also aims to diversify recommendations to prevent users from getting bored and losing interest.

For analysts familiar with user retention, it all looked like pretty routine stuff. “Totally reasonable, but traditional stuff,” quipped Julian McAuley, a professor of computer science at the University of California San Diego.

So perhaps the true power of the TikTok algos come not only from the user analysis but also the the considerable speed at which they are executed.

The Power of Real-Time Feedback

The “Monolith” paper highlights the challenges of building recommendation systems that can keep up with users’ rapidly changing preferences.

In a nutshell, the job of a recommendation engine is to predict users’ interests and future behavior, using the latest interactions as the primary input for training the model.

To do this, traditional systems often rely on complex models that are slow to adapt to new data.

The conventional wisdom for building such systems has been to maintain individual models for each task. “Predicting clicks” would get one model, and “predicting watch time” would get another model. Analysis is usually done by batch processing, slowing the rate at which the system learns about the user. The models can’t interact with customer feedback in real time.

The more frequently someone uses TikTok, “the more accurate the algorithm will be.”

— Zhengwei Zhao, Sun Yat-sen University

Deep learning frameworks like Pytorch and TensorFlow, built for general usage, aren’t really geared for the urgent production demands of online recommendations. Tensorflow separates training from inference, which cuts the model off from the latest input from users. As a result, any competitive system design has to build all sorts of workarounds to jam these batch frameworks into real-time systems.

Monolith, on the other hand, utilizes a single model for all tasks and incorporates real-time feedback through online training. This allows the system to quickly learn and adapt to users’ evolving interests.

The paper describes how Kafka can be used to log actions of users, and, in a parallel operation, log features. An Apache Flink job “concatenates features with labels from user actions and produces training examples, which are then written to a Kafka queue. The queue for training examples is consumed by both online training and batch training.”

Image may be NSFW.
Clik here to view.
Monolith architecture
The Monolith architecture emphasizes faster online training.

‘Doomscrolling’ and Table Size

Another challenge for recommendation systems is dealing with “sparse features” and “concept drift.” Sparse features mean that users are only interested in a small subset of content, while concept drift refers to the tendency for users’ interests to change over time as they scroll through an endless stream of videos.

“The same user interested in one topic could shift their zeal every next minute,” the paper states.

If you were to put everything into the embedding table, it would be too large to fit into memory. And traditional fixed-sized models don’t take kindly to being enlarged, which they would have to constantly be as new users came aboard.

Monolith tackles these issues by:

  • Rapidly incorporating user feedback into the training process.
  • Keeping the data table manageable through a “collisionless hash table” and fancy feature eviction mechanisms.

Meeting these objectives ensures that the system can keep up with users’ changing preferences without it being overwhelmed by the sheer volume of data.

To mash the wide array of sparse features into computer memory, the researchers advocated using a Cuckoo Hashmap design, which minimizes collisions, or two keys inadvertently occupying the same space.

To further reduce memory usage, seldomly-used IDs are trimmed away.

Personalized Content Fast

In essence, Monolith represents a shift away from the traditional microservices approach to building systems, opting instead for a more unified, monolithic architecture.

While ByteDance has not confirmed whether the Monolith architecture is used in TikTok (or the Chinese version, Douyin), all the company’s services depend on the ability to deliver personalized content at lightning speed.

TikTok enjoys nearly 2 billion users, along with its Chinese version Douyin, operated as an independent entity just for China. The company also runs Toutiao, a news and personal content aggregator app. There’s a long-form video platform, Xigua Video, as well as a gaming division, a hosted recommendation service (BytePlus Recommend), an enterprise collaboration suite (Lark) and the popular CapCut video editing tool and hosted service.

On TikTok, there is almost no work needed on the part of the user to find compelling content, observed researcher Zhengwei Zhao, Sun Yat-sen University, China. As soon as the user starts swiping, the app starts learning.

The more frequently someone uses TikTok, “the more accurate the algorithm will be.”

The post What Makes TikTok’s Algorithms So Effective? appeared first on The New Stack.

TikTok's recommendation system is incredibly good at understanding what users want — so good that it's the envy of tech titans like Elon Musk. But what makes TikTok tick?

Viewing latest article 5
Browse Latest Browse All 613

Trending Articles