Avatars and Identity in the Metaverse, Part 1

In Sanskrit, avatar (अवतार) refers to “an incarnation in human form.” In Roblox, few things reflect a user’s identity more directly than their avatar. As we’ll discover, there is no “standard” Roblox user, and the fantastical aesthetic variety in our users’ avatars directly reflects the diversity of the user base itself.

Characterizing Avatars (Methodology)

If we’re interested in aesthetic diversity, we need to start by characterizing avatar aesthetics. The most natural place to look is the 2D avatar thumbnail that often represents users to one another. For aesthetic analysis, we need to turn this thumbnail into a semantically meaningful numerical representation. There are many ways to reduce the dimensionality, but here’s a few that we can try. 

  1. The simplest approach: directly apply PCA to the flattened thumbnail images. To evaluate the “quality” of the reduction, we visualize thumbnails on the extremes of the principal components (PCs). We can see that while the first PC distinguishes between interpretable types of avatars, the twelfth is too broad to be meaningful.

PC 1 (14.3% of variance explained):

PC 12 (1.5% of variance explained):

2. Almost as simple: we can apply the last hidden layer of an off-the-shelf pretrained image   classification network (Resnet 18), and evaluate embedding quality by clustering them. Observe how Resnet captures color information very effectively (see all the blue shoes in the second cluster) but sometimes fails to encode shape information (see the first cluster).

Samples of thumbnails from 2 clusters are shown below:

3. To get a visual read on cohesiveness, we can apply UMAP to reduce the image classification embeddings all the way down to 2 dimensions. While there dose seem to be discernible clusters, the large blob of points in the bottom right looks suspicious. Rightly so: samples from that megacluster are visually incohesive.

2D embedding plot:

Samples from the megacluster in the 2D embedded space:

4. Training a small custom variational autoencoder (VAE) on the thumbnail data directly. Ideally, this better captures the unique aesthetic variation in Roblox avatars, as compared to a general-purpose image classifier. (cute aside: K-means is particularly appropriate for clustering these embeddings, as its normal prior matches up with the VAE’s latent variable posterior)

While there are metrics that can attempt to quantify the benefits of different approaches, practical use cases for unsupervised learning often come down to subjective judgment. Anecdotally, we find the most success with #4. 

The Avatar Manifold

Using the VAE, we can transform the thumbnails into succinct 64-dimensional vectors for clustering. Here are some examples of the VAE + K-means clusters from a 20-way clustering:

Some very customized avatars in one cluster:

Tall and thin avatars, which we call “Rthro” in another cluster:

Big and blocky avatars which we call “Blocky” in this cluster:

Default avatars here:

Lightly customized in-between Rthro and Blocky body type in this one:

Dark Angels of Roblox

“Look Over There!”

The Black Cube

I Believe I Can Fly

The consistency of the clusters across multiple runs, random initializations, and choices of k suggests that Avatars naturally fall into distinct (albeit fuzzy) categories. On the extremes of contour, we have the old-fashioned, square-bodied “Blocky” characters opposite the tall, thin, more lifelike “Rthro” avatars. We also find a number of default avatars, which users have not edited since joining Roblox (cluster 4 above). In between, there’s everything from “goth ninjas” to “going clubbing.”

Identity through Avatar

How do these aesthetic clusters relate to our users themselves?

The easiest place to start is user behavior on the platform. When plotting avatar edits in the last month, account age in weeks, total seconds of playtime, and one-month retention by cluster — engagement indicators — we are presented with four graphs that demonstrate dramatic variation across clusters. Users with heavily customized avatars tend to be most engaged and most frequently retained, while the avatars that haven’t been as heavily customized tend to be less engaged.

There are two opposite causal interpretations of this. One is that users who edit their avatar become more engaged with Roblox as a result. The other could be that users who are already invested into Roblox tend to pour more effort into their avatars as time goes on. There’s been great work by others at Roblox determining which interpretation to believe.

Regardless of causality, we see that two aspects of on-platform identity—aesthetic representation and level of engagement—are closely intertwined. What about off-platform identity, though? How do our users’ real-life identifiers — age, geography, gender, etc. — intersect with their Roblox identities? Check out Part 2 of this blog post to find out! 

Nameer Hirschkind is a Data Science Engineer at Roblox. He works on the Avatar Shop to ensure its economy is healthy and thriving. Neither Roblox Corporation nor this blog endorses or supports any company or service. Also, no guarantees or promises are made regarding the accuracy, reliability or completeness of the information contained in this blog.

©2021 Roblox Corporation. Roblox, the Roblox logo and Powering Imagination are among our registered and unregistered trademarks in the U.S. and other countries.