Analyzing 4,500 tweets from Amazon’s most famous celebrity employee

min read

Curious to know what resonates with global thought leaders’ audiences on social media? Resident Data Evangelist Michelangiolo Mazzeschi and Content Marketer Andrés Slaughter have analyzed the tweets text and subject matter of 4,500 Allie K. Miller tweets.

  • *Allie K. Miller* is the Global Head of Machine Learning BD, Startups, and Venture Capital at AWS.

Having an audience is essential, but understanding what they like and don’t like is an invaluable insight that’s similar to burning money (we know you aren’t that crazy).

While you might understand the value of a follower and have tens of thousands, you might not always know how to gather those valuable insights that result in actionable steps, i.e., what to tweet next and how to maximize follower engagement.

Allie K. Miller has captured one of the largest followings on LinkedIn and is the most followed woman in the Machine Learning and AI space on social media, period.

She’s proactive, driven to the point of achieving a double-major MBA from The Wharton School and becoming the youngest-ever woman to build an artificial intelligence product at IBM, whilst winning LinkedIn Top Voice three consecutive years in a row.

What sets her apart from her peers is her relatable nature. She closes the gap between influencer and the average person so well that it’s hard to tell she’s as accomplished as she is.

Her most frequent content on social media tends to be social interactions, whether at a conference, committee or running into a fan on the street. This is how she’s fostered a community of like-minded individuals who love to see the blend of professionalism and charisma that becomes infectious even over a phone screen.

If you would like to see the Allie K. Miller Tweet Subject Clusters in the dashboard while reading along, then you can view the Clustering Dashboard here.

What makes a successful Allie K. Miller Tweet

  • Allie K. Miller’s audience LOVES artificial intelligence, machine learning, AWS, collaborations with notable figures in the AI space, celebrity interviews, and women in tech.
  • Elon Musk, Boston Dynamics, Burritos, and Emails are subject to avoid if you want to get high engagement.
  • Women in tech and women in science were the most consistently tweeted well-performing content on her profile, receiving over 1,300 likes with nearly 200 retweets and 39 tweets relating to the subject matter.

As a whole, the subject matter received 2,632 likes with 92 tweets.

Similarly, conference/attendance and San Francisco also performed well in terms of subject matter for tweets receiving nearly 2,000 likes.

Some of the most consistently tweeted subject matters that do the worst in terms of overall like retweet and reply count are:

  • Public @’s and replies to her content
  • Follower engagement
  • Tweets relating to Twitter

The Good

  • Tweets that achieved the highest number of likes were based around meeting celebrities (12,381 likes in total)
  • The most frequently engaged content involved human interactions, social meetups, and mentoring (371+ likes).
  • Airbnb and Tiktok were subjects that also resonated with her audience – receiving 690 and 768 likes, respectively.
  • Hypothesis: I believe this is due to the use of a broad relatability for celebrities. Most of her audience can instantly recognise the celebrities mentioned. Their mythical status means any interaction documented online, allows a follower to appreciate the connection however distant from them. This allows them to relate to the feeling such as Allie did in her most viral tweets involving celebrities. I also infer that posting time-sensitive content that’s trending tends to do well; such is the case with Airbnb and Tiktok.
  • Specifically, content mentioning Robert Downey Jr. and Mark Hamill achieved the most likes.
  • These are the two most-liked tweets (#1 & #2) of her entire 4,500 history. The similarity is the subject of the tweets, which involves a notable celebrity.
  • Similarly, summaries of professional content achieved hundreds of likes.
  • When Allie compiles relevant and useful guidebooks / links to resources in one tweet, the traction for virility skyrockets as the use case increases tenfold.

Hypothesis: This particular tweet does better than tweets in the same subject matter because Allie has pinned this to her profile, thus giving more impressions and visibility for non-followers.

  • You will get the most retweets with these types of tweet subjects below
    • Artificial Intelligence (229 retweets)
  • Women In Tech (187 retweets)
  • There was a somewhat similar result to retweet count relating to RE:MARS content
  • Content that achieved the most replies focused on guidebooks and scheduling
    • This content in terms of overall likes as a subject matter ranked #15 but in terms of specific tweets had some in the top ten in terms of overall likes such as the aforementioned one.

The Bad

  • The least engaging content involved these topics:
  • Emails (14 likes)
  • Hypothesis: These are conversational tweet replies and thus have a lack of excitement and content that is receptive to her audience.
  • Elon Musk (16 likes)
  • Hypothesis: Vague remarks regarding exciting topics don’t seem to resonate very well with her audience, who expect in-depth detailed insights regarding any topic.
  • Boston Dynamics (16 likes)
  • Hypothesis: Boston Dynamics as a subject topic is either not interesting enough for her audience or is too conversational and doesn’t allow for insight to be gathered. A filler tweet such as this is to keep the profile active.
  • Sunsprite (17 likes)
  • Overall, the products that resonated with the least involved topics were either too controversial or not explicitly related to the AI ML space.

The Ugly

  • The least tweeted subjects include
  • 1st – Random – Inner thoughts
  • Three tweets total
  • 2nd – Finsta – Initials and naming
  • Five tweets total
  • 3rd – Japan – Currency
  • Seven tweets total
  • 4th – Supervised vs. Unsupervised learning
  • Seven tweets total
  • 5th – Blocked
  • Seven tweets total


Compiled from over 4,500+ tweets taken from Allie K. Miller’s Twitter profile. These were then exported into a CSV file and uploaded as a dataset in the Relevance AI platform, then vectorized & clustered.

Technical Write up

Using the all-MiniLM-L6-v2 encoder, 4,500 different tweets have been converted to 786-dimensional vectors. Afterward, a clustering algorithm (K-means) was applied to check which tweets were most similar to each other with a total of 240 clusters.

Because there were other features in correspondence with every sample we have been using this extra data to exploit the best from the clustering relevance ai application to show additional data.


The clustering app shows what are the representative samples for every cluster, including all the corresponding metrics that can be used as additional insights.

Using textual data, we can extract zero-shot labels to build a WordCloud of the most significant words for every cluster, which makes it much easier for us to identify the content of each cluster.


The Allie K Miller Clustering Dashboard was scraped from Twitter. To clean the dataset, I preprocessed the data, extracted insights using additional Machine Learning techniques (such as zero-shots, but sentiment analysis could also be used), and then selected the reviews for the encoding.

Analyzing 4,500 tweets from Amazon’s most famous celebrity employee
Andres Slaughter
March 15, 2022