R Slash
R Slash is a Discord bot for retrieving posts from Reddit. It is written in Rust and deployed on a Kubernetes cluster running on my home server.
As of build time (11/6/2024), the bot is in 58211 Discord servers, and has 3803 Monthly Active Users.
TL;DR
What I did:
- Developed a customer-facing and revenue-producing product
- Evolved a complex system over multiple years as needs changed - learning lots about tech debt and re-factoring old code
- Learned a lot about fault tolerance and dealing with errors in a customer-facing product
- Deployed Kubernetes and many things on top of it
- Used industry tools such as Sentry and Posthog
Things I’m proud of:
- Making something people actually use and pay for
- Learning everything myself by building and doing
What I learned:
- Automated error capture and analytics are great - but making it easy for your customers to talk to you is the best
- Distributed systems even on a small scale like this have lots of hidden failure modes
- There’s a good middle ground between getting things built quickly and over-complicating a design so it’s 100% future-proof
Features
Get post
Get a post from the specified subreddit.
Repeat button
Button to repeat the same command again.
Search by title
Search for posts by their title.
Custom subreddits
Premium users can get posts from any subreddit.
Subscriptions
Subscribe to a subreddit to get posts sent as soon as they are posted.
Autoposts
Configure an autopost to get new posts at a specified interval.
Architecture
Shard
Each shard is a separate instance of the bot that connects to Discord and listens for events. It processes these events and sends the appropriate responses back to Discord. As such most of the bot’s logic is contained in the shard.
Downloader
The downloader routinely scrapes the front page of the required subreddits and performs the following on each post:
- If the post is already in the database, update the score, and additionally if the post has been deleted - remove it from the database.
- If the post is not in the database more work ensues:
- The media URL for the post is checked
- If it is a direct link to an embeddable file, we can simply store that URL for later use.
- If it is not a direct link, we use whoever is hosting it’s API to get a direct link.
- If the file is an MP4 we download it because most sites prevent against hot-linking MP4s.
- If the MP4 is below a certain size - we convert it to a GIF because they can be embedded in Discord better.
- The downloaded file can then be served from my server via Cloudflare.
- The post’s metadata and new media URL are stored in the database.
- Notify the Post Subscriber that there is a new post.
There’s a lot of complicated logic to do with rate limiting, parallelism, and retries that I won’t go into here.
Post Subscriber
The Post Subscriber is a separate service that listens for new posts and sends them to the users that have subscribed to the subreddit. Every time the Downloader encounters a post it hasn’t seen before it sends a message to the Post Subscriber with its ID. The Post Subscriber checks its database to see if any users have subscribed to the subreddit the post is from (this list is populated by RPC calls from the Shards). If there are any subscribers it sends the post to them.
Auto-poster
Users can setup “Auto-posts” where they provide a subreddit, search, time interval and limit. The bot will then send a post from that subreddit that matches the search every time interval up to the limit. This is done by a separate service that receives requests to manage auto-posts and adds them to an internal priority queue that sends the posts at the correct time.
Membership Updater
I currently process Premium subscriptions through Ko-Fi to avoid the complexity of handling payments myself. Unfortunately Ko-Fi doesn’t have an API. Fortunately however they have a Discord integration that allows Premium members to get a special role in the bot’s Discord server. The Membership Updater is a service that listens for these role changes and updates the database accordingly.
Discord Interface
Discord requires that each bot instance (shard) handles a maximum of 2000 servers, and recommends only 1000. This means the bot must be able to automatically spin up more shards as the number of servers it is in increases. This is done by a small program that simply routinely checks the number of servers the bot is in and spins up more shards as needed. It does this by making a request to the Kubernetes API to scale the Stateful Set that runs the shards.