If you have 1 server doing everything, there’s really no reason to use Kafka at all.
If you distribute, which you probably have to if you want to scale to millions of users, other aspects come in. Clients who speak may not speak to the node where the recipient is. In this case, you’ll have to route messages and/or do a database polling on each node to look for messages.
So in this case, Kafka solves 3 problems for you:
- It can route really well and easy (if you know what you’re doing)
- It will handle nodes crashing / coming online well
- You can poll Kafka directly, which is intended, instead of polling a database, which may or may not work well.
Additionally, if you’re really scaling:
- Kafka doesn’t wait for I/O. Conventional databases are usually limited by iops (I/O per second) because of transaction boundaries. This can be really slow. Like 100s of messages (per disk) vs. millions in Kafka.
That’s just a couple of things, there could be more…