I’ve been trying to study some systems design materials in preparation for an upcoming interview. My current job doesn’t really have to do that much for back of the envelope estimations. I’ve found a example design problem, specifically how to design Instagram. For the most part, it makes sense, but I’m stuck at a data size estimate on the data that is used to map followers to the people they are following.
The reference material states:
UserFollow: Each row in the UserFollow table will consist of 8 bytes.
If we have 500 million users and on average each user follows 500
users. We would need 1.82TB of storage for the UserFollow table:
500 million users * 500 followers * 8 bytes ~= 1.82TB
For the life of me, I keep calculating this out to be 20GB.
First off, number of rows of data: 5,000,000 x 500 = 2,500,000,000 rows
Next, each row is 8 bytes, so 2,500,000,000 rows x 8 bytes = 20,000,000,000 bytes => 20GB
I feel kind of like an idiot because this seems to be a simple math problem, but my number is way off. What could I be missing here?