microservices – Where to place an in-memory cache to handle repetitive bursts of database queries from several downstream sources, all within a few milliseconds span


I’m working on a Java service that runs on Google Cloud Platform and utilizes a MySQL database via Cloud SQL. The database stores simple relationships between users, accounts they belong to, and groupings of accounts. Being an “accounts” service, naturally there are many downstreams. And downstream service A may for example hit several other upstream services B, C, D, which in turn might call other services E and F, but because so much is tied to accounts (checking permissions, getting user preferences, sending emails), every service from A to F end up hitting my service with identical, repetitive calls. So in other words, a single call to some endpoint might result in 10 queries to get a user’s accounts, even though obviously that information doesn’t change over a few milliseconds.

So where is it it appropriate to place a cache?

  1. Should downstream service owners be responsible for implementing a cache? I don’t think so, because why should they know about my service’s data, like what can be cached and for how long.

  2. Should I put an in-memory cache in my service, like Google’s Common CacheLoader, in front of my DAO? But, does this really provide anything over MySQL’s caching? (Admittedly I don’t know anything about how databases cache, but I’m sure that they do.)

  3. Should I put an in-memory cache in the Java client? We use gRPC so we have generated clients that all those services A, B, C, D, E, F use already. Putting a cache in the client means they can skip making outgoing calls but only if the service has made this call before and the data can have a long-enough TTL to be useful, e.g. an account’s group is permanent. So, yea, that’s not helping at all with the “bursts,” not to mention the caches living in different zone instances. (I haven’t customized a generated gRPC client yet, but I assume there’s a way.)

I’m leaning toward #2 but my understanding of databases is weak, and I don’t know how to collect the data I need to justify the effort. I feel like what I need to know is: How often do “bursts” of identical queries occur, how are these bursts processed by MySQL (esp. given caching), and what’s the bottom-line effect on downstream performance as a result, if any at all?

I feel experience may answer this question better than finding those metrics myself.

Asking myself, “Why do I want to do this, given no evidence of any bottleneck?” Well, (1) it just seems wrong that there’s so many duplicate queries, (2) it adds a lot of noise in our logs, and (3) I don’t want to wait until we scale to find out that it’s a deep issue.