Understand caching
Understanding cache in layman terms.
This is going to be an interesting read and will be very easy to understand (at least, that's what I think). There are going to be pictures and examples to make you understand an important part of modern software, but if you're really not someone who can read long pieces of text, then maybe this is not for you.
Caching is a fundamental concept in computer science and plays a critical role in the efficiency of large-scale distributed systems. Virtually every sophisticated system incorporates caching in various forms, often in critical sections, to optimize performance.
"Caching is an advanced memory management technique utilized to expedite data retrieval processes by temporarily storing frequently accessed or computationally intensive data in a specialized high-speed memory layer, known as cache memory. By strategically placing this data closer to the processing units, caching significantly reduces latency and minimizes the computational overhead associated with repeated access to the primary data storage. This method optimizes overall system performance and efficiency by leveraging temporal and spatial locality principles, ensuring that subsequent requests for the cached data can be fulfilled with minimal delay."
Okay, what the hell is this? Let me make it easy for you: Imagine you're running a supermarket, and the most popular product you have is mobile phones. However, you store all your mobile phones in a warehouse that's 800 meters away from your store. When customers come in and want to buy a phone, they first have to wait while someone fetches the phone from the warehouse. If many customers arrive at once, this waiting time becomes even longer, leading to frustration and a poor shopping experience. Essentially, you're wasting your customers' precious time and providing subpar service by making them wait so long.
Now, imagine you decide to keep a few of these mobile phones directly in your store. The next time a customer wants to buy a phone, they get it immediately from the shelf without waiting for someone to run to the warehouse. This speeds up the process, keeps your customers happy, and improves your store's efficiency.
I hope you're getting the idea, right. In a way, we can say that this is the core idea behind caching.
Let's understand the same thing with computers. Suppose, you want to see a very trending video on your device.
The normal procedure would be something like: You request the video; the request goes to the server; it gets processed; then another request is made from the server to the database to get the video; then it is returned to the server, which finally is served to you. It almost takes 240 milliseconds.
Notice, that I said the video you want to watch is trending, which means a lot of users are going to watch the video. Imagine, the number of read requests the database and the server have to handle. This puts a lot of load on the server and database, making it slower to serve user requests.
In scenarios like this, caching plays a crucial role; just as in supermarkets we kept most frequently products closer to ourselves, we will do the same here. Since we know that the video is trending and a lot of users are asking for it, we will keep it closer to the user, in this case, in the in-memory cache of the server. Okay, Asmit, but what is an in-memory cache? We will get to that; as of now, just think that the video is kept on the server itself.
Now, when any user requests the video, the server will not have to request the video from the database; instead, it will serve the video directly from the in-memory cache. This not only reduces the waiting time for the user (or latency) but also frees up the database from too many requests.
This is the basic mechanism of caching: we store frequently accessed data closer to the user to serve faster and reduce the unnecessary load from the database.
This has a lot of advantages, like:
- Reduced latency
- Reduced network traffic
- Enhanced user experience, etc
Now, you have understood the basic concept of caching.
Let's, get back to being the owner of the store you were. You have successfully reduced the customer waiting time, but as time passed, a new problem arose. You have no more place in your store to keep the new phones! You know it's time to remove some phones from the store and keep some new ones, but which should you remove? The ones which are older than a month or the ones that are less frequently bought or you try to strike a balance between both? 🤔
There are many ways in which you can remove phones and each option has its own advantages and disadvantages. Each business has different requirements and, hence, chooses different options.
The same applies to caching; we have limited storage where we can keep our cache because maintaining it is expensive. So we follow some cache eviction policy to remove data that is less valuable to make space for new incoming data.
A few widely used policies include:
Random Replacement: In this method, random items are evicted from the cache. This is simple to implement but doesn't provide any benefits, as items that should have been kept in the cache longer might get evicted, while items with less priority may still remain in the cache.
Least Recently Used: In this method, items that are least recently used are evicted first from the cache. It's like removing the phones that were bought the longest time ago. This method makes sense that we are removing the phones which were bought the longest time ago and keeping phones in our store which were recently bought.
Ugh, but Asmit, there is one more problem in this. Let's say I keep only 3 phones in my store: phone A, phone B, and phone C. Now on any particular day, let's say that phone C was sold 100 times and phone A and phone B weren't sold at all but just as we were going to invalidate cache and get new phones, first phone A got sold and then phone B was sold.
Now, when we will invalidate cache, it will remove phone C, even though it is the most-sold product and should be kept. So that's not the right solution either.
We have something more:
Least Frequently Used: In this method, we track the number of times an item is accessed, or we could say a phone was sold, then while invalidating cache, we keep items with a higher frequency and remove items with a lower frequency. If we use this eviction policy, we can solve the problem that arose with the last eviction policy. Now, even if phone C has been least recently used, it won't be removed because it has the highest frequency.
But Asmit, what if a scenario arises like this:
Let's say your store invalidates phones in every three days, on day 1, phone X got sold 200 and then on day 2, a new phone launched, phone Y, which got in trend and got sold 90 times on day 2 and 100 times on day 3. Now, at the end of day 3, when we are going to invalidate cache (remove phones), we will remove phone Y, even though it is more trending than phone X, but since phone X has more frequency, it will be kept in cache and phone Y will be removed. So Unfair.
So frustrating; even this is not right solution. See, I told you there is no right (or perfect) solution. Every business is different and requires a different set of policies.
There are various advanced policies that are used in industry, like ARC, B-Cache, Clock Algorithm, etc.
Also, we have many unanswered questions around cache. For example, we read about cache eviction, but what is the right time to invalidate the cache?
What are the different types of cache? etc.
I'll try to write more about this topic and break it into multiple blogs.
Thank you for reading. Have a great day!