React|Essential of Language and Web

The Complete Guide to HTTP Caching: Cutting Network Costs with Cache-Control and ETag

3
The Complete Guide to HTTP Caching: Cutting Network Costs with Cache-Control and ETag

Chris's web service hit the jackpot. But as traffic exploded, the server cost bill exploded right along with it.

Upon analysis, he found that every time a user refreshed the page, they were re-downloading a 5MB background image and heavy JavaScript files.

"It's the exact same file anyway; can't the browser just save it and reuse it?"

The technology that exists to prevent this inefficiency is HTTP Caching. By simply setting the correct rules—specifically HTTP headers—that tell the browser, "Save this file on your computer and reuse it," you can dramatically reduce server costs and boost loading speeds.

1. Cache Expiration: Cache-Control

The most fundamental header is Cache-Control. It commands the browser on how long to remember a resource.

// server-response.ts

// 1. "Don't look for me for 1 year. Use what's saved." (Immutable static files)
res.setHeader('Cache-Control', 'max-age=31536000');

// 2. "Save it, but ask me before using it." (Files that change often, like HTML)
res.setHeader('Cache-Control', 'no-cache');

// 3. "Never save this. It's a security document." (Personal info, etc.)
res.setHeader('Cache-Control', 'no-store');

The most common misunderstanding here is no-cache.

Just looking at the name, it seems to imply "Do not cache." However, the actual meaning is "You may cache this, but you must validate with the server before every use to ensure it's still valid." If you strictly need to prevent saving anything, you must use no-store.

2. Validation: ETag and 304 Not Modified

Just because max-age has expired doesn't necessarily mean you should re-download the file unconditionally. The file content might still be exactly the same.

This is where ETag comes in.

ETag: The File's Fingerprint

When the server sends a file, it hashes the content and attaches this unique value to the ETag header.

  • Browser: "This file's max-age is up. But hey Server, I have a file with the fingerprint 'abc-123'. Is this still valid?" (If-None-Match: "abc-123")
  • Server: (Checks the file and sees the fingerprint is still 'abc-123')
  • Server: "Yeah, it hasn't changed. Just use that one." (304 Not Modified)
  • At this moment, the server does not send the file Body, only the Headers. It's a magical moment where data transfer volume drops significantly.

    3. Practical Caching Strategies (Best Practice)

    So, what strategy should be used in a React project? Here is the strategy Chris's team adopted.

    Strategy 1: no-cache for HTML

    The HTML file is the entry point. It loads the paths for JS and CSS files. If HTML is cached with an old version, users might never see the latest deployment.

    // nginx.conf or server.ts
    // On index.html request
    res.setHeader('Cache-Control', 'no-cache'); 
    // "Always ask the server before taking it!"

    Strategy 2: Long max-age + File Hashing for JS/CSS/Images

    Built JS files (e.g., main.a1b2c.js) change their filenames (hash) whenever their content changes. In other words, if the filename is the same, the content is 100% guaranteed to be the same.

    Therefore, it is safe to tell the browser, "This will never change, so save it for 1 year."

    // static-files.ts
    // On .js, .css, .png requests
    res.setHeader('Cache-Control', 'max-age=31536000, immutable');

    What happens when you deploy a new version?

    The HTML file (no-cache) will update to point to a new filename (main.d4e5f.js), and the browser will download the new file. This is the perfect cache invalidation strategy.

    4. Verifying Cache Behavior

    You can check if caching is working correctly by opening the Developer Tools (Network tab).

  • 200 OK (from memory cache): The browser didn't even ask the server; it retrieved it from RAM. (Fastest)
  • 200 OK (from disk cache): Even after restarting the browser, it retrieves the file saved on the disk.
  • 304 Not Modified: Asked the server (no-cache), and since it didn't change, the existing one was used. (Only incurs header communication cost)
  • Key Takeaways

  • Cache-Control: no-store: Absolute ban on saving.
  • Cache-Control: no-cache: Save, but verify with the server every time (304).
  • Cache-Control: max-age=...: Don't ask the server for this duration; just use it.
  • Strategy: Use no-cache for HTML, and set a long max-age for hashed static files (JS, CSS).

  • This concludes Part 1: The Essence of Language and the Web.

    Now that we have built up our foundational strength, it is time to open up the heart of React.

    Next Post Teaser:

    How are React components born, how do they change, and how do they disappear?

    Part 2 begins with "The 2 Stages of React Rendering: Render Phase vs. Commit Phase"

    🔗 References

  • MDN - Cache-Control
  • MDN - ETag
  • Google Developers - HTTP Caching