Oliver's Blog
  • 🤩Welcome!
  • Projects
    • RISC Game
    • Mini Amazon
    • HTTP Caching Proxy
    • Course Enrollment App
    • Fitness Tracker App
    • Voice Shopping Assistant
    • Graphics Town
  • Algo
    • Binary Search
      • Classical Binary Search
      • First Position of Target
      • Last Position of Target
      • Guess Number Higher or Lower
      • Search in a Big Sorted Array
      • Total Occurrence of Target
      • First Bad Version
      • Find Minimum in Rotated Sorted Array
      • Maximum Number in Mountain Sequence
      • Search a 2D Matrix
      • Search a 2D Matrix II
      • Search for a Range
      • Smallest Rectangle Enclosing Black Pixels
      • Find Peak Element
      • Search in Rotated Sorted Array
      • Wood Cut
      • Find the Duplicate Number
      • Sqrt(x) II
      • Maximum Average Subarray II
      • Copy Books
      • How Many Problem Can I Accept
    • Linked List
      • Insert Node in Sorted Linked List
      • Merge Two Sorted Lists
      • Merge K Sorted Lists
      • LRU Cache
      • Reverse Linked List II
      • Copy List with Random Pointer
      • Reverse Nodes in k-Group
      • Add Two Numbers
      • Swap Nodes in Pairs
      • Rotate List
      • Linked List Cycle
      • Linked List Cycle II
      • Intersection of Two Linked Lists
      • Remove Linked List Elements
      • Reverse Linked List
      • Delete Node in a Linked List
      • Odd Even Linked List
      • Partition List
    • Recursion Basics
      • Fibonacci
      • Double Factorial
      • Reverse Order Storage
      • Linked List Weighted Sum In Reverse Order
    • Binary Tree
      • 1. Traversal
        • Binary Tree Preorder Traversal
        • Binary Tree Inorder Traversal
        • Binary Tree Postorder Traversal
        • Construct Binary Tree from Inorder and Postorder Traversal
        • Minimum Depth of Binary Tree
        • Find Leaves of Binary Tree
        • Reconstruct Itinerary
      • 2. Classical Questions
        • Maximum Depth of Binary Tree
        • Average of Levels in Binary Tree
        • Binary Tree Leaf Sum
        • Invert Binary Tree
        • Binary Tree Path Sum
        • Binary Tree Path Sum II
        • Binary Tree Path Sum III
        • Clone Binary Tree
        • Sum Root to Leaf Numbers
        • Binary Tree Level Sum
        • Binary Tree Paths
      • 3. Binary Search Tree
        • Insert Node in a Binary Search Tree
        • Remove Node in Binary Search Tree
        • Validate Binary Search Tree
        • Trim a Binary Search Tree
        • Search Range in Binary Search Tree
        • Inorder Successor in BST
        • Binary Search Tree Iterator
        • Recover Binary Search Tree
      • 4. Divide and Conquer
        • Balanced Binary Tree
        • Minimum Subtree
        • Subtree with Maximum Average
        • Maximum Subtree
        • Lowest Common Ancestor of a Binary Tree
        • Lowest Common Ancestor II
        • Lowest Common Ancestor III
        • Binary Tree Maximum Path Sum II
        • Binary Tree Maximum Path Sum
        • Path Sum III
      • Convert Sorted Array to Binary Search Tree
      • Path Sum
      • Lowest Common Ancestor of a Binary Search Tree
      • Sum of Left Leaves
      • Minimum Absolute Difference in BST
      • Minimum Distance Between BST Nodes
      • Convert Sorted List to Binary Search Tree
      • Range Sum of BST
      • Kth Smallest Element in a BST
      • Find Largest Value in Each Tree Row
    • Sorting
      • Sort Integers
      • Merge Two Sorted Arrays
      • Reverse Pair
      • Sort List
      • Partition Array
      • Sort Colors
      • Kth Largest Element
      • Multi-keyword Sort
    • Two Pointers
      • Window Sum
      • Two Sum - Difference equals to target
      • Valid Palindrome
      • Remove Duplicates from Sorted Array
      • Recover Rotated Sorted Array
      • Two Sum II - Input array is sorted
      • Two Sum - Unique pairs
      • Two Sum - Closest to target
    • Queue & Stack
      • Implement Queue by Interface
      • Implement Stack
      • Implement Queue by Two Stacks
      • Implement Stack by Two Queues
      • Binary Tree Level Order Traversal
      • Valid Parentheses
      • Min Stack
      • Largest Rectangle in Histogram
      • Evaluate Reverse Polish Notation
      • Implement Queue by Linked List II
      • Basic Calculator II
      • Moving Average from Data Stream
      • Reveal Cards In Increasing Order
      • Longest Valid Parentheses
    • Hash Table
      • Rehashing
      • Valid Anagram
      • Two Sum
      • Contiguous Array
      • Anagrams
      • Remove Duplicate Numbers in Array
      • Friendship Service
    • Heap & Priority Queue
      • Heapify
      • Top k Largest Numbers II
      • K Closest Points
      • Kth Smallest Number in Sorted Matrix
      • Find Median from Data Stream
      • Sliding Window Median
      • Trapping Rain Water II
      • High Five
    • BFS
      • 1. BFS in Binary Tree
        • Check Full Binary Tree
        • Binary Tree Level Order Traversal II
        • Binary Tree Maximum Path Sum II
        • Convert Binary Tree to Linked Lists by Depth
      • 2. Connected Graph & Topologic Sorting
        • Search Graph Nodes
        • Graph Valid Tree
        • Connected Component in Undirected Graph
        • Topological Sorting
        • Course Schedule
        • Course Schedule II
        • Sequence Reconstruction
        • Clone Graph
        • Alien Dictionary
    • Array
      • Remove Element
      • Search Insert Position
      • Maximum Subarray
      • Plus One
      • Merge Sorted Array
      • Pascal's Triangle
      • Pascal's Triangle II
      • Best Time to Buy and Sell Stock
      • Best Time to Buy and Sell Stock II
      • Majority Element
      • Contains Duplicate
      • Contains Duplicate II
      • Summary Ranges
      • Missing Number
      • Move Zeroes
      • Third Maximum Number
      • Binary Search
      • Pairs of Songs With Total Durations Divisible by 60
      • 3Sum
      • Median of Two Sorted Arrays
      • Running Sum of 1d Array
      • Container With Most Water
    • String
      • Longest Substring Without Repeating Characters
      • Roman to Integer
      • Implement strStr()
      • Reverse Words in a String
      • First Unique Character in a String
      • Count Unique Characters of All Substrings of a Given String
    • Math
      • Pow(x, n)
      • Narcissistic Number
      • A + B Problem
    • Dynamic Programming
      • Fibonacci Number
      • N-th Tribonacci Number
      • Climbing Stairs
      • Min Cost Climbing Stairs
    • LeetCode vs. LintCode Table
  • React Notes
    • Optimizing Performance in React
  • Golang Notes
    • Basics
      • Setup
      • Hello World
      • Structure
      • Data Types
      • Variables
      • Operators
      • Constants
      • Decision Making
      • Loops
      • Special Statements
    • Official Tutorial Notes
      • More Types
        • Functions
        • Pointers
        • Structs
        • Arrays
        • Slices
        • Range
        • Maps
        • More Functions
      • Methods and Interfaces
        • Methods
        • Interfaces
        • Stringers
        • Errors
        • Images
        • Readers
      • Concurrency
        • Goroutines
        • Channels
        • Range and Close
        • Select
        • Mutual Exclusion
  • Miscellaneous
    • Traveling to China During a Global Pandemic
Powered by GitBook
On this page
  • Introduction
  • What is an HTTP Caching Proxy?
  • Key Features and Architecture
  • 1. High-Performance Design
  • 2. Efficient Caching with LRU
  • 3. Log Management
  • 4. Error Handling
  • 5. HTTPS and Tunneling Support
  • 6. Dockerized for Easy Deployment
  • Performance Improvements and Benchmarks

Was this helpful?

  1. Projects

HTTP Caching Proxy

Introducing My HTTP Caching Proxy Project: A High-Performance, Scalable Solution As part of my efforts to deepen my understanding of networking, HTTP protocols, and high-performance server development

Introduction

As part of my efforts to deepen my understanding of networking, HTTP protocols, and high-performance server development, I created an HTTP Caching Proxy server as a course project. This proxy server is designed to handle HTTP requests and responses efficiently while caching the results to improve response times. The proxy is fully built in C++, leveraging advanced techniques like multi-threading, I/O multiplexing with epoll, and caching mechanisms based on LRU (Least Recently Used).

What is an HTTP Caching Proxy?

In simple terms, an HTTP proxy server acts as an intermediary between a client (such as a web browser) and an origin server (like a website). It forwards requests from the client to the server and sends back the server's response. By caching frequently requested content, a proxy server can significantly reduce the time it takes to fetch resources from the origin server and minimize bandwidth usage.

In my project, the caching proxy not only forwards requests and responses but also intelligently caches GET requests with a 200-OK status code and checks for cache validity based on expiration times and re-validation strategies. This reduces the number of requests that need to be sent to the origin server and ensures faster browsing for users.

Key Features and Architecture

1. High-Performance Design

I implemented the proxy server with multi-threading to handle multiple concurrent connections efficiently. By employing the epoll API for I/O multiplexing, the server can handle thousands of requests simultaneously without blocking. The server is designed using the master-slave Reactor pattern to separate the management of incoming connections from the actual request handling. This design ensures that the server scales well and remains responsive under high load.

2. Efficient Caching with LRU

A core feature of the proxy is its caching mechanism. The server uses an LRU (Least Recently Used) cache, meaning that the most frequently accessed resources are kept in memory, while less frequently used ones are evicted. This mechanism is crucial in ensuring that the cache stays within memory limits while maximizing performance. Additionally, cached responses are validated based on Cache-Control headers and ETags, which is consistent with the HTTP specification.

3. Log Management

A robust logging system is implemented to track every interaction with the proxy server. Each request is assigned a unique ID, and detailed logs are generated for requests, cache status, origin server interactions, and responses. This ensures that both users and developers can monitor and debug the server's behavior. The logs are stored in /var/log/erss/proxy.log inside the container and are accessible on the host machine via a mounted volume for easy analysis.

4. Error Handling

The proxy is designed to handle various errors gracefully, including:

  • Malformed requests (400 Bad Request)

  • Server errors (502 Bad Gateway)

  • Expired or corrupted cached data

If the origin server responds with an error or corrupted data, the proxy responds with the appropriate HTTP error code. This fault tolerance ensures a reliable experience even in the face of unexpected issues.

5. HTTPS and Tunneling Support

The proxy also supports HTTPS requests, which require the CONNECT method. For secure connections, the proxy opens a tunnel to the origin server and forwards encrypted data back and forth. The server logs when a tunnel is opened and closed, providing transparency into these interactions.

6. Dockerized for Easy Deployment

To facilitate easy deployment and testing, I containerized the proxy using Docker. A docker-compose.yml file and Dockerfile were created to allow the proxy to run as a Docker container. The container is configured to map port 12345 on the host machine to the proxy, and logs are saved in a directory on the host system, ensuring that developers can easily track the proxy's performance and behavior.

Technical Details and Implementation

  • Programming Language: C++

  • Networking: Linux Sockets, TCP

  • Concurrency: Multi-threading, Epoll, Thread Pool, Reactor Pattern

  • Caching: LRU Algorithm, Cache Expiration, Cache Re-validation

  • HTTP Protocols Supported: GET, POST, CONNECT (with optional support for other methods)

  • Error Handling: Graceful handling of server errors and malformed requests

  • Deployment: Docker, Docker Compose

Performance Improvements and Benchmarks

One of the key highlights of this project was improving the response time of the proxy. By using an LRU caching mechanism and ensuring that expired or stale cache entries were properly handled, the proxy was able to reduce response time by 30% compared to a non-caching setup.

The proxy was designed to handle over 10,000 concurrent connections, and thanks to the epoll-based I/O multiplexing and thread pool, it could scale to meet high demand. This makes it well-suited for production environments where performance is critical.

Logs and Debugging

The logging system I implemented provides detailed insights into the behavior of the proxy server. Every request and response is logged with a unique ID and includes information about:

  • The incoming request (method, URL, headers)

  • The cache status (whether the response was served from cache, if it was expired, etc.)

  • The server interactions (requests made to the origin server, responses received)

  • Responses sent back to the client

This logging mechanism is invaluable for debugging issues and understanding the server's internal state at any given time.

Conclusion

This HTTP Caching Proxy project allowed me to gain hands-on experience in network programming, server design, and high-performance systems. By implementing an efficient multi-threaded proxy with caching, I was able to deliver significant performance improvements for web browsing, while ensuring robust error handling and system monitoring.

PreviousMini AmazonNextCourse Enrollment App

Last updated 5 months ago

Was this helpful?