7th EAI International Conference on Industrial Networks and Intelligent Systems, INISCOM 2021, Ha-Noi, Vietnam, 22 - 23 Nisan 2021, cilt.379, ss.73-87, (Tam Metin Bildiri)
More than half of the content over the Internet is carried by content delivery networks (CDNs). CDNs cache popular and most requested contents on the edges of the network. Thus helping to increase Quality of Experience (QoE), e.g., by decreasing time to first byte (TTFB) for different contents. In the present paper, we focus on developing a hierarchical caching structure for CDNs to improve their QoE. We focus on unpopular content here, since it accounts for a big portion of content over the Internet. Our novel data-driven method forms caching clusters or hierarchies to deal with unpopular contents. In order to form our clusters and assign edge servers into these clusters, we consider the pattern in which contents have been requested including the total number of requests, similar objects between two edge servers, and requests for those objects. Using tf- idf method, which is widely used in information retrieval, we find the similarities between requests landed on each of our edge servers and use these similarities to form clusters using the Markov Clustering algorithm. We evaluate our approach using different hierarchical models, and with real-world requests from a large-scale global CDN. We demonstrate that our hierarchical caching approach improves cache hit ratio by 9.05 %. Additionally, a 7.39 % decrease in TTFB is observed.