Graduate students of Computer Science Department will present their research work. The purpose of this section is to share our knowledge and get to know each other's research interests. Presentations will last for 10 minutes with 5 minutes questions. Presenters should submit both their presentation document and poster in PDF format until October 1st, 2019 by sending an e-mail to: email@example.com, with the following Subject: [GSC19] Presentation and Poster submission - < Author's Full Name >. Presenters may request for their poster to be printed by the organizing committee of the conference in their submission e-mail. The poster session will take place between 14:00-15:00 and the poster dimensions should be A0 [84.1 x 118.9 cm] size.
The amount of data generated daily by social media and large
organizations is increasing at a high rate. The flood of data
enables JVM-based data analytics frameworks such as Apache Spark
or Apache Flink to perform intensive analysis on data to discover
and predict trends. While the intensive computations and the size
of data grow, the memory capacity of a server must scale to
optimize big data analytics engines' workloads by moving and
processing large amount of data closer to the processor.
However, DRAM capacity scaling is not able to match the
requirements of in-memory big data analytics frameworks. There are
two orthogonal approaches, to overcome DRAM capacity scaling: (i)
by adding more memory to each server via persistent memory (PM)
technologies, (ii) by adding more servers.
Our goal is to explore how persistent memory can be used to enlarge the memory of each server in Apache Spark. We are exploring the use of on-heap, off-heap, and storage address spaces in Apache Spark and we are considering optimizations related to each approach. In addition, we are interested to understand which aspects (byte addressability, performance, etc.) of PM have the largest impact on big data analytics frameworks applications and what are the new challenges (e.g Garbage Collection Time) for JVM-based data analytics engines to work on TB heap sizes. We would also like to contrast these approaches with mmap-based systems, where block-device address space can appear as an extension of DRAM (and via DRAM).
With the constant evolution of high performance applications, their memory
requirement is rapidly increasing. As a result, the demand for more memory on
computer nodes of large clusters running those applications, continuously
rises. However, an individual computer node has limits in terms of memory
capacity. Typically, by running several processes of different computational
and memory requirements on a cluster, creates fluctuating workloads among the
computer nodes. Hence, several nodes use most of their memory while others are
left with unused memory which could potentially be exploited by nodes with a
heavy memory workload.
Consequently, the concept of remote memory management has become the subject for research by many organizations, which have implemented varying techniques for reading and writing data on remote memory. Although using remote memory practically increases the total available memory of a computer node, accessing data remotely, can critically minimize performance due to the data travelling through the network interconnection of the cluster. Furthermore, software APIs that are implemented to give processes access to remote memory, primarily can be complex, and secondly the responsibility for remote memory allocation and fair remote memory sharing among processes, is assigned to processes themselves, which can be quite complicated, especially when many processes are running simultaneously on the same computer node.
In our thesis, we present the Page Migration System(PMS), which monitors main memory usage of the computer nodes on a cluster, and moves infrequently accessed data of a process from the memory of a computer node with heavy memory workload, to the unused memory of a remote computer node of the same cluster node with a lighter memory workload. The key features of the PMS is that it transparently moves LRU pages of processes to remote memory while using a fairness algorithm when choosing memory pages among many processes running on the same computer node. What's more, remote memory is mapped on the local node, allowing the OS to cache remote data. To be precise, a read and/or write on remote memory happens when we get a cache miss. Cacheability offers better performance when there are less misses, by reducing network transfers. Finally the system is able to return memory pages locally if the overall node memory usage drops, or if the access frequency of those memory pages increases.
We evaluate the PMS using several benchmarks that stress the CPU in terms of memory access. We use benchmarks that perform raw serial access on arrays of around a Gigabyte in size and thus cause cache eviction frequently, essentially moving more data through the network. That way we can measure the performance drop of a process due to memory access in the worst case scenario. We also run cache blocking benchmarks that exploit temporal locality, and we show that we get a better performance that way by reducing operations on remote memory. Finally we observe the behaviour and performance on real HPC applications using the PMS.
Over the past years, there has been an increasing number of
key-value (KV) store designs, each optimizing for a different
set of requirements. Furthermore, with the advancements of
storage technology the design space of KV stores has become
even more complex. More recent KV-store designs target fast
storage devices, such as SSDs and NVM. Most of these designs
aim to reduce amplification during data re-organization by
taking advantage of device characteristics. However, until
today most analysis of KV-store designs is experimental and
limited to specific design points. This makes it difficult to
compare tradeoffs across different designs, find optimal
configurations and guide future KV-store design.
In this presentation, we introduce the Variable Amplification--Throughput analysis (VAT) to calculate insert-path amplification and its impact on multi-level KV-store performance. We use VAT to express the behavior of several existing design points and to explore tradeoffs that are not possible or easy to measure experimentally. VAT indicates that by inserting randomness in the insert-path, KV stores can reduce amplification by more than 10x for fast storage devices. Techniques, such as key-value separation and tiering compaction, reduce amplification by 10x and 5x, respectively. Additionally, VAT predicts that the advancements in device technology towards NVM, reduces the benefits from both using key-value separation and tiering.
Technical presentation on the system I'm working on at CARV for my Master's thesis.
Abstract: Scale-out persistent key-value stores are at the heart of modern data processing systems. However, they exhibit high CPU and I/O overhead because they use TCP/IP for their communication across servers and target HDDs as their storage devices. With the advent of flash storage and fast networks in datacenters, there is a lot of room for improvements in terms of CPU efficiency. In this paper we design a scale-out version of Kreon, an efficient key-value store tailored for flash storage, that uses RDMA for its communication. RDMA’s lower protocol overhead and μs latency reduces the impact imposed by replication as well as the latency experienced by the client.
Internet mobile traffic grows exponentially and multimedia streaming is a dominant contributor to this increase. Multimedia delivery services, such as online video (e.g., YouTube,Netflix, Hulu), audio (e.g., Spotify, Deezer), live streaming for gaming (e.g., Twitch.tv), content over social media (e.g., Facebook), etc., use recommendation systems (RSs) to best satisfy the users and/or maximize their engagement in the service (retention rate). In this work, we propose that jointly designing communication networks and recommendation systems (RSs) enables content and network providers to keep up with the increasing data demand, while maintaining acceptable Quality of Experience (QoE). Our goal is to investigate the benefits of this concept, both from the content provider's side and the end user's, respectively. To this end, we conduct a simulation driven evaluation and a real users' evaluation, to quantify the advantages of this idea. We apply our approach to the YouTube service, and conduct measurements on YouTube video recommendations. Our analysis supports the potential that: Network Aware recommendations could be beneficial for both the users (better experience) and content providers (higher retention rates, offloaded core network). We envision that the experimental results presented in this work are a first step towards embracing recommendations into networks, to jointly improve the level of users' satisfaction, content providers and network systems.
CAPrice is an initiative led by FORTH for developing socio-technical solutions that can make an impact on privacy protection in the digital world. CAP-A is a NGI-Trust funded project that suggests a bottom-up solution to digital privacy by implementing a suite of tools and engaging users to participate to crowd-sourcing activities that improve citizen awareness on privacy issues, and leverage this awareness to motivate the market to adopt more privacy-friendly practices.