Here we first explain the challenges of bringing decentralized data collaboration to the real world, then give an overview about the content in this concept doc.

Challenges of decentralized data collaboration

Data collaboration is a common practice of great importance in both academia and industry. It often involves multiple participants with various privacy and security concerns, making it extremely challenging to build real-world solutions.

The challenges in both the development and deployment of decentralized data collaboration solutions are due to the missing key abstractions and building blocks.

Goal of the concept doc

Going through the concept doc, we summarize and analyze different types of existing workloads and identify the shared abstractions. Combining the observations, we present a simple, secure, and flexible new programming abstraction for applications that involves multiple participants, named Decentralized Programming, and build CoLink, the decentralized programming abstractions.

In the concept doc:

To characterize the performance of CoLink, in later section of the doc we provide an evaluation of the efficiency of the proposed abstractions, a micro-benchmark on basic operations.

We also conduct a case study experiment on a practical billion-level private set intersection system built with CoLink. Solution built with CoLink uses less development effort (about 400 LoC), but has better performance, achieving up to 20x improvement over existing work.


Next, we introduce the background for the problem of decentralized data collaboration. Background