2026-05-19
cthread wrapper lets programmer specify
latency bounds, remote preference, failure handlersdiff --git a/lit-reviews/hale2021coalescent.md b/lit-reviews/hale2021coalescent.md index e8066f1..6f8fcdc 100644 --- a/lit-reviews/hale2021coalescent.md +++ b/lit-reviews/hale2021coalescent.md @@ -5,3 +5,90 @@ date: "2026-05-19" bibliography: bibliography/references.bib csl: bibliography/chicago-author-date.csl --- + +## Source Information + +- **Title:** Coalescent Computing +- **Authors:** Kyle C. Hale +- **Year:** 2021 +- **Publication:** APSys 2021 +- **PDF:** [Coalescent Computing](/assets/papers/hale2021coalescent) + +## General Notes + +### Core Idea + +- Coalescent computing = cyber foraging for disaggregated edge hardware +- Coalescence principle + - Resources coalesce into a user's device proportionally to physical proximity +- Users' devices appear to gain/lose compute, memory, GPU, etc. transparently as + they move through an environment +- Distinguished from typical cloud offload b\c automatic, proximity driven, fine + grained + +### Hardware + +- Wireless latency is largest blocker (WiFi at ~20ms), but 5G URLLC can get to + ~1ms + - Cache coherence isn't possible at these latencies +- Current NICs not built for RDMA-style disagg + - Need hardware extensions +- [[../concepts/acpi]] needs extensions to reflect dynamically coalesced + resouces + +### Software + +- Need a "Coalescent OS" supporting: + - Performance, disaggregation, resource discovery, adaptation, heterogeneity, + fault tolerance + - LegoOS is closest foundation + - Designed for datacenter so ExCache and Infiniband based RPC don't + translate +- Scheduling is multi-objective optimization problem that is further complicated + by user mobility + - Centralized won't work, so use decentralized ML approach +- Heterogeneous ISAs require fat binaries or a JIT based IR approach +- Resource discovery needs to balance reactivity with network and battery + overhead + +### Programming Model + +- Default: OS transparently migrates threads to remote vCPUs + - Programmer unaware +- Explicit API: `cthread` wrapper lets programmer specify latency bounds, remote + preference, failure handlers +- FaaS variant: [[../concepts/virtines]] used as primitive for synchronous + remote function invocation, in connection to @wanninger2022isolating + +### Fault Tolerance + +- Replica hierarchy + - Edge server (secondary) $\rightarrow$ cloud (tertiary, relaxed consistency) +- Periodic checkpointing + append-only storage for state recovery +- Best-effort coalescence + - If network can't meet SLO, fall back to local or cloud + +## Key Findings + +- Coalescent Computing is positioned as feasible in the near term given + converging trends: disaggregated hardware, hierarchical edge clouds, and + ultra-low-latency wireless (5G URLLC, WiFi 6E) +- [[../concepts/virtines]] are the proposed primitive for the FaaS/RPC + programming model in a coalescent system +- Deterministic performance is identified as the central unsolved problem + - Same albatross that killed early distributed OSes + +## Critique/Gaps + +- This is a concept paper, so there's a lot lol + +## Questions + +- Is a trust peer and/or participation peer a good idea? + +## Relations + +- [[../concepts/unikernels]] +- [[../concepts/disaggregated-hardware]] + +## References (if any)