Intro

Network is important in the cloud, but, unlike things like CPU and memory, network bandwidth is not a resource that is fair. Proportional fairness (extending to network fairness), where each customer gets a bandwidth equal to the money they put in, is not possible.

Ideal requirement for a cloud network

  1. min-guarantee: guarantee that tenants can expect a minimum network bandwidth
  2. high utilization: guarantee that the network utilization is maxed
  3. network proportionality: network bandwidth proportional to money spent

Trade-offs

Researches found that there is a clear trade-off between min-guarantee and network proportionality. For example, if A & B share a VM, and A is talking to one other VM whereas B is talking to 11 other VMs, A will only get 2/13 of the network bandwidth. B can keep increasing it’s VM count until A gets nearly no bandwidth, violating any reasonable min-guarantee.

Moreover, high utilization and network proportionality also have a clear trade-off. If someone has a free, uncongested path, they may be punished for using it to its fullest simply to maintain network proportions.

Clearly, this system does not work.

One potential solution is the idea of sacrificing network proportionality. This may be done by using link-level fairness, which basically means that fairness is governed through only the congested network links themselves. Each link acts independently. This is essentially max-min fairness over a link.

Network Sharing Properties

Traditional allocation policies do not work for this problem specifically, because there is no clean way to share links. The traditional policies would fail at the network level. There are five properties that the authors highlight about the network here:

  1. Work conservation: link L should not be idle so long as there is work it could do
  2. Strategy proofness: tenants cannot lie about demands to gain an advantage. No practical algo exists to prevent this.
  3. Utilization incentives: tenants cannot be incentivized to reduce theri demands to gain an advantage.
  4. Communication pattern independence: allocation of a VM only depends on the VMs it talks to and not other things in the network. This is obviously not possible.
  5. Symmetry: switching direction of flows should not change allocation. Also not really possible.

Proposed solutions

  1. PS-L (link-level): fair on each congested link (good utilization incentives, but no strong guarantee). Can be gamed a bit.
  2. PS-N (network-level-ish): tries to make tenants total share across the network more proportional.
  3. PS-P (proximate/tree): gives real minimum bandwidth guarantees (especially in tree/fat-tree-ish topologies) and keeps utilization high, but doesn’t try to be proportional network-wide.

References

  1. [FairCloud: Sharing network in cloud computing][https://dl.acm.org/doi/pdf/10.1145/2342356.2342396]