TransAtlantic networking
using Cloud links
Igor Sfiligoi
UC San Diego – San Diego Supercomputer Center
The problem
• ESNet TransAtlantic
capacity expected
to not keep up with
WLCG needs
• 400 Gbps now
• Not expected to
exceed 600 Gbps
anytime soon
What can WLCG community do?
•Drastically reduce network traffic
•Find alternatives
Can Cloud networking help?
There is plenty of Network capacity in the Clouds
• Presented during CHEP’19
• Measured TransAtlantic network bandwidth of
• Google (GCP): 1 Tbps (1060 Gbps)
• Amazon (AWS): 450 Gbps
• Microsoft (Azure): 190 Gbps
• And this was without any special arrangements
with the Cloud providers
• And without trying to hit the limit
There is plenty of Network capacity in the Clouds
• Presented during CHEP’19
• Measured TransAtlantic network bandwidth of
• Google (GCP): 1 Tbps (1060 Gbps)
• Amazon (AWS): 450 Gbps
• Microsoft (Azure): 190 Gbps
• And this was without any special arrangements
with the Cloud providers
• And without trying to hit the limit
AWS comparable
to ESNet today
GCP already higher than
projected ESNet capacity
And I just scratched the surface
• PR numbers are 100x as large!
https://www.theregister.co.uk/2020/02/18/orange_telxius_google_transatlantic_cable/
Of course, we need to get data to/from on-prem
• Around 100 Gbps pretty easy to reach
• Demonstrated for all three Cloud providers
• For the purpose of this talk, I will assume
we can get as high as we need with modest effort (on same continent)
AWS West AWS Central GCP West GCP Central Azure West Azure S. Central
100 Gbps 90 Gbps 100 Gbps 100 Gbps 120 Gbps 120 Gbps
Fetching data from California (PRP)
Cost the only real constraint
• Cloud networking is not cheap
• They are in the business for money
• More details on the next few slides
• ESNet still absolutely needed for base-load
• But can we afford the Cloud prices
for occasional bursts when needed?
• Just like we do with Computing?
The Cloud networking cost
Measuring about 1TB of TransAtlantic traffic
• Pretty simple transfer of about 1 TB data from the US To the EU
• Using HTTP and a (couple of) squid(s) to force routing
• Resulting bill, on-prem to on-prem:
• AWS: $146 for ~1.3TB
• GCP: $141 for ~1.2TB
• Azure: $183 for ~1.1TB
Tiered pricing
• My test was the worst-case scenario – Top tier
• Larger transfers will be charged at a lower rate
(No separate TransAtlantic charge)
Tiered pricing
• My test was the worst-case scenario –
Top tier
• Larger transfers will be charged
at a lower rate
Tiered pricing
• My test was the worst-case scenario – Top tier
• Larger transfers will be charged at a lower rate
• Really big transfers expected to happen
under specially negotiated prices
Egres to on-prem price can be managed
• All Cloud providers have option to peer at lower price
• I am aware of Internet2 setup with AWS (AWS Direct)
Estimating large transfer cost: 200 TB/workday
• Let’s assume FNAL to CERN (or the way around)
• 200 TB in 6h would average approx. 100Gbps
• Using the list pricing this should cost approximately
• AWS: $19k
• GCP: $16k
• Azure: $22k
• Assuming we can get AWSDirect to CERN (and that I understand the pricing right):
• AWS: $7k
• Don’t understand yet the peering pricing for the other providers
Estimating large transfer cost: 4 PB/day
• Let’s assume FNAL to CERN (or the way around)
• 4 PB in 24h would average approx. 450Gbps
• Assuming we do not get a large discount, list price:
• AWS: $280k
• GCP: $320k
• Azure: $400k
• Assuming we can get AWSDirect to CERN (and that I understand the pricing right):
• AWS: $120k
• Don’t understand yet the peering pricing for the other providers
Summary
Cloud TransAtlantic networking like compute
• Cloud providers have plenty of high-speed
TransAtlantic networking capacity
• But it comes at a cost
• Story very similar to what you see in Cloud computing
• Plenty of capacity, but at a cost
• I believe same approach should be taken
• Use “on-prem” resources when possible, use Cloud for bursts
• For networking, this means the ESNet TransAtlantic link
Acknowledgments
• This work has been partially sponsored by NSF grants
OAC-1826967, OAC-1541349, OAC-1841530,
OAC-1836650, MPS-1148698, OAC-1941481,
OPP-1600823 and OAC-190444.
• I kindly thank Amazon, Microsoft and Google for providing
Cloud credits that covered most of the incurred Cloud costs

TransAtlantic Networking using Cloud links

  • 1.
    TransAtlantic networking using Cloudlinks Igor Sfiligoi UC San Diego – San Diego Supercomputer Center
  • 2.
    The problem • ESNetTransAtlantic capacity expected to not keep up with WLCG needs • 400 Gbps now • Not expected to exceed 600 Gbps anytime soon
  • 3.
    What can WLCGcommunity do? •Drastically reduce network traffic •Find alternatives
  • 4.
  • 5.
    There is plentyof Network capacity in the Clouds • Presented during CHEP’19 • Measured TransAtlantic network bandwidth of • Google (GCP): 1 Tbps (1060 Gbps) • Amazon (AWS): 450 Gbps • Microsoft (Azure): 190 Gbps • And this was without any special arrangements with the Cloud providers • And without trying to hit the limit
  • 6.
    There is plentyof Network capacity in the Clouds • Presented during CHEP’19 • Measured TransAtlantic network bandwidth of • Google (GCP): 1 Tbps (1060 Gbps) • Amazon (AWS): 450 Gbps • Microsoft (Azure): 190 Gbps • And this was without any special arrangements with the Cloud providers • And without trying to hit the limit AWS comparable to ESNet today GCP already higher than projected ESNet capacity
  • 7.
    And I justscratched the surface • PR numbers are 100x as large! https://www.theregister.co.uk/2020/02/18/orange_telxius_google_transatlantic_cable/
  • 8.
    Of course, weneed to get data to/from on-prem • Around 100 Gbps pretty easy to reach • Demonstrated for all three Cloud providers • For the purpose of this talk, I will assume we can get as high as we need with modest effort (on same continent) AWS West AWS Central GCP West GCP Central Azure West Azure S. Central 100 Gbps 90 Gbps 100 Gbps 100 Gbps 120 Gbps 120 Gbps Fetching data from California (PRP)
  • 9.
    Cost the onlyreal constraint • Cloud networking is not cheap • They are in the business for money • More details on the next few slides • ESNet still absolutely needed for base-load • But can we afford the Cloud prices for occasional bursts when needed? • Just like we do with Computing?
  • 10.
  • 11.
    Measuring about 1TBof TransAtlantic traffic • Pretty simple transfer of about 1 TB data from the US To the EU • Using HTTP and a (couple of) squid(s) to force routing • Resulting bill, on-prem to on-prem: • AWS: $146 for ~1.3TB • GCP: $141 for ~1.2TB • Azure: $183 for ~1.1TB
  • 12.
    Tiered pricing • Mytest was the worst-case scenario – Top tier • Larger transfers will be charged at a lower rate (No separate TransAtlantic charge)
  • 13.
    Tiered pricing • Mytest was the worst-case scenario – Top tier • Larger transfers will be charged at a lower rate
  • 14.
    Tiered pricing • Mytest was the worst-case scenario – Top tier • Larger transfers will be charged at a lower rate • Really big transfers expected to happen under specially negotiated prices
  • 15.
    Egres to on-premprice can be managed • All Cloud providers have option to peer at lower price • I am aware of Internet2 setup with AWS (AWS Direct)
  • 16.
    Estimating large transfercost: 200 TB/workday • Let’s assume FNAL to CERN (or the way around) • 200 TB in 6h would average approx. 100Gbps • Using the list pricing this should cost approximately • AWS: $19k • GCP: $16k • Azure: $22k • Assuming we can get AWSDirect to CERN (and that I understand the pricing right): • AWS: $7k • Don’t understand yet the peering pricing for the other providers
  • 17.
    Estimating large transfercost: 4 PB/day • Let’s assume FNAL to CERN (or the way around) • 4 PB in 24h would average approx. 450Gbps • Assuming we do not get a large discount, list price: • AWS: $280k • GCP: $320k • Azure: $400k • Assuming we can get AWSDirect to CERN (and that I understand the pricing right): • AWS: $120k • Don’t understand yet the peering pricing for the other providers
  • 18.
  • 19.
    Cloud TransAtlantic networkinglike compute • Cloud providers have plenty of high-speed TransAtlantic networking capacity • But it comes at a cost • Story very similar to what you see in Cloud computing • Plenty of capacity, but at a cost • I believe same approach should be taken • Use “on-prem” resources when possible, use Cloud for bursts • For networking, this means the ESNet TransAtlantic link
  • 20.
    Acknowledgments • This workhas been partially sponsored by NSF grants OAC-1826967, OAC-1541349, OAC-1841530, OAC-1836650, MPS-1148698, OAC-1941481, OPP-1600823 and OAC-190444. • I kindly thank Amazon, Microsoft and Google for providing Cloud credits that covered most of the incurred Cloud costs