Caching	
  The	
  
Uncacheable:	
  
	
  
Leveraging	
  Your	
  CDN	
  to	
  Cache	
  Dynamic	
  
Content	
  
Hooman	
  Behesh+,	
  VP	
  Technology	
  
Dynamic	
  Content	
  Is	
  Really	
  
Interes=ng!	
  
What	
  Is	
  Dynamic	
  Content?	
  
•  Stuff	
  that’s	
  not	
  sta+c!	
  
•  With	
  web	
  traffic,	
  generally	
  the	
  base	
  HTML	
  
–  Big	
  deal	
  because	
  it’s	
  blocking	
  
–  And	
  some+mes	
  the	
  largest	
  object	
  à	
  longer	
  download	
  
•  Could	
  be	
  other	
  things	
  too	
  
–  AJAX	
  calls	
  
–  API	
  calls	
  
•  More…	
  
Blocking	
  
Classically,	
  with	
  dynamic	
  content…	
  
Caching	
  
Dynamic	
  Content	
  Caching	
  Problems	
  
•  Serving	
  stale	
  pages	
  
– Lack	
  of	
  good	
  invalida+on	
  framework	
  
	
  
Caching	
  	
  
vs.	
  	
  
Invalida=on	
  
We	
  tried…	
  
Dynamic	
  Content	
  Caching	
  Problems	
  
•  Serving	
  stale	
  pages	
  
– Lack	
  of	
  good	
  invalida+on	
  framework	
  
	
  
Dynamic	
  Content	
  Caching	
  Problems	
  
•  Serving	
  stale	
  pages	
  
– Lack	
  of	
  good	
  invalida+on	
  framework	
  
•  Real-­‐+me	
  visibility	
  
– Real-­‐+me	
  analy+cs/stats	
  
– Real-­‐+me	
  logging	
  
	
  
CDNs	
  and	
  Dynamic	
  Content	
  
•  Generally,	
  handling	
  dynamic	
  content	
  has	
  been	
  
a	
  maRer	
  of	
  transport	
  
– Op+mize	
  from-­‐origin	
  delivery	
  
– “DSA”	
  (Dynamic	
  Site	
  Accelera+on)	
  
– Middle	
  mile	
  op+miza+ons	
  
– TCP	
  tweaks	
  
Dynamic	
  Content,	
  Tradi=onally	
  
CDN	
  Node	
  
Client	
  
Origin	
  
Some	
  TCP	
  Tweaks	
  
Dynamic	
  Content,	
  Tradi=onally	
  
CDN	
  Node	
   CDN	
  Node	
  
Client	
  
Origin	
  
Lots	
  of	
  TCP	
  Tweaks	
  
Dynamic	
  Content,	
  Tradi=onally	
  
•  We	
  some+mes	
  do	
  micro	
  caching	
  of	
  HTML	
  
–  Short	
  TTL	
  for	
  HTML	
  content	
  
–  Not	
  full	
  proof	
  
•  Ex:	
  news	
  stories	
  faux-­‐pas!	
  
•  ESI	
  (Edge	
  Side	
  Includes)	
  
–  Par+al	
  caching	
  
–  Hard	
  and	
  onerous	
  
	
  
Actually…	
  
•  Dynamic	
  content	
  is	
  more	
  cacheable	
  than	
  we	
  
think	
  
•  Sta+c	
  for	
  short	
  periods	
  of	
  +me	
  
•  Unpredictable	
  invalida+on	
  
–  Standard	
  HTTP	
  caching	
  rules	
  aren’t	
  good	
  enough	
  
A	
  Lot	
  BeMer!	
  
CDN	
  Node	
   CDN	
  Node	
  
Client	
  
Origin	
  
Blocking	
  
So	
  Many	
  Benefits!	
  
•  Performance	
  
–  Faster	
  +me	
  to	
  first	
  byte	
  
–  Faster	
  start	
  render	
  
–  Happy	
  users!	
  
•  Offload	
  
–  Less	
  work	
  for	
  our	
  servers	
  
–  Less	
  bandwidth	
  at	
  origin	
  
What	
  would	
  make	
  it	
  beMer?	
  
Programma=c	
  Invalida=on	
  
•  Invalida+on	
  API	
  
•  Granular	
  
•  Instantaneous	
  
– Big	
  problem	
  with	
  classic	
  CDNs	
  (mul+-­‐minute	
  
purges)	
  
Power	
  of	
  the	
  Purge!	
  
•  Instant	
  purging:	
  
– As	
  a	
  page	
  gets	
  published,	
  a	
  purge	
  command	
  also	
  
gets	
  published	
  
– Instant	
  means:	
  predictable	
  and	
  determinis+c	
  
behavior	
  
	
  
Power	
  of	
  the	
  Purge!	
  
•  Purge	
  dependencies	
  
– Surrogate	
  Keys	
  
– Using	
  tags	
  to	
  purge	
  en+re	
  chunks	
  of	
  content	
  at	
  
once	
  
	
  
More	
  than	
  just	
  Invalida=on…	
  
The	
  Influence	
  of	
  Clouds	
  
•  The	
  CDN	
  is	
  an	
  extension	
  of	
  the	
  app	
  
•  No	
  longer	
  a	
  black	
  box	
  
•  Real-­‐+me	
  integra+on	
  with	
  the	
  app	
  
•  Infrastructure	
  as	
  code	
  
– Your	
  content	
  =>	
  You	
  need	
  control	
  
Control	
  
•  Programmability	
  
– Configura+on	
  API	
  
– Invalida+on	
  API	
  
– Instantaneous	
  and	
  real	
  +me	
  
– Granular	
  caching	
  
•  Ex:	
  Geo-­‐based	
  caching	
  
Control	
  at	
  the	
  Edge	
  
•  Moving	
  app	
  logic	
  to	
  the	
  edge	
  
•  VCL	
  
– Varnish	
  Configura+on	
  Language	
  
– Script-­‐like	
  configura+on	
  for	
  func+onality	
  at	
  the	
  
edge	
  
Visibility	
  
•  Real	
  +me	
  analy+cs	
  
–  Network	
  stats	
  
–  HTTP	
  stats	
  (status	
  codes	
  ,	
  etc)	
  
–  Caching	
  stats	
  (hits,	
  misses,	
  etc)	
  
–  Stats	
  API	
  
•  Logging	
  
–  Real	
  +me	
  logs	
  
–  Streaming	
  to	
  various	
  log	
  des+na+ons	
  
Example:	
  
CMS	
  +	
  Purge	
  
WordPress:	
  Before	
  
CDN	
  Node	
  
WordPress:	
  Before	
  
CDN	
  Node	
  
WordPress:	
  Before	
  
CDN	
  Node	
  
WordPress:	
  Before	
  
CDN	
  Node	
  
WordPress:	
  Before	
  
CDN	
  Node	
  
Cache	
  
WordPress:	
  AWer	
  
CDN	
  Node	
  
WordPress:	
  AWer	
  
CDN	
  Node	
  
HTTP/1.1 200 OK	
Content-Type: text/html	
Content-Length: 55,666	
Cache-Control: Long Time, totally!
WordPress:	
  AWer	
  
CDN	
  Node	
  
WordPress:	
  AWer	
  
CDN	
  Node	
  
WordPress:	
  AWer	
  
CDN	
  Node	
  
PURGE	
  
WordPress:	
  AWer	
  
CDN	
  Node	
  
PURGE	
  
WordPress:	
  AWer	
  
CDN	
  Node	
  
PURGE	
  
(Has	
  to	
  be	
  instantaneous!)	
  
WordPress:	
  AWer	
  
CDN	
  Node	
  
HTTP/1.1 200 OK	
Content-Type: text/html	
Content-Length: 55,666	
Cache-Control: Long Time, totally!
Example:	
  
Beacon	
  Termina=on	
  at	
  the	
  Edge	
  
Before	
  
CDN	
  Node	
  
Origin	
  
Log	
  Analysis	
  
http://collector.site.com/beacon.img?a=1&b=2&c=3
Before	
  
CDN	
  Node	
  
Origin	
  
Log	
  Analysis	
  
http://collector.site.com/beacon.img?a=1&b=2&c=3
Before	
  
CDN	
  Node	
  
Origin	
  
Log	
  Analysis	
  
HTTP/1.1 200 OK	
Pragma: no-cache	
Expires: Wed, 19 Apr 2000 11:43:00 GMT	
Cache-Control: no-cache, no-store	
Last-Modified: Wed, 21 Jan 2004 19:51:30 GMT	
Content-Type: image/gif	
Date: Fri, 20 Jun 2014 12:22:20 GMT	
Server: Apache	
Content-Length: 35	
http://collector.site.com/beacon.img?a=1&b=2&c=3
Before	
  
CDN	
  Node	
  
Origin	
  
Log	
  Analysis	
  
HTTP/1.1 200 OK	
Pragma: no-cache	
Expires: Wed, 19 Apr 2000 11:43:00 GMT	
Cache-Control: no-cache, no-store	
Last-Modified: Wed, 21 Jan 2004 19:51:30 GMT	
Content-Type: image/gif	
Date: Fri, 20 Jun 2014 12:22:20 GMT	
Server: Apache	
Content-Length: 35	
http://collector.site.com/beacon.img?a=1&b=2&c=3
AWer	
  
CDN	
  Node	
  
Origin	
  
http://collector.site.com/beacon.img?a=1&b=2&c=3
AWer	
  
CDN	
  Node	
  
Origin	
  
HTTP/1.1 200 OK	
Pragma: no-cache	
Expires: Wed, 19 Apr 2000 11:43:00 GMT	
Cache-Control: no-cache, no-store	
Last-Modified: Wed, 21 Jan 2004 19:51:30 GMT	
Content-Type: image/gif	
Date: Fri, 20 Jun 2014 12:22:20 GMT	
Server: Apache	
Content-Length: 35	
http://collector.site.com/beacon.img?a=1&b=2&c=3
AWer	
  
CDN	
  Node	
  
Origin	
  
HTTP/1.1 204 No Content	
Date: Sat, 21 Jun 2014 23:21:12 GMT	
Server: Awesome Server	
Content-Length: 0	
http://collector.site.com/beacon.img?a=1&b=2&c=3
AWer	
  
CDN	
  Node	
  
Origin	
  
Syslog	
  /	
  S3	
  /	
  FTP/etc	
  
http://collector.site.com/beacon.img?a=1&b=2&c=3
Example:	
  
Edge-­‐generated	
  Content	
  
JSON	
  Data	
  Center	
  ID	
  
CDN	
  Node	
  
Origin	
  
http://www.site.com/which_datacenter.js
JSON	
  Data	
  Center	
  ID	
  
CDN	
  Node	
  
Origin	
  
{ ‘datacenter’ : ‘SJC’ }	
http://www.site.com/which_datacenter.js
VCL	
  Snippet	
  
More	
  Examples	
  
•  Caching	
  with	
  tracking	
  cookies:	
  
–  hRp://www.fastly.com/blog/how-­‐to-­‐cache-­‐with-­‐
tracking-­‐cookies	
  
•  API	
  Caching:	
  
–  hRp://www.fastly.com/blog/api-­‐caching-­‐part-­‐iii	
  (part	
  
3,	
  with	
  links	
  to	
  previous	
  two	
  parts)	
  
•  Log	
  Streaming:	
  
–  hRp://www.fastly.com/blog/+ps-­‐for-­‐streaming-­‐logs	
  
	
  
Let’s	
  Sum	
  Up!	
  
Summary	
  
•  Dynamic	
  content	
  can	
  be	
  cached	
  
–  We	
  need	
  instant	
  purging	
  
–  We	
  need	
  real-­‐+me	
  logs	
  and	
  stats	
  
•  Real-­‐+me	
  integra+on	
  of	
  our	
  CDN	
  with	
  our	
  app	
  is	
  cool!	
  
–  Extensive/granular	
  API	
  to	
  control	
  the	
  CDN	
  
–  Control	
  and	
  visibility	
  at	
  the	
  edge	
  lets	
  us	
  be	
  really	
  crea+ve	
  
•  Never	
  use	
  “Long Time, totally!” in	
  a	
  Cache-Control
header!	
  
	
  
Thank	
  you!	
  
hooman@fastly.com	
  

Caching the Uncacheable: Leveraging Your CDN to Cache Dynamic Content

  • 1.
    Caching  The   Uncacheable:     Leveraging  Your  CDN  to  Cache  Dynamic   Content   Hooman  Behesh+,  VP  Technology  
  • 2.
    Dynamic  Content  Is  Really   Interes=ng!  
  • 3.
    What  Is  Dynamic  Content?   •  Stuff  that’s  not  sta+c!   •  With  web  traffic,  generally  the  base  HTML   –  Big  deal  because  it’s  blocking   –  And  some+mes  the  largest  object  à  longer  download   •  Could  be  other  things  too   –  AJAX  calls   –  API  calls   •  More…  
  • 5.
  • 8.
    Classically,  with  dynamic  content…   Caching  
  • 9.
    Dynamic  Content  Caching  Problems   •  Serving  stale  pages   – Lack  of  good  invalida+on  framework    
  • 10.
    Caching     vs.     Invalida=on  
  • 14.
  • 15.
    Dynamic  Content  Caching  Problems   •  Serving  stale  pages   – Lack  of  good  invalida+on  framework    
  • 16.
    Dynamic  Content  Caching  Problems   •  Serving  stale  pages   – Lack  of  good  invalida+on  framework   •  Real-­‐+me  visibility   – Real-­‐+me  analy+cs/stats   – Real-­‐+me  logging    
  • 17.
    CDNs  and  Dynamic  Content   •  Generally,  handling  dynamic  content  has  been   a  maRer  of  transport   – Op+mize  from-­‐origin  delivery   – “DSA”  (Dynamic  Site  Accelera+on)   – Middle  mile  op+miza+ons   – TCP  tweaks  
  • 18.
    Dynamic  Content,  Tradi=onally   CDN  Node   Client   Origin   Some  TCP  Tweaks  
  • 19.
    Dynamic  Content,  Tradi=onally   CDN  Node   CDN  Node   Client   Origin   Lots  of  TCP  Tweaks  
  • 20.
    Dynamic  Content,  Tradi=onally   •  We  some+mes  do  micro  caching  of  HTML   –  Short  TTL  for  HTML  content   –  Not  full  proof   •  Ex:  news  stories  faux-­‐pas!   •  ESI  (Edge  Side  Includes)   –  Par+al  caching   –  Hard  and  onerous    
  • 21.
    Actually…   •  Dynamic  content  is  more  cacheable  than  we   think   •  Sta+c  for  short  periods  of  +me   •  Unpredictable  invalida+on   –  Standard  HTTP  caching  rules  aren’t  good  enough  
  • 22.
    A  Lot  BeMer!   CDN  Node   CDN  Node   Client   Origin  
  • 24.
  • 27.
    So  Many  Benefits!   •  Performance   –  Faster  +me  to  first  byte   –  Faster  start  render   –  Happy  users!   •  Offload   –  Less  work  for  our  servers   –  Less  bandwidth  at  origin  
  • 28.
    What  would  make  it  beMer?  
  • 29.
    Programma=c  Invalida=on   • Invalida+on  API   •  Granular   •  Instantaneous   – Big  problem  with  classic  CDNs  (mul+-­‐minute   purges)  
  • 30.
    Power  of  the  Purge!   •  Instant  purging:   – As  a  page  gets  published,  a  purge  command  also   gets  published   – Instant  means:  predictable  and  determinis+c   behavior    
  • 31.
    Power  of  the  Purge!   •  Purge  dependencies   – Surrogate  Keys   – Using  tags  to  purge  en+re  chunks  of  content  at   once    
  • 32.
    More  than  just  Invalida=on…  
  • 33.
    The  Influence  of  Clouds   •  The  CDN  is  an  extension  of  the  app   •  No  longer  a  black  box   •  Real-­‐+me  integra+on  with  the  app   •  Infrastructure  as  code   – Your  content  =>  You  need  control  
  • 34.
    Control   •  Programmability   – Configura+on  API   – Invalida+on  API   – Instantaneous  and  real  +me   – Granular  caching   •  Ex:  Geo-­‐based  caching  
  • 35.
    Control  at  the  Edge   •  Moving  app  logic  to  the  edge   •  VCL   – Varnish  Configura+on  Language   – Script-­‐like  configura+on  for  func+onality  at  the   edge  
  • 36.
    Visibility   •  Real  +me  analy+cs   –  Network  stats   –  HTTP  stats  (status  codes  ,  etc)   –  Caching  stats  (hits,  misses,  etc)   –  Stats  API   •  Logging   –  Real  +me  logs   –  Streaming  to  various  log  des+na+ons  
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
    WordPress:  Before   CDN  Node   Cache  
  • 44.
  • 45.
    WordPress:  AWer   CDN  Node   HTTP/1.1 200 OK Content-Type: text/html Content-Length: 55,666 Cache-Control: Long Time, totally!
  • 46.
  • 47.
  • 48.
    WordPress:  AWer   CDN  Node   PURGE  
  • 49.
    WordPress:  AWer   CDN  Node   PURGE  
  • 50.
    WordPress:  AWer   CDN  Node   PURGE   (Has  to  be  instantaneous!)  
  • 51.
    WordPress:  AWer   CDN  Node   HTTP/1.1 200 OK Content-Type: text/html Content-Length: 55,666 Cache-Control: Long Time, totally!
  • 52.
  • 53.
    Before   CDN  Node   Origin   Log  Analysis   http://collector.site.com/beacon.img?a=1&b=2&c=3
  • 54.
    Before   CDN  Node   Origin   Log  Analysis   http://collector.site.com/beacon.img?a=1&b=2&c=3
  • 55.
    Before   CDN  Node   Origin   Log  Analysis   HTTP/1.1 200 OK Pragma: no-cache Expires: Wed, 19 Apr 2000 11:43:00 GMT Cache-Control: no-cache, no-store Last-Modified: Wed, 21 Jan 2004 19:51:30 GMT Content-Type: image/gif Date: Fri, 20 Jun 2014 12:22:20 GMT Server: Apache Content-Length: 35 http://collector.site.com/beacon.img?a=1&b=2&c=3
  • 56.
    Before   CDN  Node   Origin   Log  Analysis   HTTP/1.1 200 OK Pragma: no-cache Expires: Wed, 19 Apr 2000 11:43:00 GMT Cache-Control: no-cache, no-store Last-Modified: Wed, 21 Jan 2004 19:51:30 GMT Content-Type: image/gif Date: Fri, 20 Jun 2014 12:22:20 GMT Server: Apache Content-Length: 35 http://collector.site.com/beacon.img?a=1&b=2&c=3
  • 57.
    AWer   CDN  Node   Origin   http://collector.site.com/beacon.img?a=1&b=2&c=3
  • 58.
    AWer   CDN  Node   Origin   HTTP/1.1 200 OK Pragma: no-cache Expires: Wed, 19 Apr 2000 11:43:00 GMT Cache-Control: no-cache, no-store Last-Modified: Wed, 21 Jan 2004 19:51:30 GMT Content-Type: image/gif Date: Fri, 20 Jun 2014 12:22:20 GMT Server: Apache Content-Length: 35 http://collector.site.com/beacon.img?a=1&b=2&c=3
  • 59.
    AWer   CDN  Node   Origin   HTTP/1.1 204 No Content Date: Sat, 21 Jun 2014 23:21:12 GMT Server: Awesome Server Content-Length: 0 http://collector.site.com/beacon.img?a=1&b=2&c=3
  • 60.
    AWer   CDN  Node   Origin   Syslog  /  S3  /  FTP/etc   http://collector.site.com/beacon.img?a=1&b=2&c=3
  • 61.
  • 62.
    JSON  Data  Center  ID   CDN  Node   Origin   http://www.site.com/which_datacenter.js
  • 63.
    JSON  Data  Center  ID   CDN  Node   Origin   { ‘datacenter’ : ‘SJC’ } http://www.site.com/which_datacenter.js
  • 64.
  • 65.
    More  Examples   • Caching  with  tracking  cookies:   –  hRp://www.fastly.com/blog/how-­‐to-­‐cache-­‐with-­‐ tracking-­‐cookies   •  API  Caching:   –  hRp://www.fastly.com/blog/api-­‐caching-­‐part-­‐iii  (part   3,  with  links  to  previous  two  parts)   •  Log  Streaming:   –  hRp://www.fastly.com/blog/+ps-­‐for-­‐streaming-­‐logs    
  • 66.
  • 67.
    Summary   •  Dynamic  content  can  be  cached   –  We  need  instant  purging   –  We  need  real-­‐+me  logs  and  stats   •  Real-­‐+me  integra+on  of  our  CDN  with  our  app  is  cool!   –  Extensive/granular  API  to  control  the  CDN   –  Control  and  visibility  at  the  edge  lets  us  be  really  crea+ve   •  Never  use  “Long Time, totally!” in  a  Cache-Control header!    
  • 68.