Skip to content

Commit 04b7b03

Browse files
committed
more notes to cd talk
1 parent 23c1693 commit 04b7b03

File tree

1 file changed

+22
-6
lines changed

1 file changed

+22
-6
lines changed

content/posts/171101-continuous-delivery-devops-you.markdown

Lines changed: 22 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -223,32 +223,48 @@ There was a major production issue with the recurring charges in August 2013.
223223
Our engineers were alerted to the errors and the issue blew up on the top of
224224
[Hacker News](https://news.ycombinator.com/), drawing widespread atttention.
225225

226-
So now there is a major production error... what do we do?
226+
So now there is a major production error... what do we do?
227+
228+
(Reader note: this section is primarily audience discussion based on their
229+
own experiences handling these difficult technical situations.)
227230

228231

229232
<img src="/img/171101-devops-cd-you/devops-cd-you.023.jpg" width="100%" class="technical-diagram img-rounded" style="border: 1px solid #aaa" alt="Billing incident update blog post.">
230233

231-
...
234+
One step is to figure out when the problem started and whether or not it
235+
is over. If it's not over, triage the specific problems and start
236+
communicating with customers. Be as accurate and transparent as possible.
232237

233238

234239
<img src="/img/171101-devops-cd-you/devops-cd-you.024.jpg" width="100%" class="technical-diagram img-rounded" style="border: 1px solid #aaa" alt="Redis logo.">
235240

236-
...
241+
The specific technical issue in this case was due to our misconfiguration of
242+
Redis instances.
237243

238244

239245
<img src="/img/171101-devops-cd-you/devops-cd-you.025.jpg" width="100%" class="technical-diagram img-rounded" style="border: 1px solid #aaa" alt="Text that reads 'Root cause?'">
240246

241-
...
247+
We know the particular technical failure was due to our Redis mishandling,
248+
but how do we look past the specific bit and get to a broader understanding
249+
of the processes that caused the issue?
242250

243251

244252
<img src="/img/171101-devops-cd-you/devops-cd-you.026.jpg" width="100%" class="technical-diagram img-rounded" style="border: 1px solid #aaa" alt="Billing incident response from Twilio developer evangelist.">
245253

246-
...
254+
Let's take a look at the resolution of the situation and then learn about
255+
the concepts and tools that could prevent future problems.
256+
257+
In this case, we communicated with our customers as much about the problem
258+
as possible. As a developer-focused company, we were fortunate that by being
259+
transparent about the specific technical issue, many of our customers gained
260+
respect for us because they had also faced similar misconfigurations in their
261+
own environments.
247262

248263

249264
<img src="/img/171101-devops-cd-you/devops-cd-you.027.jpg" width="100%" class="technical-diagram img-rounded" style="border: 1px solid #aaa" alt="Twilio status page.">
250265

251-
...
266+
Twilio became more transparent with the status of services, especially with
267+
showing partial failures and outages.
252268

253269

254270
<img src="/img/171101-devops-cd-you/devops-cd-you.028.jpg" width="100%" class="technical-diagram img-rounded" style="border: 1px solid #aaa" alt="Twilio number of production deployments.">

0 commit comments

Comments
 (0)