Scheduling in Linux and Web Servers

Scheduling in Linux
and Web Servers

Plan for Today
Scheduling in Linux (2002-today)
Scheduling Web Services
Submitting PS3:
- Schedule demo (sign up soon!)
- Web submission form (11:59pm tomorrow)
- Benchmark submission
- Post-demo assessment (teammate evaluation)
leaderboard.html
1

Linux Scheduler before V2.6 (2002)
Three types of processes:
#define SCHED_OTHER 0 Normal user processes
#define SCHED_FIFO
1 Non-pre-ementable
#define SCHED_RR
2 Real-time round-robin
Not (fully) pre-emptive:
only user-level processes could be pre-empted
Select next process according to “goodness” function
3

/* linux/kernel/sched.c
* This is the function that decides how desirable a process is.
* You can weigh different processes against each other depending
* on what CPU they've run on lately etc to try to handle cache
* and TLB miss penalties.
*
* Return values:
* -1000: never select this
* 0: out of time, recalculate counters (but it might still be selected)
* +ve: "goodness" value (the larger, the better)
* +1000: realtime process, select this.
*/
static inline int goodness(struct task_struct * p, int this_cpu, struct
mm_struct *this_mm)
{
int weight;
/*
* Realtime process, select the first one on the
* runqueue (taking priorities within processes
* into account).
*/
if (p->policy != SCHED_OTHER) {
weight = 1000 + p->rt_priority;
goto out;
}
/*
* Give the process a first-approximation goodness value
* according to the number of clock-ticks it has left.
*
* Don't do any other calculations if the time slice is
* over..
*/

/* linux/kernel/sched.c
* This is the function that decides how desirable a process is.
* You can weigh different processes against each other depending
* on what CPU they've run on lately etc to try to handle cache
* and TLB miss penalties.
*
* Return values:
* -1000: never select this
* 0: out of time, recalculate counters (but it might still be selected)
* +ve: "goodness" value (the larger, the better)
* +1000: realtime process, select this.
*/
static inline int goodness(struct task_struct * p, int this_cpu,
struct mm_struct *this_mm)
{
…
4

static inline int goodness(struct task_struct * p, int this_cpu, struct mm_struct *this_mm)
{
int weight;
/* Realtime process, select the first one on the runqueue (taking priorities into account). */
if (p->policy != SCHED_OTHER) {
weight = 1000 + p->rt_priority;
goto out;
}
/* Give the process a first-approximation goodness value according to the number of clock-ticks it has left.
Don't do any other calculations if the time slice is over.. */
weight = p->counter;
if (!weight) goto out;
#ifdef __SMP__
/* Give a largish advantage to the same processor... (equivalent to penalizing other processors) */
if (p->processor == this_cpu) weight += PROC_CHANGE_PENALTY;
#endif
/* .. and a slight advantage to the current MM (memory segment) */
if (p->mm == this_mm) weight += 1;
weight += p->priority;
out:
This is the whole goodness function from V2.5
return weight;
scheduler (only edited formatting to fit on slide).
}

5

What is the running time of the
Linux 2.2-2.5 Scheduler?

6

What is the
running time of
the Linux 2.2-2.5
Scheduler?

7

Linux 2.6 Scheduler (2003-2007)
140 different queues (for each processor)
0-99 for “real time” processes
100-139 for “normal” processes

Bit vector keeps track of which queues have ready
to run process
Scheduler picks first process from highest priority
queue with a ready process
Given time quantum that scales with priority
9

Linux 2.6 Scheduler
(2003-2007)

struct runqueue {
struct prioarray *active;
struct prioarray *expired;
struct prioarray arrays[2];
140 different queues (for
};
each processor)
struct prioarray {
0-99 for “real time” processes
int nr_active; /* # Runnable */
100-139 for “normal” processes unsigned long bitmap[5];
struct list_head queue[140];
Bit vector of ready-to-run
};
Scheduler picks first process from highest-priority queue with a ready process
10

Linux 2.6 Scheduler?

11

(Sadly, O(1) scheduler has no Facebook page.)
12

Rotating Staircase Deadline Scheduler

This is exactly stride scheduling (but with different terminology)!
14

Linux 2.6.23+ Scheduler?

Not called the θ(log N) scheduler – by Linux 2.6.23 marketing
matters: “Completely Fair Scheduler”
19

(In practice) What is log2 N?
20

What resources should scheduler
be maximizing utility of?

21

Key Resource: Energy!

Image from http://arstechnica.com/apple/2013/10/os-x-10-9/12/
22

23

24

25

Timer Coalescing

Images from http://arstechnica.com/apple/2013/06/how-os-x-mavericks-works-its-power-saving-magic/

26

OS Schedulers Recap
Use Resources Well
Limit unnecessary switching, Save Energy
Low cost of scheduler itself
Make good decisions
Locally: pick the most important process
Globally: provide good system performance
27

Web Server Overload!

healthcare.gov

Rate of incoming requests > Rate server can process requests
29

“When the meetings ended at a CMS outpost
in Herndon, Va., at about 7:00 p.m., the rescue
squad already on the scene realized they had
more work to do. One of the things that
shocked Burt and Park’s team most—“among
many jaw-dropping aspects of what we found,”
as one put it—was that the people running
HealthCare.gov had no “dashboard,” no quick
way for engineers to measure what was going
on at the website, such as how many people
were using it, what the response times were
for various click-throughs and where traffic
was getting tied up. So late into the night of
Oct. 18, Burt and the others spent about five
hours coding and putting up a dashboard.”
32

Developer Benchmarks
• Find bottlenecks: know what to spend time
optimizing
• Measure impact of changes
• Predict what resources you will need to scale
service
Goal is a benchmark that represents the actual usage
33

Strategy 1:
Shrink and Simplify Your Content

34

5 September 2001

11 September 2001

archive.org captures of New York Times (http://www.nytimes.com)
35

11 September 2001

5 September 2001

37

Strategy 2:
Cache to Save Effort

38

“Looking over the dashboard that Park, Burt and the
others had rigged up the prior Friday night, Abbott and
the group discovered what they thought was the
lowest-hanging fruit--a quick fix to an obvious mistake
that could improve things immediately. HealthCare.gov
had been constructed so that every time a user had to
get information from the website's vast database, the
website had to make what's called a query into that
database. … The team began almost immediately to
cache the data. The result was encouraging: the site's
overall response time--the time it took a page to load-dropped on the evening of Oct. 22 from eight seconds
to two. That was still terrible, of course, but it
represented such an improvement that it cheered the
engineers. They could see that HealthCare.gov could be
saved instead of scrapped.”
40

Strategy 3:
Buy (or Rent) More Servers

41

Amazon’s
Elastic
Compute
Cloud
(EC2)
42

“A series of hardware upgrades had
dramatically increased capacity; the
system was now able to handle at least
50,000 simultaneous users and probably
more. There had been more than 400 bug
fixes. Uptimes had gone from an abysmal
43% at the beginning of November to 95%.
And Kim and her team had knocked the
error rate from 6% down to 0.5%. (By the
end of January it would be below 0.5%
and still dropping.)”
45

Using More Servers
Server 1

Dispatcher

Server 2

Server 3

46

Sharing State
Server 1

Dispatcher

Server 2

Database

Server 3

47

Distributed Database
Server 1

Dispatcher

Database

Database
Server 2
Database
Server 3
Database
48

Maintaining Consistency
Server 1

Dispatcher

Database

Database
Server 2
Database
Server 3
Database
49

1. Replication
Database
Reads are efficient
Server 1
Writes are complex and risky
2. Vertical Partitioning
Database
Dispatcher
Split database by columns Server 2
3. Horizontal Partitioning (“Sharding”)
Database
Split database by rows
Server 3
4. Give up on consistency and functionality
“NoSQL” (e.g., Cassandra, MongoDB, BigTable)
Database
50

Scalable Enough?
Server 1

Dispatcher

Database

Database
Server 2
Database
Server 3
Database
51

Distributed Denial-of-Service
Server 1

Dispatcher

Database

Database
Server 2
Database
Server 3

x 2000 machines
Botnet

Database
52

Strategy 4:
Smarter Scheduling

55

What should the server’s goal be?

56

What is the bottleneck resource?
Zhtta

Disk (files)

Cache

57

Connecting to the Network

ISP
Router

zhtta
Cache

Disk (files)

58

Cisco Nexus 7000 (~$100K)
48 Gb/s per slot x 10

10 Gb/s x 4 per switch

Your server
250 Mbits/s
$20/month
59

Shortest Remaining Processing Time-first
60

How close to this can you get for PS3?

61

Charge
Measurement (“dashboard”) is essential for
improving performance
Important to measure the right things!

Scheduling policies:
Avoid wasting resources
Make trade-offs that align with system goals

PS3 Due tomorrow (Wednesday) at 11:59pm
If you haven’t already scheduled your demo, do so now!
62

Scheduling in Linux and Web Servers

More Related Content

What's hot

Similar to Scheduling in Linux and Web Servers

More from David Evans

Recently uploaded

Scheduling in Linux and Web Servers

Editor's Notes