Monitoring Docker Containers
&
Dockerized Applications
Anantha Padmanabhan CB (@cbananth)
Rahul Krishna Upadhyaya (@rakrup_)
Satya Sanjivani Routray (@er_sanj007)
Meenakshi Sundaram Lakshmanan (@lxmeenakshi1)
Cloud and Network Solutions
Cisco Systems Inc.
Agenda
• Introduction
• Monitoring Containers - Challenges
• Approach
• Design
• Demo
• Q&A
Containers – Introduction
• Containers virtualize the OS just like hypervisors virtualizes the hardware
• Containers enable any payload to be encapsulated as a lightweight, Portable self-
sufficient container, that can be manipulated using standard operations and run
consistently on any hardware platform.
• Wraps up a piece of software in a complete filesystem that contains everything it
needs to run such as : code, runtime, system tools, libraries etc., they share the OS
kernel and bins/libs where needed, otherwise each of them operate in a self
contained environment.
Containers – Introduction
• Docker, LXCs are some of the most popular implementations
of containers today.
• Can be run on any Linux Server - VMs, physical Hosts,
openstack..
• Ability to move around between machines without any
modification
• Ability of containers to work together.
Monitoring Containers - Challenges
• Traditionally Monitoring brings to mind, Monitoring of the infrastructure – Server,
Networks and Monitoring the Apps which run on them.
• In the world of containers – monitoring infrastructure alone or Application alone may
not be able to provide the full picture.
• Complete Monitoring = (App + software defined components/devices + Infra)
• Challenges with the monitoring tools are
– Vast set of monitoring tools to collect various statistics
– Each tool gives different set of attributes in different format
– Data collection tools may tend to overload the container itself, making the
statistics inaccurate.
– Differentiating metrics for containers that are related and share resources
– More than everything, lot of computation is required to come up with meaningful
inferences from all the data that is collected
Monitoring Containers - Challenges
• Categorizing container utilization and statistics for multitenant applications is
complex
• Different applications provide different format of logs
• Identifying failure points of applications
• Analyzing the interconnectivity between applications in different containers, hosts
or regions.
• Assessing the response time of application is complicated in a web based cloud
application, since there are lot of other parameters (region, internet speed) which
could influence response time
• Clustered applications might require monitoring all the instances to identify the
faulty node
Monitoring Containers - Approach
• Apps are embedded within the containers which are in turn within a VM or
physical host
• Containerization requires monitoring at these different levels in order to collect
complete statistics
• Containers can be linked – ability to monitor and make sense of statistics from
linked containers becomes critical.
• Ability to intelligently correlate collected data in the context of App  Container
 Host relation
• Abstraction of monitoring methods and data in order to enable integration with
any monitoring tool of choice.
• Ability to do proactive, reactive and adaptive monitoring.
Monitoring at different levels
• Host
• Container
• Application
• Cluster
What to Monitor?
• Following are the major set of parameters which can be monitored
– CPU
• total_usage
• per_cpu_usage
• system_usage
• host_usage
• load_average etc.,
– Memory
• mem_pgfault
• mem_usage
• mem_cache
• mem_kernel etc.,
What to Monitor
– Disk
• total_bytes
• bytes_read
• bytes_written
• bytes_async
• bytes_sync etc.,
– Network
• rxbytes
• rxpackets
• rxdropped
• rxerrors
• txbytes
• txerrors etc.,
• Intelligently correlate the collected data that is monitored at different
levels mentioned earlier.
• Enable queries and filters to make meaningful inferences from the raw
data
How to Monitor?
Monitoring Strategy
• Proactive :
– Prevent failure situations
• Reactive :
– Raise events and alerts when failures occur.
• Adaptive :
– Automatically monitor new components and model statistics
What to use when? How?
Different levels need different type of monitoring strategy
Design Objectives
• Not overloading the Docker Daemon.
• Different approaches of monitoring at different
levels.
• Modular & Driver based approach for all possible
components
• Running multiple agent drivers simultaneously.
• Added considerations for Linked/Clustered
Containers
High Level Component Design
Data
StorageIQ
Agent
Engine
API (REST)
CLIUIRest Client
Queue
Agent
Agent
Hos
t
Hos
t
Hos
t
C
C
C
C
C
C
C
C
C
Monitoring Controller
Functions
Host
Container
Apps
Model
&
Process
Data
Store
Collect Data /Logs
Analyze
Agent
Container
Apps
Host
Agent Driver
Driver
Driver
Queue
Dump to
Queue
Logs & Stats
Logs & Stats
Logs&Stats
ToEngine
Agent
• One Agent per host
• Agent monitors the host, containers on that host, applications on these
containers
• Agent send & receive to the engine in a async model using queues.
• Driver based log/stats collection can be done for
host/application/containers.
• Drivers based on tool of choice of user for stats/log collection can be used
for each/multiple for hosts/applications/containers.
• More than one driver can run in parallel to collect even more diverse
params.
• Takes care of sanity of data collected to conform to the data-model in the
engine.
Monitoring controller
• Logical grouping of components
• REST API to be connected via CLI, UI or any other REST-client
• Driver based storage module that uses any columnar database
• IQ module that provide intelligent predictions
• Engine
– Aggregate stats & logs from different Docker Hosts.
– Integration with Identity providers (like keystone) for supporting multitenant
deployments
– Communication from agents via asynchronous queues.
– Grouping & Processing of data based on use-cases.
IQ Module
• Log & stats collected and stored make up a lot of unstructured data.
• Meaningful Inferences out of this data would be of better value to the user.
• Analytic tools like pandas, scipy planned be used to derive inteferences.
• Error predictions, usage/load pattern, capacity planning can be direct output.
• Suggestions regarding infra would be output for this module.
Agent driver configuration
Containers monitored
New container spawned
Adaptively monitored
Sample parameters
Sample graphs
Thank You.

Monitoring docker-container-and-dockerized-applications

  • 1.
    Monitoring Docker Containers & DockerizedApplications Anantha Padmanabhan CB (@cbananth) Rahul Krishna Upadhyaya (@rakrup_) Satya Sanjivani Routray (@er_sanj007) Meenakshi Sundaram Lakshmanan (@lxmeenakshi1) Cloud and Network Solutions Cisco Systems Inc.
  • 2.
    Agenda • Introduction • MonitoringContainers - Challenges • Approach • Design • Demo • Q&A
  • 3.
    Containers – Introduction •Containers virtualize the OS just like hypervisors virtualizes the hardware • Containers enable any payload to be encapsulated as a lightweight, Portable self- sufficient container, that can be manipulated using standard operations and run consistently on any hardware platform. • Wraps up a piece of software in a complete filesystem that contains everything it needs to run such as : code, runtime, system tools, libraries etc., they share the OS kernel and bins/libs where needed, otherwise each of them operate in a self contained environment.
  • 4.
    Containers – Introduction •Docker, LXCs are some of the most popular implementations of containers today. • Can be run on any Linux Server - VMs, physical Hosts, openstack.. • Ability to move around between machines without any modification • Ability of containers to work together.
  • 5.
    Monitoring Containers -Challenges • Traditionally Monitoring brings to mind, Monitoring of the infrastructure – Server, Networks and Monitoring the Apps which run on them. • In the world of containers – monitoring infrastructure alone or Application alone may not be able to provide the full picture. • Complete Monitoring = (App + software defined components/devices + Infra) • Challenges with the monitoring tools are – Vast set of monitoring tools to collect various statistics – Each tool gives different set of attributes in different format – Data collection tools may tend to overload the container itself, making the statistics inaccurate. – Differentiating metrics for containers that are related and share resources – More than everything, lot of computation is required to come up with meaningful inferences from all the data that is collected
  • 6.
    Monitoring Containers -Challenges • Categorizing container utilization and statistics for multitenant applications is complex • Different applications provide different format of logs • Identifying failure points of applications • Analyzing the interconnectivity between applications in different containers, hosts or regions. • Assessing the response time of application is complicated in a web based cloud application, since there are lot of other parameters (region, internet speed) which could influence response time • Clustered applications might require monitoring all the instances to identify the faulty node
  • 7.
    Monitoring Containers -Approach • Apps are embedded within the containers which are in turn within a VM or physical host • Containerization requires monitoring at these different levels in order to collect complete statistics • Containers can be linked – ability to monitor and make sense of statistics from linked containers becomes critical. • Ability to intelligently correlate collected data in the context of App  Container  Host relation • Abstraction of monitoring methods and data in order to enable integration with any monitoring tool of choice. • Ability to do proactive, reactive and adaptive monitoring.
  • 8.
    Monitoring at differentlevels • Host • Container • Application • Cluster
  • 9.
    What to Monitor? •Following are the major set of parameters which can be monitored – CPU • total_usage • per_cpu_usage • system_usage • host_usage • load_average etc., – Memory • mem_pgfault • mem_usage • mem_cache • mem_kernel etc.,
  • 10.
    What to Monitor –Disk • total_bytes • bytes_read • bytes_written • bytes_async • bytes_sync etc., – Network • rxbytes • rxpackets • rxdropped • rxerrors • txbytes • txerrors etc., • Intelligently correlate the collected data that is monitored at different levels mentioned earlier. • Enable queries and filters to make meaningful inferences from the raw data
  • 11.
    How to Monitor? MonitoringStrategy • Proactive : – Prevent failure situations • Reactive : – Raise events and alerts when failures occur. • Adaptive : – Automatically monitor new components and model statistics What to use when? How? Different levels need different type of monitoring strategy
  • 12.
    Design Objectives • Notoverloading the Docker Daemon. • Different approaches of monitoring at different levels. • Modular & Driver based approach for all possible components • Running multiple agent drivers simultaneously. • Added considerations for Linked/Clustered Containers
  • 13.
    High Level ComponentDesign Data StorageIQ Agent Engine API (REST) CLIUIRest Client Queue Agent Agent Hos t Hos t Hos t C C C C C C C C C Monitoring Controller
  • 14.
  • 15.
  • 16.
    Agent • One Agentper host • Agent monitors the host, containers on that host, applications on these containers • Agent send & receive to the engine in a async model using queues. • Driver based log/stats collection can be done for host/application/containers. • Drivers based on tool of choice of user for stats/log collection can be used for each/multiple for hosts/applications/containers. • More than one driver can run in parallel to collect even more diverse params. • Takes care of sanity of data collected to conform to the data-model in the engine.
  • 17.
    Monitoring controller • Logicalgrouping of components • REST API to be connected via CLI, UI or any other REST-client • Driver based storage module that uses any columnar database • IQ module that provide intelligent predictions • Engine – Aggregate stats & logs from different Docker Hosts. – Integration with Identity providers (like keystone) for supporting multitenant deployments – Communication from agents via asynchronous queues. – Grouping & Processing of data based on use-cases.
  • 18.
    IQ Module • Log& stats collected and stored make up a lot of unstructured data. • Meaningful Inferences out of this data would be of better value to the user. • Analytic tools like pandas, scipy planned be used to derive inteferences. • Error predictions, usage/load pattern, capacity planning can be direct output. • Suggestions regarding infra would be output for this module.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 26.