Distributed Logging Architecture in the Container Era

Distributed Logging Architecture
in Container Era
LinuxCon Japan 2016 at Jun 13 2016
Satoshi "Moris" Tagomori (@tagomoris)

Satoshi "Moris" Tagomori
(@tagomoris)
Fluentd, MessagePack-Ruby, Norikra, ...
Treasure Data, Inc.

http://www.linuxfoundation.org/news-media/announcements/2016/06/chaosuan-crunchy-data-qbox-storageos-and-treasure-data-join-cloud

Topics
• Microservices and logging in various industries
• Difﬁculties of logging with containers
• Distributed logging architecture
• Patterns of distributed logging architecture
• Case Study: Docker and Fluentd

Logging in Various Industries
• Web access logs
• Views/visitors on media
• Views/clicks on Ads
• Commercial transactions (EC, Game, ...)
• Data from devices
• Operation logs on Apps of phones
• Various sensor data

Microservices and Logging
• Monolithic service
• a service produces all data
about an user's behavior
• Microservices
• many services produce data
about an user's access
• it's needed to collect logs
from many services to know
what is happening
Users
Service (Application)
Logs
Users
Logs

Containers:
"a must" for microservices
• Dividing a service into services
• a service requires less computing resources 
(VM -> containers)
• Making services independent from each other
• but it is very difﬁcult :(
• some dependency must be solved even in
development environment 
(containers on desktop)

Redesign Logging: Why?
• No permanent storages
• No ﬁxed physical/network address
• No ﬁxed mapping between servers and roles
• We should parse/label logs at the source, ship
these logs by pushing to destination ASAP

Containers:
immutable & disposable
• No permanent storages
• Where to write logs?
• ﬁles in the container 
→ gone w/ container instance 😞
• directories shared from hosts 
→ hosts are shared by many containers/services
☹
• TODO: ship logs from container to anywhere ASAP

Containers:
unﬁxed addresses
• No ﬁxed physical / network address
• Where should we go to fetch logs?
• Service discovery (e.g., consul) 
→ one more component 😞
• rsync? ssh+tail? or ..? Is it installed in containers? 
→ one more tool to depend on ☹
• TODO: push logs to anywhere from containers

Containers:
instances per roles
• No ﬁxed mapping between servers and roles
• How can we parse / store these logs?
• Central repository about log syntax 
→ very hard to maintain 😞
• Label logs by source address 
→ many containers/roles in a host ☹
• TODO: label & parse logs at source of logs

Distributed Logging
Architecture

Core Architecture
• Collector nodes
• Aggregator nodes
• Destinations
Collector nodes
(Docker containers + agent)
Destinations 
(Storage, Database, ...)
Aggregator nodes

• Parse/Label (collector)
• Raw logs are not good for processing
• Convert logs to structured data (key-value pairs)
• Split/Sort (aggregator)
• Mixed logs are not good for searching
• Split whole data stream into streams per services
• Store (destination)
• Format logs(records) as destination expects
Collecting and Storing Data

Scaling Logging
• Network trafﬁc
• CPU load to parse / format
• Parse logs on each collector (distributed)
• Format logs on aggregator (to be distributed)
• Capability
• Make aggregators redundant
• Controlling delay
• to make sure when we can know what's happening in our
systems

source aggregation
NO
source aggregation
YES
destination
aggregation
NO
destination
aggregation
YES
Aggregation Patterns

Source Side Aggregation Patterns
w/o source aggregation w/ source aggregation
collector
aggregator
/
destination
aggregate
container

Without Source Aggregation
• Pros:
• Simple conﬁguration
• Cons:
• ﬁxed aggregator (endpoint) address
• many network connections
• high load in aggregator
collector
aggregator

With Source Aggregation
• Pros:
• less connections
• lower load in aggregator
• less configuration in containers 
(by specifying localhost)
• highly flexible configuration 
(by deployment only of aggregate containers)
• Cons:
• a bit much resource (+1 container per host)
aggregate
container
aggregator

Destination Side Aggregation Patterns
w/o destination aggregation w/ destination aggregation
aggregator
collector
destination

Without Destination Aggregation
• Pros:
• Less nodes
• Simpler conﬁguration
• Cons:
• Storage side change affects collector side
• Worse performance: many small write requests
on storage

With Destination Aggregation
• Pros:
• Collector side configuration is 
free from storage side changes
• Better performance with fine tune 
on destination side aggregator
• Cons:
• More nodes
• A bit complex configuration
aggregator

Scaling Patterns
Scaling Up Endpoints
HTTP/TCP load balancer
Huge queue + workers
Scaling Out Endpoints
Round-robin clients
Load balancer
Backend nodes
Collector nodes
Aggregator nodes

Scaling Up Endpoints
• Pros:
• Simple conﬁguration 
in collector nodes
• Cons:
• Limits about scaling up
Load balancer
Backend nodes

Scaling Out Endpoints
• Pros:
• Unlimited scaling 
by adding aggregator nodes
• Cons:
• Complex conﬁguration
• Client features for round-robin

Without 
Destination Aggregation
With 
Destination Aggregation
Scaling Up
Endpoints
Systems in early stages
Collecting logs over
Internet
or
Using queues
Scaling Out
Endpoints
Impossible :(
Collector nodes must know
all endpoints
↓
Uncontrollable
Collecting logs
in datacenter

Case Study: Docker+Fluentd
• Destination aggregation + scaling up
• Fluent logger + Fluentd
• Source aggregation + scaling up
• Docker json logger + Fluentd + Elasticsearch
• Docker ﬂuentd logger + Fluentd + Kafka
• Source/Destination aggregation + scaling out
• Docker ﬂuentd logger + Fluentd

Why Fluentd?
• Docker Fluentd logging driver
• Docker containers can send logs to Fluentd
directly - less overhead
• Pluggable architecture
• Various destination systems
• Small memory footprint
• Source aggregation requires +1 container per host
• Less additional resource usage ( < 100MB )

Destination aggregation + scaling up
• Sending logs directly over TCP by Fluentd logger
library in application code
• Same with patterns of New Relic
• Easy to implement 
- good for startups Application code

Source aggregation + scaling up
• Kubernetes: Json logger + Fluentd + Elasticsearch
• Applications write logs to STDOUT
• Docker writes logs as JSON in ﬁles
• Fluentd 
reads logs from ﬁle 
parse JSON objects 
writes logs to Elasticsearch
• EFK stack (like ELK stack)
http://kubernetes.io/docs/getting-started-guides/logging-elasticsearch/
Elasticsearch
Application code
Files (JSON)

Source aggregation + scaling up/out
• Docker ﬂuentd logging driver + Fluentd + Kafka
• Docker sends logs 
to localhost Fluentd
• Fluentd 
gets logs over TCP 
pushes logs into Kafka
• Highly scalable & less overhead 
- very good for huge deployment
Kafka
Application code

Application code
Source/Destination aggregation +
scaling out
• Docker ﬂuentd logging driver + Fluentd
• Docker sends logs 
to localhost Fluentd
• Fluentd 
gets logs over TCP 
sends logs into Aggregator Fluentd 
w/ round-robin load balance
• Highly ﬂexible 
- good for complex data processing 
requirements
Any other storages

What's the Best?
• Writing logs from containers: Some way to do it
• Docker logging driver
• Write logs on ﬁles + read/parse it
• Send logs from apps directly
• Make the platform scalable!
• Source aggregation: Fluentd on localhost
• Scalable storage: (Kafka, external services, ...)
• No destination aggregation + Scaling up
• Non-scalable storage: (Filesystems, RDBMSs, ...)
• Destination aggregation + Scaling out

Why OSS Are Important
For Logging?

Why OSS?
• Logging layer is interface
• transparency
• interoperability
• Keep the platform scalable
• number of nodes
• number of types of source/destination

Use OSS,
Make Logging Scalable
Thank you!

Distributed Logging Architecture in the Container Era

More Related Content

What's hot

Similar to Distributed Logging Architecture in the Container Era

Recently uploaded

Distributed Logging Architecture in the Container Era