Typical architecture before Fluentd App server App server App server Application Application Application File File File ... File File File ... File File File ... File High latency must wait for a day Log server Hard to analyze complex text parsers Burst of traffic
Architecture after Fluentd App server App server App server Application Application Application Fluentd Fluentd Fluentd Realtime! Fluentd Fluentd
Architecture after Fluentd Fluentd Fluentd Fluentd Realtime! Fluentd Fluentd Hadoop Ready to / Hive MongoDB Amazon S3 / EMR Analyze!
Case study Ruby on Rails Ruby on Rails Ruby on Rails Fluentd Fluentd Fluentd ✓ 127 RoR servers ✓ 70,000 msgs/sec Fluentd Fluentd routing ✓ 120Mbps at peak ✓ 650GB/day Hadoop User behavior / Hive MongoDB PV logs logs
# read logs from a file # forward other logs to servers <source> # (load-balancing + fail-over) type tail <match **> path /var/log/httpd.log type forward format apache <server> tag apache.access host 192.168.0.11 </source> weight 20 </server> # save access logs to MongoDB <server> <match apache.access> host 192.168.0.12 type mongo weight 60 host 127.0.0.1 </server> </match> </match>
Scribe’s Pros & Cons • Pros. > Fast (C++) • Cons. > VERY hard to install > Deals with unstructured logs you must parse logs before analyzing them > Hard to extend you must re-compile C++ programs > No longer maintained?
Fluentd vs Scribe • Easy to install > “gem install fluentd” > stable RPM and DEB packages http://packages.treasure-data.com/ • Easy to write plugins > you can use Ruby • Easy to distribute plugins > “gem search -rd fluent-plugin”
Flume’s Pros & Cons • Pros. > Central master server manages all nodes • Cons. > Difficult to understand logical topologies, phisical servers and a configuration of the logical/phisical mapping > Dificult to configure replicated master servers, log servers and agents > Big footprint 50,000 lines of Java codes
Fluentd vs Flume • Easy to understand > “syslogd that understands JSON” • Easy to setup > “sudo fluentd --setup && fluentd” • Very small footprint > small engine (3,000 lines) + plugins • Easy to configure
Fluentd vs Scribe/Flume Fluentd Scribe Flume Installation gem/rpm/deb make rpm/deb Footprint 3000 lines of 8000 lines of 50,000 lines of Ruby C++ Java Plugin Ruby N/A Java Plugin distribution RubyGems.org N/A N/A Master Server No No Yes License Apache License Apache License Apache License