Logstash是一个具有实时管线能力的开源数据收集引擎。在ELK Stack中,通常选择更轻量级的Filebeat收集日志,然后将日志输出到Logstash进行加工处理,再将处理后的日志输出到指定的目标(ElasticSearch,Kafka等)当中。 Logstash事件的处理管线是inputs → filters → outputs,三个阶段都可以自定义插件,本文主要介绍如何开发自定义需求最多的filter插件。 Logstash的安装就不详细介绍了,下载传送门:https://www.elastic.co/downloads/logstash。
cd到Logstash的跟目录,使用bin/logstash-plugin生成filter插件模板,如下:
bin/logstash-plugin generate --type filter --name test --path vendor/localgemsvendor/localgems可修改为你自己的路径。 查看filter插件的目录结构,如下:
$ tree logstash-filter-test ├── Gemfile ├── LICENSE ├── README.md ├── Rakefile ├── lib │ └── logstash │ └── filters │ └── test.rb ├── logstash-filter-test.gemspec └── spec └── filters └── test_spec.rb └── spec_helper.rbLogstash插件是用ruby写的,查看lib/logstash/filters/test.rb文件,如下:
# encoding: utf-8 require "logstash/filters/base" require "logstash/namespace" # This filter will replace the contents of the default # message field with whatever you specify in the configuration. # # It is only intended to be used as an . class LogStash::Filters::Test < LogStash::Filters::Base # Setting the config_name here is required. This is how you # configure this filter from your Logstash config. # # filter { # { # message => "My message..." # } # } # config_name "test" # Replace the message with this value. config :message, :validate => :string, :default => "Hello World!" public def register # Add instance variables end # def register public def filter(event) if @message # Replace the event message with our message as configured in the # config file. event.set("message", @message) end # filter_matched should go in the last line of our successful code filter_matched(event) end # def filter end # class LogStash::Filters::TestLogstash依赖于UTF-8编码,需要在插件代码开始出添加:
# encoding: utf-8模板代码里面默认require了"logstash/filters/base"和"logstash/namespace",如果需要依赖其它代码或者gems就在这添加,可以参考后面在插件中查询MySql的代码。
插件名称配置代码如下:
config_name "test"test就是插件名称,在Logstash配置的filter块中使用。
插件参数配置代码如下:
config :message, :validate => :string, :default => "Hello World!"message是插件test的可选参数,默认值是"Hello World!"。下面是参数的通用配置代码:
config :variable_name, :validate => :variable_type, :default => "Default value", :required => boolean, :deprecated => boolean, :obsolete => string :variable_name:参数名称:validate:验证参数类型,如:string, :password, :boolean, :number, :array, :hash, :path等:required:是否必须配置:default:默认值:deprecated:是否废弃:obsolete:声明该配置不再使用,通常提供升级方案Logstash插件必须实现两个方法:register和filter。 register方法代码如下:
public def register # Add instance variables end # def registerregister方法相当于初始化方法,不需要手动调用,可以在这个方法里面调用配置变量,如@message,也可以初始化自己的实例变量。 filter方法代码如下:
public def filter(event) if @message # Replace the event message with our message as configured in the # config file. event.set("message", @message) end # filter_matched should go in the last line of our successful code filter_matched(event) end # def filterfilter方法是插件的数据处理逻辑,其中event变量封装了数据流,可以通过接口访问event中的内容,具体参见https://www.elastic.co/guide/en/logstash/5.1/event-api.html。最后一句调用了filter_matched,这个方法用于保证Logstash的配置add_field, remove_field, add_tag 和remove_tag会被正确执行。
这里以在插件中查询MySql为例进行说明,使用jdbc操作MySql,需要安装jdbc-mysql,操作如下: 添加Logstash的环境变量:
export LOGSTASH_HOME=/opt/logstash-5.2.1 export PATH=$PATH:$LOGSTASH_HOME/vendor/jruby/bin安装jdbc-mysql:
gem install jdbc-mysql使用sequel(代码和文档请查看vendor/bundle/jruby/1.9/gems/sequel-4.43.0)操作MySql,首先需要在logstash-filter-test.gemspec配置文件中添加对sequel的依赖,如下:
# Gem dependencies s.add_runtime_dependency "logstash-core-plugin-api", "~> 2.0" s.add_runtime_dependency 'sequel' s.add_development_dependency 'logstash-devutils'然后在test.rb中require相关代码:
require "sequel" require "sequel/adapters/jdbc"在test.rb中添加:jdbc_driver_library配置参数,用于配置jdbc驱动库的path,我这的路径是"/usr/local/lib/ruby/gems/2.3.0/gems/jdbc-mysql-5.1.40/lib/mysql-connector-java-5.1.40-bin.jar"。
config :jdbc_driver_library, :validate => :string, :required => trueregister方法中做了两件事,一是初始化了几个实例变量,二是require依赖的jdbc库。简单说明下实例变量的用途,@logger用于输出日志,@connection_retry_attempts和@connection_retry_attempts_wait_time用于数据库连接重试,@connection_wait_timeout用于设置MySql的session超时时间,避免与MySql连接过多,这是一个双保险策略,正常情况下MySql会设置全局的超时时间,并且查询完成之后我们会主动断开连接(见fetch_info方法),在断开失败且MySql的超时时间过长时@connection_wait_timeout才会起作用。
public def register # Add instance variables @logger = self.logger @connection_retry_attempts = 5 @connection_retry_attempts_wait_time = 1 @connection_wait_timeout = 10 begin require @jdbc_driver_library rescue => e @logger.error("Failed to load #{@jdbc_driver_library}", :exception => e) end end # def register创建db实例:
private def create_db(conn_str) db = nil retry_attempts = @connection_retry_attempts while retry_attempts > 0 do retry_attempts -= 1 begin tmp_db = Sequel.connect(conn_str) rescue Sequel::PoolTimeout => e if retry_attempts <= 0 @logger.error("Failed to connect to database. 5 second timeout exceeded. Tried #{@connection_retry_attempts} times.") raise e else @logger.error("Failed to connect to database. 5 second timeout exceeded. Trying again.") end rescue Sequel::Error => e if retry_attempts <= 0 @logger.error("Unable to connect to database. Tried #{@connection_retry_attempts} times", :error_message => e.message) raise e else @logger.error("Unable to connect to database. Trying again", :error_message => e.message) end else db = tmp_db break end sleep(@connection_retry_attempts_wait_time) end db end查询数据:
private def fetch_info(db, sql, key) all_info = {} retry_attempts = @connection_retry_attempts while retry_attempts > 0 do retry_attempts -= 1 begin db.fetch(sql) do |row| all_info[row[key]] = row end db.run "set wait_timeout = " + @connection_wait_timeout.to_s rescue Sequel::DatabaseConnectionError, Sequel::DatabaseError => e if retry_attempts <= 0 @logger.warn("Exception when executing JDBC query", :exception => e) raise e else @logger.error("Failed to execute query. Trying again.", :error_message => e.message) end else break end sleep(@connection_retry_attempts_wait_time) end db.disconnect() all_info end接下来就可以根据需要在register和filter中使用create_db和fetch_info方法了。 注意:这里只是以查询MySql为例进行说明,处理Logstash事件时需要考虑对性能和吞吐量的影响。
cd到Logstash根目录下,在Gemfile添加以下配置:
gem "logstash-filter-test", :path => "vendor/localgems/logstash-filter-test"启动Logstash,配置我们定制的test插件,如下:
bin/logstash -e 'input { beats { port => "5043" } } filter { test { jdbc_driver_library => "/usr/local/lib/ruby/gems/2.3.0/gems/jdbc-mysql-5.1.40/lib/mysql-connector-java-5.1.40-bin.jar" } } output { stdout { codec => rubydebug }}'也可以写配置文件,与上面的-e参数内容一致,然后使用配置文件启动Logstash。 启动Logstash的传送门:https://www.elastic.co/guide/en/logstash/5.1/running-logstash-command-line.html。