Collecting Custom Logs
Run machregex
[mach@localhost ~/mach_collector_home/bin]$ ./machregex ================================================================= Machbase Collector Regex Utility Release Version 3.0.0.8634.official Copyright 2015, Machbase Inc. or its subsidiaries. All Rights Reserved. ================================================================= Usage> ./machregex Pattern NewlinePattern Result file : machregex.ok machregex.err << APACHE access log >> => machregex "^([0-9.:]+)\\s([\\w.-]+)\\s([\\w.-]+)\\s(\\[[^\\[\\]]+\\])\\s\"((?:[^\"]|\")+)\"\\s(\\d{3})\\s(\\d+|-)\\s\"((?:[^\"]|\")*)\"\\s\" ((?:[^\"]|\")*)\"$" "^([0-9.:]+)\s" < DATA.LOG << MACH trace log >> => machregex "^\\[(\\d+[-]\\d+[-]\\d+\\s\\d+[:]\\d+[:]\\d+)+\\s([P][-]\\d+)+\\s([T][-]\\d+)+\\]\\s((?:[^\\0])*)$" "^\\[" < DATA.LOG << syslog >> => machregex "^(([a-zA-Z]+)\\s+([0-9]+)\\s+([0-9:]*))\\s(\\S*)\\s+((?:[^\\0])*)$" ".*" < DATA.LOG
This is an example of the machregex run screen.
machregex Test
This is a test that parses Syslog data into machregex using regular expressions.
[mach@localhost bin]$ machregex "^(([a-zA-Z]+)\\s+([0-9]+)\\s+([0-9:]*))\\s(\\S*)\\s+((?:[^\\0])*)$" ".*" </var/log/syslog machregex "^(([a-zA-Z]+)\\s+([0-9]+)\\s+([0-9:]*))\\s(\\S*)\\s+((?:[^\\0])*)$" ".*" </var/log/syslog Pattern => (^(([a-zA-Z]+)\s+([0-9]+)\s+([0-9:]*))\s(\S*)\s+((?:[^\0])*)$) ======================================================================== ............. ======================================================================== SUCCESS[107] (rc=7)(Aug 19 18:17:01 localhost CRON[6553]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly) ) ALL (0:110) => [Aug 19 18:17:01 localhost CRON[6553]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly) ] 0 (0:15) => [Aug 19 18:17:01] 1 (0:3) => [Aug] 2 (4:6) => [19] 3 (7:15) => [18:17:01] 4 (16:37) => [localhost] 5 (38:110) => [CRON[6553]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly) ] ======================================================================= SUCCESS[107] (rc=7)(Aug 19 18:39:01 localhost CRON[6616]: (root) CMD ( [ -x /usr/lib/php5/maxlifetime ] && [ -x /usr/lib/php5/sessionclean ] && [ -d /var/lib/php5 ] && /usr/lib/php5/sessionclean /var/lib/php5 $(/usr/lib/php5/maxlifetime)) ) ALL (0:232) => [Aug 19 18:39:01 localhost CRON[6616]: (root) CMD ( [ -x /usr/lib/php5/maxlifetime ] && [ -x /usr/lib/php5/sessionclean ] && [ -d /var/lib/php5 ] && /usr/lib/php5/sephp5/maxlifetime)) ] 0 (0:15) => [Aug 19 18:39:01] 1 (0:3) => [Aug] 2 (4:6) => [19] 3 (7:15) => [18:39:01] 4 (16:37) => [localhost] 5 (38:232) => [CRON[6616]: (root) CMD ( [ -x /usr/lib/php5/maxlifetime ] && [ -x /usr/lib/php5/sessionclean ] && [ -d /var/lib/php5 ] && /usr/lib/php5/sphp5/maxlifetime)) ] Summary : Success(107), Failure(0) <== It shows that all of them were successfully completed.
In the above example, machregex parses the syslog text file into the given regular expression and splits it into six tokens. To use 0, 4, or 5 of these tokens as database input, use the COL_LIST variable in the template file to associate the token with the database column.
Example of Creating Custom Template
In this chapter, we will use a sample text log file to create a collector template that collects data from this file.
test.log
The input sample text file looks like this:
[2014-08-18 13:51:19] spiderman message-1 : This is the best machine data DBMS ever. [2014-08-18 13:51:19] superman message-2 : This is the best machine data DBMS ever. [2014-08-18 13:51:33] spiderman message-3 : This is the best machine data DBMS ever. [2014-08-18 13:51:33] superman message-4 : This is the best machine data DBMS ever. [2014-08-18 13:51:34] batman message-5 : This is the best machine data DBMS ever. [2014-08-18 13:52:34] superman message-6 : This is the best machine data DBMS ever. [2014-08-18 13:53:34] batman message-7 : This is the best machine data DBMS ever. [2014-08-18 13:54:31] superman message-8 : This is the best machine data DBMS ever. [2014-08-18 13:55:30] batman message-9 : This is the best machine data DBMS ever. [2014-08-18 13:56:44] spiderman message-10 : This is the best machine data DBMS ever. [2014-08-18 13:57:59] superman message-11 : This is the best machine data DBMS ever.
The above sample file can be converted into three columns: tm, user, and msg. The data type of each column can be specified as datetime, varchar (16), varchar (512).
Example of Creating Regular Expression
Creating Regular Expression
\[([0-9-: ]+)\]
: First, date data enclosed in square brackets comes in. The following expressions are used to retrieve only the numeric values inside the tokens except for the square brackets. (\S+)
: Second, user name data comes in, and strings excluding blanks are input.([^\0]*)
: Third, string is entered to the end.\[([0-9-: ]+)\]\s(\S+)\s+([^\0]*)
: Combines the space between the three tokens."\\[([0-9-: ]+)\\]\\s(\\S+)\\s+([^\\0]*)"
: Processes double slashing to use strings in the shell."^\\["
: New line regular expression is a square bracket at the beginning of time.
Checking Regular Expression
[mach@localhost ~/mach_collector_home/bin]$ machregex "\\[([0-9-: ]+)\\]\\s(\\S+)\\s+([^\\0]+)" "\\[" <test.log Pattern => (\[([0-9-: ]+)\]\s(\S+)\s+([^\0]+)) ============================================================================ SUCCESS[2] (rc=4)([2014-08-18 13:51:19] spiderman message-1 : This is the best machine data DBMS ever. ) ALL (0:85) => [[2014-08-18 13:51:19] spiderman message-1 : This is the best machine data DBMS ever. ] 0 (1:20) => [2014-08-18 13:51:19] 1 (22:31) => [spiderman] 2 (32:85) => [message-1 : This is the best machine data DBMS ever. ] ============================================================================ SUCCESS[3] (rc=4)([2014-08-18 13:51:19] superman message-2 : This is the best machine data DBMS ever. ) ALL (0:85) => [[2014-08-18 13:51:19] superman message-2 : This is the best machine data DBMS ever. ] 0 (1:20) => [2014-08-18 13:51:19] 1 (22:30) => [superman] 2 (32:85) => [message-2 : This is the best machine data DBMS ever. ] ============================================================================ SUCCESS[4] (rc=4)([2014-08-18 13:51:33] spiderman message-3 : This is the best machine data DBMS ever. ) ALL (0:85) => [[2014-08-18 13:51:33] spiderman message-3 : This is the best machine data DBMS ever. ] 0 (1:20) => [2014-08-18 13:51:33] 1 (22:31) => [spiderman] 2 (32:85) => [message-3 : This is the best machine data DBMS ever. ] ============================================================================ SUCCESS[5] (rc=4)([2014-08-18 13:51:33] superman message-4 : This is the best machine data DBMS ever. ) ALL (0:85) => [[2014-08-18 13:51:33] superman message-4 : This is the best machine data DBMS ever. ] 0 (1:20) => [2014-08-18 13:51:33] 1 (22:30) => [superman] 2 (32:85) => [message-4 : This is the best machine data DBMS ever. ] ============================================================================ SUCCESS[6] (rc=4)([2014-08-18 13:51:34] batman message-5 : This is the best machine data DBMS ever. ) ALL (0:85) => [[2014-08-18 13:51:34] batman message-5 : This is the best machine data DBMS ever. ] 0 (1:20) => [2014-08-18 13:51:34] 1 (22:28) => [batman] 2 (32:85) => [message-5 : This is the best machine data DBMS ever. ] ============================================================================ SUCCESS[7] (rc=4)([2014-08-18 13:52:34] superman message-6 : This is the best machine data DBMS ever. ) ALL (0:85) => [[2014-08-18 13:52:34] superman message-6 : This is the best machine data DBMS ever. ] 0 (1:20) => [2014-08-18 13:52:34] 1 (22:30) => [superman] 2 (32:85) => [message-6 : This is the best machine data DBMS ever. ] ============================================================================ SUCCESS[8] (rc=4)([2014-08-18 13:53:34] batman message-7 : This is the best machine data DBMS ever. ) ALL (0:85) => [[2014-08-18 13:53:34] batman message-7 : This is the best machine data DBMS ever. ] 0 (1:20) => [2014-08-18 13:53:34] 1 (22:28) => [batman] 2 (32:85) => [message-7 : This is the best machine data DBMS ever. ] ============================================================================ SUCCESS[9] (rc=4)([2014-08-18 13:54:31] superman message-8 : This is the best machine data DBMS ever. ) ALL (0:85) => [[2014-08-18 13:54:31] superman message-8 : This is the best machine data DBMS ever. ] 0 (1:20) => [2014-08-18 13:54:31] 1 (22:30) => [superman] 2 (32:85) => [message-8 : This is the best machine data DBMS ever. ] ============================================================================ SUCCESS[10] (rc=4)([2014-08-18 13:55:30] batman message-9 : This is the best machine data DBMS ever. ) ALL (0:85) => [[2014-08-18 13:55:30] batman message-9 : This is the best machine data DBMS ever. ] 0 (1:20) => [2014-08-18 13:55:30] 1 (22:28) => [batman] 2 (32:85) => [message-9 : This is the best machine data DBMS ever. ] ============================================================================ SUCCESS[11] (rc=4)([2014-08-18 13:56:44] spiderman message-10 : This is the best machine data DBMS ever. ) ALL (0:86) => [[2014-08-18 13:56:44] spiderman message-10 : This is the best machine data DBMS ever. ] 0 (1:20) => [2014-08-18 13:56:44] 1 (22:31) => [spiderman] 2 (32:86) => [message-10 : This is the best machine data DBMS ever. ] ============================================================================ SUCCESS[11] (rc=4)([2014-08-18 13:57:59] superman message-11 : This is the best machine data DBMS ever.) ALL (0:85) => [[2014-08-18 13:57:59] superman message-11 : This is the best machine data DBMS ever.] 0 (1:20) => [2014-08-18 13:57:59] 1 (22:30) => [superman] 2 (32:85) => [message-11 : This is the best machine data DBMS ever.] Summary : Success(11), Failure(0)
Creating test.rgx
After checking that the generated regular expression is parsed normally through the above process, if there is no problem in parsing, write rgx file for regular expression and column binding as follows. This file is written in $MACHBASE_HOME/collector/samples/test.rgx
.
############################################################################### # Copyright of this product 2013-2023, # Machbase Corporation (Incorporation) or its subsidiaries. # All Rights reserved ############################################################################### # # This file is for Machbase trace collector regex file. # LOG_TYPE=custom COL_LIST= ( ( REGEX_NO = 0 NAME = tm TYPE = datetime SIZE = 8 DATE_FORMAT="%Y-%m-%d %H:%M:%S" ), ( REGEX_NO = 1 NAME = user TYPE = varchar SIZE = 16 USE_INDEX = 1 ), ( REGEX_NO = 2 NAME = msg TYPE = varchar SIZE = 512 USE_INDEX = 1 ) ) REGEX="\[([0-9-: ]+)\]\s(\S+)\s+([^\0]+)" END_REGEX="\["
Creating test.tpl
$MACHBASE_HOME/collector/custom.tpl
is copied to the $MACHBASE_HOME/collector/test.tpl
name and modifies the file as follows:
############################################################################### # Copyright of this product 2013-2023, # Machbase Corporation(Incorporation) or its subsidiaries. # All Rights reserved ############################################################################### # # This file is for Machbase collector template file. # ################################################################### # Collect setting ################################################################### COLLECT_TYPE=FILE LOG_SOURCE=/home/mach/machbase_home/collector/samples/test.log ################################################################### # Process setting ################################################################### REGEX_PATH=/home/mach/machbase_home/collector/samples/test.tpl ################################################################### # Output setting ################################################################### DB_TABLE_NAME = "custom_table" DB_ADDR = "127.0.0.1" DB_PORT = 5656 DB_USER = "SYS" DB_PASS = "MANAGER" # 0: Direct insert # 1: Prepared insert # 2: Append APPEND_MODE=2 # 0: None, just append. # 1: Truncate. # 2: Try to create table. If table already exists, warn it and proceed. # 3: Drop and create. CREATE_TABLE_MODE=2 Create and Execute a Collector
Create/Run Collector
Create a "myclt" collector and run it.
Mach> create collector localhost.myclt from "/home/mach/mach_collector_home/collector/samples/test.tpl"; Created successfully. Elapsed Time : 0.106 Mach> Mach> alter collector localhost.myclt start; Altered successfully.
Debugging Collector
TESTTABLE was not created to record the input data.
Mach> select * from custom_table; [ERR-02025 : Table CUSTOM_TABLE does not exist.]
Writes the error of the collector to the trace file and generates trace file to solve the error. Execute the following command to create a trace file.
Mach> alter collector localhost.myclt stop; Altered successfully. Mach> alter collector localhost.myclt start trace; Altered successfully.
Problem Detection/Resolution Through Trace Log
If there is an error when running the Collector, you can look for the $MACHBASE_HOME/trc/machbase.trc
file and look for database execution errors. If an error occurs in the collector, you must run collector in TRACE mode.
[2016-03-13 23:44:35 P-29741 T-139982693979904][INFO] PREPARE Error [create table custom_table ( collector_type varchar(32), collector_addr ipv4, collector_origin varchar(512), collector_offset long, tm datetime, user varchar(16), msg varchar(512))] (100007DA:Error in parse (syntax): near token (user varchar(16), msg varchar(512))).)
Looking at the above message, the table creation query failed because the user set to the column name is not a built-in keyword and can not be used as a column name. Therefore, in the COL_LIST section of the rgx file, change the user column to myuser and run the collector again.
A partial contents from "test.rgx" ........... COL_LIST= ( ( REGEX_NO = 0 NAME = tm TYPE = datetime SIZE = 8 DATE_FORMAT="%Y-%m-%d %H:%M:%S" ), ( REGEX_NO = 1 NAME = myuser <== Modified part TYPE = varchar SIZE = 16 USE_INDEX = 1 ), ( REGEX_NO = 2 NAME = msg TYPE = varchar SIZE = 512 USE_INDEX = 1 ) ) ..................
Check Run/Results
Rerun it with the modified rgx file.
Mach> alter collector localhost.myclt stop; <== Stop the TRACE mode. Altered successfully. Mach> alter collector localhost.myclt start; <== Execute it again in a normal mode after the modification Altered successfully.
If executed normally, the collector can query the contents of the table in which the data is stored.
Mach> select tm, myuser, msg from custom_table; tm myuser ----------------------------------------------------- msg ------------------------------------------------------------------------------------ 2014-08-18 13:57:59 000:000:000 superman message-11 : This is the best machine data DBMS ever. 2014-08-18 13:56:44 000:000:000 spiderman message-10 : This is the best machine data DBMS ever. 2014-08-18 13:55:30 000:000:000 batman message-9 : This is the best machine data DBMS ever. 2014-08-18 13:54:31 000:000:000 superman message-8 : This is the best machine data DBMS ever. 2014-08-18 13:53:34 000:000:000 batman message-7 : This is the best machine data DBMS ever. 2014-08-18 13:52:34 000:000:000 superman message-6 : This is the best machine data DBMS ever. 2014-08-18 13:51:34 000:000:000 batman message-5 : This is the best machine data DBMS ever. 2014-08-18 13:51:33 000:000:000 superman message-4 : This is the best machine data DBMS ever. 2014-08-18 13:51:33 000:000:000 spiderman message-3 : This is the best machine data DBMS ever. 2014-08-18 13:51:19 000:000:000 superman message-2 : This is the best machine data DBMS ever. 2014-08-18 13:51:19 000:000:000 spiderman message-1 : This is the best machine data DBMS ever. [11] row(s) selected.