Creating Collector
Register Collector Manager
To connect the collector manager with the Machbase server, register the collector manager with the Machbase server. Execute the following command using machsql.
CREATE COLLECTORMANAGER manager_name AT "host_addr:host_port";
- manager_name : The name of the collector manager. Duplicate values are not allowed.
- host_addr: The IP address of the server where the collector manager is running.
- host_port: Port number of the server on which collector manager is running.
[mach@localhost ~/mach]$ machsql ================================================================= Machbase Client Query Utility Release Version x.x.x.official Copyright 2015, Machbase Inc. or its subsidiaries. All Rights Reserved. ================================================================= Machbase server address (Default:127.0.0.1): Machbase user ID (Default:SYS) Machbase user password: MACHBASE_CONNECT_MODE=INET, PORT=5656 mach>CREATE COLLECTORMANAGER LOCALHOST AT "127.0.0.1:9999"; Created successfully.
After registering a collector manager on the Machbase server, you can query the status in the m$sys_collectormanagers table.
mach> SELECT * FROM m$sys_collectormanagers; MANAGER_ID MANAGER_NAME MANAGER_HOST MANAGER_PORT MANAGER_STATE ------------------------------------------------------------------------------------------------------------------------------ 1 LOCALHOST 127.0.0.1 9999 1 [1] row(s) selected.
In the table, the identifier, name, port number, address, and execution status of the collector manager can be inquired.
Create Collector
After registering the collector manager, create the collector object through the collector manager.
Information about the Collector is stored in the Machbase server and can be retrieved. Execute the following command through machsql to create a collector.
CREATE COLLECTOR manager_name.collector_name FROM "path_for_template.tpl";
- manager_name : The name of the collector manager that runs the collector.
- collector_name: The name of the collector object.
- path_for_template.tpl: The path to the configuration file for collector. The various sample configuration files are located in the "$MACHBASE_COLLECTOR_HOME/collector" directory. It is recommended to select the desired sample file, modify it, and save it as another file.
Prepare Template File
The template file is a text file that describes the Collector's data source, processing method, and storage method. Sample files are provided in the $MACHBASE_COLLECTOR_HOME/collector directory.
Template File Structure
The template file has a structure of "variable name = value" similar to the Machbase property file. Detailed information of each setting variable is shown in the following table.
Configuration files after Machbase version 3.5 are not backward compatible.
Variable Name | Description | Remarks |
---|---|---|
COLLECT_TYPE | Data collection method | Sets the data collection method. The data collection method is as follows. FILE defaults to a specific file on the device where the collector is installed. SFTP: Remote SFTP file path, SOCKET: Enters socket input data. ODBC : Enters data from database set to ODBC. |
LOG_SOURCE | Location of log file to be read | The location of the data file to be read. In SFTP mode, you must specify the absolute path of the remote host. Not used in SOKET and ODBC modes. It is also possible to set multiple source files or set them to regular expressions. |
SFTP_HOST | SFTP_HOST | Host Ip Address |
SFTP_PORT | SFTP_PORT | Is set to 22 by default if not set. |
SFTP_USER | SFTP username | Is set to anonymous by default. |
SFTP_PASS | SFTP password | Is set to anonymous by default. |
SOCKET_PORT | Socket port number on which the Collector enters data | |
SOCKET_PROTOCOL | Collector socket protocol type | Possible values are TCP and UDP. The default value is TCP. |
ODBC_DSN | ODBC mode DSN | ".odbc.ini" value |
ODBC_QUERY | ODBC mode query | Query string executed to obtain input data from an ODBC data source |
ODBC_SEQ_COLUMN | Increased column names in ODBC mode | Only numeric columns are allowed. |
LIB_NAME | External link library pass | Not used yet. |
REGEX_PATH | Regular expression file for analyzing input data | Not used in ODBC mode. |
PREPROCESS_PATH | Location of Python script files for data preprocessing | |
SLEEP_TIME | Wait time after inputting data | In milliseconds, with a default of 1000. |
DB_TABLE_NAME | Table name to be entered | |
DB_ADDR | Database IP address to be entered | |
DB_PORT | Database port number | |
DB_USER | Database username | |
DB_PASS | Database password | |
APPEND_MODE | Data input method configuration | Not used as a value for compatibility with past versions. |
AUTO_ADD_COLUMN | Whether to automatically generate a table column if it does not exist If 0, it is not generated. If1, it is generated automatically. | Default value is 1. |
CREATE_TABLE_MODE | Set an operation on the input table. (0: do nothing. 1: truncate the existing table 2: create the table. If an error occurs, write the error to trc and continue 3: drop the table and recreate the table) | Generally recommended to set to 2. |
LANG | Specifies the encoding of the input data file. | Available values are UTF-8 (default), CP949 (MS949), KSC5601, EUCJP, SHIFTJIS, BIG5 and BG231280. |
REGEX_SORT | Determines the order of the input files. | Default value is ASC and DESC is also possible. |
ROTATE_FILE_PATH | Rotation file path configuration | |
ROTATE_FILE_COUNT | Rotation file number configuration | |
ROTATE_REGEX_SORT | Rotation file order configuration | Default value is ASC. DESC is also possible. |
REGEX_PATH, and PREPROCESS_PATH are the files that the collector refers to at run time. Below is a description of the rgx file set in REGEX_PATH.
Variable Name | Description | Remarks |
---|---|---|
LOG_TYPE | Regular expression name | Value that can be modified, but it is better to keep the value because it is stored together in the database. |
COL_LIST | List of columns in the table | Information on the columns belonging to the table |
REGEX | Regular expressions for data analysis | |
END_REGEX | Regular expression that signifies the end of a record | Regular expression to separate each record. If not set, |
COL_LIST describes the information linking the log file to the database column. You must set the result of the regular expression and various information to set the column. Complex log data can be entered into structured table columns using COL_LIST.
Variable Name | Description | Remarks |
---|---|---|
NAME | Column name | String that does not contain spaces. |
TYPE | Column data type | Name of the type. |
SIZE | Column size | Refers to the actual specified size of the column. The string specifies a different value depending on the size to be created or created. ((short (6), int (11), long (20), float (17), double (17), datetime -defined), ipv4 (15), ipv6 (45), text (64MB), binary (64MB)) |
DATE_FORMAT | Datetime data format when type is datetime | Internally parses the value using the "strptime" function. e.g.) 'Aug 19 07:56:16' has the format 'month day hour: minute: second'. Therefore, the format values used are as follows. "% b% d% H:% M:% S" |
USE_INDEX | Whether to create index | Creates LSM or KEYWORD LSM index based on type. 0: Do not create. / 1: Create. |
REGEX_NO | Token number within regular expression | Among the REGEX syntax specified in the regular expression file, the "()" parenthesized area is a token. 0 means the entire record data. After that, it becomes the first token from the first parenthesis. |
syslog.tpl Example
Below is an example of a syslog.tpl file. The file is provided as a sample in $MACHBASE_COLLECTOR_HOME/collector/syslog.tpl.
############################################################################### # Copyright of this product 2013-2023, # Machbase Inc. or its subsidiaries. # All Rights reserved ############################################################################### # # This file is for Machbase collector template file. # ################################################################### # Input setting ################################################################### COLLECT_TYPE=FILE <== It specifies a method to collect local data. LOG_SOURCE=/var/log/syslog <== It specifies a location of source file. ################################################################### # Process setting ################################################################### REGEX_PATH=syslog.rgx <== Regular expression file location. Set $MACHBASE_HOME/collector/regex/ to root ################################################################### # Output setting ################################################################### DB_TABLE_NAME = "syslogtable" <== Table name: Data entered here DB_ADDR = "127.0.0.1" <== Running Machbase server IP/PORT DB_PORT = 5656 DB_USER = "SYS" DB_PASS = "MANAGER" # 0: Direct insert # 1: Prepared insert # 2: Append APPEND_MODE=2 <== Data insertion in APPEND mode. # 0: None, just append. # 1: Truncate. # 2: Try to create table. If table already exists, warn it and proceed. # 3: Drop and create. CREATE_TABLE_MODE=2 <== Create a table if there is none.
The syslog.rgx file is a regular expression file set in the syslog.tpl file. When setting up an rgx file, you can either set it to an absolute path or relative path based on $MACHBASE_COLLECTOR_HOME/collector/regex.
############################################################################### # Copyright of this product 2013-2023, # Machbase Corporation (Incorporation) or its subsidiaries. # All Rights reserved ############################################################################### # # This file is for Machbase collector regex file. # LOG_TYPE=syslog COL_LIST= ( ( REGEX_NO = 0 <== Regular expression token number NAME = tm TYPE = datetime SIZE = 8 DATE_FORMAT="%b %d %H:%M:%S" <== datetime format used by strptime function ), ( REGEX_NO = 4 NAME = host TYPE = varchar SIZE = 128 USE_INDEX = 1 <== Whether index is in use ), ( REGEX_NO = 5 NAME = msg TYPE = varchar SIZE = 512 USE_INDEX = 1 ) ) # Below is the regular expression to pares syslog data. It may not work properly if it is modified. REGEX="(([a-zA-Z]+)\s+([0-9]+)\s+([0-9:]*))\s(\S+)\s+([^\n]+)" END_REGEX="\n"
Create Collector
Create the collector "syslog_test" as shown below.
mach> CREATE COLLECTOR localhost.syslog_test FROM "/home/mach/mach_collector_home/collector/syslog.tpl"; Created successfully.
Check Collector
The M$SYS_COLLECTORS table contains information about the registered collectors. The collector with the "RUN_FLAG" column value of 1 is running and if it is 0, the execution is stopped.
mach> SELECT collector_name, run_flag FROM m$sys_collectors; collector_name run_flag --------------------------------------------------------- SYSLOG_TEST 0 [1] row(s) selected. mach> SELECT * FROM m$sys_collectors; COLLECTOR_ID MANAGER_NAME COLLECTOR_NAME ----------------------------------------------------------------------------------------------------- LOG_TYPE TABLE_NAME --------------------------------------------------------------------------------------- TEMPLATE_NAME COLLECT_TYPE ------------------------------------------------------------------------------------------------------------------------------- COLLECTOR_SOURCE ------------------------------------------------------------------------------------ COLLECTOR_LIB COL_COUNT ------------------------------------------------------------------------------------------------- PREPROCESS_PATH ------------------------------------------------------------------------------------ REGEX_PATH ------------------------------------------------------------------------------------ REGEX ------------------------------------------------------------------------------------ END_REGEX ------------------------------------------------------------------------------------ DEFAULT_ADDR LANGUAGE ----------------------------------------------------------------------------------------------------------------------- SLEEP_TIME DB_ADDR DB_PORT DB_USER ----------------------------------------------------------------------------------------------------------------- DB_PASS RUN_FLAG --------------------------------------------------------- 1 LOCALHOST SYSLOG_TEST syslog syslogtable /home/mach/mach_collector_home/collector/syslog.tpl FILE /var/log/syslog NULL 7 NULL syslog.rgx (([a-zA-Z]+)\s+([0-9]+)\s+([0-9:]*))\s(\S+)\s+([^\n]+) \n 192.168.122.1 UTF-8 1000 127.0.0.1 5656 SYS MANAGER 0 [1] row(s) selected.
Run Collector
ALTER COLLECTOR manager_name.collector_name START [TRACE];
To start the registered collector, use the ALTER COLLECTOR statement.
- manager_name : Name of the registered collector manager
- collector_name: The name of the collector to execute.
If an error occurs when executing Collector, you can refer to $MACHBASE_COLLECTOR_HOME/trc/machcollector.trc file for troubleshooting.
mach> ALTER COLLECTOR localhost.syslog_test START; Altered successfully. mach> SELECT collector_name, run_flag FROM m$sys_collectors; collector_name run_flag --------------------------------------------------------- SYSLOG_TEST 1 [1] row(s) selected.
When you start collector with the ALTER COLLECTOR statement, you can see that the value of the RUN_FLAG column has changed by one.
When you start the Collector, a log table is created on the database server where the collected data is stored. The values of collector_type, collector_addr, collector_origin, and collector_offset are set to default values. The tmp, host, and msg columns set in the syslog.tpl file are also created.
mach> ALTER COLLECTOR localhost.syslog_test START; Altered successfully. mach> SELECT collector_name, run_flag FROM m$sys_collectors; collector_name run_flag --------------------------------------------------------- SYSLOG_TEST 1 [1] row(s) selected.
When you execute a query using machsql, you need to make sure that it is connected to the Machbase server and is running. If the Machbase server and collector are installed on different machines, it may not execute normally if the server to which machsql is connected is collector.
When the Collector is executed, the Collector reads the position of the last data entered and re-executes the data.
Data Check
Below is a comparison of the last 10 syslog logs with data and input data.
[mach@localhost ~/mach]$ tail -n 10 /var/log/syslog Jun 28 21:05:01 localhost CROND[12285]: (root) CMD (LANG=C LC_ALL=C /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lock/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok) Jun 28 21:10:01 localhost CROND[12442]: (root) CMD (LANG=C LC_ALL=C /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lock/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok) Jun 28 21:10:01 localhost CROND[12443]: (root) CMD (/usr/lib64/sa/sa1 1 1) Jun 28 21:15:01 localhost CROND[12527]: (root) CMD (LANG=C LC_ALL=C /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lock/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok) Jun 28 21:20:01 localhost CROND[12609]: (root) CMD (/usr/lib64/sa/sa1 1 1) Jun 28 21:20:01 localhost CROND[12608]: (root) CMD (LANG=C LC_ALL=C /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lock/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok) Jun 28 21:25:01 localhost CROND[12707]: (root) CMD (LANG=C LC_ALL=C /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lock/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok) Jun 28 21:25:01 localhost CROND[12708]: (pcp) CMD ( /usr/libexec/pcp/bin/pmlogger_check -C) Jun 28 21:25:43 localhost su: pam_unix(su:session): session opened for user root by mach(uid=506) Jun 28 21:26:02 localhost su: pam_unix(su:session): session closed for user root
The following is the last 10 data entered into the Machbase server.
mach> SELECT tm, msg FROM syslogtable LIMIT 10; tm msg --------------------------------------------------------------------------------------------------------------------- 2016-06-28 21:26:02 000:000:000 su: pam_unix(su:session): session closed for user root 2016-06-28 21:25:43 000:000:000 su: pam_unix(su:session): session opened for user root by mach(uid=506) 2016-06-28 21:25:01 000:000:000 CROND[12708]: (pcp) CMD ( /usr/libexec/pcp/bin/pmlogger_check -C) 2016-06-28 21:25:01 000:000:000 CROND[12707]: (root) CMD (LANG=C LC_ALL=C /usr/bin/mrtg /etc/mrtg/mrtg.cfg --loc k-file /var/lock/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok) 2016-06-28 21:20:01 000:000:000 CROND[12608]: (root) CMD (LANG=C LC_ALL=C /usr/bin/mrtg /etc/mrtg/mrtg.cfg --loc k-file /var/lock/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok) 2016-06-28 21:20:01 000:000:000 CROND[12609]: (root) CMD (/usr/lib64/sa/sa1 1 1) 2016-06-28 21:15:01 000:000:000 CROND[12527]: (root) CMD (LANG=C LC_ALL=C /usr/bin/mrtg /etc/mrtg/mrtg.cfg --loc k-file /var/lock/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok) 2016-06-28 21:10:01 000:000:000 CROND[12443]: (root) CMD (/usr/lib64/sa/sa1 1 1) 2016-06-28 21:10:01 000:000:000 CROND[12442]: (root) CMD (LANG=C LC_ALL=C /usr/bin/mrtg /etc/mrtg/mrtg.cfg --loc k-file /var/lock/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok) 2016-06-28 21:05:01 000:000:000 CROND[12285]: (root) CMD (LANG=C LC_ALL=C /usr/bin/mrtg /etc/mrtg/mrtg.cfg --loc k-file /var/lock/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok) [10] row(s) selected.
You can check whether the collector is executed by the following query.
mach> SELECT collector_name, run_flag FROM m$sys_collectors; collector_name run_flag --------------------------------------------------------- SYSLOG_TEST 1 [1] row(s) selected.
Stop Collector
ALTER COLLECTOR manager_name.collector_name STOP;
mach> ALTER COLLECTOR localhost.syslog_test STOP; Altered successfully.
You can stop the collector with the following command:
mach> ALTER COLLECTOR localhost.syslog_test STOP; Altered successfully.
Drop Collector
DROP COLLECTOR manager_name.collector_name;
mach> DROP COLLECTOR localhost.syslog_test; Dropped successfully.
Whether the collector is dropped can be confirmed by the following query.
mach> SELECT collector_name, run_flag FROM m$sys_collectors; collector_name run_flag --------------------------------------------------------- [0] row(s) selected.
Update Collector
ALTER COLLECTOR manager_name.collector_name RELOAD;
This is used to change the template file after creating the collector and to apply the new contents. The contents of the template file updated at the time of execution are applied. The following example changes the table into "anothertable" instead of the original value.
mach> ALTER COLLECTOR localhost.custom RELOAD; Altered successfully. mach> SELECT * FROM m$sys_collectors; COLLECTOR_ID MANAGER_NAME COLLECTOR_NAME ----------------------------------------------------------------------------------------------------- LOG_TYPE TABLE_NAME --------------------------------------------------------------------------------------- TEMPLATE_NAME COLLECT_TYPE ------------------------------------------------------------------------------------------------------------------------------- COLLECTOR_SOURCE ------------------------------------------------------------------------------------ COLLECTOR_LIB COL_COUNT ------------------------------------------------------------------------------------------------- PREPROCESS_PATH ------------------------------------------------------------------------------------ REGEX_PATH ------------------------------------------------------------------------------------ REGEX ------------------------------------------------------------------------------------ END_REGEX LANGUAGE ----------------------------------------------------------------------------------------------------------------------- SLEEP_TIME DB_ADDR DB_PORT DB_USER ----------------------------------------------------------------------------------------------------------------- DB_PASS PROCESS_BYTE PROCESS_RECORD RUN_FLAG ----------------------------------------------------------------------------------------------------- 4 LOCALHOST CUSTOM syslog anothertable syslog.tpl FILE /var/log/syslog NULL 7 NULL syslog.rgx (([a-zA-Z]+)\s+([0-9]+)\s+([0-9:]*))\s(\S+)\s+([^\n]+) \n UTF-8 1000 127.0.0.1 5656 SYS MANAGER 0 0 0 [1] row(s) selected.
When you look up the meta table, you can see that the input table has been changed to anothertable.