Collector
The above figure shows Node-2 and Node-3 Collectors collect data and input to Node-1 where the database server is installed.
- In Node-2 and Node-3, the Collectors can be seen as running a separate process on a specific log file to send data.
- You can see that each collector process gets detailed information about the log data using a given tpl file .
- The Collector manager is installed in each Node, manages and monitors the Collector process running on that node.
Characteristic
The main features of the Machbase Collector are described below.
Consistent Interface
Machbase does not require any additional programs in addition to SQL-based commands to execute the Collector. Simply use the following command to manage and monitor the Collector.
CREATE Collector MANAGER LOCALHOST AT '127.0.0.1:9999'; CREATE Collector LOCALHOST.MYADP FROM 'syslog.tpl'; ALTER Collector LOCALHOST.MYADP START;
Improved Data Collection Performance
Machbase Collector is designed to collect data with a separate Collector for each log data type so each process can process each log file at high speed.
Since separate processes each process log data, they are not affected by other log file processing.
Collector is executed with optimized code for each log type, and data is input with dedicated protocol that minimizes resource usage, so that the best performance can be obtained.
Collection Method
Collectors can be used to collect log data in a variety of ways. The data collection method can be set by modifying the tpl file. The following collection methods are supported.
Method Name | Description |
---|---|
FILE | Collects files from local host. |
SFTP | Collects files from remote host. |
SOCKET | Collects data coming into port. |
ODBC | Collects files from other databases. |
Log Data Types
Machbase Collector supports regular expressions for various types of log data.
The user can simply modify the existing regular expression to analyze various log files. Currently, the following log types are supported.
Regular Expression File Name | Supported Type | Data Default Location (can be modified) |
---|---|---|
machbase.rgx | Machbase trace log | $MACHBASE_HOME/trc/machbase.trc |
apache_access.rgx | Apache web server access file | /var/log/apache2/access.log |
apache_error.rgx | Apache web server error file | /var/log/apache2/access.log |
syslog.rgx | sysglog file | /var/log/syslog |
custom.rgx | Custome type | Custom file |
Easily Supports Custom Logs
Machbase Collector can process various kinds of log files that can be represented as regular expressions.
Even if you do not have a log file, you can test sample log messages and regular expressions using machregex.
Prevents Data Loss in the Event of a Failure
Machbase Collector provides the ability to correctly retransmit data that failed to be transmitted in the event of a failure, after the failure has been resolved.
When a failure occurs, the Collector records the last location it sent to the server, resets the fault, and then resends the data from that location.
So, even if you do not write any additional operations or code to overcome the obstacle, you can transfer it to the server without losing any data.
Ensures High Availability
To ensure high availability of services, multiple Collectors can operate simultaneously on the same data source, and these Collectors transfer data to different Machbase servers.
In this way, even if an error occurs in the Machbase server, the same data is continuously stored in another server, so that the service can be continued.
After resolving the error and restarting the server, the Collector can retransmit the untransmitted log data correctly, thus automatically replicating the data to provide high availability.
Integrated Monitoring Through MWA
Machbase Collector manager synchronizes the Collector's execution information to the Machbase server.
Using this, it is possible to perform integrated monitoring through MWA (Machbase Web Admin).
By using MWA, it is possible to monitor various status information of the running Collector and the status information of the server running the Collector in real time.
Log Pre-Processing Using Python Script
You can write a Python script to manipulate the Collector before it processes the data.
The input data can be processed so as not to input unnecessary data, or the parsed data can be changed.