Support variable time unit in Tag Table Rollup

In the previous version, the default rollup operation for the tag table was performed in units of 1 second, 1 minute, and 1 hour.

However, there was a problem of using unnecessary resources when data came in at an interval of more than a second as the rollup time unit could not be changed.

In Machbase 7.0, the rollup's time unit setting function allows users to create the rollup as they want and to set the execution cycle as desired.

The syntax for creating Rollup is as follows.

syntax

-- Create
CREATE ROLLUP rollup_name FROM tag_table_name INTERVAL timesepc;
   
timespec : integer time_unit
time_unit : SEC, MIN, HOUR 
 
-- Start
EXEC ROLLUP_START(rollup_name);
 
-- Stop
EXEC ROLLUP_STOP(rollup_name);
 
-- Delete
DROP ROLLUP rollup_name;

example

CREATE TAG TABLE tag (name VARCHAR(20) PRIMARY KEY, time DATETIME BASETIME, value DOUBLE SUMMARIZED);
 
CREATE ROLLUP rollup_30_sec FROM tag INTERVAL 30 SEC; 
CREATE ROLLUP rollup_10_min FROM rollup_30_sec INTERVAL 10 MIN; 
CREATE ROLLUP rollup_1_hour FROM rollup_10_min INTERVAL 1 HOUR; 
 
EXEC ROLLUP_START(rollup_30_sec);
EXEC ROLLUP_START(rollup_10_min);
EXEC ROLLUP_START(rollup_1_hour);
 
..
..
 
EXEC ROLLUP_STOP(rollup_1_hour);
EXEC ROLLUP_STOP(rollup_10_min);
EXEC ROLLUP_STOP(rollup_1_sec);
 
DROP ROLLUP rollup_1_hour;
DROP ROLLUP rollup_10_min;
DROP ROLLUP rollup_30_sec;

Allow Multiple Tag Tables

The problem that only one tag table could be created in the previous version was solved.

By supporting multiple tag tables, it became possible to effectively store incoming PLC data in various schema types.

Table names can also be freely designated and can be created as a general user other than 'SYS'.

example

CREATE TAG TABLE tag (tagid VARCHAR(50) PRIMARY KEY, time DATETIME BASETIME, value DOUBLE SUMMARIZED);
CREATE TAG TABLE imotbl (tagid VARCHAR(20) PRIMARY KEY, time DATETIME BASETIME, value DOUBLE SUMMARIZED, imo INTEGER, machine VARCHAR(10)) TAG_PARTITION_COUNT = 1, TAG_DATA_PART_SIZE = 20000000;
CREATE TAG TABLE shi985 (tagid VARCHAR(20) PRIMARY KEY, time DATETIME BASETIME, value DOUBLE SUMMARIZED, level INTEGER, itime DATETIME, strval VARCHAR(256)) TAG_PARTITION_COUNT = 2;
 
 
Mach> show tables;
USER_NAME             DB_NAME                                             TABLE_NAME                                          TABLE_TYPE  
-----------------------------------------------------------------------------------------------------------------------------------------------
SYS                   MACHBASEDB                                          IMOTBL                                              TAGDATA     
SYS                   MACHBASEDB                                          SHI985                                              TAGDATA     
SYS                   MACHBASEDB                                          TAG                                                 TAGDATA     
SYS                   MACHBASEDB                                          _IMOTBL_DATA_0                                      KEYVALUE    
SYS                   MACHBASEDB                                          _IMOTBL_META                                        LOOKUP      
SYS                   MACHBASEDB                                          _SHI985_DATA_0                                      KEYVALUE    
SYS                   MACHBASEDB                                          _SHI985_DATA_1                                      KEYVALUE    
SYS                   MACHBASEDB                                          _SHI985_META                                        LOOKUP      
SYS                   MACHBASEDB                                          _TAG_DATA_0                                         KEYVALUE    
SYS                   MACHBASEDB                                          _TAG_DATA_1                                         KEYVALUE    
SYS                   MACHBASEDB                                          _TAG_DATA_2                                         KEYVALUE    
SYS                   MACHBASEDB                                          _TAG_DATA_3                                         KEYVALUE    
SYS                   MACHBASEDB                                          _TAG_META                                           LOOKUP      
[13] row(s) selected.

Change The TAG Data Storage Structure

To reduce disk usage, a data compression method (delta compress) is used and the data structure was changed to store in column units.

As a result of the data usage comparison test compared to the previous version, the disk usage was reduced by up to 37%.

Test Environment

Disk Usage Change

Decrease Rate

Tag 10 thousands

RowCount: 1 billion

Rollup(O)

11,105 MB -> 6,907 MB

37.80% reduced

Tag 10 thousands

RowCount: 1 billion

Rollup(O)

11,007 MB -> 8,987 MB

18.35% reduced

Tag 10 thousands

RowCount: 1 billion

Rollup(O)

14,836 MB -> 12,752 MB

14.05% reduced

Reduce Tag Table Index Memory Usage

When creating the TAG table, there is a problem that the memory cannot be used efficiently because the memory for the index build is fixed and excessively allocated.

It is possible to create and operate multiple TAG tables, by changing the index memory to a method of dynamically allocating only the required amount of memory at the required time.

In 7.0, the maximum memory usage was reduced by about 40% compared to the previous version in the following test environment.

- Test environment: 1 tag table (4 partitions), 10,000 tags / 1 billion data performance test, PROCESS_MAX_SIZE 4GB

As a result, when PROCESS_MAX_SIZE is set to 16 GB, only two TAG tables (based on 4 PARTITIONs) could be created in previous version.

But more than 10 tag tables can be created and used at version 7 in same conditions.

Memory setting guide according to the operating environment

Maximum number of input per second: Sum of data input per second of all append applications when the append application operates as many as the number of TAG tables

(Operates one append application per table)

Operating Environment				Minimum Setting for System				Test Environment
Number of Tag Tables	Tag number per Table	Maximum append per second	Session Number (include append)	TAG_DATA_ PART_SIZE	TAG_PARTITION _COUNT	PROCESS_MAX_ SIZE	TAG_CACHE_MAX_ MEMORY_SIZE	CPU	MEM	DISK
1	100,000	1,200,000 / sec	4	16 M	1	2 GB	128 MB	16 cores	64 GB	SSD 1 TB
1	100,000	1,000,000 / sec	4	4 M	1	1 GB	32 MB	16 cores	64 GB	SSD 1 TB
1	100,000	800,000 / sec	4	1 M	1	512 MB	8 MB	16 cores	64 GB	SSD 1 TB
8	100,000	4,000,000 / sec	11	16 M	1	8 GB	256 MB	16 cores	64 GB	SSD 1 TB
32	40,000	6,000,000 / sec	35	16 M	1	16 GB	1024 MB	16 cores	64 GB	SSD 1 TB
32	40,000	4,000,000 / sec	35	4 M	1	8 GB	256 MB	16 cores	64 GB	SSD 1 TB
64	20,000	4,000,000 / sec	67	4 M	1	8 GB	512 MB	16 cores	64 GB	SSD 1 TB
64	20,000	1,000,000 / sec	67	1 M	1	4 GB	128 MB	16 cores	64 GB	SSD 1 TB
128	10,000	4,000,000 / sec	131	4 M	1	8 GB	1024 MB	16 cores	64 GB	SSD 1 TB
128	10,000	1,000,000 / sec	131	1 M	1	4 GB	256 MB	16 cores	64 GB	SSD 1 TB

The TAG_CACHE_MAX_MEMORY_SIZE value should be increased in proportion to the number of TAG tables, and if TAG_CACHE_MAX_MEMORY_SIZE is not sufficient when creating the TAG table, table creation may fail.

Calculating Proper TAG_CACHE_MAX_MEMORY_SIZE and Condition of success creating Table

If there is 4 TAG Partition and size of Partition Size is 16MB,

TAG_CACHE_MAX_MEMORY_SIZE should be at least 128MB.

128 MB = 4 * 16MB * 2

Create / Select of Statistic Data by TAG ID

There is a problem that the response time is too long when searching for statistical values by tag.

To improve this, statistics for each tag are configured in advance, and a search function is provided in the form of a view for convenience.

In the previous version, the response time was long due to data scan when selecting about TAG statistics.

But in 7.0, the response time was greatly reduced by configuring the necessary statistical information in advance.

Information of Tag Statistic

column	description
USER_ID	tag table user ID
TABLE_NAME	tag table name
TAG_ID	Tag ID of Statistic Information (From the user's point of view, Name is returned)
MIN_TIME	Min time value inserted until now (regardless to input order)
MAX_TIME	Max time value inserted until now (regardless to input order)
MIN_VALUE	Min Value for the matched tag's Summarized column
MIN_VALUE_TIME	Time value that is inserted with min_value
MAX_VALUE	Max Value for the matched tag's Summarized column
MAX_VALUE_TIME	Time value that is inserted with max_value
ROW_COUNT	Numbers of row maches to the Tag ID
RECENT_ROW_TIME	Column value of the Last inserted row

query

Whenever a tag table is created, a view that stores statistical information is created together. When you want to inquire statistical information, it is searched from view.

view table name : "V$name_STAT" . (name is tag table name)

-- ex) 
SELECT min_time, max_time FROM v$tag_stat WHERE tagid = 'tag-01';
 
-- ex) multi tag
SELECT min_time, max_time FROM v$tag_stat WHERE tagid IN ('tag-01', 'tag-02', 'tag-03');

If user wants to know value inserted in min_time (or max_time), user can inquire as follow.

SELECT value FROM tag WHERE tagid = 'tag-01' AND time = (SELECT min_time FROM v$tag_stat WHERE tagid = 'tag-01')

Support JSON Type

support background

Recently, equipment for transmitting TEXT type data is also being used.

If the JSON type is supported, not only the TEXT type but also any data type can be transmitted.

Therefore, since the number or structure of columns can be irregular, the effect of very high scalability can be expected.

Advantages

Since there is no need to add a separate extension column, the schema constraint disappears.
Users can input arbitrary data types as they want.
Due to the characteristics of JSON data, usability and convenience are high, and there is no need to separate data.

Data range

JSON data length : 1 ~ 32768 (32K)
JSON path length : 1 ~ 512

Creating data

User can specify a JSON-type column using the JSON keyword.

CREATE TABLE jsontbl (name VARCHAR(20), jval JSON);

Inserting data

You can insert data by entering TEXT that matches the JSON format.
If it does not match the JSON format, ERROR is printed.

-- Single
INSERT INTO jsontbl VALUES("name1", '{"name":"test1"}');
-- Multi
INSERT INTO jsontbl VALUES("name2", '{"name":"test2", "value":123}');
-- Nested
INSERT INTO jsontbl VALUES("name3", '{"name":{"class1": "test3"}}');
-- Array
INSERT INTO jsontbl VALUES("name4", '{"myarray": [1, 2, 3, 4]}');

Selecting data

You can partially extract JSON data by using JSON-related functions or by using Operator (->).

-- Using JSON-related function
SELECT name, JSON_EXTRACT(jval, '$.name') FROM jsontbl;
SELECT name, JSON_EXTRACT_INTEGER(jval, '$.myarray') FROM jsontbl;
SELECT name, JSON_TYPEOF(jval, '$.name.class1') FROM jsontbl;

-- Using Operator(->)
SELECT name, jval->'$.name' FROM jsontbl;
SELECT name, jval->'$.myarray' FROM jsontbl;
SELECT name, jval->'$.name.class1' FROM jsontbl;

JSON-related function

Function name	Description	Note
JSON_EXTRACT(JSON column name, 'json path')	Returns the value in string type. (If the object does not exist, ERROR is returned.)	JSON Object or Array Type : Tranform every object to String and Return. String : Return as it is. Numerial Type : Transform to string and return boolean Type : Return "True" or "False"
JSON_EXTRACT_STRING(JSON column name, 'json path')	Returns the value in string type. (If the object does not exist, NULL is returned.) Same results as operator(->).	JSON Object or Array Type : Tranform every object to String and Return. String : Return as it is. Numerial Type : Transform to string and return boolean Type : Return "True" or "False"
JSON_EXTRACT_INTEGER(JSON column name, 'json path')	Returns the value in 64 bits integer type. (If the object does not exist, NULL is returned.)	JSON Object or Array Type : Return NULL String : Transform and return, if fails, return NULL Numerial Type : Return 64-bits integer number boolean Type : Return 1 for "True" , 0 for "False"
JSON_EXTRACT_DOUBLE(JSON column name, 'json path')	Returns the value in floating-point 64 bits double type. (If the object does not exist, NULL is returned.)	JSON Object or Array Type : Return NULL String : Transform and return, if fails, return NULL Numerial Type : Return 64-bits real number boolean Type : Return 1.0 for "True" , 0.0 for "False"
JSON_TYPEOF(JSON column name, 'json path')	Return the type of value.	None : Key doesn't exists. Object : Objeck type Integer : Integer Type Real : Real number type String : String Type True/False : Boolean Array : Array type Null : NULL
JSON_IS_VALID('json string')	Check if json string is valid in json format.	0 : False 1 : True

Machbase 7 Manual (English)

7.0 New Features

Analytics

Support variable time unit in Tag Table Rollup

목차

Allow Multiple Tag Tables

Change The TAG Data Storage Structure

Reduce Tag Table Index Memory Usage

Create / Select of Statistic Data by TAG ID

Support JSON Type

Related content