Multilingual Text Search
ESEARCH
REGEXP
LIKE
The example is actual text search using keyword index. The text search can be performed at a lows cost, which is incomparable with the LIKE syntax of a general database, since it searches for a specific string pattern at a specific time using "reverse index". The keyword index can be used for variable strings, varchar and text type. It is important that the search terms and the search target terms must match exactly. Machbase does not perform morphological analysis and uses keywords based on special characters.
Syntax:
SELECT column_name(s) FROM table_name WHERE column_name SEARCH pattern;
Example:
Mutilingual Text Search
Machbase allows you to search for variable length strings stored in UTF-8 in both ASCII and multilingual characters(encoded), where the most significant bit is 1. However, since multilingual characters do not recognize the morpheme of the word, Machbase supports search using 2-gram technique.
Syxtax:
SELECT column_name(s) FROM table_name WHERE column_name SEARCH pattern;
Example:
If the input data is "tax calculation", three words are saved. In other words, "tax", "calculation", and "-ion" are saved in the dictionary. Thus, users can have adequate results when they search a word as "tax" or "tax calculation". Basically, the Machbase's search method is an AND operation, so even when searching over three characters, the result is relatively accurate.
For example, consider the input record, "computer utilization guide" and search term is "computer". The dictionary contains "compu-","-puter", "utilization", "gui-", and "-ide", and search term are in the form of "compu-" AND "-puter", hence the corresponding record is successfully retrieved.
ESEARCH
The ESEARCH statement is a search keyword that enables extended searches on ASCII text. For this extension, search for the desired pattern is performed using the % character. In the LIKE operation, if a leading% is present, all records must be checked, but the advantage of ESEARCH is that it can quickly find the word. This feature can be very useful when looking for part of an English string (an error string or code).
Syntax:
SELECT column_name(s) FROM table_name WHERE column_name ESEARCH pattern;
Example:
REGEXP
The REGEXP statement is used to perform searches on data using regular expressions. In general, patterns of particular columns are filtered using regular expressions. One thing to keep in mind is that you can not use indexes when you use the REGEXP clause, so you must lower the overall search cost by putting index conditions on other columns in order to reduce the overall search scope. When you want to check a specific pattern, use index by SEARCH or ESEARCH, and use REGEXP again in a state where the total number of data is small, thereby helping to improve system overall efficiency.
Example:
LIKE
You can use LIKE statement just like SQL LIKE operator. Machbase even supports Korean, Chinese, and Japanese as well.
Syntax:
SELECT column_name(s) FROM table_name WHERE column_name LIKE pattern;
Example:
The example is actual text search using keyword index. The text search can be performed at a lows cost, which is incomparable with the LIKE syntax of a general database, since it searches for a specific string pattern at a specific time using "reverse index". The keyword index can be used for variable strings, varchar and text type. It is important that the search terms and the search target terms must match exactly. Machbase does not perform morphological analysis and uses keywords based on special characters.