Name

dt_mer_strsearch - initiate a string search

Generic Prototype

dt_mer_strsearch(dt_Handle, dt_Handle, dt_Integer, dt_Integer, dt_Integer,
dt_Integer, dt_String, dt_String) => dt_Integer

C Prototype

#include "dt_merlin.h"

dt_Integer dt_mer_strsearch(dt_Handle hitlist, dt_Handle column, dt_Integer searchtype, dt_Integer action, dt_Integer find_next, dt_Integer * status, dt_Integer s1len, dt_String s1, dt_Integer s2len, dt_String s2)

FORTRAN Prototype

include 'dt_f_merlin.inc'

integer*4 dt_f_mer_strsearch(hitlist, column, searchtype, action, find_next, status, s1, s2)

integer*4 hitlist
integer*4 column
integer*4 searchtype
integer*4 action
integer*4 find_next
integer*4 status
character*() s1
character*() s2

Description

Begins a string search task on the server.

The hitlist 'hitlist' specifies where the resulting hits will be placed and, depending on the value of 'action', may specify the subset of the database to be searched.

The 'column' specifies the data to be searched.

The 'searchtype' specifies the type of string search to be performed. Valid values are:

DX_STRING_ASCII

Compare strings using straight ASCII match. All characters must be the same for a match.
DX_STRING_NC
Compare strings, ignoring case differences. Equivalent to converting all characters to upper case and then performing an ASCII comparison.
DX_STRING_NP
Compare strings, ignoring punctuation. Equivalent to removing all characters not in the set of [A-Za-z0-9] and then performing an ASCII comparison.
DX_STRING_NW
Compare strings, ignoring whitespace. Equivalent to removing all space, tab, newline, and carriage-return ( ASCII 32, 7, 10, and 13, respectively) characters and performing an ASCII comparison.
DX_STRING_NCW
Combination of ignore case and ignore whitespace comparison.
DX_STRING_NCP
Combination of ignore case and ignore punctuation comparison.
DX_STRING_NPW
Combination of ignore punctuation and ignore whitespace comparison.
DX_STRING_NCPW
Combination of ignore case, punctuation, and whitespace comparison.
DX_STRING_REGEXP
Uses s1 as a regular expression (see regexp(1)). Range searches are not allowed for regular expressions.
DX_STRING_APPROX
Uses s1 as a regular expression, and uses s2 for control parameters for the approximate searching. The string s2 is scanned and the first instance of the following characters sets the attributes of the approximate search as follows:

[Mm] - Magic ( regular expression meta-characters ) used / not used.
[Cc] - Case is matched / ignored.
[Ss] - Substrings / whole words are matched.
[01234B] - Number of errors or Best match found.

The first case of each set listed above is the default if none of the characters is found.

The 'action' specifies how the results of the search are to be combined with the original hitlist, as follows:

DX_ACTION_NEW_LIST

The original hitlist is discarded. The entire pool is searched. All rows which meet the criteria are included in the resulting hitlist.
DX_ACTION_ADD_HITS
All rows not on the original list are searched and hits are added to the current list.
DX_ACTION_ADD_NONHITS
All rows not on the original list are searched and non-hits are added to the current list.
DX_ACTION_DEL_HITS
The original hitlist is searched and hits are removed from the hitlist.
DX_ACTION_DEL_NONHITS
The original hitlist is searched and non-hits are removed from the hitlist.
DX_ACTION_NEXT_HIT
The original hitlist is searched and as soon as a hit is found, its hitlist index is returned. The hitlist is unchanged. Data in derived columns is modified, even though the hitlist is unchanged. The parameter find_next indicates where the position in the hitlist where the search is to begin. The first row examined is 'find_next' + 1.
DX_ACTION_NEXT_NONHIT
Like DX_ACTION_NEXT_HIT, except finds the next row which does not match the search criteria.
The parameter 'find_next' specifies where in the hitlist the search is to begin when the action is either DX_ACTION_NEXT_HIT or DX_ACTION_NEXT_NONHIT. A value of -1 indicates that the search should begin at the beginning of the hitlist. To continue a search from a previously-found hit, specify that hit's index (the value returned by the previous call to the search).

The values of 's1' and 's2' specify the type of search:

  s1 == query   s2 == NULL    row substring match.
  s1 == NULL    s2 == query   illegal parameters (error).
  s1 == query1  s2 == query1  exact match of entire row.
  s1 == query1  s2 == query2  range search between query1 
                              and query2 matching from  
                              the beginning of the row.
  s1 == ""      s2 == query   range search below query 
                              (all hits <= s2 )
                              matching from the beginning 
                              of the row.
  s1 == query   s2 == ""      range search above query 
                              (all hits >= s1 )
                              matching from the beginning 
                              of the row.
(where "" represents a valid string of length 0, NULL is the language-dependent invalid string, and query[12]? represents the string queries).

Return Value

The status of the search task is returned in 'status' (see dt_continue(3) for descriptions). If the hitlist is short enough that time-slicing is not required, the value of 'status' will be DX_STATUS_DONE. Otherwise, the status will be DX_STATUS_IN_PROGRESS and dt_continue(3) will be required to finish the task.

The functions return value is either the progress on the task (see dt_done_when(3)) or -2 if an error is encountered.

Related Topics

dt_abort(3) dt_continue(3) dt_done_when(3) dt_mer_nsearches(3)