Homer Library
RegexString


RegexString

RegexString is the class which stands for a regular expression input. It provides anything you need to search for a regular expression, retrieve or skip the matched string, ...


RegexString::RegexString(const char* input)

This initializes the object to input. Note that the data is not copied. The input string therefore has to remain valid (allocated) as long as a RegexString object uses its data. Consider RegexString as an iterator over the input string you give. Moreover, RegexString is blind in that it doesn't keep track of the beginning of the input. The implication is that the match indices are relative to the current position, not to the actual beginning of the input.


RegexString::~RegexString()

Does nothing.


bool IsEmpty() const;

Returns true if the end of input has been reached.


char* GetString() const;

Returns a pointer to the current input position.


bool Skip(const RegexMatch& match, char** matched_string = NULL);

bool Skip(int count = 1, char** matched_string = NULL);

These methods are used to skip whatever match is described or count characters. If matched_string is not NULL, the matched string (or character) will be returned. The caller must delete the matched_string data when done.


RegexMatch Search(RegularExpression* regex, bool case_sensitive = true);
RegexMatch Search(const char* string, bool case_sensitive = true);
RegexMatch Search(const char character, bool case_sensitive = true);
RegexMatch SearchLast(RegularExpression* regex, bool case_sensitive = true);
RegexMatch SearchLast(const char* string, bool case_sensitive = true);
RegexMatch SearchLast(const char character, bool case_sensitive = true);

These methods perform a search of a regular expression, string or character in the input string. case_sensitive can be used to specify the case sensitiveness of the search. The 3 last methods search the last occurrence of the regular expression, string or character.
For instance:

const char                     input_string = "Hmmmmm, donuts...";
RegexString                 input(input_string);
RegularExpression    regex("d[a-z]+");
RegexMatch                 match = input.Search(regex);
if (match != NoMatch)
{
char*                     matched_string;
input.Skip(match, &matched_string);
printf("I found '%s' in \"%s\"\n", matched_string, input_string);
printf("What follows is \"%s\"\n", input.GetString());
delete matched_string;
}

generates:

I found 'donuts' in "Hmmmmm, donuts..."
What follows is "..."

If I had used SearchLast with the regular expression "m", what would have followed would have been: ", donuts..."


RegexMatch Match(RegularExpression* , bool = true);
RegexMatch Match(const char* , bool = true);
RegexMatch Match(const char , bool = true);
RegexMatch MatchLast(RegularExpression* , bool = true);
RegexMatch MatchLast(const char* , bool = true);
RegexMatch MatchLast(const char , bool = true);

These methods are similar to their Search and SearchLast counterparts, expected that the regex, string or character has to be at the very beginning or at the very end of the input string.
For instance, in the sample code for Search methods, replacing Search with Match would have produced no output, because the match doesn't start at index 0. Replacing the regular expression with "[A-Z][a-z]*" would have matched "Hmmmmm" which is at the beginning of the input. This is the only difference between Search and Match. Match methods only perform a search and check that the match starts at 0.


This page was last updated on 12/12/99.