Homer Library
RegexString
RegexString
RegexString is the class which stands for a regular expression input. It provides anything you need to search for a regular expression, retrieve or skip the matched string, ...
RegexString::RegexString(const char* input)
This initializes the object to input. Note that the data is not copied. The input string therefore has to remain valid (allocated) as long as a RegexString object uses its data. Consider RegexString as an iterator over the input string you give. Moreover, RegexString is blind in that it doesn't keep track of the beginning of the input. The implication is that the match indices are relative to the current position, not to the actual beginning of the input.
RegexString::~RegexString()
Does nothing.
bool IsEmpty() const;
Returns true if the end of input has been reached.
char* GetString() const;
Returns a pointer to the current input position.
bool Skip(const RegexMatch& match, char** matched_string = NULL);
bool Skip(int count = 1, char** matched_string = NULL);
These methods are used to skip whatever match is described or count characters. If matched_string is not NULL, the matched string (or character) will be returned. The caller must delete the matched_string data when done.
RegexMatch Search(RegularExpression* regex, bool
case_sensitive = true);
RegexMatch Search(const char* string, bool
case_sensitive = true);
RegexMatch Search(const char character, bool
case_sensitive = true);
RegexMatch SearchLast(RegularExpression* regex, bool
case_sensitive = true);
RegexMatch SearchLast(const char* string, bool
case_sensitive = true);
RegexMatch SearchLast(const char character, bool
case_sensitive = true);
These methods perform a search of a regular expression, string or character in the
input string. case_sensitive can be used to specify the case sensitiveness of the search.
The 3 last methods search the last occurrence of the regular expression, string or
character.
For instance:
const char input_string = "Hmmmmm, donuts...";
RegexString input(input_string);
RegularExpression regex("d[a-z]+");
RegexMatch match = input.Search(regex);
if (match != NoMatch)
{
char* matched_string;
input.Skip(match, &matched_string);
printf("I found '%s' in \"%s\"\n", matched_string, input_string);
printf("What follows is \"%s\"\n", input.GetString());
delete matched_string;
}
generates:
I found 'donuts' in "Hmmmmm, donuts..."
What follows is "..."
If I had used SearchLast with the regular expression "m", what would have followed would have been: ", donuts..."
RegexMatch Match(RegularExpression* , bool = true);
RegexMatch Match(const char* , bool = true);
RegexMatch Match(const char , bool = true);
RegexMatch MatchLast(RegularExpression* , bool = true);
RegexMatch MatchLast(const char* , bool = true);
RegexMatch MatchLast(const char , bool = true);
These methods are similar to their Search and SearchLast counterparts, expected that
the regex, string or character has to be at the very beginning or at the very end of the
input string.
For instance, in the sample code for Search methods, replacing Search with Match would
have produced no output, because the match doesn't start at index 0. Replacing the regular
expression with "[A-Z][a-z]*" would have matched "Hmmmmm" which is at
the beginning of the input. This is the only difference between Search and Match. Match
methods only perform a search and check that the match starts at 0.
This page was last updated on 12/12/99.