We’ve looked at using channels as iterators, and found they’re hardly ideal. Let’s look at the next obvious answer: custom iterators.
Before range-over-func, which we’ll get to next, custom iterators were really the only meaningful solution. And they still remain a very viable one, because of their great flexibility.
Let’s start with a simple implementation of a custom iterator version of our grep function (see it in the playground):
type Result struct {
re *regexp.Regexp
scanner *bufio.Scanner
}
func (r *Result) Next() (string, bool) {
for r.scanner.Scan() {
line := r.scanner.Text()
if r.re.MatchString(line) {
return line, true
}
}
return "", false
}
func (r *Result) Err() error {
return r.scanner.Err()
}
func grep(r io.Reader, pattern string) (*Result, error) {
re, err := regexp.Compile(pattern)
if err != nil {
return nil, err
}
return &Result{
re: re,
scanner: bufio.NewScanner(r),
}, nil
}
This is quite similar to our original, non-iterating implementation. The main difference is that it’s broken into three parts:
- Setup (compile regular expression, create scanner)
- Iteration (read each result, one at a time)
- Cleanup (error checking)
The three parts are tied together with the new Result type, which is instantiated during the setup stage. Let’s look at the iteration part in detail, which happens inside the Next() method, as that’s where the magic happens.
func (r *Result) Next() (string, bool) {
for r.scanner.Scan() {
line := r.scanner.Text()
if r.re.MatchString(line) {
return line, true
}
}
return "", false
}
There are three key things to notice that this function does:
- It advances the underlying state to the next value.
- It returns that value.
- It indicates when there are no more values.
The first of these three is a bit subtle, as we aren’t doing any explicit state tracking, as is sometimes necessary. Here that state is managed by the scanner.Scan() method, which we’re just calling. Here we call that method in a loop. This means each call to Result.Next() may result in many calls to r.scanner.Scan(), since we keep calling that method repeatedly until we find a line that matches the expected regular expression, then we return the matching line, along with a bool value of true to indicate a successful match.
Once we find a match, we return immediately (breaking out of the loop). At that point, the state of the scanner is preserved, and ready for the next call to Next().
Once the scanner reads all lines of input, the for loop ends, and Next() returns an empty string and bool value of false to indicate no match.
Some readers may be wondering if the inclusion of the bool value is necessary. In this case it is, because a blank line could potentially match a regular expression. But in some cases, it would not be necessary–particularly when returing a pointer value, it may be sufficient to use a nil value to indicate “no more results”.