For statements with
range
clause…
- For a string value, the “range” clause iterates over the Unicode code points in the string starting at byte index 0. On successive iterations, the index value will be the index of the first byte of successive UTF-8-encoded code points in the string, and the second value, of type
rune
, will be the value of the corresponding code point. If the iteration encounters an invalid UTF-8 sequence, the second value will be0xFFFD
, the Unicode replacement character, and the next iteration will advance a single byte in the string.
This might seem pretty straight forward, but there are some interesting subtleties that arise from the fact that ranging over a string steps by Unicode code point, and not by byte. Consider the difference between these two snippets:
for a, b := range "Hello, 世界" {
fmt.Println(a, b)
}
which produces the following output:
0 72
1 101
2 108
3 108
4 111
5 44
6 32
7 19990
10 30028
and this variant:
for a, b := range []byte("Hello, 世界") {
fmt.Println(a, b)
}
which produces this different output:
0 72
1 101
2 108
3 108
4 111
5 44
6 32
7 228
8 184
9 150
10 231
11 149
12 140
In the latter case, we step over the string byte-by-byte. But because the string Hello, 世界
contains multi-byte Unicode codepoints, when ranging over it as a string, we range over it codepoint-by-codepoint.
This is a very important distinction to keep in mind. Sometimes you’ll want to range over a string byte-by-byte, in which case you first want to convert it to a byte slice, as we did in the second code example. Other times, particularly when processing text, you’ll want the default behavior of ranging over a string.
This may have you wondering about a related question: How do you access the nth Unicode code point or rune, of a string?
There’s no shorthand for doing this in Go, as there is for accessing the nth byte. But you can accomplish it with a loop, as we’ve just seen. Let’s wrap such a loop in a convenience function for demonstration purposes:
// runeAt returns the nth rune in str, or 0 if len(str)-1 < n
func runeAt(str string, n int) rune {
if len(str)-1 < n {
return 0
}
for i, r := range str {
if i == n {
return r
}
}
return 0
}
Quotes from The Go Programming Language Specification Language version go1.22 (Feb 6, 2024)