Before I dive in to today’s spec discussion… are you enjoying this series? If so, would you do me a favor and help spread the word? Can you share a link to this message with a fellow Gopher, or on your work chat?
Conversion to and from string types is a minefield of special cases in Go. So this will take at least a couple of days. Let’s dive in.
Conversions to and from a string type
- Converting a slice of bytes to a string type yields a string whose successive bytes are the elements of the slice.
string([]byte{'h', 'e', 'l', 'l', '\xc3', '\xb8'}) // "hellø" string([]byte{}) // "" string([]byte(nil)) // "" type bytes []byte string(bytes{'h', 'e', 'l', 'l', '\xc3', '\xb8'}) // "hellø" type myByte byte string([]myByte{'w', 'o', 'r', 'l', 'd', '!'}) // "world!" myString([]myByte{'\xf0', '\x9f', '\x8c', '\x8d'}) // "🌍"
So right off the bat, we have a case that isn’t exactly a special case, but is definitely non-obvious to many.
The string hellø
appears to have a length of 5 characters. But does Go agree?
fmt.Println(len("hellø")) // 6
No, it does not. And the clue as to why is staring at us: The last character, ø
, is made up of two bytes. And len()
returns the number of bytes, not “characters”, in the string.
But what if you want to know the number of characters? Well, that is possible, using the unicode
package:
fmt.Println(utf8.RuneCountInString("hellø")) // 5
Quotes from The Go Programming Language Specification Language version go1.22 (Feb 6, 2024)