Lexical elements: Rune literals pt 2

Let’s continue our exploration of rune literals, which began on Monday. In summary from Monday, a rune in Go represents a Unicode code point. Continuing from there… Rune literals A rune literal is expressed as one or more characters enclosed in single quotes, as in 'x' or '\n'. Within the quotes, any character may appear except newline and unescaped single quote. A single quoted character represents the Unicode value of the character itself, while multi-character sequences beginning with a backslash encode values in various formats.

News

1 min read


Introducing Cup o' Go, the new Go News podcast

Today I’m taking a short break from the discussion of the Go Spec to introduce you to a brand new Go-related podcast that I’m part of: Cup o’ Go. Shay Nehmad and I released our first episode yesterday, and intend to make this a weekly, lighthearted Go news program. Our promise is to help listeners keep up with the latest happenings in the Go community in just 15 minutes per week.


Lexical elements: Rune literals pt 1, Intro to Unicode

Runes… Oh boy! This is one of bits of Go that shines for its elegant simplicity, but constantly trips up everyone (myself included). As such, I think this may be a 2-, or maybe even a 3-parter. Let’s get started. Rune literals A rune literal represents a rune constant, an integer value identifying a Unicode code point. If you’re already familiar with Unicode, and have a strong understanding of what a “code point” is, you can probably skip this one.


Lexical elements: Imaginary literals

I understand in principle what imaginary numbers are. I used them during high school and university mathematics classes to get good grades. I’ve never had a practical use for them. I’ve certainly never used them in programming, in Go or otherwise. All that said, I guess it’s cool that Go has first-class support for imaginary numbers, rather than it being an add-on library of some kind. If you’re like me, and have never had the need to write software that understands complex numbers, you might skip today’s email, or better yet, jump over to 3Blue1Brown’s video that explains what complex and imaginary numbers are, and why they’re useful.

Subscribe to Boldly Go: Daily

Every day I'll send you advice to improve your understanding of Go. Don't miss out! I will respect your inbox, and honor my privacy policy.

Unsure? Browse the archive.


Detour: Unary operators & signed numeric literals

I noticed something while writing the last two sections on Integer literals and Floating-point literals, and I’m curious if anyone else noticed. There’s no way to express a negative integer or floating-point literal! If we look specifically at the definition of an decimal literal in EBNF format, we see: decimal_lit = "0" | ( "1" … "9" ) [ [ "_" ] decimal_digits ] . Notably, we do not see this:


Lexical elements: Floating-point literals

Ah, who doesn’t love floating point numbers? Go gives us two ways to define floating point number literals: in decimal, or in hexidecimal. I have never been tempted to write a floating point number literal in hexidecimal. I imagine this is mostly used by those interested in precision control of the IEEE 754 values as understood by the floating point implementation. (If this is something you’ve ever used, I’d love to hear from you: What was your context?


Lexical elements: Integer literals

Integer literals Ah integer literals! Now we’re starting to get into the meat of the language. Go lets us express integers in four bases. 10 (decimal) is by far the most common, followed by base 16 (hexidecimal). Base 8 (octal) and base 2 (binary) are also supported, but seen fairly rarely. Here’s how the spec describes it: An integer literal is a sequence of digits representing an integer constant. An optional prefix sets a non-decimal base: 0b or 0B for binary, 0, 0o, or 0O for octal, and 0x or 0X for hexadecimal.


Lexical elements: Keywords, Operators and punctuation

Keywords The following keywords are reserved and may not be used as identifiers. break default func interface select case defer go map struct chan else goto package switch const fallthrough if range type continue for import return var Operators and punctuation The following character sequences represent operators (including assignment operators) and punctuation: + & += &= && == != ( ) - | -= |= || < <= [ ] * ^ *= ^= <- > >= { } / << /= <<= ++ = := , ; % >> %= >>= -- !


Lexical elements: Identifiers

Identifiers Identifiers name program entities such as variables and types. An identifier is a sequence of one or more letters and digits. The first character in an identifier must be a letter. identifier = letter { letter | unicode_digit } . So in other words, every identifier must begin with a letter, followed by zero or more letters and/or digits. Pretty simple. The spec offers a few examples a _x9 ThisVariableIsExported αβ That second one looks a bit suspicious.


Lexical elements: Semicolons

Go’s use of semicolons is one area of confusion for newcomers to the language. It certainly was for me. But it’s more intuitive than it seems. Here’s the formal explanation: Semicolons The formal syntax uses semicolons ";" as terminators in a number of productions. Go programs may omit most of these semicolons using the following two rules: When the input is broken into tokens, a semicolon is automatically inserted into the token stream immediately after a line’s final token if that token is


Lexical elements: Tokens

Tokens Tokens form the vocabulary of the Go language. There are four classes: identifiers, keywords, operators and punctuation, and literals. White space, formed from spaces (U+0020), horizontal tabs (U+0009), carriage returns (U+000D), and newlines (U+000A), is ignored except as it separates tokens that would otherwise combine into a single token. Also, a newline or end of file may trigger the insertion of a semicolon. While breaking the input into tokens, the next token is the longest sequence of characters that form a valid token.