Skip to main content

Lexical Structure

This page defines the fundamental lexical elements of R65: the tokens that make up the source language, including comments, keywords, literals, and identifiers.

Comments

R65 supports two comment styles, matching C and Rust conventions.

Line Comments

A line comment begins with // and extends to the end of the line.

// This is a line comment
let x: u8 = 10; // inline comment

Block Comments

A block comment begins with /* and ends with */. Block comments do not nest.

/* This is a block comment */

/*
* Multi-line block comments
* are also valid.
*/

/* Nesting /* is NOT supported */ -- this ends the comment here */

No Doc Comments

R65 does not support Rust-style doc comments (/// or //!). The sequences /// and //! are treated as ordinary line comments.

Keywords

R65 reserves 57 keywords organized into five categories. All keywords are case-sensitive and must be lowercase, with the sole exception of Self.

Implemented Keywords (22)

These keywords are actively used in the grammar:

KeywordPurpose
fnFunction definition
letVariable binding
mutMutable modifier
constCompile-time constant
staticStatic variable
ifConditional branch
elseConditional alternative
loopInfinite loop
whileConditional loop
forRange-based for loop
inFor loop range keyword
matchPattern matching expression
breakExit loop
continueSkip to next iteration
returnReturn from function
structStructure definition
enumEnumeration definition
typeType alias
includeFile inclusion (include!())
asmInline assembly (asm!())
asType casting
macro_rulesMacro definition

Built-in Hardware Instructions (4)

These identifiers are recognized as built-in functions because they map to special 65816 hardware instructions:

KeywordPurpose
mvnBlock move next (forward copy)
mvpBlock move previous (backward copy)
waiWait for interrupt
stpStop processor

Reserved Rust Keywords (17)

These keywords are reserved for Rust compatibility and future use. Using them as identifiers is a compile error:

impl, trait, where, use, pub, mod, crate, self, Self, super, async, await, move, ref, dyn, extern, unsafe

Strict Reserved Keywords (13)

Reserved by Rust for future language expansion. R65 reserves them for compatibility:

abstract, become, box, do, final, macro, override, priv, typeof, unsized, virtual, yield, try

Special Modifier (1)

KeywordPurpose
farFar function (far fn) or far pointer (far *T) modifier

Case Sensitivity

All keywords must be lowercase, except Self:

fn main() { }   // OK: 'fn' is a keyword
Fn main() { } // ERROR: 'Fn' is not a keyword (treated as identifier)
FN main() { } // ERROR: 'FN' is not a keyword

Keywords use word boundaries and do not match inside longer identifiers. implementation is a valid identifier even though it starts with impl.

Numeric Literals

Decimal Literals

Decimal integer literals consist of one or more ASCII digits (0-9):

0
42
255
65535

Hexadecimal Literals

Hexadecimal literals begin with 0x or 0X, followed by hex digits (0-9, a-f, A-F):

0xFF
0x2100
0x7E2000
0x00

Underscores can be used as visual separators in addresses:

0x7E_2000   // Bank $7E, offset $2000
0x01_8000 // Bank 1, offset $8000

Binary Literals

Binary literals begin with 0b or 0B, followed by binary digits (0 and 1):

0b10110100
0b0000_0001
0b1111_1111

Literal Type Inference

Numeric literals have no intrinsic type. Their type is inferred from context:

let x: u8 = 255;      // 255 is u8
let y: u16 = 1000; // 1000 is u16
let z: u16 = 0xFF; // 0xFF is u16

let w = 10; // ERROR: cannot infer type without annotation

A compile-time error is produced if the literal value does not fit in the target type (e.g., let x: u8 = 256;).

String Literals

String literals are enclosed in double quotes and may only appear in static array initializers. R65 does not have a string type.

#[ram]
static mut MESSAGE: [u8; 16] = "Hello, World!";

Character Set

String literals support extended ASCII (bytes 0x00 through 0xFF). UTF-8 multi-byte sequences are rejected at compile time.

Escape Sequences

SequenceMeaning
\0Null byte (0x00)
\nNewline (0x0A)
\rCarriage return (0x0D)
\tTab (0x09)
\\Backslash
\"Double quote

Padding and Null Termination

When a string literal is shorter than the declared array size, the remaining bytes are zero-padded. Null termination is not automatic; add an explicit \0 if needed:

#[ram]
static mut NAME: [u8; 8] = "Hi\0"; // [0x48, 0x69, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00]

Restriction

String literals cannot appear in let bindings, function parameters, or expressions. They are exclusively for initializing static byte arrays.

Boolean Literals

The two boolean literals are true and false. They have type bool, which is stored as a single byte (u8):

let flag: bool = true;
let done: bool = false;

true converts to 1 and false converts to 0 when cast with as u8.

Hardware Register Names

The 65816 processor registers are exposed as special global identifiers. Register names must be uppercase; lowercase versions are valid user-defined variable names.

RegisterTypeDescription
Au8 or u16Accumulator (size depends on function mode)
Bu8Accumulator high byte (m8 mode only)
Xu16X index register (always 16-bit)
Yu16Y index register (always 16-bit)
STATUSu8Processor status flags (N V M X D I Z C)
Du16Direct Page register
DBRu8Data Bank Register
PBRu8Program Bank Register (read-only)
Su16Stack Pointer

Case Rules

A = 10;           // OK: hardware register A
let a: u8 = 5; // OK: variable 'a' (lowercase)

X = 100; // OK: hardware register X
let x: u8 = 10; // OK: variable 'x' (lowercase)

STATUS.Carry = true; // OK: STATUS register property
let status: u8 = 0; // OK: variable 'status' (lowercase)

Multi-character register names in the wrong case (dbr, pbr, status) produce a helpful error suggesting the uppercase form.

Identifiers

Identifiers begin with a letter or underscore, followed by letters, digits, or underscores:

[a-zA-Z_][a-zA-Z0-9_]*

Identifiers are case-sensitive. Reserved keywords and hardware register names (when uppercase) cannot be used as identifiers.

let my_var: u8 = 10;       // OK
let _temp: u16 = 0; // OK: leading underscore
let player_health: u8 = 0; // OK
let BUFFER_SIZE: u16 = 256; // OK: uppercase identifiers are allowed (these are not registers)

Operators and Punctuation

R65 uses the following operator and punctuation tokens:

TokenMeaning
+ - * / %Arithmetic
& | ^ ~Bitwise
<< >>Shift
== != < > <= >=Comparison
&& || !Logical
=Assignment
+= -= *= /= &= |= ^= <<= >>=Compound assignment
++ --Postfix increment/decrement
@Register/variable binding
->Return type arrow
..Range (in for loops and match)
..=Inclusive range (in match)
( ) { } [ ]Grouping/delimiters
; : , . # 'Punctuation

Whitespace

Spaces, tabs, and newlines serve as token separators and are otherwise insignificant. R65 is not whitespace-sensitive; indentation is a stylistic choice.