Testing
Testing is built into Sailfin — there is no external framework to install, no import testing at the top of your file, and no runner to configure. The test keyword is part of the language itself.
This design reflects a core belief: the closer tests live to the code they verify, the more likely they are to stay up-to-date. Tests are first-class citizens, not afterthoughts.
Your First Test
Section titled “Your First Test”fn add(a: number, b: number) -> number { return a + b;}
test "add returns the sum of two integers" { assert add(2, 3) == 5;}Run it:
sfn testOutput when it passes:
PASS add returns the sum of two integersOutput when it fails:
FAIL add returns the sum of two integers assertion failed: add(2, 3) == 5 left: 4 right: 5 --> src/math.sfn:8:5The test Block
Section titled “The test Block”A test block has three parts: the keyword test, a string name, and a body enclosed in braces.
test "descriptive name here" { // assertions and logic}Test names are documentation. A reader scanning a test file should be able to understand the intended behavior of the system just by reading the test names. Write them as statements of fact about what the code does:
// Good — describes observable behaviortest "parse_int returns error variant on empty string" { ... }test "array push increases length by one" { ... }test "Config.load reads from XDG_CONFIG_HOME when set" { ... }
// Avoid — too vaguetest "parse_int works" { ... }test "test1" { ... }test "edge case" { ... }The assert Statement
Section titled “The assert Statement”assert takes a boolean expression. If the expression evaluates to false, the test fails immediately with the failing expression printed as-is.
test "string length" { let s = "hello"; assert s.length == 5;}assert is a statement, not a function call — there are no parentheses around the expression. When the assertion fails, Sailfin prints both sides of the comparison so you can see the mismatch without adding debug output yourself.
You can include multiple assertions in one test:
test "normalise_email lowercases and trims" { let result = normalise_email(" [email protected] "); assert result == "[email protected]"; assert result.length == 17;}When there are multiple assertions, execution stops at the first failure. If you want to verify several independent properties, consider separate tests — each will give a clear, isolated failure message.
Coming in 1.0: Richer assertion helpers such as
assert_eq,assert_ne,assert_contains, andassert_throwsare on the roadmap. For now, express these as boolean expressions followingassert.
Pure Computation Tests
Section titled “Pure Computation Tests”Tests for pure functions need no effects. These are the easiest to write, the fastest to run, and the most valuable to have.
fn clamp(value: number, min: number, max: number) -> number { if value < min { return min; } if value > max { return max; } return value;}
test "clamp returns value when within range" { assert clamp(5, 0, 10) == 5;}
test "clamp returns min when value is below range" { assert clamp(-3, 0, 10) == 0;}
test "clamp returns max when value is above range" { assert clamp(99, 0, 10) == 10;}
test "clamp handles equal min and max" { assert clamp(7, 5, 5) == 5;}Effect Declarations in Tests
Section titled “Effect Declarations in Tests”Tests declare effects exactly like functions do. This is intentional: the compiler enforces capability discipline everywhere, including test code. A test that calls fs.read without declaring ![io] is a compile error, not a runtime surprise.
test "config file loads successfully" ![io] { let content = fs.read("tests/fixtures/sample.toml"); assert content.length > 0;}
test "HTTP endpoint returns 200" ![io, net] { let response = http.get("https://httpbin.org/get"); assert response.status == 200;}The effects listed in the test signature are exactly the capabilities that test is granted. This makes it easy to see, at a glance, which tests touch the filesystem or network.
Why this matters
Section titled “Why this matters”In a large test suite, tests that declare ![io] or ![net] are the ones that may be slow, may require fixtures, and may fail due to environment issues. Tests with no effects are pure — they always run fast and always produce the same result. Effect annotations let tools (and people) reason about this without reading the test body.
Organizing Test Files
Section titled “Organizing Test Files”Co-located tests
Section titled “Co-located tests”For small modules, put tests in the same file as the code they test. This is the simplest option and keeps tests close to their subject:
fn factorial(n: number) -> number { if n <= 1 { return 1; } return n * factorial(n - 1);}
test "factorial of zero is one" { assert factorial(0) == 1;}
test "factorial of five is 120" { assert factorial(5) == 120;}Separate test files
Section titled “Separate test files”For larger modules, or when the test file would dwarf the implementation, use a separate *_test.sfn file. The naming convention is <module>_test.sfn:
src/├── parser.sfn├── parser_test.sfn├── typecheck.sfn└── typecheck_test.sfnDedicated test directories
Section titled “Dedicated test directories”For integration and end-to-end tests that don’t belong to a single module, use a tests/ directory:
project/├── src/│ ├── parser.sfn│ └── parser_test.sfn # unit tests, co-located└── tests/ ├── unit/ │ └── math_test.sfn ├── integration/ │ └── pipeline_test.sfn └── e2e/ └── full_run_test.sfnThis mirrors the structure used in the Sailfin compiler itself:
compiler/├── src/│ ├── parser.sfn│ └── lexer.sfn└── tests/ ├── unit/ │ ├── parser_test.sfn │ └── lexer_test.sfn ├── integration/ │ └── effect_checker_test.sfn └── e2e/ └── full_pipeline_test.sfnRunning Tests
Section titled “Running Tests”With the CLI
Section titled “With the CLI”# Run all tests discovered from the current directorysfn test
# Run tests in a specific filesfn test src/parser_test.sfn
# Run tests in a directory (recursive)sfn test tests/unit/
# Note: --filter is not yet supported; run a specific file to narrow scopesfn test src/math_test.sfnWith Make targets
Section titled “With Make targets”Projects built with the standard Sailfin Makefile have these targets:
make test # Full suite: unit + integration + e2emake test-unit # Unit tests onlymake test-integration # Integration tests onlymake test-e2e # End-to-end tests onlyReading the output
Section titled “Reading the output”PASS factorial of zero is onePASS factorial of five is 120FAIL factorial of negative number returns one assertion failed: factorial(-1) == 1 left: -1 right: 1 --> src/math.sfn:24:5
3 tests, 2 passed, 1 failedTests run in declaration order within a file. When running multiple files, the order between files is alphabetical.
Unit Testing Patterns
Section titled “Unit Testing Patterns”Table-driven tests
Section titled “Table-driven tests”When you have many similar cases to verify, use an array of test-case structs and loop over them. This avoids repetitive test bodies and makes it easy to add new cases.
struct ParseCase { input: string; expected: number; should_fail: boolean;}
struct ParseError { message: string;}
test "parse_int handles all cases" { let cases: ParseCase[] = [ ParseCase { input: "0", expected: 0, should_fail: false }, ParseCase { input: "42", expected: 42, should_fail: false }, ParseCase { input: "-7", expected: -7, should_fail: false }, ParseCase { input: "", expected: 0, should_fail: true }, ParseCase { input: "abc", expected: 0, should_fail: true }, ParseCase { input: "2147483648", expected: 0, should_fail: true }, ];
for c in cases { let result = parse_int(c.input); // returns number | ParseError match result { ParseError { message } => assert c.should_fail, _ => { assert !c.should_fail; assert result == c.expected; }, } }}Boundary and edge cases
Section titled “Boundary and edge cases”Always test the boundaries of your domain. For a function that works on collections, test the empty case, the one-element case, and a representative multi-element case.
fn median(values: number[]) -> number { ... }
test "median of empty array returns 0.0" { assert median([]) == 0.0;}
test "median of single element returns that element" { assert median([7.0]) == 7.0;}
test "median of even-length array averages middle two" { assert median([1.0, 2.0, 3.0, 4.0]) == 2.5;}
test "median of odd-length array returns middle element" { assert median([1.0, 3.0, 5.0]) == 3.0;}Testing with enums
Section titled “Testing with enums”When a function returns a tagged enum, match against it directly. This gives you a clearer failure message than unwrapping.
enum Direction { North, South, East, West,}
test "parse_direction recognises north" { let result = parse_direction("north"); // returns Direction | ParseError match result { Direction.North => { /* pass */ }, ParseError { message } => assert false, // unexpected error _ => assert false, // wrong variant }}Integration Testing
Section titled “Integration Testing”Integration tests verify that multiple components work correctly together, often using real effects like the filesystem or network. They are slower than unit tests and may require setup.
test "round-trip: write then read returns original content" ![io] { let path = "tests/fixtures/temp_roundtrip.txt"; let original = "hello, world\n";
try { fs.write(path, original); let recovered = fs.read(path); assert recovered == original; } finally { fs.delete(path); }}The finally block runs even if an assertion fails, ensuring the temporary file is cleaned up regardless of the test outcome.
Fixtures
Section titled “Fixtures”Put static test files in a tests/fixtures/ directory and read them in tests that need them:
test "config parser handles multi-section TOML" ![io] { let source = fs.read("tests/fixtures/multi_section.toml"); let config = Config.parse(source); assert config.sections.length == 3;}Keep fixtures small and purpose-built. A fixture that exists to test one behavior should not be reused for a different test if the two tests might need to diverge.
Setup and teardown
Section titled “Setup and teardown”Sailfin does not have beforeEach/afterEach hooks. Instead, extract setup into a helper function and call it at the top of each test that needs it. Use try/finally for teardown:
fn create_temp_dir() -> string ![io, rand] { let path = "tests/temp/{{rand.uuid()}}"; fs.mkdir(path); return path;}
test "compiler emits expected IR" ![io, rand] { let dir = create_temp_dir(); try { let source = "fn main() ![io] { print(\"hi\"); }"; let out_path = "{{dir}}/out.sfn-asm"; compile_to_file(source, out_path); let ir = fs.read(out_path); assert ir.contains("define void @main"); } finally { fs.remove_all(dir); }}Testing Error Handling
Section titled “Testing Error Handling”To verify that a function throws under expected conditions, wrap the call in a try/catch block inside the test. The test fails if the exception is not thrown.
test "divide throws on zero divisor" { let threw = false; try { let _ = divide(10, 0); } catch (err) { threw = true; } assert threw;}To verify that the right kind of error is thrown, dispatch on the error’s shape inside the catch block using match:
test "parse_config throws ParseError on malformed input" { let got_parse_error = false; try { let _ = parse_config("{{ not valid toml }}}"); } catch (err) { match err { ParseError { message } => { got_parse_error = true; }, _ => assert false, // unexpected error shape } } assert got_parse_error;}Coming in 1.0: Typed
catchclauses —catch (err: ParseError) { ... }— are on the roadmap.
Testing Union Return Types
Section titled “Testing Union Return Types”Functions that return T | ErrorType don’t throw — they return an error value in the union. Test these by matching against the result:
test "parse_int returns success for valid input" { let result = parse_int("42"); match result { ParseError { message } => assert false, _ => assert result == 42, }}
test "parse_int returns error for non-numeric input" { let result = parse_int("not a number"); match result { ParseError { message } => assert message.contains("not a number"), _ => assert false, }}Testing AI Code
Section titled “Testing AI Code”Current status:
modelandpromptblocks parse correctly today but do not execute. Model invocation is planned for after the 1.0 release. This section describes the intended testing story.
When model execution lands, prompt-based functions will require ![model] effects and will be testable using a seed parameter for reproducibility:
// Planned execution — parses today; model invocation lands post-1.0test "summarize returns a short string" ![model] { let result = summarize("A very long article about the history of programming languages..."); assert result.length < 200;}Each model call produces a generation card containing the seed used. By fixing the seed, you get deterministic output across runs — the same prompt with the same seed against the same model version will always return the same response.
The workflow will look like:
- Run the test once with a known seed to capture the expected output.
- Pin the expected output and seed in the test.
- Future runs compare against the pinned output — any change is a signal to review.
This is deliberately different from mocking: you’re testing the real model behavior, pinned to a specific configuration.
Test Coverage
Section titled “Test Coverage”Coverage tooling is planned. The goal is line-level coverage that understands effect boundaries — so you can ask “do I have coverage for all my ![net] paths?” not just “which lines were executed?”.
For now, use the discipline of writing tests first to drive coverage naturally.
Best Practices
Section titled “Best Practices”Test names are documentation. Someone reading your test file shouldn’t need to open the implementation to understand what the code is supposed to do.
One behavior per test. When a test verifies a single behavior, the failure message is precise. When a test verifies five behaviors, a failure tells you something broke but not what.
// Prefer: one behavior per testtest "trim removes leading whitespace" { assert trim(" hello") == "hello";}
test "trim removes trailing whitespace" { assert trim("hello ") == "hello";}
// Avoid: too many behaviors in one testtest "trim works" { assert trim(" hello") == "hello"; assert trim("hello ") == "hello"; assert trim(" hello ") == "hello"; assert trim("") == ""; assert trim("no spaces") == "no spaces";}Test the interface, not the implementation. If your test breaks when you refactor internals without changing behavior, the test is testing the wrong thing.
Keep unit tests fast. A test suite that takes two minutes to run will be skipped. Unit tests should take milliseconds. If a test is slow, look for ![io] or ![net] effects — those are your slow paths. Separate them into an integration test suite that you run less frequently.
Write a test for every bug fix. Before fixing a bug, write a test that reproduces it. Then fix the bug. Then confirm the test passes. This prevents regressions and documents the exact scenario that was broken.
Use try/finally for cleanup. Any test that creates temporary files, starts servers, or modifies shared state should clean up in a finally block. This ensures cleanup happens even when assertions fail.
Next Steps
Section titled “Next Steps”- Effective Sailfin — Idiomatic patterns and best practices
- CLI Reference — Full
sfn testoptions - The Effect System — Understanding capability declarations