Will LLMs Help or Hurt New Programming Languages?
A perspective on how LLMs might impact the adoption of new programming languages.
Recently, I’ve seen a number of discussions circle around the same question:
Will large language models (LLMs) help or hurt the adoption of new programming languages?
It is a fair and timely question. As an academic researcher in programming language design and theory, and as the lead developer of the Flix programming language, will I soon be out of a job? Well, maybe not!
In this post, I want to explore whether a modern LLM — here, Claude Code (Opus 4.5) — is able to program in Flix. This is an interesting question because the amount of training data, e.g. Flix source code, is tiny compared to what exists for languages like Python and JavaScript.
Given that LLMs are reasonably good at translating between natural languages, e.g. from English to German, I think there is good reason to believe that they can also support new programming languages, if those languages are syntactically and semantically similar to existing ones.
But what happens when a new programming language introduces entirely new syntax and semantics? For example, the Flix programming language is built around effect-oriented programming. Will Claude be able to use that?
Aside: How much information about Flix is actually in Claude’s training set? Surely, Anthropic has scooped up our research papers, vacuumed our documentation, and gobbled our source code. Yet Flix evolves rapidly, and some information, e.g. from early research papers, may already be slightly outdated.
Experiment: Vibe Coding Tic-Tac-Toe
Today, we will "Vibe Code" (I dislike that term) a Tic-Tac-Toe game in Flix using Claude Code with the Opus 4.5 model. I chose Tic-Tac-Toe because it is a well-defined game that requires a moderate use of effects.
My intuition is that writing a new Flix program from scratch is harder for an LLM because there is no codebase it can gradually evolve, where the exsting syntax and semantics can serve as examples. There is only the LLM itself and any information we put into its context.
We begin by creating an empty Flix project:
$ mkdir tictactoe
$ cd tictactoe
$ flix init
Next, we are going to help Claude by downloading the entire Flix API reference and the Flix Book documentation:
$ mkdir -p docs
$ cd docs/
$ wget -r -np -k https://api.flix.dev/
$ wget -r -np -k https://doc.flix.dev/
The two downloads take a few minutes. If you’re reading this blog post, you have my permission to crawl our documentation just like we did above.
Next, we create a CLAUDE.md file with the following text:
# Overview
This is project is written in the Flix programming language.
## Documentation
- **API Reference**: `docs/api.flix.dev/`
- **Documentation**: `docs/doc.flix.dev/`
## Flix Compiler Commands
The `flix` compiler is available on PATH and supports the following commands:
- `flix check` - Check code for errors
- `flix run` - Run the project
## Effect System
Flix has an effect system. Documentation is in `docs/doc.flix.dev/`:
- `effect-system.html` - Core effect system concepts
- `effect-polymorphism.html` - Effect polymorphism
- `effect-oriented-programming.html` - Effect-oriented programming
- `effects-and-handlers.html` - Effects and handlers
We explicitly instruct Claude on how to run the compiler and what documentation to consult about the effect system.
We can now start. We run claude and enter plan mode.
We give it the following prompt:
Let us write a tictactoe game in Flix. The game should ask the user for
a grid size (e.g. 3x3 or 4x4) and then prompt the user for a move. Check
that the move is valid according to the rules of tictactoe. After each
player move, the AI makes a valid move. A player wins when they have
three consecutive x’s or o’s.
Use the Console effect for user input and output. Use a functional data
structure to store the game state.
Use a recursive function for the game loop.
Use the Logger effect to log all moves in the game. Write your own custom
handler for Logger that writes to a logfile using the FileWrite effect.
Consult the Flix documentation as necessary during planning.
The prompt is reasonable, but not very polished. We let Claude chew on the instructions and it comes back with:
I’ll help you create a Tic-Tac-Toe game in Flix.
Let me first explore the Flix documentation to understand the effect
system, Console, and FileWrite APIs.
After a few minutes, we are asked a few questions, including:
How sophisticated should the AI opponent be?
❯ 1. Random valid moves
AI picks any random empty cell - simple to implement
2. Basic strategy
AI blocks wins and takes winning moves when available
3. Minimax algorithm
Optimal AI that never loses - more complex implementation
Initially, I wanted to pick (3). However, choosing (1) is actually more
challenging, because it requires Claude to use the Random effect, whose
handler introduces the primitive NonDet effect. Claude will have to figure
this out on its own.
We make our choice:
Now I have clear requirements. Let me design the implementation plan.
After about seven minutes of Clauding..., which involves reading both the Flix
API reference and documentation, including Random.html (which looks
promising), Claude reports back with the full plan (which I have abbreviated):
1. Data Type Definitions
...
type alias GameState = { board = Board, size = Int32, currentPlayer = Player, moveCount = Int32 }
type alias Move = { player = Player, row = Int32, col = Int32 }
...
2. Custom Logger Effect
eff Logger {
def logMove(m: Move): Unit
}
3. Core Functions ...
4. I/O Functions ...
5. Logger Handler with FileWrite
def handleLoggerWithFile(logFile: String, f: Unit -> a \ ef): a \ (ef - Logger) + FileWrite =
run {
f()
} with handler Logger {
def logMove(m, resume) = {
let line = "${m.player} played at (${m.row + 1}, ${m.col + 1})\n";
FileWrite.append({str = line}, logFile);
resume()
}
}
6. Recursive Game Loop
def gameLoop(state: GameState): GameResult \ {Console, Logger, Random} = ...
7. Main Entry Point
def main(): Unit \ IO =
let logFile = "game_log.txt";
FileWrite.runWithIO(() ->
handleLoggerWithFile(logFile, () ->
Random.runWithIO(() ->
Console.runWithIO(() ->
startGame()
)
)
)
) |> ignore
Effect handling order (inner to outer): Console -> Random -> Logger -> FileWrite -> IO
8. Implementation Steps ...
9. Win Detection Strategy ...
10. User Input Format ...
Overall, the plan looks reasonable, but there are a few issues:
- Claude wants to define its own
Loggereffect instead of using theLoggereffect from the Standard Library with a custom effect handler. - The
mainfunction, while reasonable, does not use the more idiomaticrun-withconstruct, nor does it use default handlers. - Claude wants to use records for
GameStateandMove. This typically trips up Claude because Flix usesrec#fieldfor field selection, notrec.fieldwhich is used for Java interoperability.
We could give Claude instructions to resolve (1) and (2), and warn about (3). But instead, let us see how well Claude does without further help. Later, we can return and ask Claude to refactor (1) and (2). We proceed:
The plan is approved. Let me set up a task list and start implementing.
Claude generates about 350 lines of code and asks for permission to use the Flix compiler which I grant. The project contains 20+ errors (including parse errors, name resolution errors, and type errors). Claude says:
I see several issues. The main ones are:
1. Logger is already defined in the standard library -
I need to rename it to GameLogger
2. The backtick syntax doesn’t work with qualified names -
need to use Int32.mod(index, size)
Claude has discovered that Logger is already defined, so it (incorrectly)
decides to rename its own Logger to GameLogger. Claude also incorrectly
assumed that Flix has infix syntax for modulo (i.e., x % y), but discovered
that it should instead use Int32.mod. While encountering these issues, Claude
is using both the Flix API reference and documentation to fix them.
Aside: The Flix compiler is highly resilient, so each compilation phase can proceed to the next, even if the program contains errors. In fact, about 95% of compilation errors are recoverable. As a result, for a program generated by an LLM with multiple errors, Flix can report all of them at once, allowing the LLM to fix multiple issues in one pass.
Claude makes several changes and reruns the compiler. Now it gets:
>> Unable to unify the types: 'Player' and 'JvmToType(j0)'.
286 | match state.currentPlayer {
^^^^^^^^^^^^^^^^^^^
The error message is not great — we are working on it — the problem is that the
syntax state.currentPlayer means to read the currentPlayer field on a Java
object, but state is a Flix extensible record. Undauted, Claude continues:
I see several issues.
Let me check the Flix documentation for the correct syntax:
I see! In Flix, record field access uses # not .. So state.size
should be state#size.
Let me also search for the correct modulo syntax:
Now I understand. Let me rewrite the file with the correct syntax:
- Record access: state#size instead of state.size
- Modulo: Int32.modulo(index, size)
So far, all problems have been purely syntactic.
But now, Claude runs into its first effect-related issues:
...
I see! Key insights:
1. Random.runWithIO introduces NonDet effect (which is uninterpretable)
2. Random.runWithSeed(seed, f) handles Random without NonDet
3. No do keyword - effect operations are called directly
There are two issues: First, the Random.runWithIO handler has both the
IO and NonDet effects which must be part of the effect signature of main.
Second, Flix no longer uses the do keyword to invoke an effect operation.
Surprisingly, Claude quickly overcomes both issues. A single error remains:
Good progress! Just one error left -
String.split already returns a List, not a Vector. Let me fix that:
And done! The program compiles. Lets try to run it:
$ flix run
Welcome to Tic-Tac-Toe!
======================
Enter grid size (3-9):
3
Starting 3x3 game. You are X, AI is O.
Get 3 in a row to win!
1 2 3
1 . . .
2 . . .
3 . . .
Your turn (X):
Enter your move (row col), e.g., '1 2':
1 2
1 2 3
1 . X .
2 . . .
3 . . .
AI is thinking...
AI plays at (3, 3)
1 2 3
1 . X .
2 . . .
3 . . O
...
1 2 3
1 X X O
2 X X .
3 O . O
AI is thinking...
AI plays at (3, 2)
1 2 3
1 X X O
2 X X .
3 O O O
Player O wins!
Recall that we asked Claude to log all moves. There is a game_log file:
=== Tic-Tac-Toe Game Log ===
X played at (1, 2)
O played at (3, 3)
X played at (1, 1)
O played at (1, 3)
X played at (2, 2)
O played at (3, 1)
X played at (2, 1)
O played at (3, 2)
Looks alright!
Refactoring the Game
Let’s return to our TODO list. We wanted Claude to address two issues:
- Claude was using its own
GameLoggereffect, but we wanted it to use the built-inLoggereffect with a custom handler. - The
mainentry point is correct, but written in a non-idiomatic style.
Let’s start with (1). We prompt Claude:
Refactor the project to use the built-in Logger effect.
Claude consults the Flix API reference and documentation.
I now understand the built-in Logger API. The refactoring involves:
1. Removing the custom GameLogger effect
2. Using Logger.info(message) instead of GameLogger.logMove(move)
3. Writing a custom handler for the built-in Logger that writes to
file using FileWrite
Notably Claude has figured out that it has to write its own handler for Logger
which itself uses the FileWrite effect.
I ask Claude to continue. The refactoring is successful: the GameLogger effect
is removed and Logger is used. The new custom handler is:
// Named "handleLoggerWithFile" but renamed to h to fit on the screen.
def h(logFile: String, f: Unit -> a \ ef): a \ (ef - Logger) + FileWrite =
run {
f()
} with handler Logger {
def log(severity, msg, resume) = {
let line = "[${severity}] ${msg}\n";
FileWrite.append({str = line}, logFile);
resume()
}
}
Claude has figured out the correct syntax for an effect polymorphic function
that handles Logger while causing the FileWrite effect, i.e. the signature:
def h(logFile: String, f: Unit -> a \ ef): a \ (ef - Logger) + FileWrite =
Impressive!
Next, I want to fix (2), i.e. improve the style of main. I prompt:
Refactor main to use default handlers for the built-in effects.
Use run-with for the Logger effect and custom handler.
See @docs/doc.flix.dev/default-handlers.html for documentation.
At this point the dreaded "compacting conversation" happens. Let’s see how it goes.
After a bit of Clauding...:
The error shows that `Random` doesn’t have a default handler.
Let me check the documentation more carefully:
Read(docs/doc.flix.dev/default-handlers.html)
⎿ Read 284 lines
I see - the documentation shows Clock, Env, and Logger have
default handlers, but Random may not. Let me check the Random API:
Search(pattern: "@DefaultHandler", path: "docs/api.flix.dev")
⎿ Found 10 lines (ctrl+o to expand)
Console and Logger have default handlers, but Random does not.
I need to keep Random.runWithIO() explicit:
The refactoring completes successfully. The new main is:
def main(): Unit \ {Console, IO, NonDet} =
let logFile = "game_log.txt";
// Initialize log file
discard FileWrite.runWithIO(() -> FileWrite.write({str = "=== Tic-Tac-Toe Game Log ===\n"}, logFile));
// Run the game - Console uses default handler, Random/Logger/FileWrite need explicit handlers
discard FileWrite.runWithIO(() ->
handleLoggerWithFile(logFile, () ->
Random.runWithIO(() ->
startGame()
)
)
)
Ugh. This is not good. While the code is technically correct, its structure is terrible. A few things have gone wrong, partly due to my own instructions:
- I asked Claude to use default handlers for all effects. However, the
FileWriteeffect has no default handler because it can fail. In particular,FileWrite.runWithIOreturns aResult[IoError, Unit]. Claude, in its infinite wisdom, decided that we should simply discard theResult🤡 - I asked Claude to use the
run-withconstruct, but it did not.
Let us try again, but this time we will be more specific:
I want you to refactor main. I want it to use a single run-with construct.
Let bind the result to a local variable called r. Inside the run, you
should call FileWrite.write and startGame. You should then use "with" to
install the handlers handleLoggerWithFile and Random.runWithIO. Lastly,
pattern match on the result r, and print an error message using
Console.println if it is Result.Err.
Now Claude consults the Flix documentation to learn about run-with and produces:
def main(): Unit \ {Console, IO, NonDet} =
let logFile = "game_log.txt";
let r = run {
FileWrite.write({str = "=== Tic-Tac-Toe Game Log ===\n"}, logFile);
startGame()
} with handleLoggerWithFile(logFile)
with Random.runWithIO
with FileWrite.runWithIO;
match r {
case Result.Err(e) => Console.println("Error: ${e}")
case Result.Ok(_) => ()
}
Much better. This example shows that Claude is able to program effectively (no
pun intended!) with a rich collection of effects. The program uses the
Console, FileWrite, Logger, and Random effects. Embracing the richness
of the domain, we handle the Logger effect using handleLoggerWithFile, which
in turn causes the FileWrite effect. We then handle the remaining effects
using their associated runWithIO handlers, introducing the primitive effects
IO and NonDet. Lastly, Console is handled by its default effect handler.
Aside: This program illustrates how, in effect-oriented programming languages, effects are pushed outwards toward
main, where they must be handled. In other words, all dependency injection and plumbing typically happens nearmain, while most of the program remains pure modulo effects.
We might be disappointed that main is still fairly complex. Why not use
default handlers for FileWrite and Random? For Random, Flix has a
limitation that prevents it. For FileWrite, errors are pushed to the handler
rather than the call site, so we must handle them ourselves (here, by printing
to the terminal). We could have used FileWriteWithResult, forcing the game to
handle errors and allowing a default handler in main, but that would add
complexity to the game logic.
Game API and Excerpts
For reference, here are some of the function signatures of the game:
def aiMove(state: GameState): (Int32, Int32) \ Random =
def cellToChar(cell: Cell): String =
def displayBoard(state: GameState): Unit \ Console =
def parseInt(s: String): Option[Int32] =
def promptGridSize(): Int32 \ Console =
def promptMove(state: GameState): (Int32, Int32) \ Console =
def gameLoop(state: GameState): GameResult \ {Console, Logger, Random} =
def startGame(): Unit \ {Console, Logger, Random} =
def main(): Unit \ {Console, IO, NonDet} =
Claude has done a solid job breaking the game into smaller functions, each using
only the effects it needs. Notably, the IO effect is only used by main. This
is the essence of effect-oriented programming, and Claude has achieved it.
We can look at a few examples:
def startGame(): Unit \ {Console, Logger, Random} =
Console.println("Welcome to Tic-Tac-Toe!");
Console.println("======================");
let size = promptGridSize();
let state = initGameState(size);
Console.println("");
Console.println("Starting ${size}x${size} game. You are X, AI is O.");
Console.println("Get 3 in a row to win!");
Console.println("");
discard gameLoop(state)
Here startGame has the Console, Logger, and Random effects. Console is
used directly, whereas Logger and Random are used by gameLoop.
And here is the promptGridSize function:
def promptGridSize(): Int32 \ Console =
Console.println("Enter grid size (3-9):");
let input = Console.readln();
match parseInt(String.trim(input)) {
case Some(n) if n >= 3 and n <= 9 => n
case _ =>
Console.println("Invalid size. Please enter a number between 3 and 9.");
promptGridSize()
}
Lessons Learned
I think the key lessons are:
- Giving Claude access to the newest Flix API reference and documentation is invaluable. Claude is surprisingly good at consulting both whenever it needs to use a feature or runs into a problem.
- Giving Claude access to the compiler is must. Furthermore, the Flix compiler’s resilience allows Claude to discover and correct multiple issues in one pass.
Claude’s code style isn’t great. I’d call it unopinionated, which feels a bit odd—but perhaps makes sense, given that there isn’t yet a defined Flix style. In the future, I’d like to explore whether giving Claude a style guide would help.
Conclusions
Returning to the central question of this blog post:
Will large language models (LLMs) help or hurt the adoption of new programming languages?
Based on my experience over the past six months, I’m convinced that LLMs will only help the adoption of new languages. We’ve seen that by giving Claude access to the Flix API reference, documentation, and compiler, it can become a capable Flix programmer.
While Claude makes silly syntax mistakes, which would not happen for Python or
JavaScript, it can usually overcome them. More impressively, Claude is already
capable of effect-oriented programming. It can write functions that use
effects, functions that handle effects, and even intricate effect polymorphic
functions such as handleLoggerWithFile.
Thus, I think, the future for new programming languages is as bright as ever.
Until next time, happy Clauding.