2026 is the year ai gets good at excel

1/6/2026

opus 4.5 and gpt5.2 are currently better than most software engineers on the planet

but they still can't fully replace nancy from finance

or to not even go that far: nancy from finance hasn't had her claude code moment yet

what will it take for nancy from finance to have her claude code moment

code is just text and its structure:

is human readable
is heavily represented in the training set for AIs
has millions of tools built around it after decades of the existence of the software dev industry
- programmers love building tools to speed up their internal loops or selling to other programmers
  - also makes sense because this is what they inherently have deep domain expertise in
    - this is a tangential meta comment now but this is why the software dev vertical was affected first by ai augmentation

I write these posts in mdx files that just live alongside by codebase for this website, which is why I can now confidently just prompt claude code something like:

(sped up 2x)

something like a powerpoint file is:

actually a zip file and a bunch of xml in a trench coat with shit all over the place

its underlying data structure was always meant to be consumed by and written to by the powerpoint application - not by something going in and manually meddling with individual components

Anthropic has started building a bridge for its models to cross to microsoft office land by giving agents skills

here's an excerpt from Anthropic's powerpoint skill so agents can better handle unpacking powerpoint files:

### Raw XML access
You need raw XML access for: comments, speaker notes, slide layouts, animations, design elements, and complex formatting. For any of these features, you'll need to unpack a presentation and read its raw XML contents.

#### Unpacking a file
`python ooxml/scripts/unpack.py <office_file> <output_dir>`

**Note**: The unpack.py script is located at `skills/pptx/ooxml/scripts/unpack.py` relative to the project root. If the script doesn't exist at this path, use `find . -name "unpack.py"` to locate it.

#### Key file structures
* `ppt/presentation.xml` - Main presentation metadata and slide references
* `ppt/slides/slide{N}.xml` - Individual slide contents (slide1.xml, slide2.xml, etc.)
* `ppt/notesSlides/notesSlide{N}.xml` - Speaker notes for each slide
* `ppt/comments/modernComment_*.xml` - Comments for specific slides
* `ppt/slideLayouts/` - Layout templates for slides
* `ppt/slideMasters/` - Master slide templates
* `ppt/theme/` - Theme and styling information
* `ppt/media/` - Images and other media files

#### Typography and color extraction
**When given an example design to emulate**: Always analyze the presentation's typography and colors first using the methods below:
1. **Read theme file**: Check `ppt/theme/theme1.xml` for colors (`<a:clrScheme>`) and fonts (`<a:fontScheme>`)
2. **Sample slide content**: Examine `ppt/slides/slide1.xml` for actual font usage (`<a:rPr>`) and colors
3. **Search for patterns**: Use grep to find color (`<a:solidFill>`, `<a:srgbClr>`) and font references across all XML files

The next part in that same skill actually shows the instruction to claude to create powerpoints from scratch by first creating the presentation in html and then using a tool to convert it to a pptx

These layers of abstraction should be torn down over time - be that by the model literally navigating the human GUI or just intrinsically being able to navigate the underlying structures

Reinforcement Learning with Verifiable Rewards

I'm admittedly not entirely in the technical weeds for this next bit and haven't yet fully grasped it but:

the post-training stage for llms in 2025 had huge boosts in giving models reinforcement learning environments to learn in for different tasks - google or ask an ai for RLVR for info ab this bc im too lazy to get into it and not my place

I'm sure that one of the heaviest places the labs are all investing in is RL environments for manipulating docx, pptx, and xlsx files

these seeds have begun to be planted but im sure we'll start seeing their fruit with the new generation of models of 2026

Like it happened with programming, the models will continue to get better intrinsicially alongside the evolution of tools and harnesses that wrap them

A company like shortcut.ai may actually have its Cursor for Excel moment

Or Microsoft Copilot for Excel could actually be a thing in the near future if microsoft doesn't continue fumbling everything