I Let an AI Write My GIS Workflow. Here's What Broke.

There’s a growing narrative that AI can write code for you.

That’s true.

What’s more interesting is what happens after the code is written.

I’ve been experimenting with integrating AI directly into ArcGIS Pro workflows. The idea is simple:

Describe what you want → get working geoprocessing code.

In practice, it looks like this:

I type: “Select all points within 1 mile of schools and summarize by district”
The system generates Python (ArcPy)
The code runs inside a real project

And sometimes… it works perfectly.

Other times, it breaks in ways that are surprisingly consistent.

Where things actually break

1. The “almost right” problem

AI is very good at generating code that looks correct.

It’s much worse at generating code that:

uses the correct coordinate system
handles edge cases in real datasets
respects schema constraints

Example:

It buffers in degrees instead of meters
Or assumes a field exists that doesn’t

This is dangerous because:

The output looks valid, but the result is wrong.

2. Context is everything (and AI doesn’t have enough of it)

In a real GIS project:

layers have naming conventions
fields have meaning
projections matter

Without that context, AI guesses.

Sometimes correctly. Often not.

This is where most “AI coding demos” fall apart. They work in isolation, not inside messy systems.

3. Execution is the real problem

Generating code is easy.

Running it safely is not.

In a production environment, you need:

dry-run modes
logging
validation checks
rollback strategies

Without that, you’re basically letting an AI modify your data blindly.

What actually works

After a lot of trial and error, I’ve landed on a pattern:

AI should:

generate code
suggest approaches

Humans should:

validate intent
review execution
own the result

The key shift

The real value of AI isn’t:

“write code for me”

It’s:

“reduce the distance between intent and execution”

But there’s a gap between:

generated code
trustworthy systems

Most of my work lately has been about closing that gap.

What I’m exploring next

scoring AI-generated code quality
comparing outputs against known-good datasets
building guardrails into execution environments

Basically:

Not “can AI write code?”
But “can we trust what it produces?”

If you’re using AI in real workflows, I’d love to hear:

where it breaks for you
what guardrails you’ve built

Because that’s where the interesting work is happening.

AI GIS Arcgis Automation Guardrails Geoprocessing