Building Spotsjekk: How to Use AI Coding Tools Effectively

Dec 30, 2025 · 10 min read

Introduction

I recently built Spotsjekk, a Norwegian electricity price comparison app. It helps you figure out if you’re better off on a flat-rate plan (Norgespris) or dynamic spot pricing with the government subsidy. But the more interesting story here isn’t the app itself. It’s how I built it: using Google’s “vibe coding” IDE called Antigravity, combined with Claude deep research for gathering requirements.

So this post is really about how to use agentic coding tools without losing control of what they’re building. If you’re working with AI coding assistants, some of this might save you a few headaches.

The Real Challenge: Keeping Context Intact

Here’s the thing I learned pretty quickly. The hard part of working with AI coding tools isn’t getting them to generate code. They’re great at that. The hard part is keeping them grounded in what you actually need.

Without managing context properly, you end up with:

Solutions that sound right but don’t match your actual requirements
Code that breaks domain-specific rules the AI didn’t know about
Patterns the AI invented because it lost track of your specs
Gradual drift from your original design as the project grows

Why does this happen? Because once the AI loses track of your documentation and specs, it fills the gaps with things that sound plausible. The longer a conversation goes without anchoring back to your actual requirements, the further the implementation drifts.

So the solution is simple in theory: write solid documentation first, then actively keep the AI referencing it.

Step 1: Research Before Code

Before writing any code, I used Claude deep research to understand the Norwegian electricity market and saved everything as markdown files:

1. norwegian_eletricity_market.md - Domain knowledge:

How Norgespris pricing works (flat 50 øre/kWh vs spot pricing)
The Strømstøtte subsidy formula: (spot - 75) × 0.90 × 1.25
Price zones (NO1-NO5) and MVA (tax) handling
Edge cases like negative spot prices

2. norgespris_architecture.md - Technical spec:

API constraints (HvaKosterStrommen.no returns one day at a time)
Database schema (hourly prices, daily aggregates, running totals)
Calculation formulas with exact break-even analysis
Docker architecture (ended up going with SQLite for simplicity)

3. nettleie_api_analysis.md - Integration research:

Available APIs for network fees
Grid company data sources

These files became the source of truth that kept everything honest throughout development. When Antigravity generated code, I could verify it against these docs and catch mistakes early.

The key insight here: Claude deep research is really good at synthesizing domain knowledge into structured docs. I got a thorough understanding of Norwegian power market regulations, subsidy formulas, and API ecosystems, all saved as markdown I could version control.

Step 2: How Antigravity’s Artifacts Work

This is where things get interesting. Antigravity has a system called artifacts that gives you transparency into what the AI is doing. These aren’t files in your repo. They’re deliverables the AI creates as it works:

Task lists: Steps the agent has taken
Implementation plans: Architectural changes before they’re made
Walkthroughs: Descriptions of what was built and how to test it
Screenshots: Captured automatically during browser testing
Browser recordings: Video of the agent testing your app

The powerful part is you can interact with these using Google Docs-style comments:

Commenting on artifacts to keep them grounded

Artifact: Implementation Plan
────────────────────────────
Add Statnett grid data integration
- Create new database table for regional data
- Implement API service in services/statnett_service.py
- Add endpoint to routers/statnett.py

[Your comment]: "This should follow the same pattern
as price_fetcher.py. Check docs/architecture.md for
the service pattern."

[Agent response]: Updated plan to match existing
price_fetcher pattern from architecture doc.

The agent takes your feedback into account for future iterations. That feedback loop is what keeps things consistent.

The Workflow That Worked

Documentation → Artifacts → Code flow

Your Repository               Antigravity Artifacts              Generated Code
────────────────              ─────────────────────              ──────────────
docs/                     →   [Task List]                    →
architecture.md         →   [Implementation Plan]          →   backend/app/
market.md               →     - References docs/market.md  →   calculations.py
api_analysis.md         →   [Walkthrough]                  →   services/
                            [Screenshots of app running]
                            [Browser recording]

The browser integration is also worth mentioning. The Chrome extension captures screenshots and video recordings of the agent actually testing the app, so you have proof it ran and verified the code, not just generated it.

Keeping Artifacts Honest

This is where my involvement was critical. Artifacts are useful, but they need supervision:

Review before approving - Check implementation plans against your docs. Watch the browser recordings. Make sure task lists match what you actually asked for.
Correct drift with comments - When an artifact suggests something that doesn’t match the architecture, call it out:

Correcting artifacts via comments

[Me commenting on Implementation Plan]:
"This calculation doesn't match the formula in
docs/norwegian_eletricity_market.md. The Strømstøtte
formula is (spot - 75) × 0.90 × 1.25, not what's shown here."

[Agent updates artifact]:
Updated calculation to match docs/norwegian_eletricity_market.md
Formula: support = (spot - 0.75) × 0.90 × 1.25

Use artifacts for working memory, docs for truth - Artifacts capture current state. Docs capture what should be. When those diverge, the docs win.

Step 3: Building With Antigravity

With documentation and the artifact system in place, here’s how development actually went.

What Worked Well

Domain logic came out correct. The Norwegian electricity subsidy calculation isn’t trivial. The formula has thresholds, multipliers for tax, edge cases with negative prices. Because my market research doc had the exact formula with examples, Antigravity implemented it correctly. I checked the generated code against the spec and it matched.

Architecture stayed consistent. The architecture doc specified FastAPI with SQLAlchemy, three database tables, APScheduler for daily price fetching, Docker Compose with SQLite. Antigravity followed this throughout. New features fit existing patterns because the artifacts kept referencing the architecture doc.

Library choices made sense. The architecture doc mentioned requirements like “smart caching” and “daily background jobs.” Antigravity picked TanStack Query for React data fetching, APScheduler for the price fetcher, SlowAPI for rate limiting, Recharts for charts. These weren’t random. They aligned with what was documented.

What Needed Guidance

I still had to verify everything. Even with good docs, I’d check: does this calculation in calculations.py match the formula in the market doc? Are we handling negative spot prices correctly? The AI got it right most of the time, but having docs as reference let me catch the subtle bugs early.

Performance decisions emerged during development. The architecture doc laid out the schema but didn’t specify when to calculate aggregates. I guided Antigravity to pre-calculate daily aggregates on price arrival (not per request), store running totals in the database, and follow the aggregator service pattern from the architecture doc.

Deployment details needed manual work. “Docker Compose with SQLite” was in the spec, but production specifics like multi-stage Docker builds, separate dev and prod compose files, and Caddy reverse proxy for HTTPS evolved as the project matured.

What I Learned About Managing Context

Documentation as Source of Truth

The workflow that kept things on track:

Documentation as source of truth

docs/                          Antigravity Artifacts           Code
─────────────────────         ────────────────────            ────────────
architecture.md            →  [Working Spec]       →          backend/
norwegian_electricity.md   →  [Domain Rules]       →          calculations.py
api_analysis.md            →  [Current State]      →          services/
                            [Screenshots]
                            [Browser recordings]

Whenever artifacts drifted, I’d say things like:

“Check docs/norwegian_eletricity_market.md for the correct Strømstøtte formula”
“Reference docs/architecture/norgespris_architecture.md section on database schema”

That kept the artifacts honest and prevented hallucination.

Active Artifact Management

Artifacts need supervision. What I did:

Reviewed them regularly against the docs
Corrected them when they suggested patterns inconsistent with the architecture
Treated artifacts as working memory (current state) and docs as truth (what should be)

Correcting artifact drift

Me: "The artifact says to use PostgreSQL, but docs/architecture.md
   specifies SQLite for simplicity. Update the artifact."

Antigravity: [Updates artifact to reference SQLite decision from docs]

The Verification Loop

My actual development cycle looked like this:

Ask Antigravity to implement feature X
Antigravity references its artifacts and creates code
I verify the code against the original docs
If something’s off: correct the artifact and regenerate
If it matches: move on

That verification step is everything. Without it, artifacts gradually drift from your original specs.

Good Prompts vs Bad Prompts

What worked:

“Implement the price fetcher service using the architecture from docs/architecture.md”
“Add Statnett integration following the API patterns documented in docs/”
“Does this calculation match what’s specified in docs/norwegian_eletricity_market.md?”

What didn’t:

Assuming artifacts were always correct
Not checking generated code against documentation
Letting the conversation drift without tying back to docs

The Result

Spotsjekk went from research to production:

Full-stack: FastAPI backend, React frontend, Docker deployment
Production quality: Error handling, caching, rate limiting, security headers
Complex domain logic: Norwegian electricity subsidy calculations with edge cases
Real traffic: Live at spotsjekk.no serving actual users

Would hand-coding have been faster? Probably not. But the important thing is that the AI implementation stayed aligned with the spec because of the documentation grounding.

Key Takeaways

If you’re using AI coding tools with context systems like Antigravity:

Write detailed specs before coding. Exact formulas, architecture decisions, implementation requirements. AI tools are only as good as the specs they work from.
Document your architecture decisions. Database schemas, API patterns, library choices, deployment strategies. Without clear architectural guidance, the AI makes inconsistent choices.
Be specific to prevent hallucination. Vague requirements lead to plausible-sounding but wrong implementations. The Norwegian subsidy formula couldn’t be guessed. It had to be specified exactly.
Learn your tool’s context system. Antigravity uses artifacts for task lists, implementation plans, and verification. Know how they work so you can manage them.
Ground artifacts in documentation. Let the AI create artifacts, but verify they reference your specs. Comment on them to correct drift.
Build a verification loop. Always check generated code against your source docs. Artifacts drift. Spec docs don’t.
Reference docs explicitly in prompts. “Check docs/X.md before implementing” forces the AI to ground its work in your specs rather than guessing.
You’re the architect, the AI is the builder. These tools are powerful code generators, but they need supervision and clear specs to stay on track.

Final Thoughts

Building Spotsjekk taught me that AI coding tools are force multipliers when properly managed. They’re great at implementing complex features from clear specs, maintaining architectural consistency, and generating production-quality code quickly.

But they need you for understanding the problem domain, creating source-of-truth documentation, verifying output against specs, and making the judgment calls that come up during development.

The biggest lesson: the quality of AI-generated code ties directly to the quality of your upfront documentation and how well you maintain context. Claude deep research into markdown docs, into Antigravity artifacts, into verified code. Skip steps and things start drifting.

Context management isn’t just a nice-to-have. Losing context means losing your reality anchor. Every time the conversation drifts from your specs, the AI starts inventing solutions that sound right but might not be. Keep it tethered to your documentation and you’ll be fine.

If you want to check it out, Spotsjekk is live at spotsjekk.no.