Why We Use Smoke Tests (And Don't Run Them in CI)

By Best ROI Media

The build passed. That's what made it dangerous.

We'd just finished a major refactor. Stripe initialization moved to lazy loading. OpenAI client creation deferred until actual use. DeepSeek API calls wrapped in runtime checks. The TypeScript compiled. The linter was happy. The build succeeded.

But we had no idea if production would actually work.

This is the gap between build-time correctness and runtime safety. Your code compiles. Your tests pass. Your CI is green. But when a real request hits an API route, does it respond? Or does it crash on import? Does it leak secrets in error messages? Does it fail silently?

Smoke tests answer that question. Not with unit tests. Not with integration tests. With a simple script that asks: "Is the system alive and behaving safely?"

What Smoke Tests Really Are

Smoke tests are not unit tests.

Unit tests verify that functions return the right values. They test logic in isolation. They run fast. They're part of your development workflow.

Smoke tests verify that the system responds to real requests. They test behavior at runtime. They're intentionally simple. They're part of your deployment workflow.

Smoke tests are not integration tests.

Integration tests verify that components work together. They test workflows. They require setup. They validate business logic.

Smoke tests verify that endpoints don't crash. They test availability. They require no setup. They validate structural integrity.

Smoke tests answer one question:

"Is the system alive and behaving safely?"

That's it. Not "does it work correctly?" Not "are all edge cases handled?" Just: "does it respond without crashing, without leaking secrets, without exposing internal errors?"

Smoke tests exist to catch catastrophic failures, not edge cases.

A smoke test doesn't care if your billing calculation is off by a penny. It cares if your billing endpoint returns a 500 error with "Missing STRIPE_SECRET_KEY" in the response body.

The first is a bug. The second is a structural failure.

The Problem They Solve

Most teams misunderstand what smoke tests are for. They think smoke tests are just "lightweight integration tests." They run them in CI. They expect them to catch bugs.

That's not what smoke tests do.

Smoke tests catch structural failures. The kind that happen when:

Builds pass but production fails
Secrets leak in error messages
API routes crash on import
CI requires production secrets

These aren't bugs. They're architectural problems.

Here's what happens: You refactor your code to lazy-load Stripe. The build succeeds. TypeScript is happy. But when a request hits /api/billing/bootstrap, the route handler tries to initialize Stripe, fails silently, and returns a 500 error with "Missing STRIPE_SECRET_KEY" in the JSON response.

That's not a bug in your code. That's a structural failure in how secrets are handled.

Or: You move OpenAI client creation to runtime. The build succeeds. But when a request hits /api/ai/assistant, the module imports fail because environment variables aren't available in the CI environment. Your CI breaks. You add the secrets to CI. Now your secrets are in CI.

That's not a bug in your code. That's a structural failure in how environments are separated.

Smoke tests catch these failures before they become emergencies.

Why Smoke Tests Should Usually NOT Run in CI

This is important.

CI validates build-time correctness. It checks that your code compiles. It runs linters. It executes unit tests. It verifies that the code you wrote is syntactically and logically correct.

Production requires runtime validation. It checks that your code actually runs. It verifies that dependencies are available. It confirms that secrets are present. It ensures that the system behaves safely when real requests arrive.

These are different concerns. Mixing them creates fragile pipelines.

Here's what happens when you run smoke tests in CI:

Your smoke test hits /api/billing/bootstrap. The endpoint requires Stripe. Stripe requires STRIPE_SECRET_KEY. CI doesn't have production secrets. The test fails. CI breaks.

You have two options:

Add production secrets to CI (bad idea)
Make the smoke test skip endpoints that require secrets (defeats the purpose)

Neither option is good.

The first option leaks secrets into CI. The second option means your smoke test doesn't actually verify production safety.

The solution: Don't run smoke tests in CI. Run them manually against production after deployment. Or run them in a separate pipeline that has access to production secrets. Or run them locally against production URLs.

CI should validate build-time correctness. Smoke tests should validate runtime safety. Keep them separate.

The Smoke Test Philosophy at Best ROI

When we built our smoke test system, we established clear principles:

No destructive actions. Smoke tests never create charges. They never write data. They never trigger side effects that can't be undone.

No auth bypass. Smoke tests hit endpoints as unauthenticated requests. They verify that endpoints fail gracefully when auth is missing, not that they crash or leak secrets.

No secrets required for CI. The smoke test script can run in CI mode, where it allows "missing key" errors. This lets CI validate that the script itself works, without requiring production secrets.

Explicit runtime verification. Smoke tests run manually, intentionally, after deployment. They're not automatic. They're not part of every build. They're a lever you pull when you need confidence.

Smoke tests are manual by default. This is intentional.

When you deploy, you run the smoke test. You verify that production is alive. You confirm that endpoints respond safely. You check that secrets aren't leaking.

Then you move on.

Smoke tests aren't a monitoring system. They're a verification tool. You use them when you need to know that a deployment worked, not to continuously monitor uptime.

The Health Check Endpoint

Every system needs a health check endpoint. It's the simplest possible smoke test.

Here's ours:

// app/api/health/route.ts
import { NextResponse } from 'next/server';

/**
 * Simple health check endpoint.
 * Returns 200 with { ok: true, timestamp }
 * No secrets required.
 */
export async function GET() {
  return NextResponse.json({
    ok: true,
    timestamp: new Date().toISOString(),
  });
}

That's it. No dependencies. No secrets. No logic. Just a simple endpoint that returns 200 with a timestamp.

Why does this exist?

It proves that the server is running. It proves that routing works. It proves that JSON serialization works. It proves that the basic request/response cycle functions.

If the health check fails, nothing else matters. The system is down.

If the health check passes, you can move on to testing endpoints that actually do something.

The health check is the foundation. Everything else builds on it.

The Smoke Test Script

Our smoke test is a simple Node.js script. Not a test framework. Not a testing library. Just a script that makes HTTP requests and checks responses.

Here's the core of it:

// scripts/smoke-api.mjs
const BASE_URL = process.env.BASE_URL || 'http://localhost:3000';
const CI_MODE = process.env.SMOKE_CI_MODE === '1';

async function check(path) {
  const res = await fetch(`${BASE_URL}${path}`);
  const text = await res.text();

  if (!CI_MODE) {
    if (text.includes('Missing') && text.includes('KEY')) {
      throw new Error(`Secret leaked in response for ${path}`);
    }
  }

  console.log(`✓ ${path} (${res.status})`);
}

await check('/api/health');
await check('/api/billing/bootstrap');
await check('/api/ai/assistant');

Why is it intentionally simple?

Because complexity hides failures. If your smoke test has complex logic, you might miss simple problems. If your smoke test requires setup, you might not run it when you should.

A simple script that makes requests and checks responses is easy to understand, easy to run, and easy to trust.

Why does it check behavior, not payloads?

Because smoke tests verify structural integrity, not business logic. We don't care if the billing endpoint returns the correct price. We care if it returns a 401 (unauthorized) instead of a 500 (internal error) when auth is missing.

We care if it leaks "Missing STRIPE_SECRET_KEY" in the response body. We don't care if the response format is exactly what we expect.

Why does it allow missing-key errors in CI mode?

Because CI shouldn't have production secrets. When running in CI, we want to verify that the script works, not that production is configured. The SMOKE_CI_MODE=1 flag tells the script to skip secret-leak detection, allowing CI to validate the script itself without requiring secrets.

Commands We Use

We have three ways to run smoke tests:

# Local development
npm run smoke:local

This runs against http://localhost:3000. Use it when developing locally to verify that your changes don't break basic functionality.

# Production verification
npm run smoke:prod

This runs against https://bestroi.media. Use it after deployment to verify that production is alive and behaving safely.

# CI-safe mode (no secrets)
SMOKE_CI_MODE=1 npm run smoke:local

This runs in CI mode, allowing missing-key errors. Use it in CI to validate that the smoke test script itself works, without requiring production secrets.

Each command has a purpose. Local development. Production verification. CI validation. Keep them separate.

How Smoke Tests Saved Us at Best ROI

Here's the story.

We were refactoring our API routes to remove secrets from CI. The goal: Make CI runnable without production secrets, while ensuring production still worked correctly.

The approach: Lazy initialization. Instead of initializing Stripe, OpenAI, and DeepSeek clients at module load time, we'd initialize them on first use. This meant CI could import the modules without errors, even without secrets.

The build passed. TypeScript compiled. The linter was happy. CI was green.

But we had no idea if production would actually work.

So we wrote a smoke test. A simple script that hit our API routes and checked responses.

The first run revealed the problem: One endpoint was returning 500 errors with "Missing OPENAI_API_KEY" in the response body. Not in the error logs. In the JSON response sent to clients.

That's a secret leak. A structural failure. Not a bug, but an architectural problem.

We fixed it. We wrapped the OpenAI initialization in a try-catch. We returned generic error messages to clients. We logged detailed errors server-side.

The smoke test passed.

Then we deployed. We ran the smoke test against production. It passed. Production was alive. Endpoints were responding safely. Secrets weren't leaking.

We had confidence.

Not because the build passed. Not because CI was green. Because we verified runtime safety explicitly.

What Smoke Tests Are Not

It's important to be clear about what smoke tests are not.

They are not a replacement for tests. Unit tests verify logic. Integration tests verify workflows. Smoke tests verify structural integrity. You need all three.

They are not an uptime monitor. Smoke tests verify that a deployment worked. They don't continuously monitor production. Use a proper monitoring service for that.

They are not a security audit. Smoke tests check for obvious secret leaks. They don't verify authentication. They don't check authorization. They don't validate input sanitization.

They are not something you run constantly. Smoke tests are manual. You run them when you need confidence. After a deployment. After a refactor. When something feels off.

Smoke tests are a lever, not a crutch.

You use them when you need to verify that production is safe. You don't rely on them to catch every problem. You don't automate them into every build. You use them intentionally, when the situation calls for it.

Closing: Why This Matters Long-Term

Reliability is a competitive advantage.

When your system behaves predictably, you ship faster. You refactor with confidence. You sleep better at night.

When your system is fragile, you move slowly. You're afraid to change things. You're always one deployment away from disaster.

Smoke tests are a small investment in reliability. They take minutes to write. They take seconds to run. But they give you confidence that production is safe.

Calm deployments come from knowing that your system works. Not hoping. Not assuming. Knowing.

Confidence to refactor comes from having a way to verify that changes didn't break production. Not unit tests that pass. Not CI that's green. Actual runtime verification.

Fewer late-night emergencies come from catching structural failures before they become user-facing problems. Not from perfect code. Not from comprehensive tests. From simple checks that verify the system is alive.

This is how we build systems that don't fight us.

Not with perfect architecture. Not with comprehensive test coverage. Not with complex monitoring.

With simple tools that answer simple questions. With verification that happens when it matters. With confidence that comes from knowing, not hoping.

Smoke tests are boring. They're simple. They're manual.

And they save production.