09 - i18n & Testing

Two operational topics that touch every component. Codified here so they don’t drift.

Internationalization

Stack: react-i18next. Source-of-truth language: English. Translated into locales as product needs them (first tier: Vietnamese).

When to translate

Every user-facing string goes through t(). Not just “main content” strings — errors, tooltips, placeholders, aria-labels, button labels, everything.

Exceptions (stay hardcoded):

Product names (“Nexus Agentic”, “Nexus Admin”).
Proper nouns (agent names, tool names — these are user data, not UI strings).
Debug logs (not user-facing).
Developer-only surfaces (internal dashboards, admin-only tools).

Where translations live

Admin app:

apps/admin/public/locales/
  en/common.json
  vi/common.json

Keys are structured by feature area, not by screen:

{
  "agent": {
    "create": "Create agent",
    "delete": "Delete agent",
    "deleteConfirm": "Delete this agent?",
    "deleteDescription": "This permanently deletes the agent and all its data. This cannot be undone."
  },
  "common": {
    "save": "Save",
    "cancel": "Cancel",
    "saving": "Saving…",
    "saved": "Saved"
  }
}

Using t()

import { useTranslation } from 'react-i18next';

function AgentActions({ id }) {
  const { t } = useTranslation();
  return (
    <Button onClick={...}>{t('agent.create')}</Button>
  );
}

What NOT to do

Don’t concatenate strings.

// BAD — breaks in many languages
<p>{t('agent.found')} {count} {t('agent.results')}</p>

// GOOD — ICU plural
<p>{t('agent.foundResults', { count })}</p>

With the JSON:

{
  "agent": {
    "foundResults": "{{count, plural, one {Found # agent} other {Found # agents}}}"
  }
}

Don’t translate pieces and compose them.

// BAD — word order varies by language
const word = status === 'active' ? t('status.active') : t('status.inactive');
return (
  <p>
    {t('agent.is')} {word}
  </p>
);

// GOOD — full translatable sentences
return <p>{t(`agent.is.${status}`)}</p>;

Don’t use variable interpolation for singular/plural.

// BAD
<p>{count} {count === 1 ? 'item' : 'items'}</p>

// GOOD — ICU plural in the message
<p>{t('list.itemCount', { count })}</p>

Dates and numbers

Use Intl APIs or the locale-aware formatter — never concatenate your own:

const n = new Intl.NumberFormat(i18n.language).format(1234); // "1,234" or "1 234"
const d = new Intl.DateTimeFormat(i18n.language, { dateStyle: 'medium' }).format(date);

Keys vs strings

Use descriptive keys, not literal-English-text keys:

// BAD — translators see useless key
t('Create agent');

// GOOD
t('agent.create');

When a translation key changes meaning in a new locale, you update the key. When it’s the English string that changed, you update the English translation file.

RTL readiness

We don’t ship RTL yet. To stay ready:

Use start / end logical properties when possible (ms-2 / me-2 in Tailwind) — Tailwind v4 supports this.
Avoid left-* / right-* for anything language-dependent (use start-* / end-*).
Test with dir="rtl" on <html> occasionally to catch bugs early.

Missing keys

react-i18next config falls back to the key itself if the translation is missing. In development, log a warning:

i18n.init({
  saveMissing: true,
  missingKeyHandler: (lngs, ns, key) => console.warn(`Missing i18n key: ${key}`),
});

Don’t ship code with missing keys.

Chat app (Next.js)

Same stack. Keys live under apps/chat/public/locales/. Additional consideration: the chat app’s voice is warmer (see 07-voice.md) — don’t share error-message strings between admin and chat apps without re-reviewing the voice.

Testing

Stack: Vitest + @testing-library/react + jsdom. Mutation stubs via vi.hoisted. User interactions via @testing-library/user-event.

Philosophy

Test the user-visible behavior, not the implementation. If you’re asserting on class names, state hooks, or internal DOM structure — you’re testing the wrong thing.

Testid convention

data-testid is the escape hatch when role-based queries don’t work. Prefer:

getByRole (buttons, headings, inputs, etc.) — closest to what users and screen readers use.
getByLabelText for form fields.
getByText for visible copy.
getByTestId — last resort.

Our components already expose data-testid where useful (e.g., action-button-error, unsaved-badge, saved-indicator).

Test file layout

Tests live next to the component:

packages/ui/src/components/ActionButton.tsx
packages/ui/src/components/ActionButton.test.tsx

apps/admin/src/__tests__/WorkflowToolbar.test.tsx
apps/admin/src/pages/agents/detail/workflow/WorkflowToolbar.tsx

One test file per component. Group tests by feature area within the file (basics, icon slot, label swap, error handling).

Wizard test pattern

A wizard built per 06-flows.md § Multi-step wizard ships with a fixed test layout. Five files per wizard, regardless of step count:

apps/admin/src/pages/agents/create/AgentCreatePage.tsx
apps/admin/src/pages/agents/create/__tests__/AgentCreatePage.test.tsx        ~150 LOC
apps/admin/src/pages/agents/create/__tests__/Step1Template.test.tsx          ~80 LOC
apps/admin/src/pages/agents/create/__tests__/Step2Configure.test.tsx         ~80 LOC
apps/admin/src/pages/agents/create/__tests__/Step3Tools.test.tsx             ~60 LOC
apps/admin/src/pages/agents/create/__tests__/Step4Review.test.tsx            ~50 LOC

Coverage assignments (one per file):

Page-level (AgentCreatePage.test.tsx): the reducer (state transitions per action), the submit handler (success / 409 conflict / 403 forbidden / network error / partial-success after multi-API submit), the close-confirm flows (Esc / ✕ / browser-back when dirty vs not dirty).
Per-step (Step{N}.test.tsx): rendering each step’s loading / empty / error states, validation rules, keyboard nav within the step (↑/↓ on a card grid, Enter advances).
Happy-path E2E: one test that walks the full flow end to end (template → fill → tools → create → land on detail). If Playwright isn’t configured in the app, defer E2E to a follow-up TODO and ship the unit tests; do NOT fake it with a “page-level integration test that’s actually unit tests in disguise.”

The 5-file split keeps each file under 150 LOC and lets failures point at exactly which step regressed. A monolithic AgentCreatePage.test.tsx covering all 4 steps becomes 600 LOC and impossible to navigate.

Example: testing an ActionButton

import { describe, it, expect, vi, afterEach } from 'vitest';
import { render, screen, cleanup } from '@testing-library/react';
import userEvent from '@testing-library/user-event';
import { ActionButton } from './ActionButton';

const idle = { status: 'idle' as const, error: null };
const pending = { status: 'pending' as const, error: null };
const makeError = (msg: string) => ({ status: 'error' as const, error: new Error(msg) });

describe('ActionButton', () => {
  afterEach(cleanup);

  it('disables button while pending', () => {
    render(<ActionButton mutation={pending}>Save</ActionButton>);
    expect(screen.getByRole('button')).toBeDisabled();
  });

  it('calls onClick when clicked', async () => {
    const user = userEvent.setup();
    const onClick = vi.fn();
    render(
      <ActionButton
        mutation={idle}
        onClick={onClick}
      >
        Save
      </ActionButton>
    );
    await user.click(screen.getByRole('button'));
    expect(onClick).toHaveBeenCalledOnce();
  });

  it('renders error below in "below" placement', () => {
    render(
      <ActionButton
        mutation={makeError('Name must be unique')}
        errorPlacement="below"
      >
        Save
      </ActionButton>
    );
    expect(screen.getByRole('alert')).toHaveTextContent('Name must be unique');
  });
});

Mocking mutations — the hoist pattern

For TanStack Query mutations, we use hoisted state so vi.mock factories can see them:

const { updateStub, setUpdateState } = vi.hoisted(() => {
  let state: Record<string, unknown> = {
    status: 'idle',
    error: null,
    isPending: false,
    isError: false,
    isSuccess: false,
    mutate: vi.fn(),
    mutateAsync: vi.fn(),
  };
  return {
    updateStub: () => state,
    setUpdateState: (s: Record<string, unknown>) => {
      state = { ...state, ...s };
    },
  };
});

vi.mock('@/api/agents', () => ({
  useUpdateAgentConfig: () => updateStub(),
}));

// In tests:
setUpdateState({ status: 'pending', isPending: true });

Why: vi.mock is hoisted above imports, so factory functions cannot reference outer-scope variables — except those created with vi.hoisted.

i18n in tests — mock the translator

Translation keys leak through to rendered text when we stub useTranslation to return the key. Easier for tests; match on the key.

vi.mock('react-i18next', () => ({
  useTranslation: () => ({ t: (key: string) => key }),
}));

// Then assert:
expect(screen.getByRole('button', { name: /agent.create/ })).toBeInTheDocument();

Alternative: match visible English from the real translation bundle. Either approach is fine; pick one per test file.

Common matchers

Matcher	Use
`toBeInTheDocument()`	Element is present
`toBeDisabled()`	Interactive element is disabled
`toHaveTextContent(str/regex)`	Element contains the text
`toHaveAttribute(name, value)`	Attribute check
`toBeChecked()`	Checkbox/switch/radio is on
`toHaveFocus()`	Element is focused

What NOT to test

Don’t test Radix internals (we didn’t write them).
Don’t test Tailwind class names (they’re implementation details).
Don’t test that react-hook-form works (it does).
Don’t test that a style is applied — test that the element has the right role, label, or state.

Minimum test coverage per new component

For any new @na/ui component:

Renders with default props.
Renders with each variant / size prop.
Responds to primary interaction (click, change, focus).
Handles error / loading / empty states if applicable.
Accessibility: role, aria attributes where relevant.

For any new admin page:

Renders with mock data.
Fires the expected mutation on primary action.
Shows error state when mutation fails.
Respects loading state with skeleton.

Polyfills

jsdom is missing a few APIs that our components use. Our test-setup.ts:

import '@testing-library/jest-dom/vitest';

class ResizeObserverStub {
  observe() {}
  unobserve() {}
  disconnect() {}
}
if (typeof globalThis.ResizeObserver === 'undefined') {
  (globalThis as any).ResizeObserver = ResizeObserverStub;
}

// DOMMatrixReadOnly, scrollIntoView stubs as needed

Playwright / E2E

We don’t have E2E coverage yet — it’s a prioritized backlog item (see TODOS.md). When we add it, tests live under apps/admin/e2e/ and cover:

Auth sign-in and sign-out.
CRUD on the main entity (Agent).
Bulk action on a list.
Keyboard shortcut coverage (Cmd+K, /, ?, Esc).

Running tests

pnpm test                       # all workspaces
pnpm --filter @na/ui test       # one workspace
pnpm --filter @na/ui test ActionButton  # one file
pnpm --filter @na/app-admin test --watch

CI expectations

Per commit:

pnpm test must pass.
pnpm typecheck must pass for admin (chat has known pre-existing failures, tracked separately).
pnpm lint must pass.
pnpm format should be clean (hooked into lint-staged).

Pre-existing test failures (15 in admin at time of writing, related to NodeConfigPanel/WorkflowProgressPanel/nodes tests with i18n integration) are tracked but not regressed. New changes must not add to that count.

Testing checklist

Before merging:

Every new interactive component has a test file.
Every state transition tested (idle, pending, success, error).
User events tested via userEvent, not synthetic fireEvent.
Tests assert via roles/labels/text, not class names or internal state.
No regressions in existing test count.
No new i18n keys committed without English translation present.

Next: none — you’ve finished the spec. Now go build something people want.