lrec2026-llm-as-annotator-tutorial

Participant cheatsheet

LLM-as-annotator in one sentence

Use the LLM to produce candidate annotations, then constrain, validate, evaluate, and selectively review them. The goal is not to remove expert judgement, but to make scarce expertise more effective.

Minimal workflow

  1. Define the task and tagset.
  2. Prepare a small, fixed input batch.
  3. Build a zero-shot prompt.
  4. Add validated few-shot examples.
  5. Request structured JSON.
  6. Validate format and token alignment.
  7. Evaluate against a small gold standard.
  8. Analyse errors.
  9. Select the next examples for expert review.
  10. Update guidelines, prompts, and examples.

Prompt checklist

A good prompt should specify:

Non-negotiable instructions

The model must not:

Validation checklist

Before evaluating linguistic quality, check:

Invalid output is a result to log, not something to hide.

Evaluation checklist

Report separately:

Do not rely only on a single global accuracy score.

Error typology

Type Meaning Typical action
Format malformed JSON, missing fields schema / retry
Alignment token added, removed, translated, reordered stricter prompt / validation
Tagset unauthorised label closed inventory
Lemma wrong or inconsistent lemma examples / guideline
Morphology wrong feature-value pair feature-specific analysis
Script transliteration or character confusion preprocessing / prompt
Domain genre-specific or rare form in-domain examples
Noise OCR/HTR or damaged text flag / expert review
Guideline instructions underspecified revise documentation
Ambiguous multiple analyses plausible expert adjudication

Sampling for expert review

Prioritise examples that combine:

Keep a random slice to avoid blind spots.

Governance checklist

Before using an external API or releasing outputs, ask:

Minimal reproducibility record

For each run, save: