blob: feb90d4d24ab27509a090716943af8a18dc67910 [file] [view] [edit]
# Instructions for AI Agents
This file gives specific instructions for AI agents that perform
housekeeping tasks for Git l10n. Use of AI is optional; many successful
l10n teams work well without it.
The section "Housekeeping tasks for localization workflows" documents the
most commonly used housekeeping tasks:
1. Generating or updating po/git.pot
2. Updating po/XX.po
3. Translating po/XX.po
4. Reviewing translation quality
## Background knowledge for localization workflows
Essential background for the workflows below; understand these concepts before
performing any housekeeping tasks in this document.
### Language code and notation (XX, ll, ll\_CC)
**XX** is a placeholder for the language code: either `ll` (ISO 639) or
`ll_CC` (e.g. `de`, `zh_CN`). It appears in the PO file header metadata
(e.g. `"Language: zh_CN\n"`) and is typically used to name the PO file:
`po/XX.po`.
### Header Entry
The **header entry** is the first entry in every `po/XX.po`. It has an empty
`msgid`; translation metadata (project, language, plural rules, encoding, etc.)
is stored in `msgstr`, as in this example:
```po
msgid ""
msgstr ""
"Project-Id-Version: Git\n"
"Language: zh_CN\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Plural-Forms: nplurals=2; plural=(n != 1);\n"
```
**CRITICAL**: Do not edit the header's `msgstr` while translating. It holds
metadata only and must be left unchanged.
### Glossary Section
PO files may have a glossary in comments before the header entry (first
`msgid ""`), giving terminology guidelines (e.g.):
```po
# Git glossary for Chinese translators
#
# English | Chinese
# ---------------------------------+--------------------------------------
# 3-way merge | 三路合并
# ...
```
**IMPORTANT**: Read and use the glossary when translating or reviewing. It is
in `#` comments only. Leave that comment block unchanged.
### PO entry structure (single-line and multi-line)
PO entries are `msgid` / `msgstr` pairs. Plural messages add `msgid_plural` and
`msgstr[n]`. The `msgid` is the immutable source; `msgstr` is the target
translation. Each side may be a single quoted string or a multi-line block.
In the multi-line form the header line is often `msgid ""` / `msgstr ""`, with
the real text split across following quoted lines (concatenated by Gettext).
**Single-line entries**:
```po
msgid "commit message"
msgstr "提交说明"
```
**Multi-line entries**:
```po
msgid ""
"Line 1\n"
"Line 2"
msgstr ""
"行 1\n"
"行 2"
```
**CRITICAL**: Do **not** use `grep '^msgstr ""'` to find untranslated entries;
multi-line `msgstr` blocks use the same opening line, so grep gives false
positives. Use `msgattrib` (next section).
### Locating untranslated, fuzzy, and obsolete entries
Use `msgattrib` to list untranslated, fuzzy, and obsolete entries. Task 3
(translating `po/XX.po`) uses these commands.
- **Untranslated**: `msgattrib --untranslated --no-obsolete po/XX.po`
- **Fuzzy**: `msgattrib --only-fuzzy --no-obsolete po/XX.po`
- **Obsolete** (`#~`): `msgattrib --obsolete --no-wrap po/XX.po`
### Translating fuzzy entries
Fuzzy entries need re-translation because the source text changed. The format
differs by file type:
- **PO file**: A `#, fuzzy` tag in the entry comments marks the entry as fuzzy.
- **JSON file**: The entry has `"fuzzy": true`.
**Translation principles**: Re-translate the `msgstr` (and, for plural entries,
`msgstr[n]`) into the target language. Do **not** modify `msgid` or
`msgid_plural`. After translation, **clear the fuzzy mark**: in PO, remove the
`#, fuzzy` tag from comments; in JSON, omit or set `fuzzy` to `false`.
### Preserving Special Characters
Preserve escape sequences (`\n`, `\"`, `\\`, `\t`), placeholders (`%s`, `%d`,
etc.), and quotes exactly as in `msgid`. Only reorder placeholders with
positional syntax when needed (see Placeholder Reordering below).
### Placeholder Reordering
When reordering placeholders relative to `msgid`, use positional syntax (`%n$`)
where *n* is the 1-based argument index, so each argument still binds to the
right value. Preserve width and precision modifiers, and place `%n$` before
them (see examples below).
**Example 1** (placeholder reordering with precision):
```po
msgid "missing environment variable '%s' for configuration '%.*s'"
msgstr "配置 '%3$.*2$s' 缺少环境变量 '%1$s'"
```
`%s` → argument 1 → `%1$s`. `%.*s` needs precision (arg 2) and string (arg 3) →
`%3$.*2$s`.
**Example 2** (multi-line, four `%s` reordered):
```po
msgid ""
"Path updated: %s renamed to %s in %s, inside a directory that was renamed in "
"%s; moving it to %s."
msgstr ""
"路径已更新:%1$s 在 %3$s 中被重命名为 %2$s,而其所在目录又在 %4$s 中被重命"
"名,因此将其移动到 %5$s。"
```
Original order 1,2,3,4,5; in translation 1,3,2,4,5. Each line must be a
complete quoted string.
**Example 3** (no placeholder reordering):
```po
msgid "MIDX %s must be an ancestor of %s"
msgstr "MIDX %s 必须是 %s 的祖先"
```
Argument order is still 1,2 in translation, so `%n$` is not needed.
If no placeholder reordering occurs, you **must not** introduce `%n$`
syntax; keep the original non-positional placeholders (`%s`, `%d`, etc.).
### Validating PO File Format
Check the PO file using the command below:
```shell
msgfmt --check -o /dev/null po/XX.po
```
Common validation errors include:
- Unclosed quotes
- Missing escape sequences
- Invalid placeholder syntax
- Malformed multi-line entries
- Incorrect line breaks in multi-line strings
On failure, `msgfmt` prints the line number; fix the PO at that line.
### Using git-po-helper
[git-po-helper](https://github.com/git-l10n/git-po-helper) supports Git l10n with
**quality checking** (git-l10n PR conventions) and **AI-assisted translation**
(subcommands for automated workflows). Housekeeping tasks in this document use
it when available; otherwise rely on gettext tools.
#### Splitting large PO files
When a PO file is too large for translation or review, use `git-po-helper
msg-select` to split it by entry index.
- **Entry 0** is the header (included by default; use `--no-header` to omit).
- **Entries 1, 2, 3, …** are content entries.
- **Range format**: `--range "1-50"` (entries 1 through 50), `--range "-50"`
(first 50 entries), `--range "51-"` (from entry 51 to end). Shortcuts:
`--head N` (first N), `--tail N` (last N), `--since N` (from N to end).
- **Output format**: PO by default; use `--json` for GETTEXT JSON. See the
"GETTEXT JSON format" section (under git-po-helper) for details.
- **State filter**: Use `--translated`, `--untranslated`, `--fuzzy` to filter
by state (OR relationship). Use `--no-obsolete` to exclude obsolete entries;
`--with-obsolete` to include (default). Use `--only-same` or `--only-obsolete`
for a single state. Range applies to the filtered list.
```shell
# First 50 entries (header + entries 1–50)
git-po-helper msg-select --range "-50" po/in.po -o po/out.po
# Entries 51–100
git-po-helper msg-select --range "51-100" po/in.po -o po/out.po
# Entries 101 to end
git-po-helper msg-select --range "101-" po/in.po -o po/out.po
# Entries 1–50 without header (content only)
git-po-helper msg-select --range "1-50" --no-header po/in.po -o po/frag.po
# Output as JSON; select untranslated and fuzzy entries, exclude obsolete
git-po-helper msg-select --json --untranslated --fuzzy --no-obsolete po/in.po >po/filtered.json
```
#### Comparing PO files for translation and review
`git-po-helper compare` shows PO changes with full entry context (unlike
`git diff`). Redirect output to a file: it is empty when there are no new or
changed entries; otherwise it contains a valid PO header.
```shell
# Get full context of local changes (HEAD vs working tree)
git-po-helper compare po/XX.po -o po/out.po
# Get full context of changes in a specific commit (parent vs commit)
git-po-helper compare --commit <commit> po/XX.po -o po/out.po
# Get full context of changes since a commit (commit vs working tree)
git-po-helper compare --since <commit> po/XX.po -o po/out.po
# Get full context between two commits
git-po-helper compare -r <commit1>..<commit2> po/XX.po -o po/out.po
# Get full context of two worktree files
git-po-helper compare po/old.po po/new.po -o po/out.po
# Check msgid consistency (detect tampering); no output means target matches source
git-po-helper compare --msgid po/old.po po/new.po >po/out.po
```
**Options summary**
| Option | Meaning |
|---------------------|------------------------------------------------|
| (none) | Compare HEAD with working tree (local changes) |
| `--commit <commit>` | Compare parent of commit with the commit |
| `--since <commit>` | Compare commit with working tree |
| `-r x..y` | Compare revision x with revision y |
| `-r x..` | Compare revision x with working tree |
| `-r x` | Compare parent of x with x |
#### Concatenating multiple PO/JSON files
`git-po-helper msg-cat` merges PO, POT, or gettext JSON inputs into one stream.
Duplicate `msgid` values keep the first occurrence in file order. Write with
`-o <file>` or stdout (`-o -` or omit); `--json` selects JSON output, else PO.
```shell
# Convert JSON to PO (e.g. after translation)
git-po-helper msg-cat --unset-fuzzy -o po/out.po po/in.json
# Merge multiple PO files
git-po-helper msg-cat -o po/out.po po/in-1.po po/in-2.json
```
#### GETTEXT JSON format
The **GETTEXT JSON** format is an internal format defined by `git-po-helper`
for convenient batch processing of translation and related tasks by AI models.
`git-po-helper msg-select`, `git-po-helper msg-cat`, and `git-po-helper compare`
read and write this format.
**Top-level structure**:
```json
{
"header_comment": "string",
"header_meta": "string",
"entries": [ /* array of entry objects */ ]
}
```
| Field | Description |
|------------------|--------------------------------------------------------------------------------|
| `header_comment` | Lines above the first `msgid ""` (comments, glossary), directly concatenated. |
| `header_meta` | Encoded `msgstr` of the header entry (Project-Id-Version, Plural-Forms, etc.). |
| `entries` | List of PO entries. Order matches source. |
**Entry object** (each element of `entries`):
| Field | Type | Description |
|-----------------|----------|--------------------------------------------------------------|
| `msgid` | string | Singular message ID. PO escapes encoded (e.g. `\n` → `\\n`). |
| `msgstr` | []string | Translation forms as a **JSON array only**. Details below. |
| `msgid_plural` | string | Plural form of msgid. Omit for non-plural. |
| `comments` | []string | Comment lines (`#`, `#.`, `#:`, `#,`, etc.). |
| `fuzzy` | bool | True if entry has fuzzy flag. |
| `obsolete` | bool | True for `#~` obsolete entries. Omit if false. |
**`msgstr` array (required shape)**:
- **Always** a JSON array of strings, never a single string. One element = singular
(PO `msgstr` / `msgstr[0]`); multiple elements = plural forms in order
(`msgstr[0]`, `msgstr[1]`, …).
- Omit the key or use an empty array when the entry is untranslated.
**Example (single-line entry)**:
```json
{
"header_comment": "# Glossary:\\n# term1\\tTranslation 1\\n#\\n",
"header_meta": "Project-Id-Version: git\\nContent-Type: text/plain; charset=UTF-8\\n",
"entries": [
{
"msgid": "Hello",
"msgstr": ["你好"],
"comments": ["#. Comment for translator\\n", "#: src/file.c:10\\n"],
"fuzzy": false
}
]
}
```
**Example (plural entry)**:
```json
{
"msgid": "One file",
"msgid_plural": "%d files",
"msgstr": ["一个文件", "%d 个文件"],
"comments": ["#, c-format\\n"]
}
```
**Example (fuzzy entry before translation)**:
```json
{
"msgid": "Old message",
"msgstr": ["旧翻译。"],
"comments": ["#, fuzzy\\n"],
"fuzzy": true
}
```
**Translation notes for GETTEXT JSON files**:
- **Preserve structure**: Keep `header_comment`, `header_meta`, `msgid`,
`msgid_plural` unchanged.
- **Fuzzy entries**: Entries extracted from fuzzy PO entries have `"fuzzy": true`.
After translating, **remove the `fuzzy` field** or set it to `false` in the
output JSON. The merge step uses `--unset-fuzzy`, which can also remove the
`fuzzy` field.
- **Placeholders**: Preserve `%s`, `%d`, etc. exactly; use `%n$` when
reordering (see "Placeholder Reordering" above).
### Quality checklist
- **Accuracy**: Faithful to original meaning; no omissions or distortions.
- **Fuzzy entries**: Re-translate fully and clear the fuzzy flag (see
"Translating fuzzy entries" above).
- **Terminology**: Consistent with glossary (see "Glossary Section" above) or
domain standards.
- **Grammar and fluency**: Correct and natural in the target language.
- **Placeholders**: Preserve variables (`%s`, `{name}`, `$1`) exactly; use
positional parameters when reordering (see "Placeholder Reordering" above).
- **Special characters**: Preserve escape sequences (`\n`, `\"`, `\\`, `\t`),
placeholders exactly as in `msgid`. See "Preserving Special Characters" above.
- **Plurals and gender**: Correct forms and agreement.
- **Context fit**: Suitable for UI space, tone, and use (e.g. error vs. tooltip).
- **Cultural appropriateness**: No offensive or ambiguous content.
- **Consistency**: Match prior translations of the same source.
- **Technical integrity**: Do not translate code, paths, commands, brands, or
proper nouns.
- **Readability**: Clear, concise, and user-friendly.
## Housekeeping tasks for localization workflows
For common housekeeping tasks, follow the steps in the matching subsection
below.
### Task 1: Generating or updating po/git.pot
When asked to generate or update `po/git.pot` (or the like):
1. **Directly execute** the command `make po/git.pot` without checking
if the file exists beforehand.
2. **Do not verify** the generated file after execution. Simply run the
command and consider the task complete.
### Task 2: Updating po/XX.po
When asked to update `po/XX.po` (or the like):
1. **Directly execute** the command `make po-update PO_FILE=po/XX.po`
without reading or checking the file content beforehand.
2. **Do not verify, translate, or review** the updated file after execution.
Simply run the command and consider the task complete.
### Task 3: Translating po/XX.po
To translate `po/XX.po`, use the steps below. The script uses gettext or
`git-po-helper` depending on what is installed; JSON export (when available)
supports batch translation rather than per-entry work.
**Workflow loop**: Steps 1→2→3→4→5→6→7 form a loop. After step 6 succeeds,
**always** go to step 7, which returns to step 1. The **only** exit to step 8
is when step 2 finds `po/l10n-pending.po` empty. Do not skip step 7 or jump to
step 8 after step 6.
1. **Extract entries to translate**: **Directly execute** the script below—it is
authoritative; do not reimplement. It generates `po/l10n-pending.po` with
messages that need translation.
```shell
l10n_extract_pending () {
test $# -ge 1 || { echo "Usage: l10n_extract_pending <po-file>" >&2; return 1; }
PO_FILE="$1"
PENDING="po/l10n-pending.po"
PENDING_FUZZY="${PENDING}.fuzzy"
PENDING_REFER="${PENDING}.fuzzy.reference"
PENDING_UNTRANS="${PENDING}.untranslated"
rm -f "$PENDING"
if command -v git-po-helper >/dev/null 2>&1
then
git-po-helper msg-select --untranslated --fuzzy --no-obsolete -o "$PENDING" "$PO_FILE"
else
msgattrib --untranslated --no-obsolete "$PO_FILE" >"${PENDING_UNTRANS}"
msgattrib --only-fuzzy --no-obsolete --clear-fuzzy --empty "$PO_FILE" >"${PENDING_FUZZY}"
msgattrib --only-fuzzy --no-obsolete "$PO_FILE" >"${PENDING_REFER}"
msgcat --use-first "${PENDING_UNTRANS}" "${PENDING_FUZZY}" >"$PENDING"
rm -f "${PENDING_UNTRANS}" "${PENDING_FUZZY}"
fi
if test -s "$PENDING"
then
msgfmt --stat -o /dev/null "$PENDING" || true
echo "Pending file is not empty; there are still entries to translate."
else
echo "No entries need translation."
return 1
fi
}
# Run the extraction. Example: l10n_extract_pending po/zh_CN.po
l10n_extract_pending po/XX.po
```
2. **Check generated file**: If `po/l10n-pending.po` is empty or does not exist,
translation is complete; go to step 8. Otherwise proceed to step 3.
3. **Prepare one batch for translation**: Batching keeps each run small so the
model can complete translation within limited context. **BEFORE translating**,
**directly execute** the script below—it is authoritative; do not reimplement.
Based on which file the script produces: if `po/l10n-todo.json` exists, go to
step 4a; if `po/l10n-todo.po` exists, go to step 4b.
```shell
l10n_one_batch () {
test $# -ge 1 || { echo "Usage: l10n_one_batch <po-file> [min_batch_size]" >&2; return 1; }
PO_FILE="$1"
min_batch_size=${2:-100}
PENDING="po/l10n-pending.po"
TODO_JSON="po/l10n-todo.json"
TODO_PO="po/l10n-todo.po"
DONE_JSON="po/l10n-done.json"
DONE_PO="po/l10n-done.po"
rm -f "$TODO_JSON" "$TODO_PO" "$DONE_JSON" "$DONE_PO"
ENTRY_COUNT=$(grep -c '^msgid ' "$PENDING" 2>/dev/null || echo 0)
ENTRY_COUNT=$((ENTRY_COUNT > 0 ? ENTRY_COUNT - 1 : 0))
if test "$ENTRY_COUNT" -gt $min_batch_size
then
if test "$ENTRY_COUNT" -gt $((min_batch_size * 8))
then
NUM=$((min_batch_size * 2))
elif test "$ENTRY_COUNT" -gt $((min_batch_size * 4))
then
NUM=$((min_batch_size + min_batch_size / 2))
else
NUM=$min_batch_size
fi
BATCHING=1
else
NUM=$ENTRY_COUNT
BATCHING=
fi
if command -v git-po-helper >/dev/null 2>&1
then
if test -n "$BATCHING"
then
git-po-helper msg-select --json --head "$NUM" -o "$TODO_JSON" "$PENDING"
echo "Processing batch of $NUM entries (out of $ENTRY_COUNT remaining)"
else
git-po-helper msg-select --json -o "$TODO_JSON" "$PENDING"
echo "Processing all $ENTRY_COUNT entries at once"
fi
else
if test -n "$BATCHING"
then
awk -v num="$NUM" '/^msgid / && count++ > num {exit} 1' "$PENDING" |
tac | awk '/^$/ {found=1} found' | tac >"$TODO_PO"
echo "Processing batch of $NUM entries (out of $ENTRY_COUNT remaining)"
else
cp "$PENDING" "$TODO_PO"
echo "Processing all $ENTRY_COUNT entries at once"
fi
fi
}
# Prepare one batch; shrink 2nd arg when batches exceed agent capacity.
l10n_one_batch po/XX.po 100
```
4a. **Translate JSON batch** (`po/l10n-todo.json` → `po/l10n-done.json`):
- **Task**: Translate `po/l10n-todo.json` (input, GETTEXT JSON) into
`po/l10n-done.json` (output, GETTEXT JSON). See the "GETTEXT JSON format"
section above for format details and translation rules.
- **Reference glossary**: Read the glossary from the batch file's
`header_comment` (see "Glossary Section" above) and use it for
consistent terminology.
- **When translating**: Follow the "Quality checklist" above for correctness
and quality. Handle escape sequences (`\n`, `\"`, `\\`, `\t`), placeholders,
and quotes correctly as in `msgid`. For JSON, correctly escape and unescape
these sequences when reading and writing. Modify `msgstr` and `msgstr[n]`
(for plural entries); clear the fuzzy flag (omit or set `fuzzy` to `false`).
Do **not** modify `msgid` or `msgid_plural`.
4b. **Translate PO batch** (`po/l10n-todo.po` → `po/l10n-done.po`):
- **Task**: Translate `po/l10n-todo.po` (input, GETTEXT PO) into
`po/l10n-done.po` (output, GETTEXT PO).
- **Reference glossary**: Read the glossary from the pending file header
(see "Glossary Section" above) and use it for consistent terminology.
- **When translating**: Follow the "Quality checklist" above for correctness
and quality. Preserve escape sequences (`\n`, `\"`, `\\`, `\t`), placeholders,
and quotes as in `msgid`. Modify `msgstr` and `msgstr[n]` (for plural
entries); remove the `#, fuzzy` tag from comments when done. Do **not**
modify `msgid` or `msgid_plural`.
5. **Validate `po/l10n-done.po`**:
Run the validation script below. If it fails, fix per the errors and notes,
re-run until it succeeds.
```shell
l10n_validate_done () {
DONE_PO="po/l10n-done.po"
DONE_JSON="po/l10n-done.json"
PENDING="po/l10n-pending.po"
if test -f "$DONE_JSON" && { ! test -f "$DONE_PO" || test "$DONE_JSON" -nt "$DONE_PO"; }
then
git-po-helper msg-cat --unset-fuzzy -o "$DONE_PO" "$DONE_JSON" || {
echo "ERROR [JSON to PO conversion]: Fix $DONE_JSON and re-run." >&2
return 1
}
fi
# Check 1: msgid should not be modified
MSGID_OUT=$(git-po-helper compare -q --msgid --assert-no-changes \
"$PENDING" "$DONE_PO" 2>&1)
MSGID_RC=$?
if test $MSGID_RC -ne 0 || test -n "$MSGID_OUT"
then
echo "ERROR [msgid modified]: The following entries appeared after" >&2
echo "translation because msgid was altered. Fix in $DONE_PO." >&2
echo "$MSGID_OUT" >&2
return 1
fi
# Check 2: PO format (see "Validating PO File Format" for error handling)
MSGFMT_OUT=$(msgfmt --check -o /dev/null "$DONE_PO" 2>&1)
MSGFMT_RC=$?
if test $MSGFMT_RC -ne 0
then
echo "ERROR [PO format]: Fix errors in $DONE_PO." >&2
echo "$MSGFMT_OUT" >&2
return 1
fi
echo "Validation passed."
}
l10n_validate_done
```
If the script fails, fix **directly in `po/l10n-done.po`**. Re-run
`l10n_validate_done` until it succeeds. Editing `po/l10n-done.json` is not
recommended because it adds an extra JSON-to-PO conversion step. Use the
error message to decide:
- **`[msgid modified]`**: The listed entries have altered `msgid`; restore
them to match `po/l10n-pending.po`.
- **`[PO format]`**: `msgfmt` reports line numbers; fix the errors in place.
See "Validating PO File Format" for common issues.
6. **Merge translation results into `po/XX.po`**: Run the script below. If it
fails, fix the file the error names: **`[JSON to PO conversion]`** →
`po/l10n-done.json`; **`[msgcat merge]`** → `po/l10n-done.po`. Re-run until
it succeeds.
```shell
l10n_merge_batch () {
test $# -ge 1 || { echo "Usage: l10n_merge_batch <po-file>" >&2; return 1; }
PO_FILE="$1"
DONE_PO="po/l10n-done.po"
DONE_JSON="po/l10n-done.json"
MERGED="po/l10n-done.merged"
PENDING="po/l10n-pending.po"
PENDING_REFER="${PENDING}.fuzzy.reference"
TODO_JSON="po/l10n-todo.json"
TODO_PO="po/l10n-todo.po"
if test -f "$DONE_JSON" && { ! test -f "$DONE_PO" || test "$DONE_JSON" -nt "$DONE_PO"; }
then
git-po-helper msg-cat --unset-fuzzy -o "$DONE_PO" "$DONE_JSON" || {
echo "ERROR [JSON to PO conversion]: Fix $DONE_JSON and re-run." >&2
return 1
}
fi
msgcat --use-first "$DONE_PO" "$PO_FILE" >"$MERGED" || {
echo "ERROR [msgcat merge]: Fix errors in $DONE_PO and re-run." >&2
return 1
}
mv "$MERGED" "$PO_FILE"
rm -f "$TODO_JSON" "$TODO_PO" "$DONE_JSON" "$DONE_PO" "$PENDING_REFER"
}
# Run the merge. Example: l10n_merge_batch po/zh_CN.po
l10n_merge_batch po/XX.po
```
7. **Loop**: **MUST** return to step 1 (Extract entries) and repeat the cycle.
Do **not** skip this step or go to step 8. Step 8 (below) runs **only**
when step 2 finds no more entries and redirects there.
8. **Only after loop exits**: Run the command below to validate the PO file and
display the report. The process ends here.
```shell
msgfmt --check --stat -o /dev/null po/XX.po
```
### Task 4: Review translation quality
Review may target the full `po/XX.po`, a specific commit, or changes since a
commit. When asked to review, follow the steps below.
**Workflow**: Follow steps in order. Do **NOT** use `git show`, `git diff`,
`git format-patch`, or similar to get changes—they break PO context; use **only**
`git-po-helper compare` for extraction. Without `git-po-helper`, refuse the task.
Steps 3→4→5→6→7 loop: after step 6, **always** go to step 7 (back to step 3).
The **only** ways to step 8 are when step 4 finds `po/review-todo.json` missing
or empty (no batch left to review), or when step 1 finds `po/review-result.json`
already present.
1. **Check for existing review (resume support)**: Evaluate the following in order:
- If `po/review-input.po` does **not** exist, proceed to step 2 (Extract
entries) for a fresh start.
- Else If `po/review-result.json` exists, go to step 8 (only after loop exits).
- Else If `po/review-done.json` exists, go to step 6 (Rename result).
- Else if `po/review-todo.json` exists, go to step 5 (Review the current
batch).
- Else go to step 3 (Prepare one batch).
2. **Extract entries**: Run `git-po-helper compare` with the desired range and
redirect the output to `po/review-input.po`. See "Comparing PO files for
translation and review" under git-po-helper for options.
3. **Prepare one batch**: Batching keeps each run small so the model can
complete review within limited context. **Directly execute** the script
below—it is authoritative; do not reimplement.
```shell
review_one_batch () {
min_batch_size=${1:-100}
INPUT_PO="po/review-input.po"
PENDING="po/review-pending.po"
TODO="po/review-todo.json"
DONE="po/review-done.json"
BATCH_FILE="po/review-batch.txt"
if test ! -f "$INPUT_PO"
then
rm -f "$TODO"
echo >&2 "cannot find $INPUT_PO, nothing for review"
return 1
fi
if test ! -f "$PENDING" || test "$INPUT_PO" -nt "$PENDING"
then
rm -f "$BATCH_FILE" "$TODO" "$DONE"
rm -f po/review-result*.json
cp "$INPUT_PO" "$PENDING"
fi
ENTRY_COUNT=$(grep -c '^msgid ' "$PENDING" 2>/dev/null || echo 0)
ENTRY_COUNT=$((ENTRY_COUNT > 0 ? ENTRY_COUNT - 1 : 0))
if test "$ENTRY_COUNT" -eq 0
then
rm -f "$TODO"
echo >&2 "No entries left for review"
return 1
fi
if test "$ENTRY_COUNT" -gt $min_batch_size
then
if test "$ENTRY_COUNT" -gt $((min_batch_size * 8))
then
NUM=$((min_batch_size * 2))
elif test "$ENTRY_COUNT" -gt $((min_batch_size * 4))
then
NUM=$((min_batch_size + min_batch_size / 2))
else
NUM=$min_batch_size
fi
else
NUM=$ENTRY_COUNT
fi
BATCH=$(cat "$BATCH_FILE" 2>/dev/null || echo 0)
BATCH=$((BATCH + 1))
echo "$BATCH" >"$BATCH_FILE"
git-po-helper msg-select --json --head "$NUM" -o "$TODO" "$PENDING"
git-po-helper msg-select --since "$((NUM + 1))" -o "${PENDING}.tmp" "$PENDING"
mv "${PENDING}.tmp" "$PENDING"
echo "Processing batch $BATCH ($NUM entries out of $ENTRY_COUNT)"
}
# The parameter controls batch size; reduce if the batch file is too large.
review_one_batch 100
```
4. **Check todo file**: If `po/review-todo.json` does not exist or is empty,
review is complete; go to step 8 (only after loop exits). Otherwise proceed to
step 5.
5. **Review the current batch**: Review translations in `po/review-todo.json`
and write findings to `po/review-done.json` as follows:
- Use "Background knowledge for localization workflows" for PO/JSON structure,
placeholders, and terminology.
- If `header_comment` includes a glossary, follow it for consistency.
- Do **not** review the header (`header_comment`, `header_meta`).
- For every other entry, check the entry's `msgstr` **array** (translation
forms) against `msgid` / `msgid_plural` using the "Quality checklist" above.
- Write JSON per "Review result JSON format" below; use `{"issues": []}` when
there are no issues. **Always** write `po/review-done.json`—it marks the
batch complete.
6. **Rename result**: Rename `po/review-done.json` to `po/review-result-<N>.json`,
where N is the value in `po/review-batch.txt` (the batch just completed).
Run the script below:
```shell
review_rename_result () {
TODO="po/review-todo.json"
DONE="po/review-done.json"
BATCH_FILE="po/review-batch.txt"
if test -f "$DONE"
then
N=$(cat "$BATCH_FILE" 2>/dev/null) || { echo "ERROR: $BATCH_FILE not found." >&2; return 1; }
mv "$DONE" "po/review-result-$N.json"
echo "Renamed to po/review-result-$N.json"
fi
rm -f "$TODO"
}
review_rename_result
```
7. **Loop**: **MUST** return to step 3 (Prepare one batch) and repeat the cycle.
Do **not** skip this step or go to step 8. Step 8 is reached **only** when
step 4 finds `po/review-todo.json` missing or empty.
8. **Only after loop exits**: **Directly execute** the command below. It merges
results, applies suggestions, and displays the report. The process ends here.
```shell
git-po-helper agent-run review --report po
```
**Do not** run cleanup or delete intermediate files. Keep them for inspection
or resumption.
**Review result JSON format**:
The **Review result JSON** format defines the structure for translation
review reports. For each entry with translation issues, create an issue
object as follows:
- Copy the original entry's `msgid`, optional `msgid_plural`, and optional
`msgstr` array (original translation forms) into the issue object. Use the
same shape as GETTEXT JSON: `msgstr` is **always a JSON array** when present
(one element singular, multiple for plural).
- Write a summary of all issues found for this entry in `description`.
- Set `score` according to the severity of issues found for this entry,
from 0 to 3 (0 = critical; 1 = major; 2 = minor; 3 = perfect, no issues).
**Lower score means more severe issues.**
- Place the suggested translation in **`suggest_msgstr`** as a **JSON array**:
one string for singular, multiple strings for plural forms in order. This is
required for `git-po-helper` to apply suggestions.
- Include only entries with issues (score less than 3). When no issues are
found in the batch, write `{"issues": []}`.
Example review result (with issues):
```json
{
"issues": [
{
"msgid": "commit",
"msgstr": ["委托"],
"score": 0,
"description": "Terminology error: 'commit' should be translated as '提交'",
"suggest_msgstr": ["提交"]
},
{
"msgid": "repository",
"msgid_plural": "repositories",
"msgstr": ["版本库", "版本库"],
"score": 2,
"description": "Consistency issue: suggest using '仓库' consistently",
"suggest_msgstr": ["仓库", "仓库"]
}
]
}
```
Field descriptions for each issue object (element of the `issues` array):
- `msgid` (and optional `msgid_plural` for plural entries): Original source text.
- `msgstr` (optional): JSON array of original translation forms (same meaning as
in GETTEXT JSON entries).
- `suggest_msgstr`: JSON array of suggested translation forms; **must be an
array** (e.g. `["提交"]` for singular). Plural entries use multiple elements
in order.
- `score`: 0–3 (0 = critical; 1 = major; 2 = minor; 3 = perfect, no issues).
- `description`: Brief summary of the issue.
## Human translators remain in control
Git translation is human-driven; language team leaders and contributors are
responsible for maintaining translation quality and consistency.
AI-generated output should always be treated as drafts that must be reviewed
and approved by someone who understands both the technical context and the
target language. The best results come from combining AI efficiency with human
judgment, cultural insight, and community engagement.