harden: reject USER_CHANGES inserts without an author attribute by JohnMcLear · Pull Request #7773 · ether/etherpad

JohnMcLear · 2026-05-16T06:06:47Z

Summary

Adds defensive validation in handleUserChanges that rejects USER_CHANGES whose + (insert) ops lack the author attribute. Insert ops without an author attribute can produce an AText where text and attribs disagree on length, which downstream clients then fail to reconcile in ace2_inner.ts:setDocAText with:

mismatch error setting raw text in setDocAText

Once a pad reaches that state, every subsequent browser load throws on init and the pad becomes effectively unloadable.

The check is a one-liner inside the existing per-op validation loop and reuses the wireApool that's already constructed for the author-match validation a few lines above — no extra parsing or new traversal.

Why now

The standard JS web client always tags inserts with the author attribute (via Changeset.builder(...).insert(text, [['author', authorId]], pool)), so legitimate clients are unaffected. Less-careful clients (older code paths, third-party integrations, naive ports) can today send unattributed inserts that the server silently accepts — producing the broken AText state described above. This commit closes that gap server-side.

Changes

src/node/handler/PadMessageHandler.ts — single new check at the end of the existing per-op validation loop. Reuses opAuthorId already computed there. ~6 lines.
src/tests/backend/specs/messages.ts — existing USER_CHANGES test fixtures updated to send proper author-attributed inserts (*0+5 instead of +5, plus the matching apool entry). Mirrors what the JS client does in production. A new regression test 'insert without author attribute is rejected' locks in the new server behaviour.

Test plan

Existing USER_CHANGES tests still pass with the updated fixtures
New regression test asserts disconnect: badChangeset on an insert with empty attribs
Other test files that use unattributed inserts (ImportEtherpad.ts, importexportGetPost.ts) flow through different code paths (import API, not USER_CHANGES via socket.io) and are unaffected

🤖 Generated with Claude Code

Insert ops MUST carry the author attribute reference so that pad.atext.text and pad.atext.attribs stay in lock-step. An accepted insert with empty attribs would grow text without contributing matching attribute markers, leaving the stored AText in a state where the two iterables disagree on length when reconstructed. Downstream clients then fail reconciliation in ace2_inner.ts:setDocAText with 'mismatch error setting raw text in setDocAText' on every subsequent pad load — making the affected pad effectively unloadable until manually repaired. This commit adds a single defensive check inside the existing per-op validation loop in handleUserChanges: when an op is a '+' (insert) and its attribs string doesn't yield an 'author' entry via AttributeMap.fromString, reject with badChangeset. The check piggybacks on the wireApool that was already constructed for the prior author-match validation, so no extra parsing. Test fixtures in messages.ts were updated to send proper author-attributed inserts plus the matching apool (mirroring what the JS web client always does). A new regression test 'insert without author attribute is rejected' locks in the new behaviour.

qodo-code-review · 2026-05-16T06:06:51Z

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

qodo-free-for-open-source-projects · 2026-05-16T06:07:02Z

Review Summary by Qodo

(Agentic_describe updated until commit `1751900`)

Harden pad changeset validation to prevent AText desynchronization

🐞 Bug fix ✨ Enhancement

Walkthroughs

Description

• Reject USER_CHANGES inserts lacking author attribute to prevent malformed AText
• Substitute stable system author for unattributed inserts via HTTP API/plugins
• Validate trailing newline preservation in all USER_CHANGES changesets
• Block wire-borne operations claiming reserved system author identity
• Update test fixtures to include proper author attribution in changesets

Diagram

flowchart LR
  A["USER_CHANGES socket.io"] -->|validate author attrib| B["Reject if missing"]
  C["HTTP API/plugins"] -->|spliceText/setText| D["Substitute system author"]
  E["Rebased changeset"] -->|validate trailing newline| F["Reject if missing"]
  G["Wire operation"] -->|check reserved author| H["Reject if system author"]
  B --> I["Well-formed AText"]
  D --> I
  F --> I
  H --> I

File Changes

1. src/node/db/Pad.ts ✨ Enhancement +23/-2

Add system author substitution for unattributed inserts

• Added SYSTEM_AUTHOR_ID static constant for server-internal author attribution
• Modified spliceText to substitute system author when authorId is empty and insert is non-empty
• Updated appendRevision call to use effective author instead of original parameter
• Preserves backward compatibility for existing callers without requiring explicit author allocation

src/node/db/Pad.ts

2. src/node/handler/PadMessageHandler.ts 🐞 Bug fix +56/-9

Harden USER_CHANGES validation with multiple defensive checks

• Added import of applyToText from Changeset module
• Removed makeSplice from imports (no longer used in this file)
• Added validation to reject + ops without author attribute in per-op loop
• Added defense-in-depth check to reject wire operations claiming reserved system author
• Added pre-application validation to reject changesets that would strand trailing newline
• Captured canonical changeset form after pool mapping for accurate retransmission detection
• Removed automatic newline correction revision logic (now preventive instead of corrective)

src/node/handler/PadMessageHandler.ts

3. src/tests/backend/specs/Pad.ts 🧪 Tests +23/-2

Add system author substitution test coverage

• Updated spliceText and setText calls to pass explicit author IDs in existing tests
• Added new test verifying system author substitution for empty authorId with non-empty insert
• Test validates that system author binding appears in pad's attribute pool

src/tests/backend/specs/Pad.ts

View more (3)

4. src/tests/backend/specs/messages.ts 🧪 Tests +92/-12

Update USER_CHANGES tests with proper author attribution

• Captured authorId and roAuthorId from handshake responses in beforeEach
• Created authorPool helper to generate proper apool with author attribute for session
• Updated all USER_CHANGES test fixtures to include *0 author attribute reference and matching
 apool
• Added regression test for insert without author attribute rejection
• Added test for rejection of operations claiming reserved system author
• Added test for rejection of changesets that would strand trailing newline
• Updated read-only socket tests to use matching author in apool

src/tests/backend/specs/messages.ts

5. pnpm-lock.yaml Dependencies +14/-13

Update dependency versions

• Updated etherpad-cli-client from 4.0.2 to 4.0.3
• Updated eslint dependency resolution for eslint-import-resolver-typescript and
 eslint-plugin-import

pnpm-lock.yaml

6. src/package.json Dependencies +1/-1

Bump etherpad-cli-client version
• Updated etherpad-cli-client dependency from ^4.0.2 to ^4.0.3
src/package.json

qodo-free-for-open-source-projects · 2026-05-16T06:07:04Z

Code Review by Qodo

🐞 Bugs (2) 📘 Rule violations (3)

1. Trailing-newline auto-fix removed 📘 Rule violation ☼ Reliability ⭐ New

Description

handleUserChanges now rejects any USER_CHANGES changeset that would leave the pad text not
ending with \n, instead of tolerating and correcting it. This removes a previously supported
behavior without a WARN-level deprecation period, which can break older/third-party clients that
rely on server-side correction.

Code

src/node/handler/PadMessageHandler.ts[R949-964]

+    // Defensive: reject any rebased changeset whose application would leave
+    // the pad text not ending with '\n'. Previously the server silently
+    // appended a separate `nlChangeset` correction revision; that worked
+    // for the stored pad but the FIRST broadcast (the malformed user
+    // revision) reached browsers BEFORE the correction did, and the
+    // browser's line assembler asserts "line assembler not finished" on
+    // a doc that doesn't end with '\n', taking the session out. Refuse to
+    // accept such changesets — clients must always preserve the
+    // trailing-newline invariant.
+    const projectedText = applyToText(rebasedChangeset, prevText);
+    if (!projectedText.endsWith('\n')) {
+      throw new Error(
+          `Rejected USER_CHANGES whose application would leave the pad ` +
+          `without a trailing '\\n' (length ${projectedText.length}). ` +
+          `Every USER_CHANGES must preserve the "doc ends with \\n" invariant.`);
+    }

Evidence

Compliance requires a WARN deprecation path before removing previously supported behavior. The new
code explicitly throws (rejects) when the applied text would not end with \n, immediately
enforcing the new rule without any WARN-based deprecation stage.

src/node/handler/PadMessageHandler.ts[949-964]
Best Practice: Repository guidelines

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The server now hard-rejects USER_CHANGES that would violate the trailing newline invariant, which is a breaking behavior change and effectively removes the prior tolerance/correction behavior without a deprecation path.

## Issue Context
Per compliance, feature/behavior removals must be deprecated first with a WARN log, then removed/enforced in a later version. Current code throws an error immediately for these changesets.

## Fix Focus Areas
- src/node/handler/PadMessageHandler.ts[949-964]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

2. System-author undo rejected 🐞 Bug ≡ Correctness ⭐ New

Description

handleUserChanges() rejects any op whose resolved authorId is 'a.etherpad-system', including '=' ops
that restore existing authorship. Pads can now legitimately contain system-attributed text via
server-side spliceText()/setText() without an authorId, so undoing “clear authorship colors” can
disconnect clients with badChangeset when the undo attempts to restore the system author.

Code

src/node/handler/PadMessageHandler.ts[R905-908]

+      if (opAuthorId === 'a.etherpad-system') {
+        throw new Error(`Author ${thisSession.author} attempted to submit changes as the ` +
+                        `reserved system author ${opAuthorId} in changeset ${changeset}`);
+      }

Evidence
The server now creates system-authored inserts when internal callers omit authorId, making system
author a legitimate existing attribution in pads. The editor’s clear-authorship command sets author
to '' and undo restores prior author attributes via '=' ops; rejecting all occurrences of the system
author therefore blocks that undo/restore path.
src/node/db/Pad.ts[95-106]
src/node/db/Pad.ts[416-442]
src/node/handler/PadMessageHandler.ts[848-908]
src/static/js/ace2_inner.ts[610-623]
src/tests/backend/specs/undo_clear_authorship.ts[168-195]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`handleUserChanges()` rejects any op whose `author` attribute resolves to the reserved system author (`a.etherpad-system`). This breaks legitimate client flows (notably undo of “clear authorship colors”) on pads that contain system-authored text, because undo restores author attributes via `'='` ops.

### Issue Context
* Server-side internal callers now attribute inserts to `Pad.SYSTEM_AUTHOR_ID` when no `authorId` is provided, so pads can contain system-authored segments.
* The editor’s “clearauthorship” command sets author to `''` across the doc; undo restores prior author attributes (including non-self authors) via `'='` ops.
* The new unconditional rejection of `a.etherpad-system` blocks that restoration.

### Fix Focus Areas
- src/node/handler/PadMessageHandler.ts[878-908]
- src/node/db/Pad.ts[416-442]
- src/static/js/ace2_inner.ts[610-623]
- src/tests/backend/specs/undo_clear_authorship.ts[168-195]

### Proposed fix approach
1. Narrow the reserved-system-author rejection so it only blocks actual attempts to *write as* the system author (for example: reject if `thisSession.author === 'a.etherpad-system'`, and/or reject only `'+'` ops whose `opAuthorId` is the system author).
2. Do **not** reject `'='` ops solely because they restore the system author; those are needed to undo/redo attribution changes on existing system-authored content.
3. Add/adjust a regression test that creates system-authored content (via `pad.setText(..., '')` or `spliceText(..., '')`), then performs clear-authorship + undo via USER_CHANGES and asserts it is accepted.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

3. ~~Wrong author in roSocket~~ ✓ Resolved 🐞 Bug ≡ Correctness

Description

The USER_CHANGES test helper always builds the apool using authorId from the first socket’s
CLIENT_VARS, but roSocket can have a different author identity because common.connect() only
forwards cookies from the *current* HTTP response and /p/:pad does not re-issue the author-token
cookie if it already exists. With the new server validation requiring an author attribute on '+'
ops, sendUserChanges(roSocket, ...) can now be rejected as “changes as another author”, breaking
the permitOnce test.

Code

src/tests/backend/specs/messages.ts[R172-179]

+    // Insert ops MUST carry the author attribute (server-side validation
+    // added in this PR — see PadMessageHandler.ts). Helper assembles the
+    // wire form `*0+N` plus the matching apool entry for the session author.
+    const authorPool = () =>
+        ({numToAttrib: {0: ['author', authorId]}, nextNum: 1});
  const sendUserChanges =
-        async (socket:any, cs:any) => await common.sendUserChanges(socket, {baseRev: rev, changeset: cs});
+        async (socket:any, cs:any, apool: any = authorPool()) =>
+            await common.sendUserChanges(socket, {baseRev: rev, changeset: cs, apool});

Evidence
The test code captures only one authorId but uses it to build the default apool for all
USER_CHANGES. roSocket’s handshake response is not used to set its authorId. Because
common.connect() forwards only Set-Cookie-derived cookies, and /p/:pad won’t re-issue the author
token cookie if it already exists, roSocket can connect without the author-token cookie and will
fall back to a new random CLIENT_READY token, giving it a different authorId than authorId used by
authorPool()—triggering server-side author validation failures for roSocket inserts.
src/tests/backend/specs/messages.ts[26-46]
src/tests/backend/specs/messages.ts[171-179]
src/tests/backend/specs/messages.ts[255-270]
src/tests/backend/common.ts[189-206]
src/tests/backend/common.ts[228-236]
src/node/utils/ensureAuthorTokenCookie.ts[17-35]
src/node/hooks/express/specialpages.ts[195-218]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`src/tests/backend/specs/messages.ts` defaults the USER_CHANGES `apool` to the main socket’s `authorId` for *all* sockets. But `roSocket` can end up with a different authorId (no token cookie forwarded on its socket.io connection), so `sendUserChanges(roSocket, '...*0+...')` may be rejected by the new server-side validation.
## Issue Context
- `authorId` is captured only from the first socket’s CLIENT_VARS; the roSocket CLIENT_VARS is ignored.
- `common.connect(res)` forwards only cookies present in the provided HTTP response’s `set-cookie` headers.
- `/p/:pad` uses `ensureAuthorTokenCookie()`, which does not set a Set-Cookie header if the request already has a valid token cookie—so the second GET (to the read-only URL) often won’t provide the token cookie to `common.connect()`.
## Fix Focus Areas
- src/tests/backend/specs/messages.ts[26-46]
- src/tests/backend/specs/messages.ts[171-179]
- src/tests/backend/specs/messages.ts[255-270]
## Suggested fix
1. Capture `roAuthorId` from `common.handshake(roSocket, roPadId)` (read CLIENT_VARS).
2. Change `authorPool()` to accept an explicit authorId (or create two pools: one for `socket`, one for `roSocket`).
3. Ensure `sendUserChanges(roSocket, ...)` uses `roAuthorId`’s pool (or explicitly pass an `apool` argument in the roSocket test).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

4. HTTP API author default undocumented 📘 Rule violation ⚙ Maintainability ⭐ New

Description

Pad.spliceText() now substitutes a stable system author (a.etherpad-system) when an internal
caller performs an insert with an empty authorId, changing how setText()/appendText() behave
when authorId is omitted. The HTTP API documentation describes authorId as optional but does not
document the new default attribution behavior.

Code

src/node/db/Pad.ts[R429-441]

+    // An unattributed insert (empty authorId + non-empty ins) would produce
+    // an AText where `text` and `attribs` disagree on length — clients fail
+    // setDocAText reconciliation on load. Backward-compat fix: if the caller
+    // didn't provide an authorId, attribute the insert to a stable system
+    // author. ep_post_data and other plugins that want named attribution
+    // should still pass an explicit authorId.
+    const effectiveAuthorId =
+        (ins.length > 0 && !authorId) ? Pad.SYSTEM_AUTHOR_ID : authorId;
+    const attribs = effectiveAuthorId
+        ? [['author', effectiveAuthorId] as [string, string]]
+        : undefined;
    const changeset = makeSplice(orig, start, ndel, ins, attribs, this.pool);
-    await this.appendRevision(changeset, authorId);
+    await this.appendRevision(changeset, effectiveAuthorId);

Evidence

The code change introduces a new default author attribution (Pad.SYSTEM_AUTHOR_ID) for inserts
when authorId is empty, which affects documented HTTP API methods that accept optional authorId.
The HTTP API docs show authorId as optional but do not describe the default attribution behavior,
so the documentation is no longer aligned with the API behavior.

src/node/db/Pad.ts[429-441]
doc/api/http_api.md[373-399]
Best Practice: Repository guidelines

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Behavior of HTTP API calls that omit `authorId` has changed: inserts are now attributed to a stable system author rather than being unattributed. This is an API behavior change that should be documented.

## Issue Context
The docs currently list `setText(padID, text, [authorId])` and `appendText(padID, text, [authorId])` but do not explain what happens when `authorId` is omitted.

## Fix Focus Areas
- src/node/db/Pad.ts[429-441]
- doc/api/http_api.md[373-399]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

5. Extra full-text projection 🐞 Bug ➹ Performance ⭐ New

Description

The new trailing-newline invariant check applies the rebased changeset to the entire pad text via
applyToText(), then appendRevision() applies the same changeset again to build the new AText. This
adds an extra full-document string build on every USER_CHANGES, increasing CPU and allocations on a
hot path.

Code

src/node/handler/PadMessageHandler.ts[R958-964]

+    const projectedText = applyToText(rebasedChangeset, prevText);
+    if (!projectedText.endsWith('\n')) {
+      throw new Error(
+          `Rejected USER_CHANGES whose application would leave the pad ` +
+          `without a trailing '\\n' (length ${projectedText.length}). ` +
+          `Every USER_CHANGES must preserve the "doc ends with \\n" invariant.`);
+    }

Evidence

The new handler calls applyToText(), which constructs a full new string by iterating ops and
appending remaining text; appendRevision() then calls applyToAText() to build the new AText, which
includes the new text string. These are two separate full applications/allocation paths for each
edit.

src/node/handler/PadMessageHandler.ts[949-967]
src/static/js/Changeset.ts[397-440]
src/node/db/Pad.ts[228-235]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`handleUserChanges()` computes `projectedText` by calling `applyToText(rebasedChangeset, prevText)` to check the trailing `\n` invariant, and then calls `pad.appendRevision(rebasedChangeset, ...)`, which applies the changeset again (to AText) to produce/stash the new document state.

### Issue Context
This happens for every accepted USER_CHANGES and adds an O(document length) extra pass and allocation, independent of whether the change touches the end of the document.

### Fix Focus Areas
- src/node/handler/PadMessageHandler.ts[949-967]
- src/node/db/Pad.ts[228-276]
- src/static/js/Changeset.ts[397-440]

### Proposed fix approach
1. Avoid building `projectedText` just to check the last character:
  * Option A: Apply once in `appendRevision()` (it already computes `newAText.text`) and validate `newAText.text.endsWith('\n')` **before** committing/copying `newAText` into the pad.
  * Option B: Add a helper that returns the would-be new text/AText so validation and commit can reuse the same computed result (no second apply).
2. Keep the current behavior (reject) but ensure the validation does not require constructing a full new string twice per edit.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

6. Unwarned rejection of inserts 📘 Rule violation ⚙ Maintainability

Description

The PR introduces an incompatible behavior change by rejecting previously accepted USER_CHANGES +
ops that lack an author attribute, but it does not emit a WARN-level deprecation message before
enforcing the new behavior. Older/third-party clients will be disconnected without advance warning
or a deprecation period.

Code

src/node/handler/PadMessageHandler.ts[R886-889]

+      if (op.opcode === '+' && !opAuthorId) {
+        throw new Error(`Author ${thisSession.author} submitted an insert without an ` +
+                        `author attribute in changeset ${changeset}`);
+      }

Evidence

PR Compliance ID 3 requires a WARN-level deprecation warning before incompatible behavior changes.
The new code immediately throws an Error for unattributed + ops, with no WARN log indicating
deprecation or future enforcement timeline.

src/node/handler/PadMessageHandler.ts[886-889]
Best Practice: Repository guidelines

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Server behavior for USER_CHANGES insert ops has changed incompatibly (unattributed inserts now throw), but there is no WARN-level deprecation message to give existing clients time to adapt.
## Issue Context
A new validation rejects `op.opcode === '+'` when `opAuthorId` is falsy by throwing an Error. This can break older/third-party clients that previously worked, and the compliance checklist requires a WARN-level deprecation notice before incompatible removals/behavior changes.
## Fix Focus Areas
- src/node/handler/PadMessageHandler.ts[886-889]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Previous review results

Review updated until commit 1751900

Results up to commit N/A

🐞 Bugs (0) 📘 Rule violations (1) 📎 Requirement gaps (0)

1. ~~Wrong author in roSocket~~ ✓ Resolved 🐞 Bug ≡ Correctness

Description

The USER_CHANGES test helper always builds the apool using authorId from the first socket’s
CLIENT_VARS, but roSocket can have a different author identity because common.connect() only
forwards cookies from the *current* HTTP response and /p/:pad does not re-issue the author-token
cookie if it already exists. With the new server validation requiring an author attribute on '+'
ops, sendUserChanges(roSocket, ...) can now be rejected as “changes as another author”, breaking
the permitOnce test.

Code

src/tests/backend/specs/messages.ts[R172-179]

+    // Insert ops MUST carry the author attribute (server-side validation
+    // added in this PR — see PadMessageHandler.ts). Helper assembles the
+    // wire form `*0+N` plus the matching apool entry for the session author.
+    const authorPool = () =>
+        ({numToAttrib: {0: ['author', authorId]}, nextNum: 1});
   const sendUserChanges =
-        async (socket:any, cs:any) => await common.sendUserChanges(socket, {baseRev: rev, changeset: cs});
+        async (socket:any, cs:any, apool: any = authorPool()) =>
+            await common.sendUserChanges(socket, {baseRev: rev, changeset: cs, apool});

Evidence
The test code captures only one authorId but uses it to build the default apool for all
USER_CHANGES. roSocket’s handshake response is not used to set its authorId. Because
common.connect() forwards only Set-Cookie-derived cookies, and /p/:pad won’t re-issue the author
token cookie if it already exists, roSocket can connect without the author-token cookie and will
fall back to a new random CLIENT_READY token, giving it a different authorId than authorId used by
authorPool()—triggering server-side author validation failures for roSocket inserts.
src/tests/backend/specs/messages.ts[26-46]
src/tests/backend/specs/messages.ts[171-179]
src/tests/backend/specs/messages.ts[255-270]
src/tests/backend/common.ts[189-206]
src/tests/backend/common.ts[228-236]
src/node/utils/ensureAuthorTokenCookie.ts[17-35]
src/node/hooks/express/specialpages.ts[195-218]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`src/tests/backend/specs/messages.ts` defaults the USER_CHANGES `apool` to the main socket’s `authorId` for *all* sockets. But `roSocket` can end up with a different authorId (no token cookie forwarded on its socket.io connection), so `sendUserChanges(roSocket, '...*0+...')` may be rejected by the new server-side validation.
## Issue Context
- `authorId` is captured only from the first socket’s CLIENT_VARS; the roSocket CLIENT_VARS is ignored.
- `common.connect(res)` forwards only cookies present in the provided HTTP response’s `set-cookie` headers.
- `/p/:pad` uses `ensureAuthorTokenCookie()`, which does not set a Set-Cookie header if the request already has a valid token cookie—so the second GET (to the read-only URL) often won’t provide the token cookie to `common.connect()`.
## Fix Focus Areas
- src/tests/backend/specs/messages.ts[26-46]
- src/tests/backend/specs/messages.ts[171-179]
- src/tests/backend/specs/messages.ts[255-270]
## Suggested fix
1. Capture `roAuthorId` from `common.handshake(roSocket, roPadId)` (read CLIENT_VARS).
2. Change `authorPool()` to accept an explicit authorId (or create two pools: one for `socket`, one for `roSocket`).
3. Ensure `sendUserChanges(roSocket, ...)` uses `roAuthorId`’s pool (or explicitly pass an `apool` argument in the roSocket test).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

2. Unwarned rejection of inserts 📘 Rule violation ⚙ Maintainability

Description

The PR introduces an incompatible behavior change by rejecting previously accepted USER_CHANGES +
ops that lack an author attribute, but it does not emit a WARN-level deprecation message before
enforcing the new behavior. Older/third-party clients will be disconnected without advance warning
or a deprecation period.

Code

src/node/handler/PadMessageHandler.ts[R886-889]

+      if (op.opcode === '+' && !opAuthorId) {
+        throw new Error(`Author ${thisSession.author} submitted an insert without an ` +
+                        `author attribute in changeset ${changeset}`);
+      }

Evidence

PR Compliance ID 3 requires a WARN-level deprecation warning before incompatible behavior changes.
The new code immediately throws an Error for unattributed + ops, with no WARN log indicating
deprecation or future enforcement timeline.

src/node/handler/PadMessageHandler.ts[886-889]
Best Practice: Repository guidelines

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Server behavior for USER_CHANGES insert ops has changed incompatibly (unattributed inserts now throw), but there is no WARN-level deprecation message to give existing clients time to adapt.
## Issue Context
A new validation rejects `op.opcode === '+'` when `opAuthorId` is falsy by throwing an Error. This can break older/third-party clients that previously worked, and the compliance checklist requires a WARN-level deprecation notice before incompatible removals/behavior changes.
## Fix Focus Areas
- src/node/handler/PadMessageHandler.ts[886-889]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

The first commit closed the socket.io USER_CHANGES hole. This commit closes the parallel path through Pad.spliceText (used by API.setText, API.appendText, the import flow, and plugins like ep_post_data) where an unattributed insert would otherwise produce a malformed AText. Approach: instead of REJECTING (which would break ep_post_data and many existing tests that call setText/appendText without an authorId), substitute a stable system author when none is provided. The resulting changeset is properly attributed, the AText stays well-formed, and existing callers continue to work unchanged. Plugins that want named author attribution should still pass an explicit authorId (e.g., one allocated via authorManager.createAuthor). Pad.SYSTEM_AUTHOR_ID = 'a.etherpad-system' — a stable identifier that appears in the pad's attribute pool when internal callers (HTTP API, plugins, server-side imports) write text without naming an author. The existing 'attribute changes by another author' protections still apply to socket.io USER_CHANGES paths — a remote client can't impersonate the system author for inserts (their session author check fires first). Test: - Pad.ts spec adds 'spliceText with empty authorId attributes to the system author' — verifies pad text lands AND the pool contains the system-author binding. Existing tests that pass an authorId are unaffected.

JohnMcLear · 2026-05-16T06:28:57Z

Pushed a second commit (631d8c5) that closes the parallel hole through Pad.spliceText — which is the path used by API.setText / API.appendText, the import flow, and plugins like ep_post_data.

The first commit only caught socket.io USER_CHANGES. But callers reaching Pad.spliceText with an empty authorId produce the same malformed AText: attribs field on + ops is undefined, so the resulting AText's text and attribs drift out of sync.

I went with auto-substitute a stable system author rather than throw, because rejecting would break:

~10 existing tests that call pad.setText(...) / pad.appendText(...) without an authorId
ep_post_data and any other plugins that call API.appendText(padId, content) (the second arg defaults to '')

Pad.SYSTEM_AUTHOR_ID = 'a.etherpad-system' is now used when internal callers don't name an author. The AText stays well-formed. Plugins/callers that want named attribution should still pass an explicit authorId (allocate via authorManager.createAuthor).

New test 'spliceText with empty authorId attributes to the system author' verifies the pool gets the binding.

Happy to split this into a separate PR if you'd prefer to land the socket.io check first.

JohnMcLear · 2026-05-16T07:21:27Z

I need to also check for

Uncaught (in promise) Error: doRepApplyChangeset length mismatch: 53/65

Etherpad's pad text always ends with '\n'. _handleUserChanges previously appended a separate `nlChangeset` correction revision whenever the applied USER_CHANGES left the pad without a trailing '\n'. The stored pad ended up well-formed, but the FIRST NEW_CHANGES broadcast (the malformed user revision itself) reached browsers BEFORE the correction did, and applyToAttribution's MergingOpAssembler aborts with "line assembler not finished" on a non-'\n'-terminated doc — the watching browser session then dropped the changeset and any subsequent edits silently no-op'd until the user reloaded. Replace the silent auto-correction with an explicit reject. Compute `applyToText(rebasedChangeset, prevText)` before appendRevision; if the result doesn't end with '\n', throw -> badChangeset disconnect. Clients must emit USER_CHANGES whose application preserves the invariant — this matches what the JS web client already does and forces non-JS clients (etherpad-pad, third-party integrations) to surface their bugs in their own logs instead of stranding the trailing newline in pad revision history. Also fixes a latent retransmission-detection bug surfaced by this PR's author-attrib changes: moveOpsToNewPool renumbers `*N` references to whatever slot the pad pool assigns, which can differ from the wire form's slot. Comparing the raw client wire against the stored revision form (`changeset === c`) then misses legitimate retransmissions and the same edit gets duplicated. Snapshot the post-pool-mapping form (`canonicalCs`) and compare that against `c` instead. Backend test additions: - 'changeset that would strand the trailing \\n is rejected' covers the new rejection path with wire `Z:6>1|1=6*0+1$X` against `hello\n`. - handleMessageSecurity test now captures roSocket's own authorId and uses it in the apool sent through roSocket, because the prior PR commit made `*0` referencing the wrong author a hard reject. All 1130 backend tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

JohnMcLear · 2026-05-16T08:28:50Z

Pushed 805766c extending the PR with a second hardening pass, motivated by a related browser-disconnect vector surfaced by a non-JS client (etherpad-pad terminal editor):

Problem (related but distinct from the author-attrib vector this PR opens with):
Etherpad's pad text always ends with \n. When a client sends USER_CHANGES whose application strands the trailing \n mid-document, _handleUserChanges used to silently appended a second nlChangeset correction revision to fix the stored pad. But the FIRST NEW_CHANGES broadcast (the malformed user revision itself) reaches browsers BEFORE the correction does, and applyToAttribution's MergingOpAssembler asserts line assembler not finished on a non-\n-terminated doc — the watching browser session drops the changeset and any later edits silently no-op until reload.

Repro from a non-JS client (this is what surfaced it):

Wire Z:1h>6|g=1h*0+6$hello from a 53-char doc ending in \n — Keep covers all 53 chars, Insert lands at position 53, past the existing \n.
Server applies, sees no trailing \n, appends correction.
Browser receives the bad rev first and crashes.

Fix in 805766c:

Replace the silent auto-correction with an explicit reject. Compute applyToText(rebasedChangeset, prevText) before appendRevision; if the result doesn't end with \n, throw → badChangeset disconnect. Forces buggy clients to surface their issues in their own logs instead of stranding \n in pad history.
Side-fix for a latent retransmission-detection bug uncovered by this PR's author-attrib changes: moveOpsToNewPool renumbers *N references to whatever slot the pad pool assigns, which can differ from the wire form's slot. Comparing raw client changeset === c then misses retransmissions and duplicates the edit. Snapshot the post-pool-mapping form and compare against c instead.
Backend test 'changeset that would strand the trailing \\n is rejected' locks in the new behaviour with wire Z:6>1|1=6*0+1$X against hello\n.
Updated handleMessageSecurity test to capture roSocket's own authorId for the apool — the PR's prior commit made *0 referencing the wrong author a hard reject, which the test was tripping over.

All 1130 backend tests pass locally on Node 25.

Happy to split this into a separate PR if reviewers prefer that — but it's the same shape of bug (malformed-but-passes-checkRep USER_CHANGES → server accepts → browser chokes mid-stream) and the same defensive posture, so leaving it stacked here unless asked to separate.

`applyEdit` previously left `attribs` undefined when no `edit.authorId` was supplied, producing a changeset whose `+` ops carried no author attribute. Etherpad then stored a pad.atext where `text` and `attribs` disagreed on length, and every later client load hit `mismatch error setting raw text in setDocAText` in ace2_inner — the pad became effectively unloadable. Mirror the `SYSTEM_AUTHOR_ID` pattern that ether/etherpad#7773 adopted in core's `Pad.spliceText`: when the caller doesn't supply an authorId, attribute the insert to `a.ep-ai-chat-system` so the changeset is always well-formed. The system author is *not* announced in the user list — `announceAiAuthor` stays gated on the caller-supplied id. Adds a regression test that exercises the missing-authorId path and asserts (a) every run in pad.atext.attribs carries an author attribute and (b) `attribs` total chars equals `text.length`. Refs ether/etherpad#7773

4.0.3 sends author-attributed inserts and preserves the trailing newline, complying with this PR's tightened USER_CHANGES validation. The rate-limit CI workflow drives the test pad via this client, so without the bump the new server-side rejects fire on the very first \`pad.append()\` and the rate-limit disconnect never gets a chance to arrive — testlimits.sh exits 0 instead of 1 and the rate-limit job fails with "ratelimit was not triggered when sending every 99 ms". Refs ether/etherpad-cli-client#131

SamTV12345 · 2026-05-16T15:56:33Z

+   * a fixed system author keeps the AText well-formed without requiring
+   * every plugin to allocate its own author up-front.
+   */
+  static readonly SYSTEM_AUTHOR_ID = 'a.etherpad-system';


Can that be abused? If I name myself a.etherpad-system as author. Can this have consequences?

patch incoming

The session-author equality check already rejects wire `*N` that names a different real user, but `a.etherpad-system` is server- internal — it's only used when spliceText / setText is called with an empty authorId from HTTP API or plugin paths. A wire op that names it is either a confused client or an attempt to launder edits through a reserved attribution slot. Refuse. Backend test 'insert claiming the reserved system author is rejected' locks in the new behavior with wire `Z:1>5*0+5$hello` plus an apool that maps slot 0 to `a.etherpad-system`. All 1131 backend tests pass. Inline literal `'a.etherpad-system'` rather than importing the constant from `Pad.SYSTEM_AUTHOR_ID` — `require('../db/Pad')` at PadMessageHandler module scope returned a partially-initialized class via the padManager circular path, leaving the static-field access undefined at runtime. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

qodo-code-review · 2026-05-16T16:23:40Z

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

qodo-free-for-open-source-projects · 2026-05-16T16:23:44Z

Persistent review updated to latest commit 1751900

qodo-free-for-open-source-projects · 2026-05-16T16:29:11Z

+    // Defensive: reject any rebased changeset whose application would leave
+    // the pad text not ending with '\n'. Previously the server silently
+    // appended a separate `nlChangeset` correction revision; that worked
+    // for the stored pad but the FIRST broadcast (the malformed user
+    // revision) reached browsers BEFORE the correction did, and the
+    // browser's line assembler asserts "line assembler not finished" on
+    // a doc that doesn't end with '\n', taking the session out. Refuse to
+    // accept such changesets — clients must always preserve the
+    // trailing-newline invariant.
+    const projectedText = applyToText(rebasedChangeset, prevText);
+    if (!projectedText.endsWith('\n')) {
+      throw new Error(
+          `Rejected USER_CHANGES whose application would leave the pad ` +
+          `without a trailing '\\n' (length ${projectedText.length}). ` +
+          `Every USER_CHANGES must preserve the "doc ends with \\n" invariant.`);
+    }


1. Trailing-newline auto-fix removed 📘 Rule violation ☼ Reliability

handleUserChanges now rejects any USER_CHANGES changeset that would leave the pad text not ending with \n, instead of tolerating and correcting it. This removes a previously supported behavior without a WARN-level deprecation period, which can break older/third-party clients that rely on server-side correction.

Agent Prompt

## Issue description The server now hard-rejects USER_CHANGES that would violate the trailing newline invariant, which is a breaking behavior change and effectively removes the prior tolerance/correction behavior without a deprecation path. ## Issue Context Per compliance, feature/behavior removals must be deprecated first with a WARN log, then removed/enforced in a later version. Current code throws an error immediately for these changesets. ## Fix Focus Areas - src/node/handler/PadMessageHandler.ts[949-964]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

qodo-free-for-open-source-projects · 2026-05-16T16:29:11Z

+      if (opAuthorId === 'a.etherpad-system') {
+        throw new Error(`Author ${thisSession.author} attempted to submit changes as the ` +
+                        `reserved system author ${opAuthorId} in changeset ${changeset}`);
+      }


2. System-author undo rejected 🐞 Bug ≡ Correctness

handleUserChanges() rejects any op whose resolved authorId is 'a.etherpad-system', including '=' ops that restore existing authorship. Pads can now legitimately contain system-attributed text via server-side spliceText()/setText() without an authorId, so undoing “clear authorship colors” can disconnect clients with badChangeset when the undo attempts to restore the system author.

Agent Prompt

### Issue description `handleUserChanges()` rejects any op whose `author` attribute resolves to the reserved system author (`a.etherpad-system`). This breaks legitimate client flows (notably undo of “clear authorship colors”) on pads that contain system-authored text, because undo restores author attributes via `'='` ops. ### Issue Context * Server-side internal callers now attribute inserts to `Pad.SYSTEM_AUTHOR_ID` when no `authorId` is provided, so pads can contain system-authored segments. * The editor’s “clearauthorship” command sets author to `''` across the doc; undo restores prior author attributes (including non-self authors) via `'='` ops. * The new unconditional rejection of `a.etherpad-system` blocks that restoration. ### Fix Focus Areas - src/node/handler/PadMessageHandler.ts[878-908] - src/node/db/Pad.ts[416-442] - src/static/js/ace2_inner.ts[610-623] - src/tests/backend/specs/undo_clear_authorship.ts[168-195] ### Proposed fix approach 1. Narrow the reserved-system-author rejection so it only blocks actual attempts to *write as* the system author (for example: reject if `thisSession.author === 'a.etherpad-system'`, and/or reject only `'+'` ops whose `opAuthorId` is the system author). 2. Do **not** reject `'='` ops solely because they restore the system author; those are needed to undo/redo attribution changes on existing system-authored content. 3. Add/adjust a regression test that creates system-authored content (via `pad.setText(..., '')` or `spliceText(..., '')`), then performs clear-authorship + undo via USER_CHANGES and asserts it is accepted.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Etherpad's USER_CHANGES handler now (post-2.7.x, see ether/etherpad#7773) rejects two kinds of malformed inserts: 1. `+` ops with no `author` attribute — they grow pad.atext.text without contributing matching markers to pad.atext.attribs, leaving the two iterables out of sync and breaking setDocAText reconciliation on every later client load. 2. USER_CHANGES whose application would leave the pad text not ending with '\n' — the browser's line assembler asserts on docs that don't end with '\n' and the session dies. Both were silently produced by `ee.append()`: it called `makeSplice(text, text.length, 0, ins)` (insert at very end, no attribs, no pool). When the host pad starts as `'\n'`, appending `'1'` produced `'\n1'` (no trailing newline) and the changeset's insert carried no author attribute. Fix: * Capture `userId` from CLIENT_VARS and use it as the author attribute on subsequent inserts. * Splice at `text.length - 1` so the insert lands before the trailing '\n' and the invariant holds. * Build the changeset against `padState.apool`, then `moveOpsToNewPool` into a fresh wire apool so the server can resolve the `*N` slot numbers we send. Backward-compatible: pre-hardening Etherpad servers accept the new shape too (it's the same shape the standard JS web client sends). Refs ether/etherpad#7773

qodo-free-for-open-source-projects Bot added the Compliance violation label May 16, 2026

qodo-free-for-open-source-projects Bot reviewed May 16, 2026

View reviewed changes

Comment thread src/tests/backend/specs/messages.ts

JohnMcLear self-assigned this May 16, 2026

JohnMcLear marked this pull request as draft May 16, 2026 06:47

This was referenced May 16, 2026

fix: send author-attributed inserts and preserve trailing newline ether/etherpad-cli-client#131

Merged

fix: fall back to system author when edit.authorId is missing ether/ep_ai_chat#29

Merged

SamTV12345 reviewed May 16, 2026

View reviewed changes

SamTV12345 approved these changes May 16, 2026

View reviewed changes

JohnMcLear marked this pull request as ready for review May 16, 2026 16:23

JohnMcLear merged commit 294158e into ether:develop May 16, 2026
19 checks passed

qodo-free-for-open-source-projects Bot reviewed May 16, 2026

View reviewed changes

Uh oh!

Conversation

JohnMcLear commented May 16, 2026

Summary

Why now

Changes

Test plan

Uh oh!

qodo-code-review Bot commented May 16, 2026

Qodo reviews are paused for this user.

Uh oh!

qodo-free-for-open-source-projects Bot commented May 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review Summary by Qodo

(Agentic_describe updated until commit 1751900)

Walkthroughs

File Changes

Uh oh!

qodo-free-for-open-source-projects Bot commented May 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review by Qodo

Previous review results

Uh oh!

Uh oh!

JohnMcLear commented May 16, 2026

Uh oh!

JohnMcLear commented May 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JohnMcLear commented May 16, 2026

Uh oh!

SamTV12345 May 16, 2026

Choose a reason for hiding this comment

Uh oh!

JohnMcLear May 16, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

qodo-code-review Bot commented May 16, 2026

Qodo reviews are paused for this user.

Uh oh!

qodo-free-for-open-source-projects Bot commented May 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

qodo-free-for-open-source-projects Bot May 16, 2026

Choose a reason for hiding this comment

Uh oh!

qodo-free-for-open-source-projects Bot May 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

qodo-free-for-open-source-projects Bot commented May 16, 2026 •

edited

Loading

(Agentic_describe updated until commit `1751900`)

qodo-free-for-open-source-projects Bot commented May 16, 2026 •

edited

Loading

JohnMcLear commented May 16, 2026 •

edited

Loading

qodo-free-for-open-source-projects Bot commented May 16, 2026 •

edited

Loading