You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix: address lan17 review — Decimal money, scoped budgets, store API
Addresses all feedback from lan17's review:
- Float → Decimal: All money amounts use Decimal for precision. Config,
store protocol, evaluator, and transaction_policy all updated.
Decimal(str(raw)) for safe conversion, float() only in metadata output.
- Scoped budget semantics: Documented tuple-based scope behavior.
Channel+agent_id+session_id form a composite scope key. Independent
per-dimension budgets documented as requiring separate get_spend calls.
- Store API: get_spend() now accepts start/end range instead of just
since_timestamp. Backward compatible (end defaults to None).
- Fixed always-passing test: Removed 'or True' from context override
test. Now asserts concrete store state per scope.
- Added lan17's exact test case: 90 USDC channel A, then 20 USDC
channel B with channel_max_per_period=100. Second tx allowed.
- README: Updated custom store example with scope param and Decimal
return. Fixed error handling docs. Added Known Limitations section
(race condition, tuple scoping, package wiring).
- __init__.py: selector.path '*' → 'input' with context merge note.
67/67 tests passing.
Signed-off-by: up2itnow0822 <up2itnow0822@users.noreply.github.com>
Both evaluators support two selector configurations:
74
77
75
-
- **`selector.path: "input"`** (recommended) — The evaluator receives `step.input` directly, which should be the transaction dict.
78
+
- **`selector.path: "input"`** (recommended) — The evaluator receives `step.input` directly, which should be the transaction dict. Context fields (`channel`, `agent_id`, `session_id`) are merged from `step.context` into the transaction dict by the engine before evaluation.
76
79
- **`selector.path: "*"`** — The evaluator receives the full Step object. It automatically extracts `step.input` for transaction fields and `step.context` for channel/agent/session metadata.
77
80
78
81
## Input Data Schema
@@ -82,7 +85,7 @@ The transaction dict (from `step.input`) should contain:
When using `selector.path: "*"`, the evaluator merges `step.context` fields into the transaction data automatically. When using `selector.path: "input"`, context fields must be included directly in `step.input`.
114
+
When using `selector.path: "input"`, context fields (channel, agent_id, session_id) are merged from `step.context` into the transaction dict by the engine. When using `selector.path: "*"`, the evaluator merges `step.context` fields itself.
112
115
113
116
**Option B: Inline in the transaction dict** (simpler, for direct SDK use)
114
117
115
118
```python
116
119
result = await evaluator.evaluate({
117
-
"amount": 75.0,
120
+
"amount": "75.00",
118
121
"currency": "USDC",
119
122
"recipient": "0xABC",
120
123
"channel": "experimental",
121
-
"channel_max_per_transaction": 50.0,
122
-
"channel_max_per_period": 200.0,
124
+
"channel_max_per_transaction": "50.00",
125
+
"channel_max_per_period": "200.00",
123
126
})
124
127
```
125
128
@@ -130,6 +133,7 @@ Spend budgets are **scoped by context** — spend in channel A does not count ag
130
133
The `SpendStore` protocol requires two methods. Implement them for your backend:
131
134
132
135
```python
136
+
from decimal import Decimal
133
137
from agent_control_evaluator_financial_governance.spend_limit import (
Malformed or incomplete runtime payloads (missing `amount`, missing `currency`, non-numeric values, etc.) return `matched=False, error=None` — they are treated as non-matching transactions, not evaluator errors. The `error` field is reserved for evaluator infrastructure failures (crashes, timeouts, missing dependencies).
181
+
163
182
## Running Tests
164
183
165
184
```bash
@@ -170,10 +189,26 @@ pytest tests/ -v
170
189
171
190
## Design Decisions
172
191
173
-
1. **Decoupled from data source** — The `SpendStore` protocol means no new tables in core Agent Control. Bring your own persistence.
174
-
2. **Context-aware limits** — Override keys in the evaluate data dict allow per-channel, per-agent, or per-session limits without multiple evaluator instances.
175
-
3. **Python SDK compatible** — Uses the standard evaluator interface; works with both the server and the Python SDK evaluation engine.
176
-
4. **Fail-open on errors** — Missing or malformed data returns `matched=False` with an `error` field, following Agent Control conventions.
192
+
1. **Decimal for money** — All monetary amounts use `Decimal` to avoid float precision errors in financial calculations.
193
+
2. **Decoupled from data source** — The `SpendStore` protocol means no new tables in core Agent Control. Bring your own persistence.
194
+
3. **Context-aware limits** — Override keys in the evaluate data dict allow per-channel, per-agent, or per-session limits without multiple evaluator instances.
195
+
4. **Python SDK compatible** — Uses the standard evaluator interface; works with both the server and the Python SDK evaluation engine.
196
+
5. **Fail-open on malformed data** — Missing or invalid fields return `matched=False` with `error=None`, following Agent Control conventions.
197
+
198
+
## Known Limitations
199
+
200
+
### Race Condition (read-then-write is not atomic)
201
+
The spend-limit evaluator reads current period spend and then writes a new record as two separate operations. Under concurrent load this can allow transactions to slip through just above the budget. For hard enforcement use a `SpendStore` implementation that provides atomic `check_and_record` semantics (e.g., a Redis `MULTI`/`EXEC` block or a PostgreSQL `SELECT ... FOR UPDATE`). The `InMemorySpendStore` is thread-safe within a single process but does not provide atomic check-and-record.
202
+
203
+
### Tuple-Scoped Budgets
204
+
When context fields (`channel`, `agent_id`, `session_id`) are all present, they form a **single composite scope key** — not independent per-dimension budgets. For example, a scope of `{"channel": "A", "agent_id": "bot-1"}` matches only records that have *both* `channel=="A"` AND `agent_id=="bot-1"`. To enforce truly independent per-channel and per-agent budgets you would need separate `get_spend()` calls with separate scope dicts.
205
+
206
+
### Package Not Yet in Extras
207
+
This package is not yet wired into the `agent-control-evaluators` extras install target. Install directly from the contrib path:
Copy file name to clipboardExpand all lines: evaluators/contrib/financial-governance/src/agent_control_evaluator_financial_governance/spend_limit/config.py
+15-12Lines changed: 15 additions & 12 deletions
Original file line number
Diff line number
Diff line change
@@ -2,6 +2,8 @@
2
2
3
3
from __future__ importannotations
4
4
5
+
fromdecimalimportDecimal
6
+
5
7
frompydanticimportField, field_validator
6
8
7
9
fromagent_control_evaluatorsimportEvaluatorConfig
@@ -15,9 +17,10 @@ class SpendLimitConfig(EvaluatorConfig):
15
17
Attributes:
16
18
max_per_transaction: Hard cap on any single transaction amount. A
17
19
transaction whose ``amount`` exceeds this value is blocked
18
-
regardless of accumulated period spend. Set to ``0.0`` to disable.
20
+
regardless of accumulated period spend. Set to ``Decimal("0")``
21
+
to disable.
19
22
max_per_period: Maximum total spend allowed within the rolling
20
-
*period_seconds* window. Set to ``0.0`` to disable.
23
+
*period_seconds* window. Set to ``Decimal("0")`` to disable.
21
24
period_seconds: Length of the rolling budget window in seconds.
22
25
Defaults to ``86400`` (24 hours).
23
26
currency: Currency symbol this policy applies to (e.g. ``"USDC"``).
@@ -27,27 +30,27 @@ class SpendLimitConfig(EvaluatorConfig):
27
30
Example config dict::
28
31
29
32
{
30
-
"max_per_transaction": 500.0,
31
-
"max_per_period": 5000.0,
33
+
"max_per_transaction": "500.00",
34
+
"max_per_period": "5000.00",
32
35
"period_seconds": 86400,
33
36
"currency": "USDC"
34
37
}
35
38
"""
36
39
37
-
max_per_transaction: float=Field(
38
-
default=0.0,
39
-
ge=0.0,
40
+
max_per_transaction: Decimal=Field(
41
+
default=Decimal("0"),
42
+
ge=0,
40
43
description=(
41
44
"Per-transaction spend cap in *currency* units. "
42
-
"0.0 means no per-transaction limit."
45
+
"0 means no per-transaction limit."
43
46
),
44
47
)
45
-
max_per_period: float=Field(
46
-
default=0.0,
47
-
ge=0.0,
48
+
max_per_period: Decimal=Field(
49
+
default=Decimal("0"),
50
+
ge=0,
48
51
description=(
49
52
"Maximum cumulative spend allowed in the rolling period window. "
0 commit comments