Skip to content

Commit 78033d8

Browse files
committed
fix: address lan17 review — Decimal money, scoped budgets, store API
Addresses all feedback from lan17's review: - Float → Decimal: All money amounts use Decimal for precision. Config, store protocol, evaluator, and transaction_policy all updated. Decimal(str(raw)) for safe conversion, float() only in metadata output. - Scoped budget semantics: Documented tuple-based scope behavior. Channel+agent_id+session_id form a composite scope key. Independent per-dimension budgets documented as requiring separate get_spend calls. - Store API: get_spend() now accepts start/end range instead of just since_timestamp. Backward compatible (end defaults to None). - Fixed always-passing test: Removed 'or True' from context override test. Now asserts concrete store state per scope. - Added lan17's exact test case: 90 USDC channel A, then 20 USDC channel B with channel_max_per_period=100. Second tx allowed. - README: Updated custom store example with scope param and Decimal return. Fixed error handling docs. Added Known Limitations section (race condition, tuple scoping, package wiring). - __init__.py: selector.path '*' → 'input' with context merge note. 67/67 tests passing. Signed-off-by: up2itnow0822 <up2itnow0822@users.noreply.github.com>
1 parent c36c7e2 commit 78033d8

9 files changed

Lines changed: 392 additions & 211 deletions

File tree

evaluators/contrib/financial-governance/README.md

Lines changed: 62 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -26,11 +26,14 @@ Static policy checks with no state tracking. Enforces structural rules on indivi
2626
## Installation
2727

2828
```bash
29-
# From the repo root (development)
29+
# From the repo root (development) — install directly from contrib path
3030
cd evaluators/contrib/financial-governance
3131
pip install -e ".[dev]"
3232
```
3333

34+
> **Note:** This package is not yet wired into `agent-control-evaluators` extras.
35+
> Install directly from the contrib path as shown above.
36+
3437
## Configuration
3538

3639
### Spend Limit
@@ -41,12 +44,12 @@ controls:
4144
evaluator:
4245
type: financial_governance.spend_limit
4346
config:
44-
max_per_transaction: 100.0 # Max USDC per single payment
45-
max_per_period: 1000.0 # Rolling 24h budget
46-
period_seconds: 86400 # Budget window (default: 24 hours)
47-
currency: USDC # Currency to govern
47+
max_per_transaction: "100.00" # Max USDC per single payment (Decimal string)
48+
max_per_period: "1000.00" # Rolling 24h budget
49+
period_seconds: 86400 # Budget window (default: 24 hours)
50+
currency: USDC # Currency to govern
4851
selector:
49-
path: input # Extract step.input (transaction dict)
52+
path: input # Extract step.input (transaction dict)
5053
action: deny
5154
```
5255
@@ -61,8 +64,8 @@ controls:
6164
allowed_currencies: [USDC, USDT]
6265
blocked_recipients: ["0xDEAD..."]
6366
allowed_recipients: ["0xALICE...", "0xBOB..."]
64-
min_amount: 0.01
65-
max_amount: 5000.0
67+
min_amount: "0.01"
68+
max_amount: "5000.00"
6669
selector:
6770
path: input
6871
action: deny
@@ -72,7 +75,7 @@ controls:
7275
7376
Both evaluators support two selector configurations:
7477
75-
- **`selector.path: "input"`** (recommended) — The evaluator receives `step.input` directly, which should be the transaction dict.
78+
- **`selector.path: "input"`** (recommended) — The evaluator receives `step.input` directly, which should be the transaction dict. Context fields (`channel`, `agent_id`, `session_id`) are merged from `step.context` into the transaction dict by the engine before evaluation.
7679
- **`selector.path: "*"`** — The evaluator receives the full Step object. It automatically extracts `step.input` for transaction fields and `step.context` for channel/agent/session metadata.
7780

7881
## Input Data Schema
@@ -82,7 +85,7 @@ The transaction dict (from `step.input`) should contain:
8285
```python
8386
# step.input — transaction payload
8487
{
85-
"amount": 50.0, # required — transaction amount
88+
"amount": "50.00", # required — transaction amount (Decimal-compatible)
8689
"currency": "USDC", # required — payment currency
8790
"recipient": "0xABC...", # required — payment recipient
8891
}
@@ -98,28 +101,28 @@ Context fields (`channel`, `agent_id`, `session_id`) and per-context limit overr
98101
step = Step(
99102
type="tool",
100103
name="payment",
101-
input={"amount": 75.0, "currency": "USDC", "recipient": "0xABC"},
104+
input={"amount": "75.00", "currency": "USDC", "recipient": "0xABC"},
102105
context={
103106
"channel": "experimental",
104107
"agent_id": "agent-42",
105-
"channel_max_per_transaction": 50.0,
106-
"channel_max_per_period": 200.0,
108+
"channel_max_per_transaction": "50.00",
109+
"channel_max_per_period": "200.00",
107110
},
108111
)
109112
```
110113

111-
When using `selector.path: "*"`, the evaluator merges `step.context` fields into the transaction data automatically. When using `selector.path: "input"`, context fields must be included directly in `step.input`.
114+
When using `selector.path: "input"`, context fields (channel, agent_id, session_id) are merged from `step.context` into the transaction dict by the engine. When using `selector.path: "*"`, the evaluator merges `step.context` fields itself.
112115

113116
**Option B: Inline in the transaction dict** (simpler, for direct SDK use)
114117

115118
```python
116119
result = await evaluator.evaluate({
117-
"amount": 75.0,
120+
"amount": "75.00",
118121
"currency": "USDC",
119122
"recipient": "0xABC",
120123
"channel": "experimental",
121-
"channel_max_per_transaction": 50.0,
122-
"channel_max_per_period": 200.0,
124+
"channel_max_per_transaction": "50.00",
125+
"channel_max_per_period": "200.00",
123126
})
124127
```
125128

@@ -130,6 +133,7 @@ Spend budgets are **scoped by context** — spend in channel A does not count ag
130133
The `SpendStore` protocol requires two methods. Implement them for your backend:
131134

132135
```python
136+
from decimal import Decimal
133137
from agent_control_evaluator_financial_governance.spend_limit import (
134138
SpendStore,
135139
SpendLimitConfig,
@@ -142,24 +146,39 @@ class PostgresSpendStore:
142146
def __init__(self, connection_string: str):
143147
self._conn = connect(connection_string)
144148
145-
def record_spend(self, amount: float, currency: str, metadata: dict | None = None) -> None:
149+
def record_spend(self, amount: Decimal, currency: str, metadata: dict | None = None) -> None:
146150
self._conn.execute(
147151
"INSERT INTO agent_spend (amount, currency, metadata, recorded_at) VALUES (%s, %s, %s, NOW())",
148-
(amount, currency, json.dumps(metadata)),
152+
(str(amount), currency, json.dumps(metadata)),
149153
)
150154
151-
def get_spend(self, currency: str, since_timestamp: float) -> float:
155+
def get_spend(
156+
self,
157+
currency: str,
158+
start: float,
159+
end: float | None = None,
160+
scope: dict[str, str] | None = None,
161+
) -> Decimal:
162+
end_clause = "AND recorded_at <= to_timestamp(%s)" if end is not None else ""
163+
params = [currency, start]
164+
if end is not None:
165+
params.append(end)
152166
row = self._conn.execute(
153-
"SELECT COALESCE(SUM(amount), 0) FROM agent_spend WHERE currency = %s AND recorded_at >= to_timestamp(%s)",
154-
(currency, since_timestamp),
167+
f"SELECT COALESCE(SUM(amount), 0) FROM agent_spend "
168+
f"WHERE currency = %s AND recorded_at >= to_timestamp(%s) {end_clause}",
169+
params,
155170
).fetchone()
156-
return float(row[0])
171+
return Decimal(str(row[0]))
157172
158173
# Use it:
159174
store = PostgresSpendStore("postgresql://...")
160175
evaluator = SpendLimitEvaluator(config, store=store)
161176
```
162177

178+
## Error Handling
179+
180+
Malformed or incomplete runtime payloads (missing `amount`, missing `currency`, non-numeric values, etc.) return `matched=False, error=None` — they are treated as non-matching transactions, not evaluator errors. The `error` field is reserved for evaluator infrastructure failures (crashes, timeouts, missing dependencies).
181+
163182
## Running Tests
164183

165184
```bash
@@ -170,10 +189,26 @@ pytest tests/ -v
170189

171190
## Design Decisions
172191

173-
1. **Decoupled from data source** — The `SpendStore` protocol means no new tables in core Agent Control. Bring your own persistence.
174-
2. **Context-aware limits** — Override keys in the evaluate data dict allow per-channel, per-agent, or per-session limits without multiple evaluator instances.
175-
3. **Python SDK compatible** — Uses the standard evaluator interface; works with both the server and the Python SDK evaluation engine.
176-
4. **Fail-open on errors** — Missing or malformed data returns `matched=False` with an `error` field, following Agent Control conventions.
192+
1. **Decimal for money** — All monetary amounts use `Decimal` to avoid float precision errors in financial calculations.
193+
2. **Decoupled from data source** — The `SpendStore` protocol means no new tables in core Agent Control. Bring your own persistence.
194+
3. **Context-aware limits** — Override keys in the evaluate data dict allow per-channel, per-agent, or per-session limits without multiple evaluator instances.
195+
4. **Python SDK compatible** — Uses the standard evaluator interface; works with both the server and the Python SDK evaluation engine.
196+
5. **Fail-open on malformed data** — Missing or invalid fields return `matched=False` with `error=None`, following Agent Control conventions.
197+
198+
## Known Limitations
199+
200+
### Race Condition (read-then-write is not atomic)
201+
The spend-limit evaluator reads current period spend and then writes a new record as two separate operations. Under concurrent load this can allow transactions to slip through just above the budget. For hard enforcement use a `SpendStore` implementation that provides atomic `check_and_record` semantics (e.g., a Redis `MULTI`/`EXEC` block or a PostgreSQL `SELECT ... FOR UPDATE`). The `InMemorySpendStore` is thread-safe within a single process but does not provide atomic check-and-record.
202+
203+
### Tuple-Scoped Budgets
204+
When context fields (`channel`, `agent_id`, `session_id`) are all present, they form a **single composite scope key** — not independent per-dimension budgets. For example, a scope of `{"channel": "A", "agent_id": "bot-1"}` matches only records that have *both* `channel=="A"` AND `agent_id=="bot-1"`. To enforce truly independent per-channel and per-agent budgets you would need separate `get_spend()` calls with separate scope dicts.
205+
206+
### Package Not Yet in Extras
207+
This package is not yet wired into the `agent-control-evaluators` extras install target. Install directly from the contrib path:
208+
209+
```bash
210+
pip install -e "evaluators/contrib/financial-governance"
211+
```
177212

178213
## Related Projects
179214

evaluators/contrib/financial-governance/src/agent_control_evaluator_financial_governance/__init__.py

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,19 +14,28 @@
1414
1515
{
1616
"condition": {
17-
"selector": {"path": "*"},
17+
"selector": {"path": "input"},
1818
"evaluator": {
1919
"name": "financial_governance.spend_limit",
2020
"config": {
21-
"max_per_transaction": 100.0,
22-
"max_per_period": 1000.0,
21+
"max_per_transaction": "100.00",
22+
"max_per_period": "1000.00",
2323
"period_seconds": 86400,
2424
"currency": "USDC"
2525
}
2626
}
2727
},
2828
"action": {"decision": "deny"}
2929
}
30+
31+
Note on ``selector.path``:
32+
Use ``selector.path: "input"`` (recommended) to pass ``step.input``
33+
directly as the transaction dict. Context fields (``channel``,
34+
``agent_id``, ``session_id``) are merged from ``step.context`` into
35+
the transaction dict by the engine before evaluation.
36+
37+
Use ``selector.path: "*"`` to pass the full Step object; the evaluator
38+
will extract ``step.input`` and merge ``step.context`` fields itself.
3039
"""
3140

3241
from agent_control_evaluator_financial_governance.spend_limit import (

evaluators/contrib/financial-governance/src/agent_control_evaluator_financial_governance/spend_limit/config.py

Lines changed: 15 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22

33
from __future__ import annotations
44

5+
from decimal import Decimal
6+
57
from pydantic import Field, field_validator
68

79
from agent_control_evaluators import EvaluatorConfig
@@ -15,9 +17,10 @@ class SpendLimitConfig(EvaluatorConfig):
1517
Attributes:
1618
max_per_transaction: Hard cap on any single transaction amount. A
1719
transaction whose ``amount`` exceeds this value is blocked
18-
regardless of accumulated period spend. Set to ``0.0`` to disable.
20+
regardless of accumulated period spend. Set to ``Decimal("0")``
21+
to disable.
1922
max_per_period: Maximum total spend allowed within the rolling
20-
*period_seconds* window. Set to ``0.0`` to disable.
23+
*period_seconds* window. Set to ``Decimal("0")`` to disable.
2124
period_seconds: Length of the rolling budget window in seconds.
2225
Defaults to ``86400`` (24 hours).
2326
currency: Currency symbol this policy applies to (e.g. ``"USDC"``).
@@ -27,27 +30,27 @@ class SpendLimitConfig(EvaluatorConfig):
2730
Example config dict::
2831
2932
{
30-
"max_per_transaction": 500.0,
31-
"max_per_period": 5000.0,
33+
"max_per_transaction": "500.00",
34+
"max_per_period": "5000.00",
3235
"period_seconds": 86400,
3336
"currency": "USDC"
3437
}
3538
"""
3639

37-
max_per_transaction: float = Field(
38-
default=0.0,
39-
ge=0.0,
40+
max_per_transaction: Decimal = Field(
41+
default=Decimal("0"),
42+
ge=0,
4043
description=(
4144
"Per-transaction spend cap in *currency* units. "
42-
"0.0 means no per-transaction limit."
45+
"0 means no per-transaction limit."
4346
),
4447
)
45-
max_per_period: float = Field(
46-
default=0.0,
47-
ge=0.0,
48+
max_per_period: Decimal = Field(
49+
default=Decimal("0"),
50+
ge=0,
4851
description=(
4952
"Maximum cumulative spend allowed in the rolling period window. "
50-
"0.0 means no period limit."
53+
"0 means no period limit."
5154
),
5255
)
5356
period_seconds: int = Field(

0 commit comments

Comments
 (0)