Skip to content

Commit b530b1a

Browse files
committed
fix: semantic bugs found in the LLM audit backlog
Closes the four findings tracked in #120 that ruff/bandit/vulture could not catch because they are about intent, not pattern. - cache.py: flip the expiry comparison from `<=` to `<`. Before, a cache entry set with `expire=now+5` was marked expired at exactly second 5. Now it is valid up to and including second 5 and first becomes expired at second 6, matching HTTP Cache-Control max-age and Redis EXPIRE semantics. - human.py: `_to_human()` now compares `abs(n)` against the thresholds and preserves the sign in the scaled output. `bytes2human(-1048576)` used to fall through to the fallback unit and return `-1048576.0B`; it now returns `-1.0MiB`. Affects bits2human, bytes2human, bps2human. Matters for counter deltas that can legitimately go negative (counter resets, reclaimed memory, bandwidth drops). - shell.py: in the non-shell pipeline path, close the previous stage's `stdout` after the new Popen has inherited it. Without this, the upstream child never saw EOF/SIGPIPE when the downstream stage exited early, and each stage leaked a file descriptor in the parent. - url.py: drop the `timeout=timeout if digest_auth_user else timeout` dead ternary in `fetch()` (both branches evaluated to `timeout`). No behavior change. Version bumps: cache.py, human.py, shell.py. url.py is unchanged behaviorally so its version stays.
1 parent d803b57 commit b530b1a

5 files changed

Lines changed: 31 additions & 10 deletions

File tree

CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
1111
### Fixed
1212

1313
* Fix `--require-hashes` pip installs in CI workflows by using pinned versions instead
14+
* cache.py: treat a cache entry as valid up to and including its `expire` timestamp instead of expiring it one second early (`<` instead of `<=`). A key set with `expire=now+5` is now still served at `now+5` and first becomes unavailable at `now+6`, matching HTTP Cache-Control max-age and Redis EXPIRE semantics. Callers that relied on the old one-second-early expiry see their cached value live one second longer ([#120](https://github.com/Linuxfabrik/lib/issues/120))
15+
* human.py: `bits2human()`, `bytes2human()` and `bps2human()` now scale negative values to a unit that matches their magnitude. Before, `bytes2human(-1048576)` returned `-1048576.0B`, now it returns `-1.0MiB`. This matters for counter deltas that can legitimately be negative (counter resets, reclaimed memory, bandwidth drops) ([#120](https://github.com/Linuxfabrik/lib/issues/120))
16+
* shell.py: close the upstream process's `stdout` after connecting it to the next pipeline stage. Without this, the upstream process never received EOF/SIGPIPE when the downstream stage exited early, and each pipeline stage leaked a file descriptor until garbage collection caught up ([#120](https://github.com/Linuxfabrik/lib/issues/120))
17+
* url.py: drop the dead `timeout=timeout if digest_auth_user else timeout` ternary in `fetch()`; both branches evaluated to `timeout`, so behavior is unchanged ([#120](https://github.com/Linuxfabrik/lib/issues/120))
1418
* db_sqlite.py: pass `usedforsecurity=False` to `hashlib.sha1()` so bandit no longer flags a non-security SHA1 use as a weak hash (the hash is only used to derive sanitized SQL identifiers)
1519
* db_sqlite.py: rename unused loop variable in `rm_db()` to silence ruff B007
1620
* grassfish.py: remove unused `match()` helper that referenced undefined names (`re` and `compiled_custom_id_regex`); the function was never called and would have raised `NameError` at runtime

cache.py

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@
2525
"""
2626

2727
__author__ = 'Linuxfabrik GmbH, Zurich/Switzerland'
28-
__version__ = '2025042001'
28+
__version__ = '2026041201'
2929

3030
from . import db_sqlite, time
3131

@@ -86,13 +86,17 @@ def get(
8686
if not success or not result:
8787
return False
8888

89-
# Check if the key has expired
89+
# Check if the key has expired. `timestamp` is the "expires at" Unix
90+
# epoch set via `set(expire=...)`. We treat the entry as valid up to
91+
# and including that timestamp (strict-less-than), so a key with
92+
# expire=now+5 is still served at now+5 and first becomes expired at
93+
# now+6. This matches HTTP Cache-Control max-age and Redis EXPIRE.
9094
now = time.now()
91-
if result['timestamp'] != 0 and result['timestamp'] <= now:
95+
if result['timestamp'] != 0 and result['timestamp'] < now:
9296
# Clean up all expired entries
9397
db_sqlite.delete(
9498
conn,
95-
sql='DELETE FROM cache WHERE timestamp <= :now;',
99+
sql='DELETE FROM cache WHERE timestamp < :now;',
96100
data={'now': now},
97101
)
98102
db_sqlite.commit(conn)

human.py

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@
1313
"""
1414

1515
__author__ = 'Linuxfabrik GmbH, Zurich/Switzerland'
16-
__version__ = '2026030301'
16+
__version__ = '2026041201'
1717

1818
import math
1919

@@ -155,9 +155,13 @@
155155

156156

157157
def _to_human(n, thresholds, fallback_unit, decimals=1, space=False):
158+
# compare against the absolute value so negative deltas (counter resets,
159+
# memory reclaimed, bandwidth swings) still get scaled to a unit that
160+
# matches their magnitude. e.g. -1048576 bytes reads as '-1.0MiB'.
158161
sep = ' ' if space else ''
162+
abs_n = abs(n)
159163
for symbol, threshold in thresholds:
160-
if n >= threshold:
164+
if abs_n >= threshold:
161165
return f'{float(n) / threshold:.{decimals}f}{sep}{symbol}'
162166
return f'{n:.{decimals}f}{sep}{fallback_unit}'
163167

shell.py

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
"""Communicates with the Shell on Linux and Windows."""
1212

1313
__author__ = 'Linuxfabrik GmbH, Zurich/Switzerland'
14-
__version__ = '2026032101'
14+
__version__ = '2026041201'
1515

1616

1717
import os
@@ -178,21 +178,28 @@ def shell_exec(
178178
return True, (stdout, stderr, retc)
179179

180180
# non-shell pipeline: cmd string is split on '|' and each segment runs with
181-
# shell=False; args come from the caller's cmd string via shlex.split
181+
# shell=False; args come from the caller's cmd string via shlex.split.
182+
# after connecting a stage's stdout to the next stage's stdin, we close it
183+
# in the parent so only the downstream child holds the read end. Without
184+
# this, the upstream child never sees EOF / SIGPIPE when the downstream
185+
# exits early, and we leak a file descriptor per pipeline stage.
182186
cmds = cmd.split('|')
183187
p = None
184188
for part in cmds:
185189
try:
186190
args = shlex.split(part.strip())
191+
prev = p
187192
p = subprocess.Popen( # nosec B603
188193
args,
189-
stdin=p.stdout if p else subprocess.PIPE,
194+
stdin=prev.stdout if prev else subprocess.PIPE,
190195
stdout=subprocess.PIPE,
191196
stderr=subprocess.PIPE,
192197
env=env,
193198
shell=False,
194199
cwd=cwd,
195200
)
201+
if prev is not None:
202+
prev.stdout.close()
196203
except (OSError, ValueError, Exception) as e:
197204
return False, f'Error "{e}" while calling command "{part}"'
198205

url.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -159,9 +159,11 @@ def fetch(
159159
)
160160
response = opener.open(request) # nosec B310
161161
else:
162+
# when digest auth is active the opener installed further up
163+
# already carries its own context, so we must not pass one here
162164
response = urllib.request.urlopen( # nosec B310
163165
request,
164-
timeout=timeout if digest_auth_user else timeout,
166+
timeout=timeout,
165167
context=None if digest_auth_user else ctx,
166168
)
167169

0 commit comments

Comments
 (0)