You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,19 +4,19 @@ A simple and lightweight tool for extracting text from a screenshot/image (on th
4
4
5
5
## Optimization
6
6
7
-
-**Screen capture uses the fastest available hardware path per platform**: DXGI Desktop Duplication on Windows acquires frames directly from the GPU's front buffer via a staging texture mapped for CPU read, avoiding any GDI software rasterization; XGetImage on X11 takes a direct 32bpp packed-pixel fast path (a single `memcpy`-equivalent row scan), falling back to the `XGetPixel` generic path only when the pixel format does not match the expected mask layout. The screen is then kept as a single RGBA buffer in memory for the entire session; all cropping, annotation rendering, and encoding operate on that buffer without re-capturing.
7
+
-**Screen capture uses the fastest available hardware path (X11, Windows)**: DXGI Desktop Duplication on Windows acquires frames directly from the GPU's front buffer via a staging texture mapped for CPU read, avoiding any GDI software rasterization; XGetImage on X11 takes a direct 32bpp packed-pixel fast path (a single `memcpy`-equivalent row scan), falling back to the `XGetPixel` generic path only when the pixel format does not match the expected mask layout. The screen is then kept as a single RGBA buffer in memory for the entire session; all cropping, annotation rendering, and encoding operate on that buffer without re-capturing.
8
8
9
9
- The fullscreen overlay is a **borderless windowed surface** rather than exclusive fullscreen, avoiding implicit GPU mode switches and the display state corruption they can leave behind on abnormal exit. Can also be changed via configuration file.
10
10
11
11
-**OCR, barcode scanning, and font loading are all on-demand**: none are initialized at startup; Tesseract and ZBar are only configured when the user triggers an extraction, and the Tesseract engine instance is reused across extractions within a session, re-initializing only when the model or data path changes. Tesseract page segmentation mode is additionally dispatched in **O(1)** via area and aspect ratio heuristics before OCR runs, avoiding full-page layout analysis on small single-word or single-line regions.
12
12
13
-
-**Annotation geometry is rendered entirely through ImGui draw lists on the GPU**, with CPU-side pixel rasterization only used when baking annotations into the saved image. The rasterizer uses **Bresenham's line algorithm**, O(max(Δx, Δy)), and a **midpoint circle algorithm**, O(radius), rather than naive scanline fills.
13
+
-**Annotation geometry is rendered entirely through ImGui draw lists on the GPU**, with CPU-side pixel rasterization only used when baking annotations into the saved image. The rasterizer uses **Bresenham's line algorithm**`O(max(Δx, Δy))` and a **midpoint circle algorithm**`O(radius)` rather than naive scanline fills.
14
14
15
15
-**Pencil stroke point reduction uses a squared-distance threshold**, comparing `dx²+dy² > 4.0` rather than computing `sqrt`, keeping the per-mouse-move check O(1) with no transcendental function call and keeping the point array small regardless of how long the user draws.
16
16
17
17
-**Grayscale conversion for barcode scanning uses integer-only ITU-R BT.601 weights**`(77r + 150g + 29b) >> 8` rather than floating-point luminance coefficients, keeping the O(w×h) pixel walk entirely in the integer pipeline.
18
18
19
-
-**Monitor detection under XRandR is O(monitors)**, querying only the list of attached outputs and comparing cursor coordinates against their bounding rectangles, never touching pixel data.
19
+
-**Monitor detection is O(monitors)**, querying only the list of attached outputs and comparing cursor coordinates against their bounding rectangles, never touching pixel data.
20
20
21
21
-**The font cache is an O(log n) lookup** keyed on `(path, size)`, ensuring repeated renders of the same annotated text at the same size never trigger atlas rebuilds or filesystem access.
0 commit comments