← Back to documentation

Stroke Format Specification

Version: 0.1 (Draft)
Date: March 2026
Status: Proposal
Depends on: None (foundational)
Used by: Edge, Bridge, Vision, Cloud (storage)


1. Overview

This spec defines how handwritten strokes are represented, captured, transmitted, and stored in the FreakingGenius system.

A stroke is a single pen-down to pen-up sequence — one continuous line the student draws without lifting the stylus.

1.1 Design Goals

Goal Implication
Low latency Batched transmission (200ms), not per-point
Edge-agnostic Same format works for BOOX, reMarkable, iPad
Compact Minimize bandwidth for bad WiFi
Parseable Vision needs to reconstruct writing order, timing
Debuggable Human-readable format available for development

1.2 Non-Goals


2. Data Model

2.1 Point

A single sampled position of the stylus.

Field Type Required Description
x float X coordinate in normalized space (0.0 - 1.0)
y float Y coordinate in normalized space (0.0 - 1.0)
p float Pressure (0.0 - 1.0). Omit if device doesn't support.
t uint Timestamp in ms since stroke start

Coordinate space: Normalized 0.0 to 1.0, origin at top-left. This decouples the format from device resolution. Edge converts native pixels → normalized on capture. Vision/renderer converts normalized → pixels as needed.

Timestamp: Milliseconds since the stroke began (first point is t=0). Allows reconstructing writing speed without needing absolute timestamps.

2.2 Stroke

A sequence of points forming one continuous line.

Field Type Required Description
id string Unique identifier (UUID or incrementing per session)
pts Point[] Array of points, in temporal order
ts uint64 Absolute timestamp (ms since Unix epoch) of stroke start
tool enum Tool type if known (see 2.3)

2.3 Tool Types

enum Tool {
  PEN = 0       // Default, standard writing
  ERASER = 1    // Eraser stroke (for deletion)
  HIGHLIGHTER = 2  // Highlighter (if supported)
}

Most strokes will be PEN. Eraser strokes are captured separately so Vision knows what was deleted (useful for understanding student's revision process).

2.4 Stroke Batch

Multiple strokes transmitted together (every 200ms).

Field Type Required Description
sessionId string Current session identifier
seq uint Sequence number (for ordering, gap detection)
strokes Stroke[] Array of strokes completed since last batch
partial Stroke Currently in-progress stroke (pen still down)

The partial field allows transmitting incomplete strokes for real-time observation. When the stroke completes, it appears in strokes in the next batch.


3. Wire Format

3.1 JSON (Development / Debug)

Human-readable, used during development and for debugging.

{
  "sessionId": "sess_abc123",
  "seq": 42,
  "strokes": [
    {
      "id": "str_001",
      "ts": 1711670400000,
      "tool": 0,
      "pts": [
        {"x": 0.234, "y": 0.156, "p": 0.45, "t": 0},
        {"x": 0.236, "y": 0.158, "p": 0.52, "t": 8},
        {"x": 0.240, "y": 0.162, "p": 0.61, "t": 16},
        {"x": 0.245, "y": 0.168, "p": 0.58, "t": 24}
      ]
    }
  ],
  "partial": {
    "id": "str_002",
    "ts": 1711670400150,
    "tool": 0,
    "pts": [
      {"x": 0.500, "y": 0.300, "p": 0.40, "t": 0},
      {"x": 0.502, "y": 0.305, "p": 0.45, "t": 8}
    ]
  }
}

3.2 Binary (Production)

Compact format for production use. Significantly smaller than JSON.

Batch Header (16 bytes):
  [0-3]   Magic number: 0x46475354 ("FGST")
  [4-7]   Version: uint32
  [8-11]  Sequence number: uint32
  [12-15] Stroke count: uint32

Per Stroke:
  [0-15]  Stroke ID: 16 bytes (UUID)
  [16-23] Timestamp: uint64 (ms since epoch)
  [24]    Tool: uint8
  [25-27] Point count: uint24
  [28+]   Points: (point_count * 10 bytes each)

Per Point (10 bytes):
  [0-3]   X: float32
  [4-7]   Y: float32
  [8-9]   P + T packed: uint16 (4 bits pressure, 12 bits time delta)

Packed P+T: Pressure is quantized to 4 bits (16 levels, sufficient for writing). Time delta from previous point in 12 bits (max 4095ms between samples — more than enough at 120Hz).

3.3 Format Negotiation

Edge and Tutor negotiate format at session start:

// In capability exchange
{
  "strokeFormat": "binary",  // or "json"
  "strokeVersion": 1
}

Default to binary in production, JSON when DEBUG flag is set.


4. Transmission Protocol

4.1 Batching

Strokes are batched and sent every 200ms (configurable).

Parameter Default Range Notes
batchIntervalMs 200 100-500 Lower = more responsive, higher bandwidth
maxBatchSize 100 50-500 Max points per batch before forced send

4.2 Reliability

Stroke data is critical — we can't lose what the student wrote.

Approach: Lightweight acknowledgment with local buffer

  1. Edge sends batch with sequence number
  2. Tutor acknowledges received sequences periodically (every 1s)
  3. Edge buffers unacknowledged batches (last 30s)
  4. On connection drop, Edge replays unacknowledged batches on reconnect
  5. Tutor deduplicates by sequence number
Edge                           Tutor
  │                               │
  │──── Batch seq=1 ────────────►│
  │──── Batch seq=2 ────────────►│
  │──── Batch seq=3 ────────────►│
  │                               │
  │◄──── ACK seq=3 ──────────────│  (acknowledges 1,2,3)
  │                               │
  │  (Edge clears buffer 1-3)     │

4.3 Offline Handling

If connection is lost:

  1. Edge continues capturing strokes locally
  2. Strokes buffer on-device (up to configurable limit, e.g., 10MB)
  3. On reconnect, Edge sends buffered batches in order
  4. Tutor processes the backlog, catches up on what student wrote

This means the student can keep working during brief network drops.


5. Sampling

5.1 Sample Rate

Target: 120Hz (8.3ms between samples)

Most modern styluses report at 120-240Hz. We downsample to 120Hz to balance accuracy vs. data volume.

5.2 Point Filtering

Edge may apply minimal filtering before transmission:

Filter Purpose Parameters
Distance threshold Skip points too close together min 0.002 normalized units
Smoothing Optional noise reduction 3-point moving average

Vision may apply additional processing. Raw-ish data is preferred to preserve writing characteristics.


6. Coordinate Spaces

6.1 Normalized Space (Wire Format)

6.2 Device Space (Edge Internal)

Each Edge converts between device pixels and normalized:

normalized_x = (device_x - left_margin) / content_width
normalized_y = (device_y - top_margin) / content_height

Content area excludes any UI chrome. Only the "paper" area is mapped.

6.3 Exercise Space (Vision Internal)

Vision may further transform into exercise-relative coordinates — e.g., mapping to specific input zones within an exercise. That's defined in the Vision spec.


7. Metadata

7.1 Per-Session Metadata

Captured at session start:

{
  "sessionId": "sess_abc123",
  "deviceId": "dev_xyz789",
  "deviceModel": "BOOX Tab Ultra C",
  "screenResolution": {"w": 2480, "h": 1860},
  "contentArea": {"x": 40, "y": 120, "w": 2400, "h": 1680},
  "stylusModel": "BOOX Pen Plus",
  "samplingRate": 120
}

7.2 Per-Stroke Metadata (Optional)

For analytics/research, Edge may capture:

{
  "strokeId": "str_001",
  "inputZone": "zone_answer_1",  // Which part of exercise
  "exerciseId": "ex_4521",       // Current exercise
  "duration": 342,               // ms from first to last point
  "length": 0.234,               // Total path length (normalized)
  "boundingBox": {"x": 0.23, "y": 0.15, "w": 0.12, "h": 0.08}
}

This metadata is computed locally on Edge and included in batches when present.


8. Examples

8.1 Simple Digit "3"

Student writes the number 3:

{
  "id": "str_001",
  "ts": 1711670400000,
  "tool": 0,
  "pts": [
    {"x": 0.320, "y": 0.100, "p": 0.50, "t": 0},
    {"x": 0.345, "y": 0.095, "p": 0.55, "t": 16},
    {"x": 0.370, "y": 0.100, "p": 0.60, "t": 33},
    {"x": 0.375, "y": 0.120, "p": 0.58, "t": 50},
    {"x": 0.360, "y": 0.140, "p": 0.55, "t": 66},
    {"x": 0.340, "y": 0.150, "p": 0.52, "t": 83},
    {"x": 0.365, "y": 0.165, "p": 0.55, "t": 100},
    {"x": 0.380, "y": 0.185, "p": 0.58, "t": 116},
    {"x": 0.375, "y": 0.210, "p": 0.55, "t": 133},
    {"x": 0.350, "y": 0.225, "p": 0.50, "t": 150},
    {"x": 0.320, "y": 0.220, "p": 0.45, "t": 166}
  ]
}

8.2 Eraser Stroke

Student erases something:

{
  "id": "str_015",
  "ts": 1711670450000,
  "tool": 1,
  "pts": [
    {"x": 0.400, "y": 0.300, "t": 0},
    {"x": 0.450, "y": 0.310, "t": 50},
    {"x": 0.500, "y": 0.305, "t": 100}
  ]
}

Note: Eraser strokes typically don't have pressure data.

8.3 Batch with Partial Stroke

Mid-writing batch:

{
  "sessionId": "sess_abc123",
  "seq": 42,
  "strokes": [
    {"id": "str_040", "ts": 1711670500000, "tool": 0, "pts": [...]},
    {"id": "str_041", "ts": 1711670500200, "tool": 0, "pts": [...]}
  ],
  "partial": {
    "id": "str_042",
    "ts": 1711670500350,
    "tool": 0,
    "pts": [
      {"x": 0.600, "y": 0.400, "p": 0.45, "t": 0},
      {"x": 0.605, "y": 0.408, "p": 0.50, "t": 8},
      {"x": 0.612, "y": 0.420, "p": 0.52, "t": 16}
    ]
  }
}

9. Storage

9.1 Session Storage (Cloud)

Complete stroke data is stored for:

Storage format: Binary batches, gzipped, in object storage (S3/GCS).

Retention: TBD (likely 1 year for active students, anonymized thereafter).

9.2 Local Cache (Edge)

Edge caches recent strokes for:

Cache size: 10MB or 30 minutes, whichever is smaller.


10. Compatibility

10.1 Version Evolution

Format includes version number. Receivers must:

  1. Accept current version
  2. Accept previous version (backward compatible)
  3. Reject unknown future versions gracefully

10.2 Device Variations

Device Pressure Tilt Sample Rate
BOOX Tab Ultra ✓ (4096 levels) 120Hz
reMarkable 2 ✓ (4096 levels) 120Hz
iPad + Pencil 240Hz (downsample)
Finger touch 60Hz

Missing capabilities are simply omitted from the data (e.g., no p field if no pressure).


Appendix A: Size Estimates

Typical writing session:

Format Size Notes
JSON (uncompressed) ~2.5 MB ~100 bytes/point
JSON (gzipped) ~400 KB Compresses well
Binary (uncompressed) ~250 KB 10 bytes/point
Binary (gzipped) ~100 KB Final storage size

Transmission (200ms batches over 45 min):

This is very manageable, even on poor connections.


Appendix B: Open Questions

Question Options Notes
Stroke ID format UUID vs. sequential UUID is safer, sequential is smaller
Pressure quantization 4-bit vs 8-bit 4-bit (16 levels) probably sufficient
Include tilt data? Yes / No / Optional Useful for rendering, adds 4 bytes/point
Compression algorithm gzip / lz4 / zstd lz4 is fastest, gzip most compatible

Next spec: Visual Primitives (how Tutor tells Edge what to draw)