Recursive JSON Parsing

Purpose

Within the broader scope of JSON Parsing and Representation, Recursive JSON Parsing addresses the challenge of efficiently converting JSON text into a structured, in-memory tree representation by leveraging recursion. JSON’s nested nature—with arrays containing objects, objects containing arrays, and so forth—requires a parsing approach capable of handling arbitrarily deep structures while maintaining correctness and performance.

This subtopic focuses specifically on implementing recursive descent parsing techniques that systematically process JSON strings, numbers, arrays, and objects. It ensures proper state management during parsing, particularly for complex constructs like escaped characters in strings and nested containers. The approach enables the core parser to build accurate data structures while detecting syntax errors and handling whitespace effectively.

Functionality

The recursive parsing workflow is initiated by the public API function json_parse(), which prepares the input and calls the core recursive parser function parse_value_build(). This function dispatches parsing to specialized routines depending on the JSON token encountered:

Each parsing function maintains its own state and position pointer within the input string. They invoke each other recursively as nested structures are encountered. Whitespace skipping is performed at appropriate points to ensure robustness against formatted JSON input.

Key aspects of the implementation include:

This recursive approach contrasts with iterative or lexer-driven parsers by directly mirroring the JSON grammar in function calls, which simplifies the logic and fits naturally with JSON’s nested structure.

Code Snippet Illustrating Recursive Dispatch

static json_value *parse_value_build(const char **s, int id) {
  skip_ws(s);
  if (**s == '"')
    return parse_string_value(s);
  if (**s == '{')
    return parse_object_value(s, ++id);
  if (**s == '[')
    return parse_array_value(s, ++id);
  if (**s == 'n') {
    if (match_literal_build(s, "null"))
      return json_new_null();
    return NULL;
  }
  if (**s == 't') {
    const char *ptr = *s;
    if (match_literal_build(s, "true"))
      return json_new_boolean(ptr, 4);
    return NULL;
  }
  if (**s == 'f') {
    const char *ptr = *s;
    if (match_literal_build(s, "false"))
      return json_new_boolean(ptr, 5);
    return NULL;
  }
  if (**s == '-' || isdigit((unsigned char)**s))
    return parse_number_value(s);
  return NULL;
}

Integration

Recursive JSON Parsing is a foundational subtopic within JSON Parsing and Representation, providing the low-level mechanics behind the construction of the in-memory JSON value trees. It complements the In-Memory JSON Structure subtopic by producing the structured data that those data structures represent.

The parsed json_value tree generated by this recursive parsing process is subsequently used by other modules:

Thus, recursive parsing acts as the entry point converting raw JSON text into the internal format that all other functionalities operate on safely and efficiently.

Diagram

flowchart TD
Start[Start Parsing] --> SkipWS[Skip Whitespace]
SkipWS --> CheckToken{Next Char}
CheckToken -->|'s'| ParseString[Parse String]
CheckToken -->|'{'| ParseObject[Parse Object]
CheckToken -->|'\['| ParseArray[Parse Array]
CheckToken -->|'n'| MatchNull[Match "null"]
CheckToken -->|'t'| MatchTrue[Match "true"]
CheckToken -->|'f'| MatchFalse[Match "false"]
CheckToken -->|'-' or digit| ParseNumber[Parse Number]
CheckToken -->|Other| Fail[Parse Fail]
ParseString --> ReturnValue[Return json_value]
ParseObject --> ParseObjectLoop[Parse Key-Value Pairs Recursively]
ParseObjectLoop --> ReturnValue
ParseArray --> ParseArrayLoop[Parse Elements Recursively]
ParseArrayLoop --> ReturnValue
MatchNull --> ReturnNull[Return Null Value]
MatchTrue --> ReturnTrue[Return Boolean True]
MatchFalse --> ReturnFalse[Return Boolean False]
ParseNumber --> ReturnNumber[Return Number Value]
ReturnNull --> End[End Parsing]
ReturnTrue --> End
ReturnFalse --> End
ReturnNumber --> End
ReturnValue --> End
Fail --> End

This flowchart illustrates the recursive descent decision process and recursive calls that build the JSON value tree from the input string.