CS302 Lecture Notes - The Nlohmann JSON library


Introduction

JSON is a standard for storing information in strings. There JSON readers and writers for every language, so if you want your input and output to interoperable with other people's programs, JSON is a very good choice.

The Nlohmann JSON library is an open source include-only library for C++, which you can get in https://github.com/nlohmann/json. Clone it, and then if you include "nlohmann/json.hpp" in your programs, you'll be able to read, write and manipulate JSON effortlessly. You do have to compile with C++11 or later. I know that doesn't bother you like it does me, but it should show you how much I like this library that I'll eat C++11 to use it.


A little about JSON

JSON is pretty flexible -- a number, boolean or a quoted string is a valid piece of JSON. The two more complex pieces of JSON are the array and the key/value store (referred to as an "object"). The following strings are all valid JSON:
55                                       # Number
-23.45                                   # Number
true                                     # Boolean
"Fred"                                   # String
[ 1, 2, "Fred" ]                         # Array with three elements
{ "Name": "Jim", "Age": 100 }            # Key/Value "object" with two keys/values.
You may nest arrays and objects in arrays and objects, giving you a flexible hierarchy. For example, the file txt/i2.txt contains a key/value store with two keys: "Jim" and "Brad". Each of them have objects as their values. I'm representing Brad's children with an empty JSON array.

{ "Jim" : { "Last": "Plank", 
            "Age": "Infinity",
            "Children": [ "Katie", "Emily", "Ellen", "Jeff" ] },
  "Brad" : { "Last": "Vander Zanden",
             "Age": 55,
            "Children": []}}

I won't talk much more about JSON -- my advice is to keep it simple, and don't even think about arcane things like how to represent a double-quote or general ASCII characters.


Example 1 - Creating and printing some JSON

My first program (src/ex0_create_n_print.cpp) reads lines of text, and does one of two things:
  1. If the line has two words, it will treat it as key/value and store it into a JSON object. It stores numbers as doubles and strings as strings.
  2. If the line has more than two words, then it treats the line as a key/values where the values are stored in a JSON array.
It then prints the JSON. So:
UNIX> echo Jim Plank | bin/ex0_create_n_print 
{"Jim":"Plank"}
UNIX> echo Five 5 | bin/ex0_create_n_print 
{"Five":5.0}
UNIX> cat txt/i1.txt
Jim Plank
Fred -55.55
Luther 1 Two Buckle My Shoe
UNIX> cat txt/i1.txt | bin/ex0_create_n_print 
{"Fred":-55.55,"Jim":"Plank","Luther":[1.0,"Two","Buckle","My","Shoe"]}
UNIX> 
You'll note -- when it prints the JSON, it sorts the keys. However, you shouldn't assume that if someone gives you JSON, that they keys will be sorted.

Here's the code -- reading the input should be straightforward to you. Note how I can treat the JSON like an associative array (e.g. a map) when it's a key/value store, and like a vector when it's an array:

/* Read lines of text and add them as key/value objects in JSON.  Then print the JSON. */

#include "nlohmann/json.hpp"
#include <iostream>
#include <vector>
#include <sstream>
using namespace std;
using nlohmann::json;

int main()
{
  string key, val;
  double v;
  vector <string> sv;
  vector <bool> bv;
  vector <double> dv;
  istringstream ss;
  string line;
  string s;
  size_t i;
  json js;

  /* Read in lines of text.  

     If there are two words, then store a key/value pair in the json.  If the value
        can be interpreted as a double, then store it as a double.  Otherwise, store
        it as a string.

     If there are more than two words, then have the value be an array.
   */

  js = json::object();

  /* Read the line and turn it into a vector of words, bools and doubles. */

  while (getline(cin, line)) {
    sv.clear();
    bv.clear();
    dv.clear();
    ss.clear();
    ss.str(line);
    while (ss >> s) {
      v = 0;
      sv.push_back(s);
      bv.push_back(sscanf(s.c_str(), "%lf", &v) == 1);
      dv.push_back(v);
    }

    /* Set js[key] to val, using sscanf to see if val is a double or string */

    if (sv.size() == 2) {   
      if (bv[1]) {                                   // It's a double.
        js[sv[0]] = dv[1];
      } else {                                       // It's a string.
        js[sv[0]] = sv[1];
      }

    /* Otherwise set js[key] to be an array of values. */

    } else if (sv.size() > 2) {  
      key = sv[0];
      js[key] = json::array();
      for (i = 1; i < sv.size(); i++) {
        if (bv[i]) {                                   // It's a double.
          js[key].push_back(dv[i]);
        } else {                                       // It's a string.
          js[key].push_back(sv[i]);
        }
      }
    }
  }
   
  /* Print it out. */
  
  cout << js << endl;
  return 0;
}


Reading and "Dumping" JSON

Reading JSON with nlohmann::json is super easy -- you simply read it from a stream like cin. You can write using a stream too, as I did above. There is also the dump() method, which will return a string that has the JSON in it. If you give an integer parameter to dump(), then it will indent the JSON.

Here's a simple program that reads JSON and prints it formatted with two spaces of indentation. It's in src/ex1_read_and_dump.cpp:

/* Read JSON from standard input, and then use dump(2) to write it indented by two spaces. */

#include "nlohmann/json.hpp"
#include <iostream>
using namespace std;
using nlohmann::json;

int main()
{
  json js;

  cin >> js;
  
  cout << js.dump(2) << endl;
  return 0;
}

Here are some examples:

UNIX> echo 55 | bin/ex1_read_and_dump 
55
UNIX> echo Jim Plank | bin/ex0_create_n_print | bin/ex1_read_and_dump 
{
  "Jim": "Plank"
}
UNIX> cat txt/i1.txt | bin/ex0_create_n_print | bin/ex1_read_and_dump
{
  "Fred": -55.55,
  "Jim": "Plank",
  "Luther": [
    1.0,
    "Two",
    "Buckle",
    "My",
    "Shoe"
  ]
}
UNIX> 
If you mess up your JSON, it will throw an execption -- we won't catch it, so the program will exit. As you can see, the error messages are not fun to read.
UNIX> echo Jim | bin/ex1_read_and_dump      # Strings need quotes
libc++abi.dylib: terminating with uncaught exception of type nlohmann::detail::parse_error: [json.exception.parse_error.101] parse error at line 1, column 1: syntax error while parsing value - invalid literal; last read: 'J'
Abort trap: 6
UNIX> echo '{"Jim":}'     # Keys need values
{"Jim":}
UNIX> echo '{"Jim":}' | bin/ex1_read_and_dump
libc++abi.dylib: terminating with uncaught exception of type nlohmann::detail::parse_error: [json.exception.parse_error.101] parse error at line 1, column 8: syntax error while parsing value - unexpected '}'; expected '[', '{', or a literal
Abort trap: 6
UNIX> 
You can catch the exception. It is of type json::exception, and like standard C++ exceptions, it has a what() method which returns a C-style string with the exception.

Enumerating keys

You can use an iterator to enumerate the keys in a JSON object. For example, the program in src/ex2_print_keys.cpp illustrates -- read the header comment for what it does:

/* This program uses an iterator to enumerate the keys inside a JSON object.
   For each key, it prints:
     - The key and val, if the val is a number or string.
     - The key and val size if the val is an object or array.
 */
   
#include "nlohmann/json.hpp"
#include <iostream>
using namespace std;
using nlohmann::json;

int main()
{
  json js;
  json::const_iterator jit;

  /* Read the json from standard input. */

  cin >> js;
  
  /* If it's not an object, print an error message and exit. */

  if (!js.is_object()) {
    cout << "Not a JSON object: " << js << endl;
    return 0;
  } 

  /* Otherwise, run through the keys with an iterator and print them. */

  for (jit = js.begin(); jit != js.end(); jit++) {
    printf("Key: %-10s ", jit.key().c_str());
    if (jit.value().is_string() || jit.value().is_number()) {
      cout << "Value: " << jit.value() << endl;
    } else if (jit.value().is_object() || jit.value().is_array()) {
      cout << "Size: " << jit.value().size() << endl;
    } else {
      cout << "Unknown value type.\n";
    }
  }
  return 0;
}

You will have to use const_iterator, or it won't compile. You can see above that the iterator has a key() method to get the key, which will be a string, and a val() iterator to get the val, which is itself JSON, which can be a string, number, array or object. Here are some examples:

UNIX> echo 1 | bin/ex2_print_keys                              # Although "1" is a valid JSON string, it is not a key/value object.
Not a JSON object: 1
UNIX> echo '{"Jim":"Plank", "Number":5}'
{"Jim":"Plank", "Number":5}
UNIX> echo '{"Jim":"Plank", "Number":5}' | bin/ex2_print_keys  # Interestingly, the key is a string, but the val is JSON, 
Key: Jim        Value: "Plank"                                 # hence why you get the quotes when it prints "Plank".
Key: Number     Value: 5
UNIX> cat txt/i2.txt
{ "Jim" : { "Last": "Plank", 
            "Age": "Infinity",
            "Children": [ "Katie", "Emily", "Ellen", "Jeff" ] },
  "Brad" : { "Last": "Vander Zanden",
             "Age": 55,
            "Children": []}}
UNIX> cat txt/i2.txt | bin/ex2_print_keys                      # The keys are sorted when you enumerate them.
Key: Brad       Size: 3
Key: Jim        Size: 3
UNIX> bin/ex0_create_n_print < txt/i1.txt
{"Fred":-55.55,"Jim":"Plank","Luther":[1.0,"Two","Buckle","My","Shoe"]}
UNIX> bin/ex0_create_n_print < txt/i1.txt | bin/ex2_print_keys
Key: Fred       Value: -55.55
Key: Jim        Value: "Plank"
Key: Luther     Size: 5
UNIX> 

Setting JSON constants in your program

There are times where you'll simply want to set JSON in your program. The program src/ex3_static_assignment.cpp shows you how to do this -- it's a good example to have in your pocket.

/* This program shows how you can assign a json object (or array) statically in your program. */

#include "nlohmann/json.hpp"
#include <iostream>
using namespace std;
using nlohmann::json;

static json j = {
         { "first", "Tiger" },
         { "last", "Woods" },
         { "wins", 82 },
         { "majors", 15 },
         { "masters-wins", { 1997, 2001, 2002, 2005, 2019 } },   // These are arrays.
         { "pga-wins", { 1999, 2000, 2006, 2007 } },
         { "us-open-wins", { 2000, 2002, 2008 } },
         { "british-open-wins", { 2000, 2005, 2006} } };


int main()
{
  cout << j.dump(2) << endl;
}

Of course, when you print it, the keys will be sorted:

UNIX> bin/ex3_static_assignment 
{
  "british-open-wins": [
    2000,
    2005,
    2006
  ],
  "first": "Tiger",
  "last": "Woods",
  "majors": 15,
  "masters-wins": [
    1997,
    2001,
    2002,
    2005,
    2019
  ],
  "pga-wins": [
    1999,
    2000,
    2006,
    2007
  ],
  "us-open-wins": [
    2000,
    2002,
    2008
  ],
  "wins": 82
}
UNIX> 


Other useful things

There are examples in https://github.com/nlohmann/json (scroll down or search for "Examples").

A Practice Program to Write

Here's an assignment to give you practice. Every week, my family competes on DraftKings in a fantasy golf game. If there is a PGA Tour tournament that week, then there is a game on DraftKings, where you draft a 6-golfer team within a $50,000 salary. At the end of the tournament, each golfer has a score, and whoever's team scores the highest wins.

Before the tournament, DraftKings lets you download a csv file containing the salaries, and after the tournament starts, you can download a csv file containing each player's score. I've converted these to JSON and put the files into the salaries and standings directories. Take a look at the files for the Heritage tournament (played on Hilton Head island) in 2020:

UNIX> ls standings/*2020-*Heritage*
standings/2020-06-21-Heritage.txt
UNIX> ls salaries/*2020-*Heritage*
salaries/2020-06-21-Heritage.txt
UNIX> bin/ex2_print_keys < standings/2020-06-21-Heritage.txt
Key: golfers    Size: 154
Key: tournament Value: "2020-06-21-Heritage"
Key: value      Value: "score"
UNIX> bin/ex2_print_keys < standings/2020-06-21-Heritage.txt
Key: golfers    Size: 154
Key: tournament Value: "2020-06-21-Heritage"
Key: value      Value: "score"
UNIX> bin/ex1_read_and_dump < standings/2020-06-21-Heritage.txt | head
{
  "golfers": {
    "Aaron Baddeley": 33,
    "Aaron Wise": 32.5,
    "Abraham Ancer": 125.5,
    "Adam Hadwin": 88,
    "Adam Long": 29,
    "Adam Schenk": 27.5,
    "Alex Noren": 104,
    "Andrew Landry": 84,
UNIX> bin/ex1_read_and_dump < salaries/2020-06-21-Heritage.txt | head
{
  "golfers": {
    "Aaron Baddeley": 6400,
    "Aaron Wise": 6400,
    "Abraham Ancer": 8000,
    "Adam Hadwin": 7300,
    "Adam Long": 6300,
    "Adam Schenk": 6500,
    "Alex Noren": 7000,
    "Andrew Landry": 6800,
UNIX> 
Each file contains JSON with the following keys: Unfortunately, it's not the case that every golfer plays every tournament, and worse yet, the golfers in the salary file for a tournament doesn't always match the golfers in the score file. Such is life.

Now, take a look at src/read_tournaments.cpp:

#include "nlohmann/json.hpp"
#include <iostream>
#include <vector>
#include <sstream>
using namespace std;
using nlohmann::json;

int main()
{
  json js;
  json overall;
  string v, t;
  json g;

  overall["salary"] = json::object();
  overall["score"] = json::object();

  try {
    while (true) {
      cin >> js;
      v = js["value"];
      t = js["tournament"];
      g = js["golfers"];
      overall[v][t] = g;
    }
  } catch (const json::exception &e) {
  }

  cout << overall << endl;
  return 0;
}

This lets you cat files into standard input, and it will create a new JSON which has the keys "salary" and "score". The vals for these are JSON objects whose keys are tournament names, and whose vals are the same objects for the golfers that are in the file. At the end, it prints the JSON. Let's run this on the Heritage files:

UNIX> cat s*/2020*Heritage* | bin/read_tournaments | bin/ex1_read_and_dump 
{
  "salary": {
    "2020-06-21-Heritage": {
      "Aaron Baddeley": 6400,
      "Aaron Wise": 6400,
      "Abraham Ancer": 8000,
      "Adam Hadwin": 7300,
      "Adam Long": 6300,
      "Adam Schenk": 6500,
                                   # Skip lines...
      "Xander Schauffele": 10200,
      "Zach Johnson": 7000
    }
  },
  "score": {
    "2020-06-21-Heritage": {
      "Aaron Baddeley": 33,
      "Aaron Wise": 32.5,
      "Abraham Ancer": 125.5,
      "Adam Hadwin": 88,
      "Adam Long": 29,
      "Adam Schenk": 27.5,
                                   # ...
      "Xander Schauffele": 79,
      "Zach Johnson": 27.5
    }
  }
}
UNIX>
Or, let's run it on all of the tournaments played in September, 2019:
UNIX> cat s*/2019-09* | bin/read_tournaments | bin/ex1_read_and_dump
{
  "salary": {
    "2019-09-15-Greenbrier": {
      "Adam Long": 6300,
                              # Skip lines...
      "Zack Sucher": 6100
    },
    "2019-09-22-Sanderson": {
      "Aaron Wise": 8500,
                              # ...
      "Zack Sucher": 6600
    },
    "2019-09-29-Safeway": {
      "Aaron Baddeley": 6700,
                              # ...
      "Zac Blair": 6900
    }
  },
  "score": {
    "2019-09-15-Greenbrier": {
      "Adam Long": 94.5,
                              # ...
      "Zack Sucher": 80
    },
    "2019-09-22-Sanderson": {
      "Aaron Wise": 78.5,
                              # ...
      "Zack Sucher": 84.5
    },
    "2019-09-29-Safeway": {
      "Aaron Baddeley": 67.5,
                              # ...
      "Zac Blair": 101
    }
  }
}
UNIX>
Here's what I want you do try to do -- I've copied src/read_tournaments.cpp to src/values.cpp. Augment it to do the following: My code to do this is only 21 lines -- the nlohmann JSON library is really powerful.

You can test yourself by seeing if your code matches mine in the following use cases:

# The 10 best value golfers in the Heritage (Webb Simpson won it, but he was expensive):

UNIX> cat s*/2020*Heritage* | bin/values | sort -nr | head -n 10
16.2698  1 Carlos Ortiz
16.1290  1 Michael Thompson
16.1111  1 Webb Simpson
16.0135  1 Joaquin Niemann
15.6875  1 Abraham Ancer
15.4110  1 Sergio Garcia
15.1493  1 Dylan Frittelli
14.9167  1 Jim Herman
14.8571  1 Alex Noren
14.7656  1 Doc Redman

# The 10 best value golfers in September, 2019:

UNIX> cat s*/*2019-09* | bin/values | sort -nr | head -n 10
14.0000  1 Garrett Osborn
13.8052  3 Adam Long
13.4171  2 Richy Werenski
13.0688  3 Harris English
13.0213  3 Sebastian Munoz
12.9589  2 Zack Sucher
12.6667  1 Tommy Gainey
12.4118  1 Charles Howell III
12.2222  1 Stewart Cink
12.2204  3 Cameron Percy

# The 10 best value golfers who played at least 10 tournaments in 2020:

UNIX> cat standings/2020* salaries/2020* | bin/values | sort -nr | grep ' [0-9][0-9] ' | head
 9.8953 17 Cameron Davis
 9.5444 10 Martin Laird
 9.3741 16 Daniel Berger
 9.1618 22 Joel Dahmen
 9.1290 19 Ryan Palmer
 9.1120 24 Sepp Straka
 9.0836 10 Anirban Lahiri
 8.9926 17 Richy Werenski
 8.9419 20 Andrew Landry
 8.9333 20 Sam Burns
UNIX> 
# Where's Tiger Woods? UNIX> cat standings/* salaries/* | bin/values | sort -nr | grep Tiger 6.8938 9 Tiger Woods UNIX>