Paul Gardner said “A painting is never finished. It simply stops in interesting places”.
This is 100% true of code. The finish is not the most important thing – when we seek to learn by looking at code, the steps along the way say much more than the end-product alone.
So, as I created a self-contained piece of Apex for a project, I thought that I would git-commit every step. I don’t claim that this code is perfect, but I hope that, by showing my working, it might stimulate thought/discussion. It certainly made me think about the reasons for each step, from draft to finish, more than I otherwise would have.
The Problem
We query data from an external service (YouTube), and receive a large JSON response. We only want one value from it, so can we write a JsonReader class to simplify getting that value?
The usage should look like this:
First Draft
I started by just throwing something together.
We’re going to need to split the path based on dots. I’m not yet sure how I’m going to deal with square brackets for arrays, but let’s get something down and some tests passing – we can worry about arrays later.
Once I’ve split the path up by the dots, I plan to use recursion to traverse the path expression and the JSON data at the same time. If we get to the end of the path expression, then we know we’re done and can return the current value from the JSON data.
The test case is simply the value from one object.
My Test Doesn’t Work
Oh dear, trying to be clever and use String.format() to put values into the JSON expression in the test doesn’t work because it gets confused by the other curly braces.
Changed it to old-school string concatenation, and the test passes!
Enter the Regex
Then, I went and fiddled around on RegexPlanet until I had a good regex. The expression picks out either a key in the JSON path, or an array reference each time I do find(). Using the test string ‘a.a2[1].b[0][0]’, I came up with the regex of ([d+])|([w]+) i.e. either some digits in square brackets or some continuous word characters.
I’m going to use Apex regex classes to call find() repeatedly on the given path expression. Matcher.find() is acting a bit like an iterator, but is not an iterator. Notably, find() both tests for more matches, and advances through the string. We would not expect an iterator’s hasNext() to mutate the state of the iterator.
So, I took a side-track to wrap this Matcher functionality into a new class, RegexFindIterator. I made it a private inner class until I could decide what to do with it.
Work the Regex
I joined up the new RegexFindIterator class to the read() method.
At this point, I realised that the search path pattern is a constant, so declared it and renamed it as such.
Finally, I added a new test for nested keys to make use of the regex approach.
Test Failure
Literally, a failure in my test. It was using invalid JSON – an unterminated string literal. I fixed that, and both test are passing.
No code sample here as the change was so simple. But this does show that I really did commit every change, even the mis-steps.
RegexFindIterator has earned a promotion
RegexFindIterator seems like it can stand on its own, so I made it into a top-level class.
And then onto some other considerations…
We will need some code to deal with array indices, so I added a test to find them, but I left the implementation blank for now.
We will also need to deal with missing keys. It should be able to return something other than null if a key is absent, so I put something in for that and we’ll add a way for users of the class to set it later.
Naming Matters
I renamed missingKeyValue to missingKeyResult. Constantly renaming things so that your are clear about what everything is for is really important. Sometimes, the correct name only becomes clear long after you’ve first had to give some name to a variable. With a good IDE, refactoring a name is simple, so do it.
No code sample, but you can see the commit in GitHub.
[List]
I took on handling lists:
If the current key has square braces in it, trim the first and last characters, and parse the rest as an integer. Then get either the corresponding item from the list, or missingKeyResult if it’s out of bounds.
In doing this, I realised that I can pull nextData out and set it in either branch of testing what sort of key we have.
And here’s a test for lists:
Wrong again!
Again, the JSON in the tests is malformed. So, I finally changed the way I wrote JSON to make it less error prone. Test code is just as important as the main code, so it deserves attention, too.
Invalid Integer
Now the test fails with System.TypeException: Invalid integer
I bet it’s an off-by-one error where I was trying to remove the square brackets. So, I logged out the key I’m trying to parse, and the result after trying to remove the square brackets:
From this, it’s clear that it was an off-by-one error, so the fix is simple:
More Tests
It looks like the code is largely working, so I added some tests to stretch it: Two dimensional arrays, and a good mix of arrays and keys.
All of the tests passed first time!
Deeply Nested Brackets Make Me (and PMD) Sad
I don’t much like the nested if statements in the recursive read() function, so I split them out into separate functions for List type and Map types.
And I added an extra constructor in case someone has already parsed the JSON into a Map/List.
Tests still passing!
Comments
In general, I don’t like to comment what my code is doing. I prefer to just comment circumstances where the code isn’t doing the obvious thing e.g. working around a platform bug, or dealing with an unintuitive requirement.
However, I do like to have a brief summary at the top of classes that deserve it. Now we know what JsonReader is for, I can write a decent description:
Lost your keys?
I was going to give an option of returning a specified value instead of null when a key is not found. The reason for this is that it can help us to avoid null checks later on. When you use this class, you might know a reasonable value to use instead of null. Maybe a default string, or an empty list?
I added with in a fluent style, with a test so that you can set it as you construct the JsonReader:
Conclusions and Design Principles
That’s it! A functioning little utility, with a nice interface. Despite the extra thought required to create it, this code is much easier to understand than parsing the JSON untyped and then using a series of get calls. Even if it’s never used anywhere else, it makes this corner of the project a nicer place to be.
Although this blog post is long, the actual coding took only a couple of hours, so the cost is small.
A few of the guiding principles behind some of this decision making are:
- “When faced with a hard change, first make it easy (warning, this may be hard)” (Kent Beck) – For example, the regex libraries in Apex are very long-winded and multi-purpose so I wrapped them in RegexFindIterator to simplify them for my use-case. Then, splitting the path expression was easy.
- Draft Zero – Your first draft of a solution doesn’t have to solve the whole problem. It has to advance your understanding of the problem, and be clear/clean enough that you can build/refactor it towards the complete solution.
- Test code is important – We can relax performance constraints in tests (e.g. I sometimes do SOQL in loops in test code), but we should still strive for cleanliness and clarity.
- Keep the same level of abstraction in each method – Moving the readFromMap() and readFromList() out of read() did this. The read() method is concerned with traversing the data and returning results. The readFrom*() methods are concerned with the lower-level task of looking inside each piece of data. Similarly, RegexFindIterator is concerned with details of how to move through the JSON path string so that read() can only work at the level of traversal.
As you may have gathered, the GitHub repository is here: Apex Json Reader. It has an MIT licence, so you are free to grab it and use it in your own projects. And if you have any questions, remember to get in touch.