Nock is an HTTP mocking library for Node. It allowed us to add HTTP mocking to our data collection unit tests well after the fact thus both significantly improving the speed of tests (crucial) and their reliability (you want to test your code, not your network connection). Since our tests are intense in both the number of HTTP requests that they make and in the data that it sent up or arrives down the wire, writing the mocking manually was out of the question. Which is why Nock has a recorder module that leverages HTTP interception to record what you should be checking in the first place. Originally the recording would generate code to paste into unit tests but I added to it JSON output and loading of Nock objects thus persisted.
With Memoirs of a Future Simulation gearing for launch and some new services coming on board, I pushed a number of features and fixes that we needed (for example Nock now reliably mocks binary HTTP requests which was needed for Evernote's API) You can check the two latest round of pull requests here and here).
Among all this code, one fix out in particular - today I've finally managed to fix an issue in the recorder module that was causing some unit tests to fail randomly but only on Travis-CI and on CodeShip continuous integration services. You can see the whole sorry history of failed builds on Travis-CI here and my attempts of fixing it in PR itself. I finally had to abandon the PR due to lack of time which is where the issue stayed.
Then couple of days ago a unit test run failed locally with the same issue. Bingo! Or not - I misdiagnosed it and thought that I fixed it... until again today a Travis-CI build again failed with the same issue. But after all this work on Nock, I'm much more familiar with its internals and with some debug traces and much faster turn-around of CodeShip (it starts building immediately after the push whereas Travis-CI takes about 10 minutes) I managed to trace the issue to another unit test and johnny-come-latey responses to its HTTP requests. This unit test was finishing before its request was responded so success or failure of other tests depended on the timing of the response:
- It could arrive before other tests ran or just between tests or during a test that wasn't written to test say a number of recorded HTTP requests.
- It could arrive during a test that was particularly sensitive to extra recorded requests in which case the test would fail.
- It could arrive after the process has finished which must have been happening often as these were among the last tests to be ran.
I solved the issue by adding an internal recording identifier with the sole purpose of identifying requests that were made during previous recordings. But once identified I couldn't just throw an exception - the tests would still fail randomly. And I didn't want to "fix" the corrupting unit test as it was showing what could really happen in the wild and was purposefully built that way. So at the end I decided to simply log an error message and skip recording the out-of-order response. This keeps the tests as they were, keeps all the existing user code as it is (though recording code is usually of the throwaway nature) and doesn't crash unit tests randomly.
Another solution which I might attempt at a later time is to throw an exception not when out-of-order response is detected but when a recording with outstanding responses is stopped. Or I could make the record stopping wait for all the outstanding responses to be received and timeout if they aren't. But anyway you look at it I would have to change the unit tests which implies changing the user code as well.