Week 10

delvja01
Aug 5, 2022
3 min read

After getting a parser for bson documents working it is time to start setting up the function calls to build these documents and pipelines. There are a lot of functions that I want the plugin to be able to support. The most important are the insert, find, update, delete, and aggregate functions. Since the insert, find, update, and delete functions all have similar parameters they will be pretty similar when implementing them. Since I had the buildDocument and buildPipeline functions working I spent some time refactoring my code to make it a little cleaner before moving on to adding a bunch of new functions.

I started working in the execute function and added the insert, find, and aggregate methods first. I wasn't too worried about the update and delete methods because I knew they were very similar to find. It was a little tricky getting the aggregate function working and testing buildPipeline, but after some tweaks I was able to get it working and run some aggregations. Now that I was getting some real results from the find and aggregate functions it was time to spend some time focusing on the output of the functions defined in the ECL code. I spent a lot of time making example ecl code to test queries on the MongoDB databases and found a lot of bugs that were just minor errors in my parsing and were quick fixes.

I had some trouble getting subarrays from MongoDB to be returned and it turned out to be my inexperience with ECL. To get a subarray to return all you have to do is declare is as a SET OF. Once I got that returned it felt good having a pretty wide scope of datatypes that were supported. After that I started creating more examples to run for testing the plugin and to regression test every feature that I had added previously. When I was creating test cases for the datatypes and functions I noticed that some datatypes returned blank every time. This was concerning because it wasn't just a type conversion error caused by me casting it to the wrong type. After doing some deeper digging and going through the result rows in the debugger to try and locate the problem I noticed that MongoDB was doing some additional encoding of its datatypes.

After some research I found out this was called Extended JSON and was how MongoDB outputs its results in order to preserve type information not always kept by normal JSON. Since in ECL the programmer defines the return types the plugin does not need the information from MongoDB and ECL, so I needed a way to remove the extra encoding. I tried really hard to find examples of someone doing something similar and there zero. I looked for so long and nobody was trying to deserialize the EJSON in C++. That meant that I had to create my own method for deserializing the EJSON. I wanted something quick that would be fairly lightweight since there will thousands of documents running through this. I settled on just looking for '$' characters and adding everything before it to a string. Then grabbing the data it was encapsulating and adding that to the string as well. Then I find the index immediately after the encoding and set a cursor there for the next addition. Overall this seems to be fairly fast and I can't really see a much quicker way of doing it. After doing some testing there were many bugs that caused it to stop working. Eventually I got everything ironed out, and now it seems to be running smoothly. I can get double and decimal datatypes to return now which is great.

Week 10

Recent Posts

Comments