Day 1 - 3
The first few days of my project I was focused on getting my accounts set up and creating a development environment. I got my email login information and started working with Microsoft azure to get a virtual machine up and running. After looking through the services available it became clear that the free trial version I had signed up for, which offered me a 200 dollar free credit, was not going to be enough for the entirety of my internship. My mentor Dan stepped in and was able to receive a much more suitable VM that I could use for the duration of the project development.
Once the VM was started up and I could connect to it remotely I forked the HPCC-Platform repo from github and cloned it onto the VM. I started researching MongoDB types and the C++ driver. It is very important that I handle the type conversions correctly because the types that are native to ECL are not necessarily the same types that MongoDB works with.
In order to test that the development environment was set up correctly I downloaded and installed the HPCC System on the VM. To test that it was working properly I used the start command that was found on the wiki for building HPCC. Then I ran this command to start it and stop it once I had verified that I could see it in ECL Watch.
Day 4 - 5
The next goal I had was to get the MongoDB C++ driver installed on the VM. This was something that I identified through my research that I would need in order to start developing. I followed this tutorial for installing it on a Linux machine. It is built upon the C driver for MongoDB, so the first step was to install that. I used the apt package manager to install it and the instructions were well detailed and fairly straightforward. The only consideration I had to make was the C++ driver uses C++17 and HPCC systems is currently on C++11. To build it without using the C++17 functions I had to pass this additional flag for a separate mode of installation.
-DBSONCXX_POLY_USE_MNMLSTC=1
To configure the driver with that flag I used this command with the paths pointing to the drivers that were downloaded.
Then once the C++ driver was installed I started looking into the ECL code from similar plugins to see how to start my plugin.
My plugin is for MongoDB, and a preexisting plugin that works most similarly to how mine will work is the Couchbase plugin. I have been studying the code and trying to figuring out how the classes are put together and which functions I will need to overwrite for my plugin. The HPCC platform will use either the THOR or ROXIE or HTHOR clusters to call my code. I will need to create subclasses for MongoDB that give the clusters access to my functions that will communicate directly with MongoDB servers. My goal before I am really able to get into the coding is to understand the underlying infrastructure that the Couchbase plugin was built off of. It is difficult as the calls to the functions I will be writing will be made from the HPCC System engine itself and not from within the code I am writing. I did make some noticeable progress in my understanding which is encouraging as this task will be very important to the rest of my development of the plugin.
For the final day of the week I was finally able to start pulling code from other plugins and adapting them for my own use. This was difficult because the large majority of the code that I need for my plugin to work correctly is still a bit out of my reach. I started by refactoring code that I am able to understand the use for and there was still a lot of that to work through. My goal was to just pick a spot and start for me to keep my progress steady. I started with the EmbedFunctionContext class because that is the main class used by the ECL engine to make calls to plugins. It uses RowBuilder to return results and I am not that far yet, so I ignored that and just focused on EmbedFunctionContext. My goal, as my mentor suggested, is to get that class working and hardcode the database connection into it at first to make sure that the ECL engine is calling my plugin correctly. This is a bit of a daunting task seemingly because I have very little prior experience with how the ECL engine functions, but Dan has a wide breadth of knowledge on the matter that I am slowly learning from. It has been difficult for me to understand how the engine uses these classes to make calls, but after looking over the Couchbase plugin many many times I am starting to be able to trace the code more and more.
Today I feel as though I made some decent progress. In order to be able to build the plugin I have to learn some Cmake, and I spent a large portion of my day going over tutorials and reading through the Cmake files in the CouchBase and kafka plugins. I wanted to see some simple examples to understand the basics before I dove into the fully fledged CMakeLists.txt files used by the HPCC platform. I found a lot of helpful tutorials that made reading through the real files much simpler and I was able to understand a lot of what was going on. So, not only was I able to finally get a small grasp on the underlying code that the engine uses to make calls to plugins, but I learned a lot about Cmake and the uses of it so that I can start compiling and testing my plugin. Once I can build my plugin using Cmake correctly that will make the rest of this process much more attainable and I will feel better about my progress.
For the code that I was able to reuse and add to my plugin most of it went in the header and main C++ file called mongodbembed.cpp and the rest of it went into starting the CMakeLists.txt and ecllib files. Most of the code that I have inserted as of now will need to be changed down the road, but I wanted to get a decent starting point to build from. I found the HPCC-Platform eclhelper and eclrtl files helpful because they had all of the signatures for the interfaces that I would need to implement in my code. Seeing them like that instead of spread around the couch base plugin was pretty clarifying and helped my understanding move forward. I made good progress with getting the code that will connect to the database set up and registering a user on the MongoDB database so that it can receive calls from my code. My problems will come from configuring the EmbedFunctionContext to be called correctly. Once I have this class set up I will be able to implement the other interfaces that the plugins use making the execution of queries much smoother. I am really focusing on understanding how the Couchbase plugin utilizes the interfaces that are set up for the HPCC engine to make calls and will work on that much more next week.
Comments