Day 1
Today my goal was to try and fix some of the last problems I was having with the install and get the build setup to start running ECL code in order to test the plugin functionality and finally start building it. I was successful in reaching that goal and that is partly because it was very attainable and not too difficult as I already had an idea of what the problem was. I think that a big part of what is keeping me from getting too stressed and from feeling defeated is setting attainable goals for myself. As long as I am able to be further than I was at the beginning of the day I count that as a win. So today I was able to get the ecllib file for the MongoDB plugin to show up in the install directory. I don't know if this was necessarily what fixed the issue, but it was definitely something that I needed to do eventually to avoid other issues. In the plugins directory there was a CMakeLists.txt file and it contained some commands for adding subdirectories for each of the plugins, so I added MongoDB to that list. Then what I did was rebuild the HPCC-Platform because Dan thought the issue might just be a timing problem and he noticed that the last time I built the platform was way before any of my changes to the MongoDB plugin. Once I rebuilt that it all seemed to be fixed and I was able to move on to the next part of my project.
There are a few different ways of executing ECL code on a cluster and one of them is in the playground on ECL watch. This was the first thing that I tried just to make sure that the cluster with my plugin installed wasn't broken in the simplest of ways. Once I ran the sample query that HPCC Systems provides and ensured the output was correct I started setting up VS Code to connect to the VM and run code through that. The playground is nice because it is easy to use, but it will not save your code unfortunately. It took me a while to get everything set up where I could query from VS code. The installation was really easy and all you need to do is install a VS Code extension made my HPCC and you can connect to a cluster. The problem that I was running into was I was not able to get a connection to the cluster after trying multiple things and reading through the fairly sparse instructions. In the end I realized that the problem was that I was technically working from the VM trying to connect to the VM. I thought that the VSCode would be coming from my machine, but because I chose to save the .ecl file with the ECL code on the VM that is where the query was coming from. I kept trying to put in the VM's IP address and didn't realize that it needed to be set to localhost for it to work. Originally, I thought that it was an issue with the configuration of the VM because I could connect to play.hpccsystems.com which is the publicly available ECL playground. Once I figured out that I just needed to change the server address to localhost it worked fine and I was able to run a few test queries to ensure the connection was good and that the cluster wasn't broken from the plugin.
Since I was able to get a lot done today I basically spent the rest of the day trying to get mongodb to embed correctly and I was reading through some sample couchbase queries to get an idea of how to write the ecl code. So my goal for the rest of the week will be to embed mongodb into some code and get it to connect to the database. Once I have that done I will be trying to pass some simple scalar values between the cluster and MongoDB.
Day 2
Today I started off by doing some IT Security training courtesy of Lexi the Cybertooth tiger. There was a lot of great information in there and I found it engaging so it went by quickly. After I finished that and got it out of the way I started back to my problem that I was having at the end of the day yesterday. Dan was able to point me in the direction of some folders that had some handy log files that I could use in my debugging. Slowly I am starting to get a grasp on the system that I am using and learning how to work with it. The problem that I was having was that it could not find the mongodb plugin and says that it is an "Unknown Identifier." When I asked Dan about this he showed me the log files in /var/log/HPCCSystems/myesp. The esp is where a lot of the logic gets run through so it makes sense that when I try to compile my code to send to the cluster the error would go through there.
When I looked at the log files the problem that it was showing me was that it was unable to load the shared library file libmongdbembed.so, and was saying there was an undefined symbol that had something to do with the bindBooleanParam function. When I saw this it honestly confused me a lot. I am not sure what this random function would have to do with me compiling some ECL code, but I went into my cpp file for the mongodb plugin and tried to find any instance of it being called and it was nowhere to be seen. The only place it shows up in my code is in the header file where I just included it so that I wouldn't forget to implement it later. The thing that is confusing me is that it is a virtual method of the EmbedFunctionContext class and it is virtual which I believe means that it does not necessarily need to be implemented if it is unused. If that is not the case and I do need to override it in my cpp file then that is even more puzzling because there are about 20 other methods that I did a similar thing with. My only explanation for this is that it tried to find that function first and exited when it failed. Why then would it say that it couldn't find the mongodb plugin if it was looking in its header file for functions then?
While that was confusing it was also frustrating that I couldn't figure it out or make any sense of it. Fortunately, I was able to make some other progress with some unrelated code. i thought that the problem might have been with my ECL code, so I wanted to understand the code that is embedding the couchbase plugin a little better. I was able to find the block of code that handled the initial call to the plugin here:
EMBED(couchbase : server(server), user(user), password(password), bucket(thebucket), detailed_errcodes(1), operation_timeout(operationTimeoutMs), config_total_timeout(configTotalTimeoutMs))
When EMBED is called it takes the plugin name as the first argument and then there is a bunch of optional information that you can pass to it including the server, userid, and password to connect to the server. I had recognized this when I was looking through the couchbaseembed.cpp file and was able to find the code that breaks that down and turns it into a connection string. Once I found that I felt pretty good because I was able to understand how it actually worked. I was able to take some of that code and refactor it for my needs and I think I got it pretty much right. Unfortunately I couldn't test it because I still hadn't figured out the error that was preventing the engine from finding the MongoDB plugin.
Eventually, I went back to trying to figure out the issue with the bindBooleanParam function, but still couldn't understand what the problem was. I had tried just commenting out the functions that were causing issues, but when I did that it would no longer build which was even more purplexing. I think I just removed too many functions by accident, so I will have to go back in more carefully and remove them one at a time to see if it is a single function causing the issue or all of them like I suspect. Tomorrow I will looking to investigate a little further and hopefully have a fix for it before my morning catchup with Dan, but if not he'll probably have the answers to all my questions.
Day 3
Today I went back to trying to fix the issue with the libmongodb.so file not being loaded and throwing some errors. Eventually I was able to figure out that the errors shown in the log file are thrown when I start up the HPCC-Platform clusters. That means the error is most likely not in my .ecl file and something with the MongoDB plugin. I couldn't figure out the issue so I brought it up again at the meeting with Dan and he said it probably just needed at least the method stubs to be able to load everything. I started adding all the different bind and get functions as method stubs to the cpp file and started getting different errors prompting me to add more functions as stubs. Once I figured that out it went pretty smoothly and I was able to add all of the method stubs until the shared library file was able to load. I went to run the ECL code and it was still showing me the same error that the mongodb plugin name is undefined. This was even more confusing because I had thought that the problem wasn't with my code, so I shot Dan a quick message and asked him about it and all I was missing was an import statement at the top of ecl file. When you are going to embed an external language into ECL you have to include this line:
import MongoDB;
The rest of the code was fairly short I just defined some variables to make reading easier and then created a function that would just get the engine to call the embed context function.
BOOLEAN makeConnection() := user(user), password(pwd), server(server), port(port), database(databaseName), collection(collectionName))
ENDEMBED;
makeConnection();
Once I included the import statement I was able to run my code on the hthor cluster, but nothing happened from MongoDB's point of view. I checked the MongoDB Atlas cluster that I was using and it didn't show that any connections were made. I again asked Dan if he could point me to any log files that might have the errors from this and he told me that it was time to stat debugging to see if the particular function I was using was actually called. He told me how I should go about doing that and he recommend gdb using the command line and gave me an article on setting it up with VS Code for remote debugging.
Day 4
After a day of absence because I had to go down to Oxford to move some furniture I was back at my desk and ready to get the connection to MongoDB servers working. I wanted to get gdb set up with VS Code, so I started by searching for some tutorials from windows users. It seemed pretty sparse and I wasn't able to find any good tutorials for how to get it working. I found a lot of guides where people had done what I was doing but they all seemed like a lot of work and weren't very descriptive, so I wasn't sure that it was going to work for my specific case where VS code is running on windows but I was debugging on a remote linux server. I kept trying to find something to help me set it up, but I was unable to find anything and just decided to use the command line. Before I started reading through the documentation on gdb and looking at tutorials I remembered something that Dan had said on Wednesday. He thought that the reason the code ran on ECL but didn't connect to MongoDB was because the C++ driver didn't think the connection was necessary, so it might have never even tried to connect. After remembering this I went back and read through the MongoDB documentation that showed how to create a document and insert it from an application.
MongoDB allows for two ways to create documents which are a stream builder and a more typically builder where you call functions on the document to insert values. Since the example only showed the stream builder in action I decided to try that because it was not really important I just wanted to get something inserted into the MongoDB cluster. When you choose a cluster to connect to you also have to select a database and a collection. If there are no databases or collections already created MongoDB will create them when you try to connect to it. Once I added the database and collection that I wanted to connect to and added some basic code to create a document I was able to get it inserted into the cluster. This means that when I executed my ECL code a document was inserted into MongoDB and the connection was made. This was a huge moment because it was the first time I was able to communicate with MongoDB and it only took three weeks. It felt like forever and like I was going slow. At times it felt like I wasn't making any progress but when I saw that the test document showed up on the MongoDB Atlas interface it felt really good. From here my next steps are deciphering the rest of the Couchbase plugin to see how it parses the information inside of the embed statement and insert some ECL datatypes into the MongoDB cluster.
Comments