Week 7

delvja01
Jul 11, 2022
4 min read

Day 1

Today the main goal was to get the dataset insert working. The problem that I was having was that the field info for the ECL row wasn't being filled in. After talking with Dan he told me that I need to pass in the ecl record so that the engine knows what kind of dataset to expect.

function(DATASET employeeData) // ECL knows it's a dataset, but has 
                               // no information about the types

layout1 := {STRING1 id, STRING25first, STRING25last};
function(DATASET(layout1) employeeData) // ECL knows the name and
                                        // datatypes of variables

After making the change in my ECL code I was finally able to get something inserted into MongoDB from the dataset. What happens is the engine calls EmbedFunctionContext and creates an object of that class. This object instantiates everything for the connection and gets all the arguments for the uri from the embed parameters. Then bindDatasetParam gets called where a MongoDBDatasetBinder object gets created. Then the DatasetBinder is there to process the rows of the dataset and convert them into MongoDB documents. These documents then get inserted and the next row gets processed.

After making the change to the ECL code I was able to understand a lot more of how the engine makes its calls, but it still wasn't inserting the every row of the dataset. This was because when the append method would get called the key would be the same so the current row would just replace the previous row in the MongoDB document.

Day 2

For today my goal was to solve the problem preventing the full dataset from being insertde into the database. The issue was that since the parameter names are the same for each row and the document would build up the rows would simply replace each other until every row had been processed meaning only the last row would actually get saved. To fix this we first had to answer a question. If a user inserts a database would we want each row to get inserted as a document or insert all the rows as subdocuments in a single document. I think the choice is clear that the user would want to insert each row seperately as they are essentially JSON objects like MongoDB documents.

Once that problem was solved and decided on it was time to make the change in the code. When the engine notices a dataset has been passed as a parameter it will call executeAll where it essentially calls execute for each row of the dataset. The only thing that I had to change was that the document gets inserted every time it binds a single row instead of waiting till the end to insert the document. This fixed the issue and allowed all the rows of the dataset to be properly added into the database. The next step was to begin working on searching the database. For this to work we need to implement the RowStream class from ECL as this is how the plugin will build the result rows. Returning a result of the search is the most important thing to focus on right now so I just jumped in and started building the RowStream class for use. Again the couchbase plugin was a great reference for what methods to overwrite and how to deal with the data coming into the methods.

I also needed to look into how the MongoDB handles queries. It takes a document as an argument to search against, but it has some extra features that allow users to add search parameters. For the field that you want searched you can simply pass a value and it will find documents where the field is equal to the passed value or you can pass less than, greater than, etc into the document as a subdocument instead of a value and it will search for documents based on that.

Day 3

Today my focus was to get a result from a MongoDB query to be output as the result. To do this I had to implement the getDatasetResult method in the MongoDBEmbedFunctionContext class. This class gets used to return the data from MongoDB to the engine and calls the RowStream to build the ECL rows to be returned. In order to save the results of the query for use later in the execution calls I created variables for the two different kinds of return documents that will come from the find method.

If find_one is called then an optional document is returned and if find is called then a mongocxx::cursor is called which is basically an array of documents that can be emtpy and is parseable. I created some pointers of these types so that I can have access to the results for the entire execution time. These pointers are then used to turn the documents into JSON objects which then get turned into strings and can be passed to the RowStream for building the dataset. The problem that I have run into is I cannot seem to keep the result information alive long enough for it to be seen by the RowStream class. It goes out of scope when getDatasetResult finishes execution. I want to create a new copy in memory and have a pointer to that location. It seems fairly simple but for some reason I cannot get it to work.

Week 7

Day 1

Day 2

Day 3

Recent Posts

Comments