MongoDB is a leading NoSQL, document-oriented database. Knowi enables visualization, analysis, and reporting automation from MongoDB. If you have not started your Knowi trial, visit our Instant MongoDB Analytics & Reporting page to get started.
Overview
-
Connect, extract, and transform data from your MongoDB, using one of the following options:
a. Through our UI to connect directly, if your MongoDB servers are accessible from the cloud.
b. Using our Cloud9Agent for datasources inside your network.
-
Visualize and Automate your Reporting instantly.
UI Based Approach
Connecting
-
Log in to Knowi and select Queries from the left sidebar.
-
Click on New Datasource + button and select MongoDB. Either follow the prompts to set up connectivity to your own MongoDB database, or, use the pre-configured settings in Knowi's own demo MongoDB database.
- When connecting from the UI directly to your MongoDB database, please follow the connectivity instructions to allow Knowi to access your database.
- Alternatively, if you are connecting through an agent, check Internal Datasource to assign it to your agent. The agent (running inside your network) will synchronize with it automatically. Alternatively, configure the datasource and queries directly through the agent.
-
Save the Connection. Click on the Configure Queries link on the success bar or click on the Start Querying button.
Query
Set up Query using a visual builder or query editor
Visual Builder
After connecting to the MongoDB datasource, Knowi will pull out a list of collections along with field samples.
Step 1: Generate queries through our visual builder in a no-code environment by either dragging and dropping fields or making your selections through the drop-down.
Step 2: Define data execution strategy by using any of the following two options:
- Direct Execution: Directly execute the Query on the original MongoDB datasource, without any storage in between. In this case, when a widget is displayed, it will fetch the data in real time from the underlying Datasource.
- Non-Direct Execution: For non-direct queries, results will be stored in Knowi's Elastic Store. Benefits include- long-running queries, reduced load on your database, and more. Non-direct execution can be put into action if you choose to run the Query once or at scheduled intervals.
For more information, please refer to this documentation- Defining Data Execution Strategy
Step 3: Click on Preview to review the results and fine-tune the desired output, if required.
The result of your Query is called Dataset.
After reviewing the results, name your dataset and then hit the Create & Run button
Query Editor
A versatile text editor designed for editing code that comes with a number of language modes including MongoDB Query Language (MQL) and add-ons like Cloud9QL, and AI Assistant which empowers you with powerful transformations and analysis capabilities like prediction modeling and cohort analysis if you need it.
AI Assistant
AI assistant query generator automatically generates queries from plain English statements for searching the connected databases and retrieving information. The goal is to simplify and speed up the search process by automatically generating relevant and specific queries, reducing the need for manual input, and improving the probability of finding relevant information.
Step 1: Select Generate Query from AI Assistant dropdown and enter the details of the query you'd like to generate in plain English. Details can include table or collection names, fields, filters, etc.
Example: “Show me all the restaurants with name and address”
Note: The AI Assistant uses OpenAI to generate a query and only the question is sent to OpenAI APIs and not the data.
Step 2: Define data execution strategy by using any of the following two options:
- Direct Execution: Directly execute the Query on the original MongoDB datasource, without any storage in between. In this case, when a widget is displayed, it will fetch the data in real-time from the underlying Datasource.
- Non-Direct Execution: For non-direct queries, results will be stored in Knowi's Elastic Store. Benefits include- long-running queries, reduced load on your database, and more. Non-direct execution can be put into action if you choose to run the Query once or at scheduled intervals.
For more information, please refer to this documentation- Defining Data Execution Strategy
Step 3: Click on the Preview button to analyze the results of your Query and fine-tune the desired output, if required.
Note 1: The OpenAI must be enabled by the admin before using the AI Query Generator.
{Account Settings → Customer Settings → OpenAI Integration}
Furthermore, AI Assistant offers you additional features that can be performed on top of the generated query as listed below:
- Explain Query
- Find Issues
- Syntax Help
Explain Query
Provides explanations for your existing query. For example, an explanation requested for the query generated below AI Assistant has returned the description-
“This MongoDB query is used to find all the restaurants in the database and return the name and address of each restaurant. The query uses the find() method to search the restaurants' collections, and the empty object {} indicates that all documents in the collection should be returned. The second argument, {name: 1, address: 1}, specifies that only the name and address fields should be returned in the results.”
Find Issues
Helps in debugging and troubleshooting the query. For example, finding issues in the query generated below returns this error- “The collection name is misspelled (should be "restaurants")”
Syntax Help
Ask questions around query syntax for this datasource. For example, suggesting the syntax for the requested query returned the response- “db.restaurants.find({}, {name: 1, _id: 0})”
Map-Reduce
Knowi supports Map-Reduce useful for pushing down processing of large datasets into MongoDB (beyond MongoDB'saggregatefunction). Map-Reduce support includes Map, Reduce, Finalize functions, and "limit" and "scope" parameters. Note that the output results must be returned inline.
For example, if you have collection of events for each of customers with fields "customer" and "sent":
[
{
"customer":"Wells Fargo",
"sent":"119992"
},
{
"customer":"Wells Fargo",
"sent":"130000"
},
{
"customer":"Linked In",
"sent":"23000"
}
]
To calculate the sum of "sent" for each "customer":
db.sendingActivity.mapReduce(
function(){
emit(this.customer, this.sent);
},
function(key, values) {
return Array.sum(values);
},
{
out: { inline: 1 }
}
)
Another example with "scope" and "finalize" feature, to get the total events count up to date for each data.
db.sendingActivity.mapReduce(
function () {
var date = new Date(this.date.valueOf() - ( this.date.valueOf() % ( 1000 * 60 * 60 * 24 ) ) );
var value =1;
emit(date, value);
},
function(key,values) {
return Array.sum( values );
},
{
"scope": { "total": 0 },
"finalize": function(key,reducedValue) {
total += reducedValue;
return total;
},
"out": { "inline":1 }
}
)
Example of output:
[
{
"_id" : ISODate("2014-12-01T00:00:00.000Z"),
"value" : 19.0
},
{
"_id" : ISODate("2014-12-02T00:00:00.000Z"),
"value" : 28.0
},
{
"_id" : ISODate("2014-12-03T00:00:00.000Z"),
"value" : 38.0
}
]
Cloud9Agent
As an alternative to the UI based connectivity above, you can use Cloud9Agent on stand-alone mode inside your network to pull from MongoDB. See Cloud9Agent to download your agent along with instructions to run it.
Highlights:
- Pull data using MongoDB query syntax.
- Complement MongoDB syntax with Cloud9QL to cleanse/transform data further.
- Execute queries on a schedule, or, one time.
The agent contains adatasource_example_mongo.jsonandquery_example_mongo.jsonunder theexamplesfolder of the agent installation to get you started.
- Edit those to point to your database and modify the queries to pull your data.
- Move it into theconfigdirectory (datasource_XXX.jsonfiles first if the Agent is running).
Datasource Configuration:
Parameter | Comments |
---|---|
name | Unique Datasource Name. |
datasource | Set value to mongo |
url | DB connect URL, with host, port and database. Example: dharma.mongohq.com:10071/cloud9demo |
userId | DB User id to connect |
Password | DB password |
mongoReadPref | Optional Read Preference strategy for MongoDB to route read operations to member in the replica set. See Read Preference documentation at MongoDB. Valid values: primary, primaryPreferred, secondary, secondaryPreferred, nearest |
mongoReadPrefTags | Optional Read Preference user defined replica tag sets. See more details on replica tag sets at MongoDB. Example: {"region":"US_West","datacenter":"Los Angeles"} |
mongoCheckIndex | Optional flag to check for indexes in a query, to ensure that queries executed contain a valid indexes (and shard keys where applicable). Valid values: true, false. Defaults to false. |
kerberosRealm | Optional, for Kerberos environments only (alternative to user/password authentication). Specify the Kerberos Realm here. Example: cloud9.com |
kerberosKDC | Optional, for Kerberos based authentication schemes only. Specify the Key Distribution Center server. Example: a.hostname.com |
kerberosKeytab | Optional, for Kerberos based authentication schemes only. Specify the location of the keytab file. Example: /users/cloud9/dev/cloud9.keytab |
Query Configuration:
Query Config Params | Comments |
---|---|
entityName | Dataset Name Identifier |
identifier | A unique identifier for the dataset. Either identifier or entityName must be specified. |
dsName | Name of the datasource name configured in thedatasource_XXX.jsonfile to execute the query against. Required. |
queryStr | MongoDB query syntax. Required. Example: db.pagehits.find({hits: { $gte: 1}}) |
c9QLFilter | Optional cleansing/transformation of the results from the Mongo query using Cloud9QL. See Cloud9QL docs |
frequencyType | One ofminutes,hours,days,weeks,months. If this is not specified, this is treated as a one time query, executed upon Cloud9Agent startup (or when the query is first saved) |
frequency | Indicates the frequency, iffrequencyTypeis defined. For example, if this value is 10 and thefrequencyTypeisminutes, the query will be executed every 10 minutes |
startTime | Optional, can be used to specify when the query should be run for the first time. If set, the the frequency will be determined from that time onwards. For example, is a weekly run is scheduled to start at07/01/2014 13:30, the first run will run on 07/01 at 13:30, with the next run at the same time on 07/08/2014. The time is based on the local time of the machine running the Agent. Supported Date Formats:MM/dd/yyyy HH:mm, MM/dd/yy HH:mm, MM/dd/yyyy, MM/dd/yy, HH:mm:ss,HH:mm,mm |
overrideVals | This enables data storage strategies to be specified. If this is not defined, the results of the query is added to the existing dataset. To replace all data for this dataset within Knowi, specify{"replaceAll":true}. To upsert data specify"replaceValuesForKey":["fieldA","fieldB"]. This will replace all existing records in Knowi with the same fieldA and fieldB with the the current data and insert records where they are not present. |
Examples
Datasource Example:
[
{
"name":"demoMongo",
"url":"dharma.mongohq.com:10071/cloud9demo",
"datasource":"mongo",
"userId":"someUserId",
"password":"somePass"
}
]
Query Example:
[
{
"entityName":"Page Hits Over Time",
"dsName":"demoMongo",
"queryStr":"db.pageviews.find({lastAccessTime: { $exists: true}})",
"c9QLFilter":"select date(lastAccessTime) as Date, count(*) as Page Hits group by date(lastAccessTime) order by Date asc",
"overrideVals":{
"replaceAll":true
},
"postURL":"http://localhost:9090/connect/6xaFYvBLSA8ie"
}
]
Advanced Examples:
Kerberos based authentication, with custom Read Preferences and Index checking enabled on the datasource.
Datasource:
[
{
"name":"kerbWithReadPrefs",
"url":"ec2-54-164-132-188.compute-1.amazonaws.com/records",
"datasource":"mongo",
"userId":"mongo/mongo@CLOUD9.COM",
"kerberosRealm":"CLOUD9.COM",
"kerberosKDC":"ec2-54-164-132-188.compute-1.amazonaws.com",
"kerberosKeytab":"/Users/c9/Dev/cloud9.keytab",
"mongoReadPref":"nearest",
"mongoReadPrefTags":{"region":"US_West","datacenter":"Los Angeles"},
"mongoCheckIndex":true
}
]
Multiple databases with wildcard database name matching:
The following example:
- Connects to a set of MongoDB databases using a wildcard database name matching.
- Executes queries against all databases from that match group and combines the data.
- Executes a Cloud9QL to further aggregate the data.
- Connects and pulls data from a set of MySQL databases using a wildcard name match, to then combine and aggregate the data.
- Resulting data from both MongoDB and MySQL databases are stored into the same dataset.
Datasource:
[
/* Wildcard token to connect to multiple databases with the same schema */
{
"name":"demoMySQLGroup",
"url":"localhost:3306/app_${c9_wildcard}_somepostfix",
"datasource":"mysql",
"userId":"a",
"password":"b"
},
{
"name":"demoMongoGroup",
"url":"dharma.mongohq.com:10071/cloud9${c9_wildcard}_acc",
"datasource":"mongo",
"userId":"x",
"password":"y"
}
]
Query:
[
{
"entityName":"Multiple Databases",
"dsName":"demoMySQLGroup",
/* Executes against multiple databases and combines the result */
"queryStr":"select * from sometable",
/* Optional c9QL that runs on the combined query results. */
"c9QLFilter":"select count(*) as Total count, \"MySQL Counts\" as Type",
"overrideVals":{
"replaceValuesForKey":["Type"]
}
},
{
"entityName":"Multiple Databases",
"dsName":"demoMongo",
/* Runs a Mongo Query on all matched databases and combines them*/
"queryStr":"db.pageviews.find({lastAccessTime: { $exists: true}})",
/* Optional C9QL to Aggregate the data further*/
"c9SQLFilter":"select count(*) as Total counts,\"Mongo Counts\" as Type",
"overrideVals":{
"replaceValuesForKey":["Type"]
}
}
]