Knowi enables data discovery, visualization, data manipulation, warehousing, and reporting automation from Google Analytics V4, along with the ability to merge that data with other data stores.
Overview
- Connect, extract, and transform data from your Google Analytics V4, using one of the following options:
- Through our UI to connect directly.
- Using our Cloud9Agent. This can securely pull data inside your network. See agent configuration for more details.
- Visualize and Automate your Reporting instantly.
UI-Based Approach
Connecting
- Log in to Knowi and select Queries from the left sidebar.
- Click on New Datasource + button and select Google Analytics 4 from the list of datasources.
- Authorize your Google Analytics V4 with the Gmail account connected to Google Analytics.
-
Enter the following details:
- Datasource name: Enter a name for your datasource (Identifier)
- Google Analytics Profile ID: Select the Property ID associated with your Google account. Please refer to the Property ID documentation to find your Property ID
- Refresh Token: Refresh Authentication Token returned by Google. This is used to connect and pull your GA reports. You can revoke access anytime at https://www.google.com/settings/u/1/security.
- Click on the Save button and start Querying.
Query
After connecting to the Google Analytics V4 datasource, Knowi will pull out a list of metrics along with field samples. Using these metrics, you can automatically generate queries through our visual builder in a no-code environment by either dragging and dropping fields or making your selections through the drop-down.
- Metrics: Select a list of metrics you want to track from the drop-down (or type in). See API Schema for a list of all metrics.
- Dimensions: Dimensions enable the grouping of data for the metrics selected. Each dimension can be set with filters; it is name/value pairs. E.g for dimension "browser", you can filter as "Chrome". For more info, please see Dimension Filters.
- Start Date: Specify a start date with a Date format: yyyy-MM-dd, or relative date (e.g., today, yesterday, or NdaysAgo where N is a positive integer. Note: Use Start and End Dates, or, alternatively, use the date range field to specify the last n time units.
- End Date: Specify an end date with a Date format: yyyy-MM-dd, or relative date (e.g., today, yesterday, or NdaysAgo where N is a positive integer. Note: Use Start and End Dates, or, alternatively, use the date range field to specify the last n time units.
- Date Range: Specify a date range to pull data from. Leave this empty if you have already specified explicit start/end dates. Use a number followed by y for years, m for Months, w for weeks, d for days, h for hours, min for minutes. For example, 3m implies 3 months.
Date Range Op |
Comments |
min |
Date range of n minutes back from now. Example (10 minutes) : 10min |
d |
Date range of n days back from today. Example (upto 120 days) : 120d |
w |
Date range of n weeks back from today. Example (upto 10 weeks): 10w |
m |
Date range of n months back from today. Example (upto 3 months): 3m |
y |
Date range of n years back from today. Example (upto last 1 year): 1y |
today |
Midnight of today till now |
yesterday |
Midnight of yesterday till now |
this hour |
Current hour till now |
this week |
Midnight of Monday of the current week till now |
this month |
Midnight of the 1st of the month, to now till now |
last hour |
Last hour, adjusted to 0 mins and 0 secs till now |
last week |
Last Week Monday, adjusted to Midnight till now |
last month |
Last Month, adjusted to the first of that month, to now |
- Max Results: Maximum number of records to pull. Note: Google Analytics allows up to 10k rows per data pull.
- Sort By Field: Field sorting for the data returned. Example: Ascending: ga:visits Descending: -ga:visits
Note: You can also perform Cloud9QL transformations.
Define data execution strategy by using any of the following two options:
-
Direct Execution: Directly execute the Query on the original Datasource, without any storage in between. In this case, when a widget is displayed, it will fetch the data in real-time from the underlying Datasource.
- Non-Direct Execution: For non-direct queries, results will be stored in Knowi’s Elastic Store. Benefits include- long-running queries, reduced load on your database, and more.
Non-direct execution can be put into action if you choose to run the Query once or at scheduled intervals. For more information, feel free to check out this documentation- Defining Data Execution Strategy
Click on the Preview button to analyze the results of your Query and fine-tune the desired output, if required.
The result of your Query is called Dataset. After reviewing the results, name your dataset and then hit the Create & Run button.
Cloud9Agent
Use Cloud9Agent as an alternative to the UI-based connectivity outlined above. The agent runs inside your network to extract data from Google Analytics 4 and sends the extracted/manipulated data into your Knowi warehouse. Check out Cloud9Agent to download your agent.
For sample Google Analytics configuration, see the datasource_example_googleanalytics.json and query_example_googleanalytics.json examples folder under the Cloud9Agent install directory.
Highlights:
- Connects to Google Analytics using OAuth tokens
-
Pulls data using GA API, with optional manipulations using Cloud9QL
Obtain the Refresh token using the Connect step in the UI section above.
Datasource Configuration:
Parameter |
Comments |
name |
Unique Datasource Name. |
datasource |
Set value to ga |
authRefreshToken |
OAuth Offline Token generated by Google Analytics |
gaProfileID |
Profile ID is a Google Unique identifier for the account. To determine profile ID, login to your Google Analytics account. The URL when you login will have a structure similar to: https://www.google.com/analytics/web/#report/visitors-overview/a5559982w55599512p12345678. In the structure at the end of the URL, the 8 digits that follow 'p' is your profile id (12345678 in the example URL) |
Query Configuration:
Query Config Params |
Comments |
entityName |
Dataset Name Identifier |
identifier |
A unique identifier for the dataset. Either identifier or entityName must be specified. |
dsName |
Name of the datasource name configured in the datasource_XXX.json file to execute the query against. Required. |
gaStartDate |
Begin Date to pull data for, in yyyy-MM-dd format. Either gaStartDate and gaEndDate, OR, gaDateRange must be specified |
gaEndDate |
End Date to pull data for, in yyyy-MM-dd format. Either gaStartDate and gaEndDate, OR, gaDateRange must be specified |
gaDateRange |
Date Range. Takes precedence over gaStartDate or gaEndDate, if specified. Example: 5d, this week, etc. See all supported dateRange params below |
gaMetrics |
Required. Metrics to track. Multiple metrics can be tracked using a comma delimiter. Example: ga:visitors,ga:newVisits. See GA API Explorer, metrics section for supported metrics. |
Dimensions |
Dimensions/Grouping on the data. Select the dimensions of the metric you are querying from the dropdown list. Dimensions are name/value pairs that carry additional data to describe the metric value. Note: Using the asterisk for one or more dimension values, you can keep track of a dynamic set of metrics. See GA API Explorer, dimensions section for supported dimensions. |
gaMaxResults |
Required. Maximum number of records to pull |
gaSort |
Optional. Sorting order: +/-, minus indicates descending, + ascending. Example: -ga:date |
c9QLFilter |
Optional. Can be used to manipulate to results retrieved from GA API. See Cloud9QL docs |
frequencyType |
Scheduling frequency type. One of minutes, hours, days,weeks,months. If this is not specified, this is treated as a one time query, executed upon Cloud9Agent startup (or when the query is first saved) |
frequency |
Indicates the frequency, if frequencyType is defined. For example, if this value is 10 and the frequencyType is minutes, the query will be executed every 10 minutes |
startTime |
Optional, can be used to specify when the query should be run for the first time. If set, the the frequency will be determined from that time onwards. For example, is a weekly run is scheduled to start at 07/01/2014 13:30, the first run will run on 07/01 at 13:30, with the next run at the same time on 07/08/2014. The time is based on the local time of the machine running the Agent. Supported Date Formats: MM/dd/yyyy HH:mm, MM/dd/yy HH:mm, MM/dd/yyyy, MM/dd/yy, HH:mm:ss,HH:mm,mm |
overrideVals |
This enables data storage strategies to be specified. If this is not defined, the results of the query is added to the existing dataset. To replace all data for this dataset within Knowi, specify {"replaceAll":true}. To upsert data specify "replaceValuesForKey":["fieldA","fieldB"]. This will replace all existing records in Knowi with the same fieldA and fieldB with the the current data and insert records where they are not present. |
Datasource Example:
[
{
"name":"demoGA",
"datasource":"ga",
"authToken":"1/JL-_Wkuan6c_Hs5FdEkgGMJ4wCgHm4STz2J7EgjN4OI",
"gaProfileID":"ga:66465881"
}
]
Query Example:
[
{
"entityName":"Website Visits",
"dsName":"demoGA",
"gaMetrics":"active7DayUsers",
"dimensions" : [ {
"name" : "browser",
"value" : [ "Chrome" ],
"operator" : "="
}, {
"name" : "date",
"value" : [ "*" ],
"operator" : "="
} ],
"gaDateRange":"10d",
"gaMaxResults":"1000",
"gaSort":"active7DayUsers",
"overrideVals":{
"replaceAll":true
},
"frequencyType":"hour",
"frequency":2
}
]