On premise deployment with Agent

Answered

Thinking about deploying Knowi on-premise and want to understand the communication between the Agent and Tomcat

1 comment

Official comment

Manny Ezeagwula November 14, 2019 23:09

To understand the communication between TC and Agent. There are 2 main requirements for the Agent:

1. Be able to run behind a firewall. This means that the agent can NOT accept any incoming connection. Only outgoing connections are allowed.
2. Should be able to adapt to a cluster of Tomcat (TC) web app machines (ie behind a LB). When one TC node goes now, it should be able to connect to another.

For this example, we will consider a cluster of 1 Agent and 3 TCs behind an LB. For clustered agent mode is even more complicated which I am not going to talk about here.

The numbers here match the numbers on the diagram above:

1. The Agent starts and opens a WebSocket connection to the TC cluster. The LB will forward that connection to one of the TC. In this case TC1. At this time, the agent also sends its CONNECTOR_ID_XYZ to TC1.
2. TC1 then registers as a consumer to a topic CONNECTOR_ID_XYZ to Rabbit MQ
3. A user comes to Knowi from a web interface. The user communication (sticky session) is assigned to TC3 by the LB.
4. This user creates a NONE-DIRECT query, and click on save and run now. Note that this whole flow is only applicable to a none-direct query. For direct query, we don't execute anything ahead of time and hence you don't need an agent (except for few cases). TC3 saves then query and put a message for topic CONNECTOR_ID_XYZ on Rabbit MQ to tell the agent that there is an update.
5. TC1 receives this message and forward it to the agent through the existing WebSocket connection.
6. The Agent opens a new regular HTTP connection to TC cluster to download the updates. The update might contain new/updated datasources and queries. Note that in the picture, this step goes to TC2 but in reality, this request can go to any of the TCs. The synchronization between TCs happens through MySQL but I will not go into details here.
7. After downloading the new query created by the user, the agent pulls data from needed datasources (can be many since we allow cross datasources join) and execute the query.
8. The agent opens yet another HTTP connection to TCs post the result from executing the query back. Whichever TC accept this request (in this case TC3) persist the result into Mongo
9. User comes to the dashboard to view the chart for his newly created query
10. TC3 (sticky session so he will be on this TC3 until he logs out) query mongo to get the result and display it on the user's web browser.

Comment actions Permalink

Please sign in to leave a comment.