How to Access TIBCO ComputeDB Store from an existing Spark Installation using Smart Connector

TIBCO ComputeDB comes with a Smart Connector that enables Spark applications to work with the TIBCO ComputeDB cluster, from any compatible Spark cluster. You can use any distribution that is compatible with Apache Spark 2.1.1. To connect from a Spark 2.4 based cluster, you may use JDBC connector as described here. The Spark cluster executes in its own independent JVM processes and connects to TIBCO ComputeDB as a Spark data source. This is similar to how Spark applications today work with stores like Cassandra, Redis, etc.

For more information on the various modes, refer to the TIBCO ComputeDB Smart Connector section of the documentation.

Code Example

The code example for this mode is in SmartConnectorExample.scala

Configure a SnappySession:
The code below shows how to initialize a SparkSession. Here the property snappydata.connection instructs the connector to acquire cluster connectivity, catalog metadata and register it locally in the Spark cluster. The values consists of locator host and JDBC client port on which the locator listens for connections (default 1527).

val spark: SparkSession = SparkSession
    .builder
    .appName("SmartConnectorExample")
    // It can be any master URL
    .master("local[4]")
    // snappydata.connection property enables the application to interact with TIBCO ComputeDB store
    .config("snappydata.connection", "localhost:1527")
    .getOrCreate

val snSession = new SnappySession(spark.sparkContext)

Create Table and Run Queries: You can now create tables and run queries in TIBCO ComputeDB store using your Apache Spark program.

// reading an already created SnappyStore table SNAPPY_COL_TABLE
val colTable = snSession.table("SNAPPY_COL_TABLE")
colTable.show(10)

snSession.dropTable("TestColumnTable", ifExists = true)

// Creating a table from a DataFrame
val dataFrame = snSession.range(1000).selectExpr("id", "floor(rand() * 10000) as k")

snSession.sql("create table TestColumnTable (id bigint not null, k bigint not null) using column")

// insert data in TestColumnTable
dataFrame.write.insertInto("TestColumnTable")

Running a Smart Connector Application

Start a TIBCO ComputeDB cluster and create a table.

$ ./sbin/snappy-start-all.sh

$ ./bin/snappy
TIBCO ComputeDB version 1.2.0
snappy>  connect client 'localhost:1527';
Using CONNECTION0
snappy> CREATE TABLE SNAPPY_COL_TABLE(r1 Integer, r2 Integer) USING COLUMN;
snappy> insert into SNAPPY_COL_TABLE VALUES(1,1);
1 row inserted/updated/deleted
snappy> insert into SNAPPY_COL_TABLE VALUES(2,2);
1 row inserted/updated/deleted
exit;

The Smart Connector Application can now connect to this TIBCO ComputeDB cluster.

The following command executes an example that queries SNAPPY_COL_TABLE and creates a new table inside the TIBCO ComputeDB cluster.
TIBCO ComputeDB package has to be specified along with the application jar to run the Smart Connector application.

$ ./bin/spark-submit --master local[*] --conf snappydata.connection=localhost:1527  --class org.apache.spark.examples.snappydata.SmartConnectorExample --packages SnappyDataInc:snappydata:1.2.0-s_2.11       $SNAPPY_HOME/examples/jars/quickstart.jar

Execute a Smart Connector Application

Start a TIBCO ComputeDB cluster and create a table inside it.

$ ./sbin/snappy-start-all.sh

$ ./bin/snappy
TIBCO ComputeDB version 1.2.0
snappy>  connect client 'localhost:1527';
Using CONNECTION0
snappy> CREATE TABLE SNAPPY_COL_TABLE(r1 Integer, r2 Integer) USING COLUMN;
snappy> insert into SNAPPY_COL_TABLE VALUES(1,1);
1 row inserted/updated/deleted
snappy> insert into SNAPPY_COL_TABLE VALUES(2,2);
1 row inserted/updated/deleted
exit;

A Smart Connector Application can now connect to this TIBCO ComputeDB cluster. The following command executes an example that queries SNAPPY_COL_TABLE and creates a new table inside TIBCO ComputeDB cluster. TIBCO ComputeDB package has to be specified along with the application jar to run the Smart Connector application.

$ ./bin/spark-submit --master local[*] --conf spark.snappydata.connection=localhost:1527  --class org.apache.spark.examples.snappydata.SmartConnectorExample   --packages SnappyDataInc:snappydata:1.2.0-s_2.11 $SNAPPY_HOME/examples/jars/quickstart.jar