Getting Started with your Spark Distribution

If you are a Spark developer and already using Spark 2.1.1 the fastest way to work with TIBCO ComputeDB is to add TIBCO ComputeDB as a dependency. For instance, using the package option in the Spark shell.

Open a command terminal, go to the location of the Spark installation directory, and enter the following:

$ cd <Spark_Install_dir>
# Create a directory for TIBCO ComputeDB artifacts
$ mkdir quickstartdatadir
$ ./bin/spark-shell --conf spark.snappydata.store.sys-disk-dir=quickstartdatadir --conf spark.snappydata.store.log-file=quickstartdatadir/quickstart.log --packages "SnappyDataInc:snappydata:1.2.0-s_2.11"

This opens the Spark shell and downloads the relevant TIBCO ComputeDB files to your local machine. Depending on your network connection speed, it may take some time to download the files.
All TIBCO ComputeDB metadata, as well as persistent data, is stored in the directory quickstartdatadir. The spark-shell can now be used to work with TIBCO ComputeDB using Scala APIs and SQL.

For this exercise, it is assumed that you are either familiar with Spark or SQL (not necessarily both). Basic database capabilities like working with Columnar and Row-oriented tables, querying and updating these tables is showcased.

Tables in TIBCO ComputeDB exhibit many operational capabilities like disk persistence, redundancy for HA, eviction, etc. For more information, you can refer to the detailed documentation.

Next, you can try using the Scala APIs or SQL. We will add Java/Python examples in the future.