.. meta::
   :description: 💫 Cantrip semantic layer
   :keywords: semantic layer, metric, dimension

💫 Cantrip
==========

Cantrip is a lightweight, hand-crafted, semantic layer as a library, written in modern Python. Semantics are stored as database views, so you can **use native SQL to define metrics, dimensions, relationships** and more, without the need for any external services. Cantrip generates the SQL queries needed to compute arbitrarily complex requests for multiple metrics and dimensions, and can even run the query for you.

----

.. container:: db-selector

   Database selector loading...

Imagine we have these 2 tables in our database:

.. code-block:: sql
   :class: db-example db-sqlite

   PRAGMA foreign_keys = ON;

   CREATE TABLE dim_users (
     id INTEGER PRIMARY KEY,
     name TEXT
   );

   CREATE TABLE fact_orders (
     id INTEGER PRIMARY KEY,
     user_id INTEGER,
     amount INTEGER,
     status STRING,
     FOREIGN KEY (user_id) REFERENCES dim_users(id)
   );

.. code-block:: sql
   :class: db-example db-duckdb

   CREATE TABLE dim_users (
      id INTEGER PRIMARY KEY,
      name VARCHAR NOT NULL
   );

   CREATE TABLE fact_orders (
      id INTEGER PRIMARY KEY,
      user_id INTEGER,
      amount INTEGER,
      status VARCHAR,
      FOREIGN KEY (user_id) REFERENCES dim_users(id)
   );

.. code-block:: sql
   :class: db-example db-postgres

   CREATE TABLE dim_users (
     id SERIAL PRIMARY KEY,
     name TEXT NOT NULL
   );

   CREATE TABLE fact_orders (
     id SERIAL PRIMARY KEY,
     user_id INTEGER REFERENCES dim_users(id),
     amount INTEGER,
     status TEXT
   );

Let's create a simple metric. We do that by defining a view. The metric should be an aggregation without any grain:

.. code-block:: sql
   :class: db-example db-sqlite db-duckdb db-postgres

   CREATE VIEW total_orders AS
   SELECT SUM(amount) AS total_orders
   FROM fact_orders
   WHERE status <> 'canceled';

We can now query the metric:

.. code-block:: sql
   :class: db-example db-sqlite db-duckdb db-postgres

   SELECT * FROM total_orders;
   450.0

Not super useful. But let's connect Cantrip to the database. In the following examples we'll be using a simple service that comes with Cantrip, but the service is not a requirement -- Cantrip can be used as a pure Python library.

.. code-block:: bash

   % uvicorn cantrip.service.app:app --port 1974 --reload

When Cantrip connects to a database it builds a **semantic graph** of entities -- metrics, dimensions, and more -- and how they are connected. Let's see which metrics are available:

.. code-block:: bash

   % curl 'http://localhost:1974/db/metrics' | jq

.. code-block:: json

   {
     "database": "db",
     "metrics": [
       {
         "name": "total_orders",
         "type": "NUMBER",
         "description": null,
         "joins": null
       }
     ]
   }

Let's see which dimensions are compatible with the metric ``total_orders``.


.. code-block:: bash

   % curl 'http://localhost:1974/db/dimensions?q=metrics:!(total_orders)' | jq

.. code-block:: json

   {
     "database": "db",
     "dimensions": [
       {
         "name": "dim_users.name",
         "type": "STRING",
         "grain": null,
         "incremental": false,
         "description": null
       },
       {
         "name": "fact_orders.amount",
         "type": "NUMBER",
         "grain": null,
         "incremental": false,
         "description": null
       }
       {
         "name": "fact_orders.status",
         "type": "STRING",
         "grain": null,
         "incremental": false,
         "description": null
       }
     ]
   }

What happened here? First, about the request we did. We're using a data serialization format called `Rison <https://github.com/betodealmeida/python-rison>`_ (more specifically, O-Rison) to encode this object:

.. code-block:: json

   {"metrics": ["total_orders"]}

In a more compact representation that is URL safe:

.. code-block:: o-rison

   metrics:!(total_orders)

This returns the list of dimensions compatible with the all the metrics in the request. For our metric ``total_orders``, Cantrip was able to infer 3 compatible dimensions by analyzing primary and foreign keys defined in the database.

.. container:: only-light

   .. graphviz:: _graphs/index/metric_dimension_path.light.dot

.. container:: only-dark

   .. graphviz:: _graphs/index/metric_dimension_path.dark.dot

We can now use Cantrip to generate the SQL for the ``total_orders`` metric alongside the ``dim_users.name`` dimensions, for example:

.. code-block:: bash

   % curl 'http://localhost:1974/db/sql?q=metrics:!(total_orders),dimensions:!(dim_users.name)' \
   > | jq -r '.query.sql'

.. code-block:: sql
   :class: db-example db-sqlite

   SELECT
     "__annotated"."dim_users.name" AS "dim_users.name [TEXT]",
     "__annotated"."total_orders" AS "total_orders [INTEGER]"
   FROM (
     SELECT
       "__final"."dim_users.name" AS "dim_users.name",
       "__final"."total_orders" AS "total_orders"
     FROM (
       SELECT
         SUM("fact_orders"."amount") AS "total_orders",
         "dim_users"."name" AS "dim_users.name"
       FROM "fact_orders" AS "fact_orders"
        JOIN "dim_users"
          ON "fact_orders"."user_id" = "dim_users"."id"
        WHERE
          "fact_orders"."status" <> 'canceled'
        GROUP BY
          "dim_users"."name"
     ) AS __final
   ) AS __annotated

.. code-block:: sql
   :class: db-example db-duckdb

   SELECT
     "__final"."dim_users.name" AS "dim_users.name",
     "__final"."total_orders" AS "total_orders"
   FROM (
     SELECT
       SUM("fact_orders"."amount") AS "total_orders",
       "dim_users"."name" AS "dim_users.name"
     FROM "fact_orders" AS "fact_orders"
      JOIN "dim_users"
        ON "fact_orders"."user_id" = "dim_users"."id"
      WHERE
        "fact_orders"."status" <> 'canceled'
      GROUP BY
        "dim_users"."name"
   ) AS __final

.. code-block:: sql
   :class: db-example db-postgres

   SELECT
     "__final"."dim_users.name" AS "dim_users.name",
     "__final"."total_orders" AS "total_orders"
   FROM (
     SELECT
       SUM("fact_orders"."amount") AS "total_orders",
       "dim_users"."name" AS "dim_users.name"
     FROM "fact_orders" AS "fact_orders"
      JOIN "dim_users"
        ON "fact_orders"."user_id" = "dim_users"."id"
      WHERE
        "fact_orders"."status" <> 'canceled'
      GROUP BY
        "dim_users"."name"
   ) AS __final


.. container:: only-light

   .. graphviz:: _graphs/index/query.light.dot

.. container:: only-dark

   .. graphviz:: _graphs/index/query.dark.dot

We can also get the results directly. Let's look at ``total_orders`` by ``fact_orders.status``:

.. code-block:: bash

   % curl 'http://localhost:1974/db/run?q=metrics:!(total_orders),dimensions:!(fact_orders.status)' | jq

.. code-block:: json

   {
     "database": "db",
     "metrics": [
       {
         "name": "total_orders",
         "type": "NUMBER",
         "description": null,
         "joins": null
       }
     ],
     "dimensions": [
       {
         "name": "fact_orders.status",
         "type": "STRING",
         "grain": null,
         "incremental": false,
         "description": null
       }
     ],
     "rows": [
       {
         "fact_orders.status": "complete",
         "total_orders": 350.0
       },
       {
         "fact_orders.status": "pending",
         "total_orders": 100.0
       }
     ]
   }


Even though this is a simple example, we can already see why Cantrip is special:

- **Metrics are defined using native SQL**. They can be as complex as you want, as long as they produce a single aggregated column and read from a single relation -- which can itself be a view with complex joins. This gives you the full power of SQL when defining metrics.
- **Semantics are stored in the database itself**. Metrics are defined as views, and relationships between metric and dimensions are either inferred from foreign keys or discovered from annotations (as we'll see in the next section).
- **There is no vendor lock-in**. You can use Cantrip with any database supported by `SQLAlchemy <https://www.sqlalchemy.org/>`_ and `sqlglot <https://sqlglot.com/sqlglot.html>`_, and moving from one database to another is just a matter of redefining the views.

In the :ref:`next section <annotations>` we'll look at a slightly more complex example.

.. toctree::
   :maxdepth: 2
   :caption: Documentation

   Home <self>
   annotations
   enrichment
   derived_metrics
   segments
   populations
   hierarchies
   installation
   usage