Setting Up Semantic Models

Athenic’s semantic models are built from five column types you assign to your dataset output columns: Primary Key, Foreign Key, Time, Category, and Simple Metric.

A semantic model is created from its corresponding dataset and cannot be assigned to other datasets. After assigning your dataset output columns to one of the semantic types and filling out the required information, Athenic will be able to automatically detect the joins and SQL logic required to understand and use your data.

The image contains the interface for setting up the semantic model for the given dataset "avocados."

Semantic Types

Primary Key

A primary key is a column that has unique values for each row in the dataset. It is the main identifier for the table and will be used for joins. It is required to have one primary key for each dataset. If the data does not already have a valid primary key column, we recommend adding a column that uses concatenation in the dataset to create a unique identifier for the row.

  • You must ensure that each row has a unique value for a primary key.

  • Primary key labels must be unique across all semantic models for the entire project. Athenic will build the joins automatically using the key labels you assign. For example, your project cannot contain the label "ID" in both the "Customer" and the "Order" semantic models. The best way to prevent this is to provide context to what the ID is e.g. "Customer ID" and "Order ID".

Foreign Key

A foreign key is a column whose values reference a Primary Key in another dataset. Foreign keys are used by Athenic to discover relationships between datasets and automatically construct joins.

  • When you mark a column as a Foreign Key, you must ensure the label matches the Primary Key label (the target it references). You will see a warning if there is no matching label. For example, if your "Customer" semantic model has a primary key labeled "Customer ID", then the foreign key to "Customer" in the "Order" semantic model must also be labeled "Customer ID".

  • Foreign key columns do not need to be unique.

  • You will get a warning if there are foreign key values detected that do not have a matching Primary Key value (invoice #5 is in your order table as foreign key but was deleted for some reason in the primary key dataset invoices). If possible, clean your data to avoid dropping rows in the join.

Time

A time dimension column represents a timestamp or date used for time-based filtering and grouping (e.g., “last 30 days”, “by week”, “by month”). They enable time-series queries and are commonly used as the default axis for trends.

  • A time dimension column is required for every semantic model that contains Simple metrics.

  • Every time dimension column must also have a time grain provided. This tells Athenic whether the level of detail for the time dimension is by day, week, month, etc.

  • If you have multiple time dimension columns in a dataset, be very explicit in the label and description on what they each time dimension represents.

Category

A Category column is a descriptive attribute (a “dimension”) used to slice, filter, and group results. Categories are typically strings or low/medium-cardinality identifiers that users ask about in plain language. Some typical examples include attributes such as customer's name, location, type of customer, or product categories.

  • Make sure your labels are descriptive.

  • If users will ask for these columns using other synonyms, include them in the description. For example, if you have a category labeled "channel", but your users may also refer to "channel" as either "order source" or "platform", then include this information in the description.

Simple Metric

A Simple metric column is a numeric column that users want to aggregate and analyze (sum, count, average, etc.) You can view detailed information on setting up your Simple Metrics in the next section.

Last updated