Test Data & Users
To demonstrate Athena federation capabilities, a sample data set is being used in this workshop along with sample tables and sample data sources.
Let’s walk through the following test datasets & data sources:
TPCH Database & Tables
, which is public, will be used for this workshop. This dataset is a decision support benchmark. It consists of a suite of business-oriented ad hoc queries and concurrent data modifications. The queries and the data populating the database have been chosen to have broad industry-wide relevance. This benchmark illustrates decision support systems that examine large volumes of data, execute queries with a high degree of complexity, and give answers to critical business questions.
The components of TPC-H consist of eight separate and individual tables (the Base Tables). The relationships between columns in these tables are illustrated in the following diagram:
For this workshop, we will focus on the following tables from the TPC database:
The entire TPCH data dictionary is available here: here