ETL using Athena CTAS
Most of the times the raw data coming into the data lake or tables is in csv or text format. These formats are not optimal for querying with Athena and other engines. It is advisable to convert the data into columnar formats like parquet. In this lab we will using Athena’s Create Table As Select ( CTAS ) query to transform a table from tsv format to parquet and store with compression and partitioning.
Open Athena → Saved Queries and click on Athena_ctas_reviews
Change the name of <<Athena-WorkShop-Bucket>>
with the bucket name from CloudFormation outputs.
It will take few minutes for this query to run. Once the query is completed, run the next query to see the results.