Run Jupyter Python Code
Run jupyter notebook python code In your AWS console, starting from first cell.
Make sure to update the s3 bucket defined in the second cell of the notebook by replacing bucket name with your s3 “athena-federation-workshop-********
” bucket which is already created for you as part of preparing the environment for this lab. This bucket name in your account will be globally unique and we will be using this s3 bucket to store our training data and model for this excercise.
As you can from the 3rd notebook cell, we are calling a federated query against orders table on Aurora MySQL database using “lambda:mysql
” connector that we have already defined and used in previous excercises. This query generates a training dataset for number of orders per day. After running the 4th cell and waiting for a few seconds, you should be able to see this training dataset.
By running the last cell, we train a RandomCutForest Model to detect anomalies and we deploy the model to a SageMaker endpoint that our application (or query) can call. This part can take a few minutes before training job completes and you can get a generated SageMaker endpoint. Please take a note of this endpoint name as we will need this in our athena federated query.