posted 6 years ago
In the example shown there are a couple of "pragmatic" approaches that could tried first, ie. in the spirit of using the highest level tools first. I would approach the problem like this:
Step 1: Spin up an AWS Sagemaker Instance
Step 2: Create a Jupyter Notebook
Step 3: Clean up the data and do some clustering and see if there are some clusters that create types of transactions and plot (with on axis being transaction size)
Step 4: Add the labels to the original data set and predict which cluster something could be assigned to and deploy via Sagemaker (to either production or other team members via about 1 line of code)
Off the top of my head a workflow like this could be a rapid way to approach the problem