Discover sensitive data present in S3 bucket using Amazon Macie

Lab Details

  1. This lab walks you through the steps to create and configure an Amazon Macie job to discover sensitive data.
  2. You will practice using a custom data identifier where you will write a regular expression that matches the pattern of data present in the S3 bucket.
  3. Duration: 60 minutes
  4. AWS Region: US East (N. Virginia) us-east-1


What is Amazon Macie

  • Amazon Macie uses pattern matching and machine learning to protect the sensitive data stored in S3 buckets.
  • It detects a list of data types including PII (Personally identifiable information) such as names, addresses, credit card numbers, etc.
  • Along with detecting data, it gives you complete visibility of your S3 buckets and its information like publicly accessible buckets, unencrypted buckets, and buckets shared with other accounts.
  • To get started with Amazon Macie, you can use its free trial of 30 days for bucket evaluation.
  • The free trial does not include the discovery of sensitive data present in S3 buckets.

Architecture Diagram

Task Details

  1. Launching Lab Environment
  2. Enable Macie for the account
  3. Create a Macie job
  4. Macie job run and findings
  5. Validation of the lab

Lab Steps

Task 1: Launching Lab Environment

  1. Launch the lab environment by clicking on . Please wait until the lab environment is provisioned. It will take less than 2 minutes to provision the lab environment.
  2. Once the Lab is started, you will be provided with IAM user namePasswordAccessKey and Secret Access Key.
  3. Click on the , AWS Management Console will open in a new tab.
  4. In the AWS sign in page, the Account ID will be present by default.
    • Leave the Account ID as default. Do not remove or change the Account ID otherwise you cannot proceed with the lab.
  5. Copy and paste the IAM user name and Password into AWS Console. Click on Sign in to log into the AWS Console.?

Note : If you face any issues, please go through FAQs and Troubleshooting for Labs.

Task 2: Enable Macie for the account

  1. Make sure you are in US East (N. Virginia) us-east-1 Region. 
  2. Navigate to Amazon Macie by clicking on the  menu in the top, then click on   in the  section.
  3. On the home page, click on the Get started button to configure Amazon Macie.

  4. On the Get started page, click on the Enable Macie button.

Task 3: Create a Macie job

  1. Macie will try to find out all the details of the account, which may take some time. No need to wait, simply click on the Create job button.
  2. For Step-1, Select S3 Bucket,
    • Select the bucket name starting with Whizlabs, and click on the Next button.
  3. For Step-2, Review S3 buckets
    • Keep everything as default and click on the Next button.
  4. For Step-3, Scope,
    • In Sensitive data discover options: Select One-time job
    • Click on the arrow to expand the window of Additional settings
    • Let the Object criteria be default as File name extensions
    • Write csv in the textbox and click on the Include button
    • Once done, click on the Next button to proceed.
  5. For Step-4, Custom data identifiers,
    • Click on the Manage custom identifiers, to create one.
      Note: This will open in the New tab, please enable a pop-up, if it doesn’t open in one click.
  6. Click on the  option present on the right top.
    • Fill in the details, as follows:
    • Name: Whiz
    • Description: This identifier finds the data present in the format of AB-01 i.e. two characters, dash and followed by two numbers.
    • Regular expression: [a-z]{2}-[0-9]{2}
    • Keep all other options as default.
    • Click on the Submit button to create the Custom identifier.
    • Go back to the previous tab and click on the refresh icon to see the newly created Custom identifier.

    • Once refreshed, you will be able to see the Whiz identifier listed here. Click on the Next button. 
  7. For Step-5, Name and Description,
    • Name: WhizJob
    • Description: This job scans the bucket with a name starting as whizlabs and gathers its finding based on the regular expression pattern.
    • Click on the Next button.
  8. For Step-6, Review and create,
    • Review everything and click on the Submit button present below.
  9. Job is now created successfully.

Task 3: Macie job run and findings

  1. Once the job is created, it will start running immediately. 
  2. The job runs for approximately 10 minutes and gathers the findings.
  3. After 10 minutes, the status is changed to Complete.
  4. To view the Findings for the job, perform the following:
  5. Click on the Job present there.
  6. Select Show results
  7. And, Choose Show findings
  8. To check the exact results open the finding
  9. Perform the following task:
  10. Select the present finding
  11. Click on the Actions button
  12. And, Choose Export (JSON)
  13. JSON present here is in Read-only format, you may choose to download the complete report.

Task 6 : Validation of the lab

  1. Once the lab steps are completed, please click on the  button on the right side panel.
  2. This will validate the resources in the AWS account and displays whether you have completed this lab successfully or not.
  3. Sample output : 

Completion and Conclusion

  1. You have successfully created and launched Amazon EC2 Instance.
  2. You have successfully logged into the EC2 instance by SSH.
  3. You have successfully created a webpage and published it.