HIGH SageMaker

SageMaker notebook direct internet access

Check ID: aws-sagemaker-002

AWS-SAGEMAKER-002 is an AWS security check performed by cloud-audit, an open-source AWS security scanner. Checks if SageMaker notebook instances have direct internet access disabled. Direct internet access bypasses VPC security controls and network monitoring.

Why it matters

SageMaker notebooks with direct internet access can communicate with any external endpoint, bypassing VPC security groups, NACLs, and network monitoring tools. This creates an unmonitored data exfiltration path - a compromised notebook can upload training datasets, model weights, and AWS credentials to attacker-controlled servers without triggering any VPC flow log alerts. Direct internet access also enables supply chain attacks through pip install from arbitrary PyPI mirrors. Placing notebooks in a VPC with NAT gateway and outbound filtering ensures all traffic is visible to network security tools and can be restricted to approved destinations like PyPI, GitHub, and specific S3 endpoints.

Common causes

Direct internet access is enabled by default on SageMaker notebook instances. Teams choose the default to simplify setup, especially when data scientists need to download packages and datasets from the internet. Placing notebooks in a VPC requires additional networking infrastructure - private subnets, NAT gateways, VPC endpoints for S3 and SageMaker API - that adds cost and complexity that teams defer.

Detection

Run cloud-audit to detect this issue:

pip install cloud-audit
cloud-audit scan -R

The -R flag includes remediation details for every finding, including this one.

Remediation: AWS CLI

aws sagemaker update-notebook-instance --notebook-instance-name NOTEBOOK_NAME --direct-internet-access Disabled --subnet-id SUBNET_ID --security-group-ids SG_ID

Remediation: Terraform

resource "aws_sagemaker_notebook_instance" "main" {
  name                    = "notebook"
  instance_type           = "ml.t3.medium"
  role_arn                = aws_iam_role.sagemaker.arn
  direct_internet_access  = "Disabled"
  subnet_id               = var.private_subnet_id
  security_group_ids      = [var.security_group_id]
}

This check is part of cloud-audit - install with pip install cloud-audit