Event 3

Report: “WORKSHOP: DATA SCIENCE ON AWS”

Event Objectives

  • Share AWS AI services
  • Demonstrate deploying AI models using Amazon SageMaker
  • Show how to deploy models and access them via APIs

Speakers

  • Van Hoang Kha - Cloud Solutions Architect, AWS User Group Leader
  • Bach Doan Vuong - Cloud Developer Engineer, AWS Community Builder

Highlights

1. The Importance of Cloud Computing in Data Science

  • Opened with an affirmation of cloud computing’s pivotal role in modern data science, especially in processing and storing large datasets.
  • Cloud vs. On-premise:
    • Cloud: Superior scalability, deployment agility, and flexible cost model (OPEX).
    • On-premise: High upfront infrastructure cost (CAPEX), difficult to scale resources instantly, and maintenance burden.
  • AWS provides an “End-to-End” platform for the Data Science pipeline: from collection, storage, and processing to training and model operation.

2. The 3-Layer Stack of AWS AI

AWS designs its AI ecosystem in 3 distinct layers to serve diverse needs:

Layer 1: AI Services (Pre-managed AI Services)

Suitable for: App Developers not specialized in ML.

  • Intelligent APIs pre-trained by AWS.
  • Quickly integrate into applications to add AI features immediately.
  • Key Services:
    • Amazon Comprehend: Text and sentiment analysis.
    • Amazon Translate: Automated translation.
    • Amazon Textract: Identify and extract data from text/images.
    • Amazon Rekognition: Image and video analysis.
    • Amazon Bedrock: Gateway to powerful Foundation Models.

Layer 2: ML Services (Amazon SageMaker)

Suitable for: Data Scientists & ML Engineers.

  • Comprehensive Integrated Development Environment (IDE) for Machine Learning.
  • Provides tools for every step:
    • Data Wrangler: Data preparation.
    • Feature Store: Feature repository.
    • SageMaker Autopilot: Automated training (AutoML).
    • Model Registry: Model lifecycle management.

Layer 3: AI Infrastructure (ML Infrastructure)

Suitable for: Expert Practitioners needing deep optimization.

  • Provides the most powerful computing resources:
    • EC2 Instances (P5, G6, Trn1…): Specialized chipsets for training/inference.
    • EKS/ECS: Run AI workloads on container platforms.

3. Powerful Support Tools for Students

  1. Amazon SageMaker: The best place to start learning and practicing industry-standard ML workflows.
  2. Amazon Comprehend: Powerful tool for NLP tasks like review analysis, text classification.
  3. Amazon Translate: Supports building multilingual applications at low cost.
  4. Amazon Textract: Effective solution for digitizing data from paper documents.

4. Demo & Practice

  • Demo 1: No-code ML with SageMaker Canvas

    • The speaker demonstrated how to create a prediction model without writing any code using a visual drag-and-drop interface.
    • Lesson: Enables non-experts (Business Analysts) to apply AI as well.
  • Demo 2: Deploy Model as an API Service

    • The process of deploying a trained model to a SageMaker Endpoint and exposing it via API Gateway.
    • Lesson: Understand the process of “Productionizing” an AI model for end-users.

Event Experience

The workshop was a truly valuable experience, helping me systematize my knowledge of AI/ML on the Cloud platform.

  • Systems Thinking: Understanding the 3 distinct layers helps me choose the right tool for the right problem, avoiding “using a sledgehammer to crack a nut”.
  • Real-world Perspective: Visual demos clearly illustrated the path from a notebook file to a functioning real-world API service.
  • Motivation: Seeing the strong support AWS offers to the community and students through Free Tier programs and learning materials.

Some photos from the event

event-3