Event 3
Report: “WORKSHOP: DATA SCIENCE ON AWS”
Event Objectives
- Share AWS AI services
- Demonstrate deploying AI models using Amazon SageMaker
- Show how to deploy models and access them via APIs
Speakers
- Van Hoang Kha - Cloud Solutions Architect, AWS User Group Leader
- Bach Doan Vuong - Cloud Developer Engineer, AWS Community Builder
Highlights
1. The Importance of Cloud Computing in Data Science
- Opened with an affirmation of cloud computing’s pivotal role in modern data science, especially in processing and storing large datasets.
- Cloud vs. On-premise:
- Cloud: Superior scalability, deployment agility, and flexible cost model (OPEX).
- On-premise: High upfront infrastructure cost (CAPEX), difficult to scale resources instantly, and maintenance burden.
- AWS provides an “End-to-End” platform for the Data Science pipeline: from collection, storage, and processing to training and model operation.
2. The 3-Layer Stack of AWS AI
AWS designs its AI ecosystem in 3 distinct layers to serve diverse needs:
Layer 1: AI Services (Pre-managed AI Services)
Suitable for: App Developers not specialized in ML.
- Intelligent APIs pre-trained by AWS.
- Quickly integrate into applications to add AI features immediately.
- Key Services:
- Amazon Comprehend: Text and sentiment analysis.
- Amazon Translate: Automated translation.
- Amazon Textract: Identify and extract data from text/images.
- Amazon Rekognition: Image and video analysis.
- Amazon Bedrock: Gateway to powerful Foundation Models.
Layer 2: ML Services (Amazon SageMaker)
Suitable for: Data Scientists & ML Engineers.
- Comprehensive Integrated Development Environment (IDE) for Machine Learning.
- Provides tools for every step:
- Data Wrangler: Data preparation.
- Feature Store: Feature repository.
- SageMaker Autopilot: Automated training (AutoML).
- Model Registry: Model lifecycle management.
Layer 3: AI Infrastructure (ML Infrastructure)
Suitable for: Expert Practitioners needing deep optimization.
- Provides the most powerful computing resources:
- EC2 Instances (P5, G6, Trn1…): Specialized chipsets for training/inference.
- EKS/ECS: Run AI workloads on container platforms.
- Amazon SageMaker: The best place to start learning and practicing industry-standard ML workflows.
- Amazon Comprehend: Powerful tool for NLP tasks like review analysis, text classification.
- Amazon Translate: Supports building multilingual applications at low cost.
- Amazon Textract: Effective solution for digitizing data from paper documents.
4. Demo & Practice
Event Experience
The workshop was a truly valuable experience, helping me systematize my knowledge of AI/ML on the Cloud platform.
- Systems Thinking: Understanding the 3 distinct layers helps me choose the right tool for the right problem, avoiding “using a sledgehammer to crack a nut”.
- Real-world Perspective: Visual demos clearly illustrated the path from a notebook file to a functioning real-world API service.
- Motivation: Seeing the strong support AWS offers to the community and students through Free Tier programs and learning materials.
Some photos from the event
