Real-World ML Problem Solving
NLP & Computer Vision Applications
5 min read
NLP Applications
Sentiment Analysis:
# Interview approach
def sentiment_pipeline():
# 1. Preprocessing
text = text.lower().strip()
tokens = tokenize(text)
# 2. Features
# Option A: TF-IDF + Logistic Regression (baseline)
# Option B: Word embeddings (Word2Vec, GloVe)
# Option C: Transformers (BERT, RoBERTa)
# 3. Model
# Start simple: Logistic Regression on TF-IDF
# Production: Fine-tuned BERT
return model.predict(features)
Interview Q: "Build sentiment classifier for product reviews" A:
- Data: Labeled reviews (1-5 stars)
- Preprocessing: Lowercase, remove punctuation, handle negations ("not good")
- Baseline: TF-IDF + Logistic Regression
- Advanced: Fine-tune DistilBERT (faster than BERT)
- Handle: Sarcasm (hard), emojis, misspellings
- Evaluation: F1 per class, confusion matrix
Text Classification:
- Spam detection: NB or Logistic Regression (fast, interpretable)
- Intent classification: BERT embeddings
- Named Entity Recognition (NER): BiLSTM-CRF or transformer
Computer Vision Applications
Image Classification:
# Interview approach
def image_classifier():
# 1. Architecture
# Transfer learning: Pre-trained ResNet50, EfficientNet
# Fine-tune top layers on custom data
# 2. Data augmentation
# Random crop, flip, rotation, color jitter
# 3. Training
# Adam optimizer, learning rate schedule
# Early stopping on validation loss
return model
Interview Q: "Classify X-ray images for disease detection" A:
- Transfer learning: ImageNet pre-trained ResNet50
- Data augmentation: Rotations, flips (medical images are centered)
- Class imbalance: Weighted loss (rare diseases)
- Evaluation: AUC-ROC per disease, sensitivity (recall)
- Interpretability: GradCAM heatmaps (show radiologist what model sees)
- Validation: Stratified k-fold, test on different hospitals
Object Detection:
- YOLO: Real-time (autonomous driving, video)
- Faster R-CNN: Higher accuracy (medical, satellite imagery)
- Two-stage: Region proposals → classification
Interview Q: "Detect pedestrians for self-driving car" A:
- Model: YOLOv8 (real-time, 30+ FPS)
- Data: Augment with weather conditions, lighting
- Evaluation: mAP (mean Average Precision), IoU threshold
- Edge cases: Occlusion, night time, small objects
- Deployment: TensorRT optimization for GPU inference
:::