[DRAFT] Personal Learning Journey

February 17, 2025 · 4 min read

SRE Engineer - Fullstack Enthusiast - Go, Python, React, Typescript

alt text

Introduction

In today's fast-evolving tech landscape, continuous learning is essential for anyone in Site Reliability Engineering (SRE), DevOps, or Machine Learning/Data Analytics (ML/DA). This post outlines my structured learning journey, highlighting key knowledge areas, practical skills, and resources that have helped me grow as an engineer. Whether you're just starting or looking to deepen your expertise, I hope this guide provides a valuable roadmap.

SRE & DevOps Learning Roadmap

1. Kernel Fundamentals

A strong grasp of the kernel is crucial for SREs, especially when debugging performance issues or understanding system behavior. Focus on:

Process Management: Learn how processes are created, scheduled, and managed. Tools: ps, top, htop, strace.
Memory Management: Understand virtual memory, paging, and memory allocation. Tools: free, vmstat, smem.
File Systems: Explore how the kernel interacts with file systems. Practice mounting, permissions, and troubleshooting disk issues.
System Calls: Study the interface between user-space and kernel-space. Use strace to trace system calls.

Actionable Tip: Try debugging a real performance issue using these tools to solidify your understanding.

2. Operating System & Networking Essentials

SREs must be comfortable with both OS and networking concepts:

OS Concepts: Processes, threads, memory, and file systems. Practice with Linux or BSD systems.
System Administration: User and package management, troubleshooting, and automation. Tools: useradd, apt, yum, systemctl.
Security: Learn about authentication, authorization, and common vulnerabilities. Resources: OWASP Top 10.
Networking:
- TCP/IP: Understand the basics of network communication.
- HTTP/HTTPS: Learn how web protocols work.
- DNS: Practice resolving and troubleshooting DNS issues.
- Load Balancing & Firewalls: Study how to distribute traffic and secure networks.
- VPN: Set up a VPN for secure remote access.

Actionable Tip: Set up a small home lab to practice networking and OS administration.

3. Computer Science Fundamentals

A solid CS foundation is invaluable:

Data Structures & Algorithms: Arrays, linked lists, trees, graphs, sorting/searching, and Big O analysis. Practice on platforms like LeetCode or HackerRank.
Operating Systems & Networking: Deepen your understanding beyond basics.
Databases:
- SQL: Data modeling, schema design, queries, transactions, indexing.
- NoSQL: Key-value, document, column-family, and graph databases.
- Administration: Backup, restore, and performance tuning.
Distributed Systems:
- Core Concepts: Consistency, fault tolerance, latency.
- Scalability Patterns: Sharding, replication, caching.
- Messaging: Explore Kafka, RabbitMQ, etc.

Resource: CS50 by Harvard is a great starting point.

4. System Administration & Automation

Linux Admin: User, package, and file system management; process monitoring.
Scripting: Automate tasks with Bash or Python.
Configuration Management: Learn Ansible, Chef, or Puppet for Infrastructure as Code (IaC).
Automation: Use cron or systemd timers for scheduled tasks.
Cloud Platforms: Get hands-on with AWS, Azure, or GCP.

Actionable Tip: Automate a repetitive task in your environment using a script or configuration management tool.

5. System Performance & Observability

Monitoring: Set up and use Prometheus, Grafana, Datadog, or New Relic.
Logging: Centralize logs with ELK stack or Fluentd.
Tracing: Implement distributed tracing with Jaeger, Zipkin, or OpenTelemetry.
Profiling: Use tools like perf or pprof for CPU/memory profiling.
Load Testing: Try JMeter, k6, or Locust to simulate real-world traffic.

Actionable Tip: Instrument a simple application with monitoring and logging, then run a load test and analyze the results.

ML/DA Learning Roadmap

Recommended Resource: DeepLearning.AI

1. Data Mining & Preprocessing

Data cleaning, preprocessing, and visualization.
Feature extraction and selection.
Association rule mining and clustering algorithms.

Actionable Tip: Use Python libraries like Pandas and Scikit-learn to preprocess a real dataset.

2. Machine Learning Foundations

Supervised Learning: Regression, classification.
Unsupervised Learning: Clustering, dimensionality reduction.
Model Evaluation: Cross-validation, metrics.
Regularization: Prevent overfitting.
Algorithms: Linear/logistic regression, decision trees, SVM, k-NN.

Resource: Coursera ML by Andrew Ng

3. Natural Language Processing (NLP)

Text preprocessing: tokenization, stemming, lemmatization.
Feature extraction: TF-IDF, word embeddings.
Text classification, sentiment analysis, language modeling.
Sequence-to-sequence models.

4. Neural Networks & Deep Learning

Neural network architectures, activation and loss functions.
Optimization algorithms.
CNNs and RNNs.

5. Large Language Models (LLMs)

Transformer architecture, pre-training, and fine-tuning.
Prompt engineering.
Applications: text generation, Q&A, summarization.

6. Transformers & Advanced Deep Learning

Self-attention, encoder-decoder, multi-head attention.
Applications in NLP and beyond.

Conclusion

Continuous learning is the key to success in SRE, DevOps, and ML/DA. By following a structured roadmap, practicing hands-on, and leveraging the right resources, you can accelerate your growth and make a real impact. Remember: the best way to learn is by doing—so start experimenting, building, and sharing your journey!

Introduction​

SRE & DevOps Learning Roadmap​

1. Kernel Fundamentals​

2. Operating System & Networking Essentials​

3. Computer Science Fundamentals​

4. System Administration & Automation​

5. System Performance & Observability​

ML/DA Learning Roadmap​

1. Data Mining & Preprocessing​

2. Machine Learning Foundations​

3. Natural Language Processing (NLP)​

4. Neural Networks & Deep Learning​

5. Large Language Models (LLMs)​

6. Transformers & Advanced Deep Learning​

Conclusion​