Syllabus porting DONE.

This commit is contained in:
Avleen Vig
2012-10-18 14:33:44 -05:00
parent 0220a2ef37
commit 3591c7dcf2
40 changed files with 1652 additions and 0 deletions

View File

@@ -0,0 +1,30 @@
Application Components 201
**************************
Message Brokers
===============
RabbitMQ
--------
Apache ActiveMQ
---------------
Memory Caches
=============
Memcached
---------
Redis
-----
Specialized Caches
==================
Varnish
-------
nginx+memcached
---------------

16
architecture_101.rst Normal file
View File

@@ -0,0 +1,16 @@
Architecture 101
****************
How to make good architecture decisions
=======================================
Patterns and anti-patterns
==========================
Introduction to availability
============================
Introduction to scalability
===========================

57
architecture_201.rst Normal file
View File

@@ -0,0 +1,57 @@
Architecture 201
****************
Service Oriented Architectures
==============================
Fault tolerance, fault protection, masking, dependability fundamentals
======================================================================
Fail open, fail closed
----------------------
Perspective: node, network, cluster, application
------------------------------------------------
Caching Concerns
================
Static assets
-------------
Data
----
Eviction and replacement policies and evaluation
------------------------------------------------
Approaches
----------
(TTL, purge-on-write, no-purge versioning, constantly churning cache versus
contained, working set sizing)
Crash only
==========
Synchronous vs. Asynchronous
============================
Business continuity vs. Disaster Recovery
=========================================
Designing for Scalability: Horizontal, Vertical
===============================================
Simplicity
==========
Performance
===========
Tiered architectures
====================
MTTR > MTBF
===========
http://www.kitchensoap.com/2010/11/07/mttr-mtbf-for-most-types-of-f/

40
backups.rst Normal file
View File

@@ -0,0 +1,40 @@
Backups
*******
Online/Offline backups
======================
Designing a backup policy
=========================
Deciding what to backup
-----------------------
Backup type and frequency
-------------------------
Retention periods and budgeting
-------------------------------
Offsite backups
---------------
Ensuring and monitoring backup integrity and completeness
=========================================================
Your RAID is not a backup (and neither is your cluster duplication)
===================================================================
Legal implications of having backups
====================================
Security implications of having backups
=======================================
Recovery basics
===============
Secure data destruction
=======================

56
capacity_planning.rst Normal file
View File

@@ -0,0 +1,56 @@
Capacity Planning
*****************
(Seems reasonable to have the Statistics for Engineers course be a pre-req for
this course)
Fundamentals of capacity planning
=================================
Resource usage investigation and exploration
---------------------------------------------
* Examples: CPU:req/sec ratio, memory footprint:req/sec ratio, disk consumption
per user/per sale/per widget, etc.
* Application:Infrastructure metric relationships
* 2nd order capacity (logging,
metrics+monitoring systems, ancillary systems)
Finding ceilings
----------------
* Discovering resource limits
* Comparing different hardware/instance profiles - production load versus
synthetic
* Benchmarking: pitfalls, limitations, pros/cons
* http://www.contextneeded.com/system-benchmarks
* Multivariate infra limits (multiple resource peak-driven usage) Ex: web+image
uploads, caching storage+processing, etc.
* Architecture analysis (anticipating the next bottleneck)
Forecasting
============
Linear and nonlinear trending and forecasting (“steering by your wake”)
-----------------------------------------------------------------------
Details of automatic forecasting and scaling
--------------------------------------------
Seasonality and future events
-----------------------------
* Organic growth approaches (bottom-up infra driven, top-down app driven)
* inorganic growth events (new feature launch, holiday effects, “going viral”,
major public announcement)
* Provisioning effects on timelines, financial tradeoffs
Diagonal scaling
================
(vertically scaling your already horizontal architecture)
Reprovisioning and legacy system usage tradeoffs
------------------------------------------------

15
common_services.rst Normal file
View File

@@ -0,0 +1,15 @@
Common services
***************
.. todo::
Introduction to RFCs
.. toctree::
system_daemons_101
dns_101
dns_201
dhcp
http_101
http_201
smtp_101
smtp_201

31
config_management.rst Normal file
View File

@@ -0,0 +1,31 @@
Configuration Management 101
****************************
Idempotency
===========
Convergent and Congruent systems
================================
Direct and Indirect systems: ansible, capistrano
================================================
Chef
====
Chef (adam: Im biased here, but I would do Chef in 101, puppet and cfengine in
201, but its because I want junior admins to get better at scripting, not just
because Im a dick.)
(Magnus: this goes back to why Ruby will be so much more for new guys coming in
today like Perl was for a lot of us in the 90s)
Configuration Management 201
****************************
Puppet
======
Cfengine 3
==========

23
databases_101.rst Normal file
View File

@@ -0,0 +1,23 @@
Databases 101 (Relational databases)
************************************
SQL shell
=========
Creating databases
==================
Creating users
==============
Granting privileges
===================
Basic normalized schema design
==============================
Select, Insert, Update and Delete
=================================

52
databases_201.rst Normal file
View File

@@ -0,0 +1,52 @@
Databases 201
*************
Database Theory
===============
ACID
----
CAP theorem
-----------
High Availability
-----------------
Document Databases
==================
MongoDB
-------
CouchDB
-------
Hadoop
------
Key-value Stores
================
Riak
----
Cassandra
---------
Dynamo
------
BigTable
--------
Graph Databases
===============
FlockDB
-------
Neo4j
-----

20
datacenters_101.rst Normal file
View File

@@ -0,0 +1,20 @@
Datacenters 101
***************
Power budgets
=============
A/B power and cascading outages
-------------------------------
Cooling budgets
===============
You will be judged by the tidiness of your rack
===============================================
Machine and cable labeling
==========================
Traditional naming conventions
==============================

29
datacenters_201.rst Normal file
View File

@@ -0,0 +1,29 @@
Datacenters 201
***************
Networking many racks
=====================
Power
=====
N+1, N+2 power
--------------
Fused vs usable
---------------
calculations
------------
Cooling
=======
N+1, N+2
---------
Physical security and common security standards compliance requirements
=======================================================================
Suggested practices
===================

33
datacenters_301.rst Normal file
View File

@@ -0,0 +1,33 @@
Datacenters 301
***************
Power
=====
PUE (Power Usage Effectiveness)
-------------------------------
3 phase power
-------------
Increasing cooling efficiency
=============================
CFM
---
Hot aisle / cold aisle design fundamentals
------------------------------------------
Hot aisle / cold aisle containment
----------------------------------
Chimneys
---------
Design Options
==============
Raised floor (18in & 36in)
--------------------------

33
deployment_101.rst Normal file
View File

@@ -0,0 +1,33 @@
Software Deployment 101
***********************
Software deployment vs configuration management
===============================================
definition of a hosts role or state
Running services
================
inetd
-----
Shared containers vs self-contained binaries
--------------------------------------------
Package mangement
=================
Configuration files
-------------------
Source based / binary packages
------------------------------
Building packages
-----------------
Packaging new tools
-------------------

22
deployment_201.rst Normal file
View File

@@ -0,0 +1,22 @@
Software Deployment 201
***********************
Running services
================
daemontools
-----------
god
---
angel
-----
runit
-----
monit
-----

27
dhcp.rst Normal file
View File

@@ -0,0 +1,27 @@
DHCP
****
Tools: ISC dhcpd
================
Protocol
========
dhcp helper
===========
Defining classes, leases
========================
Options
=======
Default gateway
---------------
DNS server(s)
-------------
(Tie in previous chapters re: TFTP, PXE with related options?)

25
dns_101.rst Normal file
View File

@@ -0,0 +1,25 @@
DNS 101
*******
What is DNS? Provide a brief description
* How DNS works
* The DNS tree
Forward vs reverse
==================
Resource types
==============
UDP vs TCP
==========
``/etc/hosts``, ``/etc/resolv.conf``, ``/etc/nsswitch.conf``
============================================================
Dynamic DNS
===========

23
dns_201.rst Normal file
View File

@@ -0,0 +1,23 @@
DNS 201
*******
Architecture/design choices
===========================
Master/slave
------------
Shadow master/slave
-------------------
Shadow master/multi=slave
-------------------------
Split horizon
-------------
DNSSEC
======

91
hardware_101.rst Normal file
View File

@@ -0,0 +1,91 @@
Hardware 101
************
Hardware Types
==============
Rackmount
---------
Blades
------
SANs
----
Basic server architecture
=========================
CPU
---
RAM
---
Motherboard
-----------
Bus
---
Disk controllers
----------------
Network devices
---------------
BIOS/UEFI
---------
PCI-E Cards
-----------
Disk management
===============
Arrays
------
RAID/ZFS
--------
Logical Volume Management
-------------------------
Performance/Redundancy
======================
RAID/ZFS
--------
Power
-----
Electrical overview
^^^^^^^^^^^^^^^^^^^
110v/220v, amperage, 3 phase power
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UPS
^^^
common plug types
^^^^^^^^^^^^^^^^^
common circuit types
^^^^^^^^^^^^^^^^^^^^
Troubleshooting
===============
Remote management (IPMI, iLO, DRAC, etc)
----------------------------------------
Common MCE log errors
---------------------
System burn-in
--------------

20
http_101.rst Normal file
View File

@@ -0,0 +1,20 @@
HTTP 101 (Core protocol)
************************
Tools: Speaking http with telnet/netcat/curl
============================================
Understanding how a browser/tool parses a url and server receives it for a
request
Apache, nginx
=============
HTML
====
Virtual hosting
===============
CGI
===

41
http_201.rst Normal file
View File

@@ -0,0 +1,41 @@
HTTP 201 (Application Servers & Frameworks)
*******************************************
Java
====
Tomcat
------
JBoss
-----
Jetty
-----
node.js / Server-side Javascript
================================
Perl
====
PHP
===
Ruby
====
Rails
-----
Sinatra
-------
Python
======
Django
------
Flask
-----

13
identity_management.rst Normal file
View File

@@ -0,0 +1,13 @@
Identity Management 101
***********************
LDAP
====
Active Directory
----------------
OpenLDAP
--------

View File

@@ -23,6 +23,34 @@ Contents:
troubleshooting_101
networking_101
networking_201
common_services
identity_management
remote_filesystems_101
remote_filesystems_201
programming_101
programming_201
hardware_101
datacenters_101
datacenters_201
datacenters_301
virtualization_101
virtualization_201
logs_101
logs_201
databases_101
databases_201
application_components_201
monitoring_101
monitoring_201
backups
configuration_management
capacity_planning
statistics
deployment_101
deployment_201
soft_skills
labs
seealso
TODO List

244
labs.rst Normal file
View File

@@ -0,0 +1,244 @@
Labs exercises
**************
Bare-Metal Provisioning 101
===========================
Install CentOS by hand
----------------------
Install the same node using kickstart
-------------------------------------
Bare-Metal Provisioning 201
===========================
Setup a basic cobbler server
----------------------------
Build a profile
---------------
Kickstart a node
----------------
Change the profile, re-kickstart the node
-----------------------------------------
Cloud Provisioning 101
======================
Setup a free-tier account with AWS
----------------------------------
Spawn a Linux node from the AWS Console
---------------------------------------
Cloud Provisioning 201
======================
Spawn a Linux node using the AWS API
------------------------------------
Attach an Elastic IP and EBS to it
----------------------------------
Database 101
============
Install and start up MySQL
--------------------------
Create basic relational database / tables, using a variety of field types
-------------------------------------------------------------------------
Grant and revoke privileges
---------------------------
Install Riak
------------
Write (or provide, probably depends on where this fits in relation
to scripting tutorial?) basic tool to insert and retrieve some data.
Database 201
============
Spawn up second VM/MySQL install
--------------------------------
Set up Master->Slave replication
--------------------------------
Deliberately break and then fix replication
-------------------------------------------
Set up Master<->Master replication
----------------------------------
Introduction to percona toolkit
-------------------------------
Set up Riak Cluster, modify the tool from 101 to demonstrate replication.
Database 301
============
Galera cluster
--------------
* Introduction to variables and their meaning. Tuning MySQL configuration (use
mysqltuner.pl as a launch point?), pros and cons of various options.
* Introducing EXPLAIN and how to analyse and improve queries and schema.
* Backup options, mysqldump, LVM Snapshotting, Xtrabackup.
Automation 101
==============
Do you need it? How much and why?
---------------------------------
http://www.kitchensoap.com/2012/09/21/a-mature-role-for-automation-part-i/
* Talk basic theory and approach in terms of Idempodency and Convergence.
* Write a bash script to install something as idempodently as possible.
* Discuss Chef and Puppet and while reflecting on the bash script.
Automation - Chef 201
=====================
Setup an Opscode account
------------------------
Setup your workstation as a client to your Opscode account
----------------------------------------------------------
Download the build-essential cookbook, and apply it to your workstation
-----------------------------------------------------------------------
Automation - Chef 301
=====================
Setup a chef-repo
-----------------
Write your own cookbook
-----------------------
Automation - Chef 302
=====================
Setup your own Chef Server
--------------------------
Write your own resources/providers
----------------------------------
Write sanity tests for your code
--------------------------------
Automation - Puppet 201
=======================
Install Puppet
--------------
Install a Forge module using the module tool
--------------------------------------------
Apply it to your local machine
------------------------------
Automation - Puppet 301
=======================
Install Puppet
--------------
Create your own module
----------------------
Apply it to your local machine
------------------------------
Package Management 101
======================
Setup a basic YUM or APT repo and put some packages in it
---------------------------------------------------------
Setup a local mirror of CentOS (or what have you)
-------------------------------------------------
Setup a client to install from it
---------------------------------
Package Management 201
======================
Build a simple RPM or deb
-------------------------
FPM
---
Build automation fleets
=======================
koji
----
D
-
Version Control with Git 101
============================
Open a github account
---------------------
Create a new repo called 'scripts'
----------------------------------
Place a useful shell script in it
---------------------------------
Commit and push
---------------
Make a change, commit and push
------------------------------
Create a branch, make a change, commit, and push
------------------------------------------------
Create a pull request and merge the branch into the master branch
-----------------------------------------------------------------
Read Chapters 1-3 of the Pro Git online book - http://git-scm.com/book
DNS 101
=======
Install Bind
------------
Configure one zone
------------------
Show DNS resolution for an A and a CNAME record in the configured zone
----------------------------------------------------------------------
HTTP 101
========
Install Apache
--------------
Configure a virtual host
------------------------
Display a simple web page
--------------------------

75
loadbalancing_101.rst Normal file
View File

@@ -0,0 +1,75 @@
Load Balancing
**************
Why do we use load balancers?
=============================
* A fast, HTTP/TCP based load balancer can distribute between slower application
servers, and can keep each application server running optimally
* They keep application servers busy by buffering responses and serving them to
slow clients (or keepalives). We want app servers to do real work, not waste
time waiting on the network.
* Can be tuned by itself for max conns?
* HTTP path based routing, 1 domain to multiple heterogeneous server pools.
Software
========
Apache
------
Nginx
-----
HAProxy
-------
Hardware
========
BIG-IP
------
Netscaler
---------
Multi-dc
========
Anycast
-------
DNS GSLB
--------
CDNs
-----
(cparedes: Id argue that its valid in some contexts, depending on what
youre load balancing)
Application implications
========================
Balancing algorithms
--------------------
Session Affinity, and why it is evil.
-------------------------------------
Local & ISP caching
-------------------
SSL Termination
---------------
Balancing vs. Failover (Active / Passive)
-----------------------------------------
Health Checks.
---------------
Non-HTTP use cases
==================
Native SMTP, or general TCP for things like RabbitMQ, MongoDB, etc

14
logs_101.rst Normal file
View File

@@ -0,0 +1,14 @@
Logs 101
********
Common system logs & formats
============================
Syslog basics
=============
Log rotation, append, truncate
==============================
Retention and archival
======================

19
logs_201.rst Normal file
View File

@@ -0,0 +1,19 @@
Logs 201
********
Centralized logging
===================
Syslogd, Rsyslog
----------------
Benefits for developers
-----------------------
Log parsing
===========
Search & Correlation
====================
(Splunk, Logstash, Loggly, etc)

43
monitoring_101.rst Normal file
View File

@@ -0,0 +1,43 @@
Monitoring, Notifications, and Metrics 101
******************************************
History: How we used to monitor, and how we got better (monitors as tests)
==========================================================================
Perspective (end-to-end) vs Introspective monitoring
====================================================
Metrics: what to collect, what to do with them
==============================================
Common tools
============
Syslog (basics!)
----------------
Syslog-ng
---------
Nagios
------
Graphite
--------
Ganglia
-------
Munin
-----
RRDTool / cacti
---------------
Icinga
------
SNMP
----

28
monitoring_201.rst Normal file
View File

@@ -0,0 +1,28 @@
Monitoring, Notifications, and Metrics 201
******************************************
Dataviz & Graphing
==================
Graphite, StatsD
================
Dashboard: Info for ops and info for the business
=================================================
Third-party tools
=================
Datadog
-------
Boundry
-------
NewRelic
--------
Circonus
--------

64
programming_101.rst Normal file
View File

@@ -0,0 +1,64 @@
Programming 101
***************
Shell scripting basics
======================
“#!/usr/bin/env bash” vs. “#!/bin/bash” vs “#!/bin/sh”(portability
considerations)
Variables
---------
user-defined
built-in
Control Statements
------------------
tests / conditionals
loops
functions
---------
arrays
------
style
-----
Redirection
-----------
I/O
---
Pipes
-----
stderr vs. stdout
------------------
/dev/null and /dev/zero
-----------------------
Regular Expressions
===================
Sed & awk
=========
GIGO
====
Validating input
----------------
Validating output
-----------------
Trapping & handling exceptions with grace
-----------------------------------------

121
programming_201.rst Normal file
View File

@@ -0,0 +1,121 @@
Programming 201
***************
Common elements in scripting, and what they do
==============================================
Syntax
------
Variables
---------
Common data structures
----------------------
Rosetta stone? One mans hash is anothers associative array is another mans
dict(ionary)?
Functions
---------
Objects
-------
C (A very basic overview)
=========================
The main loop
-------------
Libraries & Headers
-------------------
#include
--------
The Compiler
------------
The Linker
----------
Make
----
Lab: Open a file, write to it, close it, stat it & print the file info, unlink
it. Handle errors.
Ruby
====
Syntax
------
Variables
---------
Common data structures
----------------------
Functions
---------
Objects
-------
Rubygems
--------
Databases
---------
Python
======
Syntax
------
Variables
---------
Common data structures
----------------------
Functions
---------
Objects
-------
Version Control
===============
Git
---
SVN
---
CVS
---
API design fundamentals
=======================
RESTful APIs
------------
JSON / XML and other data serialization
---------------------------------------
Authentication / Authorization / Encryption and other security after-thoughts.
------------------------------------------------------------------------------
:)
https://github.com/ziliko/code-guidelines/blob/master/Design%20an%20hypermedia(REST)%20api.md
Continuous Integration
======================

View File

@@ -0,0 +1,11 @@
Remote Filesystems 101
**********************
NFSv3
=====
iSCSI
=====
SAMBA/CIFS
==========

View File

@@ -0,0 +1,14 @@
Remote Filesystems 201
**********************
GlusterFS
=========
NFSv4
=====
Netatalk / AFP
==============
S3
==

21
seealso.rst Normal file
View File

@@ -0,0 +1,21 @@
See also
********
A list of System Administration or Ops related classes or degree granting
programs. It would be well worth our time to compare their syllabi, course
outcomes, exercises etc.
http://www.hioa.no/Studier/TKD/Master/Network-and-System-Administration/
http://www.hioa.no/Programmes/Summer/Network-and-systems-administration
http://www.cs.stevens.edu/~jschauma/615/
http://goo.gl/4ygBn
http://www.cs.fsu.edu/current/grad/cnsa_ms.php
http://www.rit.edu/programs/applied-networking-and-system-administration-bs
http://www.mtu.edu/technology/undergraduate/cnsa/
http://www.wit.edu/computer-science/programs/BSCN.html
In addition, Usenix SAGE (now LISA) used to have a sage-edu@usenix.org mailing
list, but I dont know if that is still active. LOPSA has
https://lists.lopsa.org/cgi-bin/mailman/listinfo/educators
http://www.verticalsysadmin.com/Report_on_programs_in_System_Administration__25-June-2012.pdf

37
smtp_101.rst Normal file
View File

@@ -0,0 +1,37 @@
SMTP 101
********
Tools: Speaking smtp with telnet/netcat
=======================================
Risks / Dangers / Open relays
=============================
Tools to test/confirm operation as an open relay
Protocol
========
MX routing
----------
Envelope
--------
Message Headers
---------------
Message Body
------------
MTAs (forwarding, smart hosts, etc.)
====================================
Postfix
-------
Sendmail
--------

43
smtp_201.rst Normal file
View File

@@ -0,0 +1,43 @@
SMTP 201
********
Anti=Spam
=========
Blacklisting
------------
Whitelisting
------------
Greylisting
-----------
SpamAssassin?
-------------
Reliable Mail Delivery
======================
Basic Etiquette
===============
Authentication (SPF, DKIM)
==========================
Bounce Management
=================
Feedback Loops
==============
Reputation Management
=====================
Advanced routing
================
Hosted services = why they are good/bad/ugly
============================================

115
soft_skills.rst Normal file
View File

@@ -0,0 +1,115 @@
Soft Skills
***********
What is the role of ops?
========================
Maintain and grow infrastructure as needed.
-------------------------------------------
Protect data from internal and external threats, hardware failure.
------------------------------------------------------------------
Be enablers and problem solvers, not a hinderance. The antithesis of the BOFH.
------------------------------------------------------------------------------
Business Savvy
==============
Why we do what we do. Supporting business needs.
------------------------------------------------
Selling system changes and new proposals
----------------------------------------
Negotiating budgetary constraints vs. need/want requirements
------------------------------------------------------------
Communicating with internal and external customers
--------------------------------------------------
* Managing maintenance windows
Evaluating a product offering
-----------------------------
The importance of Documentation
===============================
What to document
----------------
* Runbooks? SOP? (cparedes: might be worthwhile even though we want to automate
SOPs away as much as possible - what should we check at 2 AM? What do folks
typically do in this situation if automation fails?)
* Architecture and design (cparedes: also maybe talk about *why* we choose that
design - what problems did we try to solve? Why is this a good solution?) How
to manage documentation
Documentation through Diagrams
------------------------------
Project Management (Does time management go here too?)
======================================================
Working with other teams
========================
Where do you go from here?
==========================
How to get help, keep sharp, learn new skills, and network within the systems
community.
Mailing lists
-------------
Local user groups
-----------------
LOPSA
-----
Twitter
-------
ServerFault
-----------
Sign up and participate. As your own questions, but also answer questions that
look interesting to you. This will not only help the community, but can keep you
sharp, even on technologies you dont work with on a daily basis.
Books (and concepts worth “Googling”)
-------------------------------------
* Web Operations, John Allspaw and Jesse Robbins
* The Art of Capacity Planning, John Allspaw
* Blueprints for High Availability, Evan Marcus and Hal Stern
* Resilience Engineering, Erik Hollnagel
* Human Error, James Reason
* To Engineer is Human, Henry Petroski
* To Forgive Design, Henry Petroski
Agile
=====
Kanban
------
Scrum
-----
The Tao of DevOps
=================
What is DevOps
--------------
What isnt DevOps
-----------------
Why devops is important
-----------------------

10
statistics.rst Normal file
View File

@@ -0,0 +1,10 @@
Statistics For Engineers
************************
Normal distributions
====================
Percentiles, histograms, averages, mean, medians
================================================

25
system_daemons_101.rst Normal file
View File

@@ -0,0 +1,25 @@
System daemons 101
******************
NTP
===
Kernel daemons
==============
``klogd``, etc
Filesystem daemons
==================
``xfslogd``, etc
udevd
=====
crond
=====
atd
===
sshd
====

26
virtualization_101.rst Normal file
View File

@@ -0,0 +1,26 @@
Virtualization 101
******************
Intro to virtualization technologies
====================================
What problems is virtualization good at solving, what isnt it good at.
-----------------------------------------------------------------------
Hardware assisted vs Paravirtualization
---------------------------------------
Hypervisors - overview, some comparison
---------------------------------------
Container-based virtualization (Jails, Containers/Zones, OpenVZ)
----------------------------------------------------------------
The Cloud
=========
Cloud providers comparison. Pros & Cons
---------------------------------------
Cloud management tools and APIs
-------------------------------

17
virtualization_201.rst Normal file
View File

@@ -0,0 +1,17 @@
Virtualization 201
******************
Managing virtualized infrastructures (Private clouds)
=====================================================
Balancing CPU, Memory, and IO for guests
----------------------------------------
Leveraging virtualization for development
=========================================
Leveraging virtualization for production
========================================
Security implications of virtualization
=======================================