Make your own free website on Tripod.com
Previous Lecture Next Lecture Schedule

11. System Resources and Performance Tuning

ref: Essential System Administration, Second edition, Chapter 7

Reminders:
	a) Always read the appropriate chapter before class.
	b) Always work through the script(s) you write in class
		after class to make sure that you can get them to work.
Note:
	Programming Perl, 2nd ed - 23 copies cam in and sold out. The
		next shipment ~25 are expected mid Feb. Make sure that
		you reserve a copy at the bookstore.

Topics:

What Resources are we trying to manage?

	- main memory
		- virtual memory includes main memory + swap space
	- cpu utilization
	- disk space 
	- i/o throughput
		- network
		- disk access

What are the goals for System Performance?

	1. system should work reliably.
	2. sufficient resources should exist to allow all processes
		to run at reasonable performance level.
	3. there should not be annoying slowdowns, or erratic behaviour
		of the system.
	4. performance should be obtained economically.

   notes: 
     a) there will always be some bottleneck in the system - and
	the bottleneck may change as the system load and mix varies.
     b) there are tradeoffs in scheduling which mean that one category
	of users may have good performance at the expense of other types
	of users: real-time, versus interactive, versus interactive users. 

Identifying problems:
	1. the system is slow.
		- in invoking new tasks
		- in echoing characters
		- in producing output
	2. the system crashed - again.


Resource Control Mechanisms

	CPU		- speed of processor [SpecInt92/95, SpecRateInt92, etc]
			- number of processors
			- process priorities 
				- default 
				- system adjustment
				- user adjustment [nice, renice]
			- scheduling and other O/S parameters 
				- time quantum
				- working set size [pages in memory]
			- monitor & control with:
				uptime, monitor, ps, top, nice, renice, pstree,
					w, kill, killall
				- timeshift with: batch, at, cron
				- user process with if(`whoami` == root)

	memory		- size of physical memory 1GB = $6,000 to $100,000+
			- size and speed of swap devices and/or files
			- process resource limits [ulimit] for memory
			- memory management tuning parameters
			- monitor & control with:
				vmstat, swapon [various], ulimit, monitor

	Disk I/O	- disk drive performance
				- rotation speed [3600 -  7200 - 10,000 rpm]
				- average seek time [8 msec - 14 msec]
				- cache size, cache method, controller
				- media transfer speed
				- interface type Wide Ultra SCSI, etc
			- disk controller type
			- file system organization [what is where]
			- file placement [stripping, mirroring]
			- file system type ufs, ffs, advfs, etc.
			- monitor & control with:
				iostat, monitor, perfmeter, rm, [touch]
				- can be difficult due to naming of drives
				- usually need to know functions of drives
				- df, du, quota, edquota

	Network I/O	- network interface type: FDDI, Fast Ethernet, ATM, ...
			- overhead of network interface [smart interface]
			- network loading
			- routing load, and speed mismatch buffering
			- network protocols
			- monitor & control with:
				- netstat, ifconfig, route, monitor, nfsstat

	Often you can buy your way our of performance issues. But, usually
	you are always susceptible to user & system errors, and increasing
	demands of users.


Solving Performance Problems:

0. Monitor the system to see how it is performing, and ask opinions of
	users.
1. Define the problem in detail [may need exploring to determine].
2. Determine the likely cause[s] of the problem. Be aware of the
	performance tuning guide for the specific OS implementation,
	with stated weaknesses. Know what to expect from different
	hardware systems.
3. Write down explicit performance goals [such as 30 day uptimes].
4. Make modifications to achieve the goals.
5. Monitor the system to determine results.
6. Iterate the process.


General Monitoring:
1. monitor [on Digital Unix - restricted use] is excellent for seeing what 
	is happening with cpu and i/o instantaneously, but is weak for 
	processes.
2. top is excellent for watching processes instantanously, but not as 
	good with cpu & i/o.
3. perfmeter [an expanded xload on sun systems] gives a short historical 
	record of performance.
4. iostat can help identify what each disk drive is doing.
5. netstat and nfsstat show what is happening on the network.


Process Priorities
	- execution is round-robbin within priority-levels
	- highest priority queue runs as long as there are elements in queue
	- different systems use different numbers for priorities
	- nice reduces priority of a process [or increases prioority]

Perl Script for the Day
Write a perl script called kill greedy which will warn users with processes that use too much cpu time, renice their processes, and kill the process if not fixed up. Sample Answer note: this isn't tested, other than to run [I didn't see any hefty processes, and didn't want to send warnings out!] #! /usr/local/bin/perl5 # # killgreedy - warn users, renice and kill their greedy processes ########################################################## # Logic: # if any process uses >75% of a cpu, and # has been running for > 2 minutes # => send a warning to the user (please don't warn root!) # # if any process uses >50% of a cpu, and # has been running for >10 minutes, and # has been sent a warning # => renice the process # # if any process uses >20% of a cpu, and # has been running for >15 minutes, and # has already been reniced # => kill the process, and inform the user ########################################################### #1234567890123456789012345678901234567890123456789012345678901234567890 # 1 2 3 4 5 6 7 # sample output from Solaris 2.5.1 /usr/ucb/ps aux: #USER PID %CPU %MEM SZ RSS TT S START TIME COMMAND #root 151 3.7 0.6 2152 1384 ? S Jan 22 755:21 /usr/lib/netsvc/y #siewtk 4852 0.8 1.3 3480 3280 pts/6 S 18:07:15 0:06 rtin #root 5374 0.0 0.5 1488 1256 console S 18:12:25 0:00 /usr/lib/saf/ttymo #wongwf 26919 0.0 0.5 1216 1144 pts/25 S 14:22:34 0:00 bash #sungkk 28772 0.0 0.3 752 688 ? S Jan 23 0:00 ./chplan 8 #sanjay 29999 0.0 0.3 904 704 ? S Feb 04 0:00 /bin/sh /usr/X11/l ############################################################ # $date = `date`; $date =~ s/\n//; print "Starting killgreedy at $date\n"; if (`uname` == "SunOS") { while (true) { # loop forever - until killed @ps = `/usr/ucb/ps aux`; #pick up all processes shift(@ps); # throw away title line $date = `date`; $date =~ s/\n//; $aaa++; print "killgreedy running, iteration $aaa\n"; foreach $process (@ps) { ($user,$pid,$cpu,$mem,$vmsize,$memsize,$tty,$state,$start, $time,$command) = split(" ", $process); $cpu = int($cpu); if ( $start =~ /\w+/ ) { # fix start gap $time=$command; # command may be garbage } $time = int( $time); #print "user=$user, pid=$pid, cpu=$cpu, time=$time.\n"; # debug print # now for logic section 1: # if(($cpu>75)&&($time>2)&&($user!="root")&&(!$warn{$pid})) { `echo "Your process $pid is taking too many resources" \| write $user`; $warning = "$user warned at $date for $pid"; $warn{$pid} = $warning; print "$user warned at $date for $pid\n"; } if(($cpu>50)&&($time>10)&&($warn{$pid})) { `echo "Your process $pid is being lowered in priority" | write $user`; `renice 5 $pid`; $warning = "$user 2nd warning at $date for $pid"; $warn2{$pid}= $warning; print "$user 2nd warning at $date for $pid\n"; } if(($cpu>20)&&($time>15)&&($warn2{$pid})) { `echo "Your process $pid is being killed" | write $user`; `kill -9 $pid`; $warning = "$user pid $pid killed at $date"; $warn2{$pid} = $warning; print "$user pid $pid killed at $date\n"; } } sleep 60; } } else {die "This program only works on SunOS due to ps command format\n";}
Previous Lecture
Next Lecture