FCP_ARRAY_ERR8, FCP_ARRAY_ERR9, FSCSI_ERR6…

Bunch of messages popped up on a test box w/ FCP_ARRAY_ERR8,FCP_ARRAY_ERR8, FCP_ARRAY_ERR9, FSCSI_ERR6…. fget_config -Av was kinda hung, file system in Vg was unable to mount – all kinda SAN related things and the ultimate solution was to update microcode on fiber channel hba.

Error –

LABEL:          FCP_ARRAY_ERR8
IDENTIFIER:     483C9D10

Date/Time:       Mon Apr 20 15:07:39 CDT 2009
Sequence Number: 26895
Class:           H
Type:            INFO
Resource Name:   dac0
Resource Class:  array
Resource Type:   ibm-dac-V4
Location:        U0.1-P1-I4/Q1-W200200A0B8122A16
VPD:
Manufacturer…………….IBM
Machine Type and Model……1742-900
Part Number……………..348-0049782
ROS Level and ID…………0914

Solution – update microcode on card –

Microcode downloads

http://www14.software.ibm.com/webapp/set2/firmware/gjsn

Store rpm in /etc/microcode if you cant put on your NIM. Unpack rpm and use diag to update microcode  (diag–>Task Selection (Diagnostics, Advanced Diagnostics, Service Aids, etc.–>Microcode Tasks–>Download Microcode–>seletc fcsN –> follow the steps to seletc /etc/microcode )

Quick Way to Change Oracle Env -standard .profile script

This .profile is executed when I sign on to the server, and it automatically makes an alias for each database instance, which is the same as the Oracle SID ] name. When I enter the Oracle SID name at the Unix prompt, my entire Unix environment is reset for the new database. The following code is what I place in my .profile file:

 

   for DB in `cat /etc/oratab|grep -v \#|grep -v \*|cut -d”:” -f1`

   do

      alias $DB=’export ORAENV_ASK=NO; \

      export ORACLE_SID=’$DB’;\

      . $TEMPHOME/bin/oraenv; \

      export ORACLE_HOME;\

      export ORACLE_BASE=\

     `echo $ORACLE_HOME | sed -e ‘s:/product/.*::g’`;\

      export DBA=$ORACLE_BASE/admin;\

      export SCRIPT_HOME=$DBA/scripts;\

      export PATH=$PATH:$SCRIPT_HOME;\

      export LIB_PATH=$ORACLE_HOME/lib64:$ORACLE_HOME/lib ‘

   done

 

Now, if I want to switch my environment to the PROD database, I simply type in PROD as a command at the Unix command prompt.

How to install and run CDE on a non-graphical AIX system.

Recently came across a standalone / non-hmc system without console – had to install CDE-X11 file set from base OS. X11.Dt was on Vol2 on AIX base OS…it does prompt vol3if needed.

1. install X11.Dt.*  by running smitty install_all–>cd0–>F4–>/X11.Dt
2. run /usr/dt/bin/dtconfig -e
3. run this script /etc/rc.dt

IBM Network card configuration

If you change network card hardware on IBM systems, make sure the “media_speed” is set to “100_Full_Duplex” or backups will suffer severely. By default, cards are set to “Auto_Negotiation”

Current settings:

lsattr -El ent0

busmem 0xe0080000 Bus memory address False

rom_mem 0xe0040000 ROM memory address False

busintr 179 Bus interrupt level False

intr_priority 3 Interrupt priority False

txdesc_que_sz 512 TX descriptor queue size True

rxdesc_que_sz 1024 RX descriptor queue size True

tx_que_sz 8192 Software transmit queue size True

media_speed Auto_Negotiation Media speed True

copy_bytes 2048 Copy packet if this many or less bytes True

use_alt_addr no Enable alternate ethernet address True

alt_addr 0x000000000000 Alternate ethernet address True

slih_hog 10 Interrupt events processed per interrupt True

rx_hog 1000 RX buffers processed per RX interrupt True

intr_rate 10000 Interrupt events processed per interrupt True

compat_mode no Gigabit Backward compatability True

flow_ctrl yes Enable Transmit and Receive Flow Control True

jumbo_frames no Transmit jumbo frames True

chksum_offload yes Enable hardware transmit and receive checksum True

large_send yes Enable hardware TX TCP resegmentation True

rxbuf_pool_sz 1024 RX descriptor queue size True

Checking available settings for “media_speed”

lsattr -El ent0 -a media_speed

10_Half_Duplex

10_Full_Duplex

100_Half_Duplex

100_Full_Duplex

Auto_Negotiation

Using WebSM open a console to the system you want to change

To correct the “media_speed” setting, do the following:

rmdev -l en0

rmdev -l ent0

chdev -l ent0 -a media_speed=100_Full_Duplex

cfgmgr

Check results:

lsattr -El ent0

media_speed 100_Full_Duplex Media speed True

Reboot the server to make sure your settings take place.

Posted in AIX. 1 Comment »

Two excellent commands to compare two lists

grep -f <list1> -v <list2>

comm -123 <list1> <list2>

command line option to Power off a VM -ESX host

  1. Right-click on the virtual machine and choose Power off using the VMware Infrastructure Client.
  2. If this does not work, you must use the command line method.
  3. From the Service Console of the ESX host, run these commands:

    vmware-cmd <cfg> stop
    vmware-cmd <cfg> stop hard

    Where <cfg> is the complete path to the configuration file, which can be determined by running:

    vmware-cmd –l

  4. Run the following command to check the state of the virtual machine:

    vmware-cmd <cfg> getstate

  5. If none of the above suggestions for stopping the virtual machine work, get the virtual machine’s process ID using the following command:

    ps –auxwww | grep –i <vm name>

  6. Kill the process ID (PID) for the virtual machine (number in the second column of the previous step) using the following command:

    kill PID

  7. After issuing the kill command, wait 30 seconds and run the following command to check for the process:

    ps –auxwww | grep –i <vm name>

  8. If the process is still present, run following command to stop the process:kill -9 PID

  9. Wait 30 seconds and check for the process again.

Alternate kill method

  1. Run the following command to determine VMID of the problem virtual machine:vm-support -x or cat /proc/vmware/vm/*/names

  2. Run the following command to determine the master world ID for the virtual machine using the VMID determined from the previous step for ####:

    less –S /proc/vmware/vm/####/cpu/status

  3. Find group number by scrolling over to Group and finding underneath vm.####.
  4. Run the following commands to kill the virtual machine using the group ID determined from the previous step:

    /usr/lib/vmware/bin/vmkload_app –k 9 ####

AIX – Performance Tuning Standards

  • AIO Servers

Default Values:

Minservers = 2
Maxservers = 10
Maxrequests = 4096

Rule of Thumb for Oracle Database System:

maxserver = 300
minservers = 100
maxrequests = 8192

Command:   chdev -l aio0 –P -a maxservers=$MAX -a minservers=$MIN –a
maxreqs=8192

  • JFS Buffer-Cache

Default Values:

maxperm = 80%
minperm = 20%
strict_maxperm = 0

Rule of Thumb for Database System

DB is on FileSystem & Mounted as DIO

Strict_maxperm = 1
maxperm = 20%
minperm  = 5%

Rule of Thumb for system with > 2 Gbyte of RAM

Strict_maxperm = 1
maxperm = 20%
minperm  = 5%

Command: [AIX 5.2 and above] vmo –p –o maxclient%=20
Command: [AIX 5.2 and above] vmo –p –o strict_maxperm%=1
Command: [AIX 5.2 and above] vmo –p –o maxperm%=20
Command: [AIX 5.2 and above] vmo –p –o minperm%=5
Command: [AIX 5.1 and below] vmtune -p $MINPERM  -P $MAXPERM

To view all currently set values:  [AIX 5.2 and above] vmo –a
To view individual value:  [AIX 5.2 and above]  vmo –X maxclient%

  • Client File Pages [JFS2 Buffer Cache]

Default

maxclient = 80%
strict_maxclient = 1

Rule of Thumb for Database System

DB is on FileSystem & Mounted as DIO

maxclient = maxperm

Note: strict_maxclient by default is already turned on

Command: [AIX 5.1 and below] vmtune –t  $MAXCLIENT
Command: [AIX 5.2 and above] vmo –p –o maxclient%=20

  • Maxfree/Minfree Memory [Page Stealing]

Default  Values:

minfree = 120
maxfree = 128

Rule of Thumb for System

minfree = 120 * Quantity of CPU’s * Quantity of Memory Pools
maxfree = ( minfree + [maxpgahead ) * Quantity of CPU’s

Quantity of Memory Pools and maxpgahead values can be determined by executing: vmtune –a  and looking for the total memory pools value and maxpgahead.

Command: [AIX 5.1 and below] vmtune –f $MIN –F $MAX
Command: [AIX 5.2 and above] vmo –p –o maxperm%=20
Command: [AIX 5.2 and above] vmo –p –o minperm%=5

  • Fibre-Channel Device Settings (HBA)

Maximum I/O Transfer Size
Default Value
max_xfer_size = 0x100000  [ 1 MB ]

Maximum number of COMMANDS to queue to the adapter
Default Value
num_cmd_elems = 200

HBA Direct Memory Access transfer buffer
Default value
lg_term_dma = 0x200000 [ 2 MB ]

Rule of Thumb

max_xfer_size = 0x400000  [ 4 MB ]
num_cmd_elems = 512  ( 1024 if a FA is dedicated to that HBA )
lg_term_dma = 0x1000000 [  16 MB ]

Command: chdev –l fcs0 –P –a max_xfer_size=0x400000 –a num_cmd_elems=512
-a lg_term_dma=0x1000000

*Note, this would change the values for device fcs0, there might have multiple HBA’s [i.e. fcs1, fcs2, etc]

  • HDISK tuning – high i/o systems

On high I/O systems (like Data Warehouse), we set the following on each hdisk

Note:  hdisk can only be tuned while mount points are not mounted

Queue Depth
Default Value
queue_depth=8

Max transfer buffer
Default Value
max_transfer=

Rule of Thumb
queue_depth = [ 32 if disk is a 4 way meta ]  [ 64 if disk is a 8 way meta ]
max_transfer = 0x100000 [ 1 MB ]

The following command should be done with the hdisk? and hdiskpower? in defined state: The symptom of this problem is while attempting to add a disk to a volume group, you get a message like “extendvg: LTG must be less than or equal to max_transfer, blah, blah”

root # rmdev –l hdiskpower?
root # rmdev –l hdisk?

root # chdev –l hdiskpower? –P –a queue_depth=32 –a max_transfer=0x100000
root # cfgmgr

Note:  The –P flag on the chdev command allows you to make the change to the device’s characteristics permanently in the Customized Devices object class without actually changing the device. This is useful for devices that cannot be made unavailable and cannot be changed while in the available state.  In most cases, as in changing characteristics on a new disk, you would not use the –P flag.

Moving an LPAR to another frame

Someone asked me this and I couldnt’ explain it upfront..so whoever face it –

1.Have Storage zone the LPARs disk to the new HBA(s).  Also have them add an additional 40GB drive for the new boot disk.  By doing this we have a back out to the old boot disk on the old frame.

2. Collect data from the current LPAR:

a. Network information – Write down IP and ipv4 alias(s) for each interface

b. Run “oslevel –r”  – will need this when setting up NIM for the mksysb recovery

c. Is the LPAR running AIO, if so will need to configure after the mksysb recovery

d. Run “lspv”, save this output, contains volume group and PVID information

e. Any other customizations you deem neccessary

3. create mksysb backup of this LPAR

4. Reconfigure the NIM machine for this LPAR, with new Ethernet MAC address.  Foolproof method is to remove the machine and re-create it.

5. In NIM, configure the LPAR for a mksysb recovery.  Select the appropriate SPOT and LPP Source, base on “oslevel –r” data collected in step 2.

6. Shut down the LPAR on the old frame (Halt the LPAR)

7. Move network cables, fibre cables, disk, zoning

8. if needed, to the LPAR on the new frame

9. On the HMC, bring up the LPAR on the new frame in SMS mode and select a network boot.  Verify SMS profile has only a single HBA (if Clarrion attached, zoned to a single SP), otherwise the recovery will fail with a 554.

10. Follow prompts for building a new OS.  Select the new 40GB drive for the boot disk (use lspv info collected in Step 2 to identify the correct 40GB drive).  Leave defaults for remaining questions NO (shrink file systems, recover devices, and import volume groups).

11. After the LPAR has booted, from the console (the network interface may be down):

a. lspv Note the hdisk# of the bootdisk

b. bootlist –m normal –o Verify boot list is set – if not, set it

bootlist –m normal –o hdisk#

c. ifconfig en0 down If interface got configured, down it

d. ifconfig en0 detach and remove it

e. lsdev –Cc adapter Note Ethernet interfaces (ex. ent0, ent1)

f. rmdev –dl <en#> Remove all en devices

g. rmdev –dl <ent#> Remove all ent devices

h. cfgmgr Will rediscover the en/ent devices

i. chdev –l <ent#> -a media_speed=100_Full_Duplex Set on each interface unless

running GIG, leave defaults

j. Configure the network interfaces and aliases Use info recorded from step 2                             mktcpip –h <hostname> -a <IP> -m <netmask> -i <en#> -g <gateway> -A no –t N/A –s

chdev –l en# -a alias4=<alias IP>,<netmask>

k. Verify that the network is working.

12. If LPAR was running AIO (data collected in Step 2), verify it is running (smitty aio)

13. Check for any other customizations which may have been made on this LPAR

14. Vary on the volume groups,  use the “lspv” data collected in Step 2 to identify by PVID a hdisk in each volume group.  Run for each volume group:

a. importvg –y <vgname> hdisk# Will vary on all hdisk in the volume group

b. varyonvg <vgname>

c. mount all Verify mounts are good

15. Verify paging space is configured appropriately

a. lsps –a Look for Active and Auto set to yes

b. chps –ay pagingXX Run for each paging space, sets Auto

c. swapon /dev/pagingxx Run for each paging space, sets Active

16. Verify LPAR is running 64 bit

a. bootinfo –K If 64, you are good

b. ln –sf /usr/lib/boot/unix_64 /unix If 32, change to run 64 bit

c. ln –sf /usr/lib/boot/unix_64 /usr/lib/boot/unix

d. bosboot –ak /usr/lib/boot/unix_64

17. If LPAR has Power Path

a. Run “powermt config” Creates the powerpath0 device

b. Run “pprootdev on” Sets Power Path control of the boot disk

c. If Clariion, make configuration changes to enable SP failover

chdev -l powerpath0 -Pa QueueDepthAdj=1

chdev –l fcsX –Pa num_cmd_elems=2048 For each fiber adapter

chdev –l fscsiX –Pa fc_err_recov=fast_fail For each fiber adapter

d. Halt the LPAR

e. Activate the Normal profile If Sym/DMX – verify two HBA’s in profile

f. If Clarrion attached, have Storage add zone to 2nd SP

i. Run cfgmgr Configure the 2nd set of disk

g. Run “pprootdev fix” Put rootdisk pvid’s back on hdisk

h. lspv | grep rootvg Get boot disk hdisk#

i. bootlist –m normal –o hdisk# hdisk# Set the boot list with both hdisk

20. From the HMC, remove the LPAR profile from the old frame

21. Pull cables from the old LPAR (Ethernet and fiber), deactivate patch panel ports

22. Update documentation, Server Master, AIX Hardware spreadsheet, Patch Panel spreadsheet

23. Return the old boot disk to storage.

Boot Problem Management – Quick Guide

LED

User Action

553

Access the rootvg.  Issue ‘df –k’.  Check if /tmp, /usr or / are full.

553

Access the rootvg.  Check /etc/inittab (empty, missing or corrupt?).   Check /etc/environment.

551, 555, 557

Access the rootvg.  Re-create the BLV: 

# bosboot –ad /dev/hdiskx

551, 552, 554, 555, 556, 557

Access rootvg before mounting the rootvg filesystems.  Re-create the JFS log:

# logform /dev/hd8

Run fsck afterwards

552, 554, 556

Run fsck against all rootvg filesystems.  If fsck indicates errors (not an AIXV4 filesystem), repair the superblock  (each filesystem has two superblocks, one in logical block 1 and a copy in logical block 31, so copy block 31 to block 1)

# dd count=1 bs=4k skip-31 seek=1 if=/dev/hd4 of=/dev/hd4

551

Access rootvg and unlock the rootvg:

chvg –u rootvg

523 – 534

ODM files are missing or inaccessible.  Restore the missing files from a system backup.

518

Mount of /usr or /var failed? Check the /etc/filesystem.  Check network (remote mount)., filesystems (fsck) and hardware.

http://publib16.boulder.ibm.com/pseries/en_US/infocenter/base/ledsrch.htm

Ether Channel configuration

System requirements

Two network interfaces (ent0 & ent1) in “Available”  state.

1. Put in request to netdatacom team to activate two ports on patch-panel specifically for ether channel.

2. AIX 5.2 ML 03 (minimum requirement)

Procedure

Logon to system as root from console  —-  VERY IMPORTANT

Check available interfaces

lsdev –Cc adapter

Down and detach the interfaces that will be used for ether channel

ifconfig en0 down detach

ifconfig en1 down detach

Remove the devices from device list

rmdev –dl ent0

rmdev –dl ent1

rmdev –dl ent2

rmdev –dl en0

rmdev –dl en1

rmdev –dl en2

Bring back the devices to device list

cfgmgr

Set up the speed of NIC’s to 100Mbps Full Duplex

chdev –l ent0 –a media_speed=100_Full_Duplex

chdev –l ent1 –a media_speed=100_Full_Duplex

Setup interfaces for ether channel via SMITTY

smit etherchannel

 Add An EtherChannel / Link Aggregation

Select primary interface ent0

Move cursor to “Backup Adapter”  and hit “ESC –4 “ to list devices. (Please note F4 might not work on console).

Choose backup adapter  — ent1

Leave other values at default and hit “Enter”

Ether channel device ent2 will be created.

Do standard procedure for adding IP to interface en3.  Preferred way to do this is

smit mktcpip (and select en2 as the interface).