If FC adapter driver detects a lost link between storage and switch, it will wait a short period of time (15 seconds?) to allow fabric to stablize. If it detects no device at the time, it will start failling all I/Os.
The default settings of fc_err_recov is "delayed_fail" which is good for single path I/O especially with a single path to paging space.
For multiple paths, use "chdev -l fscsi0 -a fc_err_recov=fast_fail -P". (Note A: set to fast_fail my decrease the I/O fail time due to link loss between storage and switch and allow faster failover to alternate paths.)(Note B:fast_fail must in switch environment and also support certain FW level of adapter)
Dynamic Tracking
With dyntrx is enable, if FC N_port ID is changed, the fc adapter drive will re-route the traffic for that device to new address when the device is still online. Example, moving cable from one switch port to another will cause N_port ID change; or ISL.
chdev -l fscsi0 -a dyntrk=yes -P (default is no)
Requirement:
- device PWWN and NWWN must remain constant.
- device will only track from the same HBA.
- no track while AIX system dump in progress.
fast_fail and dyntrk are seprate features, however, change one of them might change the other. The following are different situation of these two attributes:
1. fc_err_recv=delayed_fail and dyntrk=no
Default setting. FC drivers won't recover if device change, I/Os takes longer to fail. Desirble for sing-path environement.
2. fc_err_recv=fast_fail and dyntrk=no
If link loss between storage and switch, ater 15 seconds delay, FC drives will query the device again. If no device, I/O flush back to adapter and I/O fail; if has device but scsi_id has changed, the FC driver won't recovery.(I/O failed with PERM errors)
3. fc_err_recv=delayed_fail and dyntrk=yes
If link loss between storage and switch, ater 15 seconds delay, FC drives will query the device again. If no device, I/O flush back to adapter and I/O fail; if has device but scsi_id has changed,the FC driver will re-route traffic to new scsi_id.(compare to 4,it will increase I/O failure time,with high I/O traffic, the difference is more noticeable)
4. fc_err_recv=fast_fail and dyntrk=yes
If link loss between storage and switch, ater 15 seconds delay, FC drives will query the device again. If no device, I/O flush back to adapter and I/O fail; if has device but scsi_id has changed,the FC driver will re-route traffic to new scsi_id.
With dyntrk disable, there is a big difference between "fast_fail" and "delayed_fail"; with dyntrk enable, there is less differnece. This is because there is some overlap in dyntrk and fc_err_rec.
No comments:
Post a Comment