Patch-ID# 102580-05 Keywords: KAIO i18n HA metacvt RAID hotspares miscellaneous diskset presto Synopsis: Solstice DiskSuite 4.0: Jumbo patch Date: Dec/06/95 Solaris Release: 2.3, 2.4, or later SunOS Release: 5.3, 5.4, or later Unbundled Product: Solstice DiskSuite Unbundled Release: 4.0 Relevant Architectures: sparc x86 BugId's fixed with this patch: 1204343 1197922 1200302 1203186 1200689 1203565 1205220 1200486 1200301 1205423 1199594 1196387 1204530 1207664 1203768 1204341 1205314 1209464 1210917 1211366 1209002 1206419 1218675 1218867 1219078 1219587 1219973 1222845 1208206 1220284 1223225 1223655 1224730 1223725 1226416 Changes incorporated in this version: 1220284 1223225 1223655 1224730 1223725 1226416 Patches accumulated and obsoleted by this patch: Patches which conflict with this patch: Patches required with this patch: 101945-32 or later rev for KAIO on SPARC Solaris 2.4 KAIO is not supported on Solaris 2.4 x86 at this time. Obsoleted by: DiskSuite 4.1 or later Files included with this patch: /kernel/drv/md /kernel/misc/md_hotspares /kernel/misc/md_mirror /kernel/misc/md_raid /kernel/misc/md_stripe /kernel/misc/md_trans /usr/lib/drv/preen_md.so.1 /usr/opt/SUNWmd/locale/C/LC_MESSAGES/SUNWmd.po /usr/opt/SUNWmd/sbin/metaclear /usr/opt/SUNWmd/sbin/metadb /usr/opt/SUNWmd/sbin/metadetach /usr/opt/SUNWmd/sbin/metahs /usr/opt/SUNWmd/sbin/metainit /usr/opt/SUNWmd/sbin/metaoffline /usr/opt/SUNWmd/sbin/metaonline /usr/opt/SUNWmd/sbin/metaparam /usr/opt/SUNWmd/sbin/metareplace /usr/opt/SUNWmd/sbin/metaroot /usr/opt/SUNWmd/sbin/metaset /usr/opt/SUNWmd/sbin/metastat /usr/opt/SUNWmd/sbin/metasync /usr/opt/SUNWmd/sbin/metattach /usr/opt/SUNWmd/sbin/rpc.metad /usr/opt/SUNWmd/sbin/rpc.metamhd /usr/opt/SUNWmd/lib/X11/uid/Metatool/physicalview.uid /usr/opt/SUNWmd/sbin/metatool Not installed, but included: metacvt metacvt.1m This conversion utility should be used in place of the one shipped on the FCS distribution. See bugIDs 1203768, 1210917, and 1211366 below. Problem Description: Bugs corrected in revision 05: 1220284 - If half or more of the disks in a metaset fail, the data is lost after reboot 1223225 - cannot take ownership of a diskset if a mirror cannot be opened 1223655 - md does not check for FWRITE on open if databases are stale 1226416 - set remains stale after replicas are fixed Recovery from catastrophic diskset failure (half or more of the disks or database replicas inaccessible) was not possible, short of bringing the disks back or re-creating the diskset. This fix allows the user to take the diskset, with the new "-M" flag, delete replicas down to a majority, release the diskset, re-take the diskset, and repair the damage. 1224730 - Mirror resync region process can deadlock System failure during the after reboot or takeover resync could deadlock the system on the next reboot or diskset takeover. 1223725 - Incorrect memory allocation under heavy system load I/O an become serialized (slow) under some heavy load conditions. Bugs corrected in revision 04: 1222845 - metatool patch 3 can not be installed if /usr is under DiskSuite control. A bug in installpatch prevented installation of the DiskSuite jumbo patch if root or /usr was mounted on a metadevice. 1208206 - pr: Combination of Prestoserve and ODS highly unreliable on filesystem recovery. The following workaround may be used: The solution is to not load and enable prestoserve until the ODS metadisk is enabled. This may be done as follows: 1. Edit the /etc/system file. Add the line exclude: drv/pr This will keep the Prestoserve driver from being loaded early in boot. 2. Edit the /etc/init.d/SUNWmd.init file and add the following lines to the the end of the "start" case: 'start') rm -f /tmp/.mdlock if [ -x "$METAINIT" -a -c "$METADEV" ]; then #echo "$METAINIT -r" $METAINIT -r error=$? #echo "$error" case "$error" in 0|1) ;; 66) echo "Insufficient metadevice database replicas" echo "located. Use metadb to delete databases which" echo "are no longer in existance. Exit the shell" echo "when done to continue the boot process." /sbin/sulogin < /dev/console echo "Resuming system initialization." ;; *) echo "Unknown $METAINIT -r failure $error." ;; esac modload /kernel/drv/pr presto -p >/dev/null fi ;; This will cause the Prestoserve driver to be loaded after metadevice initialization, and flush out any data. 3. Edit the /etc/init.d/prestoserve file and replace the following line: presto -u with the following line: presto -u /file_system1 /file_system2 ....... /file_systemN Where /file_system1,/file_system2 ....... /file_systemN should be a list of every filesystem to be accelerated with Prestoserve. The list must not include /, /usr usr/kvm, /var, or /var/adm. Bugs corrected in revision 03: 1218675 - ODS mirroring doesn't work on metadevices build up from slices bigger than 4 GB Mirroring concatenations with components bigger than 4 GB hangs the driver if component fails. 1218867 - Raid-5, unrecoverable after system crash, metareplace didn't reset state Losing all of the disks in a RAID-5 metadevice can lead to time consuming repair. Raid has been changed to support raid devices built on single controller. This is mainly for users of SSA-100 and SSA-200, who are placing the entire raid device on a single SSA. The change involves LAST ERRORED devices. Last errored is the condition that occurs when more than a single slice fails in a raid device. When this occurs, writes are disallowed and opens return ENXIO. The raid device must be repaired starting with the slice that is in ERRORED or maintenance state. The LAST ERRORED condition is no longer persistent. Last errored is cleared at each open or mount of the raid device. In the event that a slice really LAST ERRORED the error will occur again. This handles the situation of all the slices in a raid failing. Last errored will be cleared at the next use, such as open or mount. The change effects two areas of operation. The first is a device that does not operate at a boot. The second is when a controller, cable, or SSA fails during operation. Where a controller fails to operate after a boot, any attempt to open or mount the raid device results in no change of state, and an error being logged, indicating that the open failed. As before, the open or mount will return ENXIO. When a controller fails during use, a single slice in the raid will become ERRORED, and all others will become LAST ERRORED. Once the problem is corrected, the next mount or open all slices in LAST ERRORED will return to OKAY. Only one slice, the ERRORED one, will need to be enabled by metareplace -e (1m). Since data integrity is only guaranteed for the worst case of a single slice failing, raid metadevices become read only if more than one errors occur. This change will cause the second error to be lost at the next mount or open. This is an unfortunate side effect that can be checked for by doing a metastat(1m) before mount or open. If a slice is LAST ERRORED, it is necessary to take corrective action. 1219078 - If a set becomes stale (50% replica loss) all metadevices become read-only Losing more than half of the replicas on a shared diskset can cause all disksets (including "local" metadevices) to to become read-only. 1219587 - metareplace -e cNtNdNsN on offline submirror brings submirror online w/o resync The user was not prevented from replacing offlined disks. The correct repair method is to replace errored disks, or to detach the submirror, replace the disk, and re-attach the submirror. 1219973 - metaset(1M) does not always balance replicas properly on SSA's The database replica balancing mechanism chooses bad configurations if disks in a diskset are not spread evenly across all controllers. Bugs corrected in revision 02: 1210917 - S94SUNWmd.cvt does not remove itself when root is not a metadevice Using metacvt to upgrade Online: DiskSuite revisions 2.0 - 3.0 can fail if the root filesystem is not on a metadevice. 1211366 - Description of unavailable option, -f, is left undeleted in manpage/metacvt.1m. Usage message prolem. 1209002 - Hot sparing for RAID does not happen if all hotspares are labelled RAID devices do not use labelled (slice starting at cylinder 0) hotspares. 1206419 - Bad message is used for popup title of "Slice Remove Error". Internationalization fix for Japan localization. Bugs corrected in revision 01: 1204343 - KAIO patch for 494 exists, SDS needs to keep up, KAIO no build 494 Performance improvement for users of libaio. Requires Solaris 2.4 patch 101945-27 and 102020-05 or later for SPARC. KAIO for Solaris 2.4 x86 is not supported at this time. 1197922 - Sets: clearing a stripe that is in a set and has a hotspare hangs Clearing some stripes which have been hotspared may hang or panic the system in some circumstances. 1200302 - Problem for showing Component lists which is displayed on Drop site of Disk View. 1203186 - L10Ned message for "Show Trans..." in the Mirror Information can not be seen. 1200689 - Device name for "stripe x of dx" is not messaged. 1203565 - Additional error messages needing to be in I18N format 1205220 - "OK" is not good button name. 1200486 - Problem List dumps core when selecting [File] -> [Log to file...] 1200301 - A message related label named FROM_STRING for Size filter is not CATALOGED. 1205423 - Stripe number does not displayed correctly on Concat Information window. Internationalization fixes for Japan localization. 1199594 - RAID SETs - if an error occurs on a raid in a large set the system panics RAID metadevices in disksets may panic the system if a component fails. 1196387 - RAID: attach device fails, replace successful, metadeives stays in maintenance Recovery from a failed attach, while successful, can leave the device in a confusing state. 1204530 - Trans device showing error status if subcomponent is ok Confusing UFS logging device status. 1207664 - Making a slice in a one way mirror makes the tool almost unusable Attaching 2nd submirror when 1st submirror is unreadable can corrupt metadevice state database. 1203768 - metacvt cannot handle system fs's that the master trans device is a mirror. Using metacvt to upgrade Online: DiskSuite revisions 2.0 - 3.0 can fail if system (used by operating system install) filesystems are on UFS logging devices which have mirrored subdevices. 1204341 - dropping a concat into a toplevel concat causes a SEGV 1205314 - support SSA200 in the GUI The graphical user interface, metatool, displays a SPARCstorage Array model 200 with 3 trays of 2 busses each, instead of 6 trays of 1 bus each. 1209464 - metainit ordering dependency in md.tab metainit -a will fail if the metadevice hierarchy is deeper than 3 levels. This commonly occurs when a UFS logging device has a master subdevice consisting of a mirror with stripes with hotspare pools. The metacvt conversion script may not work for metadevices such as this too. Patch Installation Instructions: -------------------------------- Generic 'installpatch' and 'backoutpatch' scripts are provided within each patch package with instructions appended to this section. Other specific or unique installation instructions may also be necessary and should be described below. Special Install Instructions: ----------------------------- The metacvt script is not installed by this patch. Users of this script may copy it out of the patch and use it according to the directions for the original metacvt script in the FCS product. Instructions to install patch using "installpatch" -------------------------------------------------- 1. Become super-user. 2. Apply the patch by typing:
.
See /tmp/log. for reason for failure.
Explanation and recommended action: The installation of one of
patch packages failed. Installpatch will backout the patch
to leave the system in its pre-patched state. See the log file
for the reason for failure. Correct the problem and
re-apply the patch.
Error message:
Pkgadd of package failed with error code .
Will not backout patch...patch re-installation.
Warning: The system may be in an unstable state!
See /tmp/log. for reason for failure.
Explanation and recommended action: The installation of one of
the patch packages failed. Installpatch will NOT backout the
patch. You may manually backout the patch using backoutpatch,
then re-apply the entire patch. Look in the log file for the
reason pkgadd failed. Correct the problem and re-apply the
patch.
Patch Installation Messages:
---------------------------
Note: the messages listed below are not necessarily considered errors
as indicated in the explanations given. These messages are, however,
recorded in the patch installation log for diagnostic reference.
Message:
Package not patched:
PKG=SUNxxxx
Original package not installed
Explanation: One of the components of the patch would have patched a
package that is not installed on your system. This is not
necessarily an error. A Patch may fix a related bug for several
packages. Example: suppose a patch fixes a bug in both the
online-backup and fddi packages. If you had online-backup installed
but didn't have fddi installed, you would get the message
Package not patched:
PKG=SUNWbf
Original package not installed
This message only indicates an error if you thought the package
was installed on your system. If this is the case, take the
necessary action to install the package, backout the patch (if
it installed other packages) and re-install the patch.
Message:
Package not patched:
PKG=SUNxxx
ARCH=xxxxxxx
VERSION=xxxxxxx
Architecture mismatch
Explanation: One of the components of the patch would have patched a
package for an architecture different from your system. This is not
necessarily an error. Any patch to one of the architecture specific
packages may contain one element for each of the possible
architectures. For example, Assume you are running on a sun4m. If
you were to install a patch to package SUNWcar, you would see the
following (or similar) messages:
Package not patched:
PKG=SUNWcar
ARCH=sparc.sun4c
VERSION=11.5.0,REV=2.0.18
Architecture mismatch
Package not patched:
PKG=SUNWcar
ARCH=sparc.sun4d
VERSION=11.5.0,REV=2.0.18
Architecture mismatch
Package not patched:
PKG=SUNWcar
ARCH=sparc.sun4e
VERSION=11.5.0,REV=2.0.18
Architecture mismatch
Package not patched:
PKG=SUNWcar
ARCH=sparc.sun4
VERSION=11.5.0,REV=2.0.18
Architecture mismatch
The only time these messages indicate an error condition
is if installpatch does not correctly recognize your architecture.
Message:
Package not patched:
PKG=SUNxxxx
ARCH=xxxx
VERSION=xxxxxxx
Version mismatch
Explanation: The version of software to which the patch is applied is
not installed on your system. For example, if you were running Solaris
5.3, and you tried to install a patch against Solaris 5.2, you would
see the following (or similar) message:
Package not patched:
PKG=SUNWcsu
ARCH=sparc
VERSION=10.0.2
Version mismatch
This message does not necessarily indicate an error. If
the version mismatch was for a package you needed patched, either
get the correct patch version or install the correct package version.
Then backout the patch (if necessary) and re-apply.
Message:
Re-installing Patch.
Explanation: The patch has already been applied, but there is
at least one package in the patch that could be added. For
example, if you applied a patch that had both Openwindows and
Answerbook components, but your system did not have Answerbook
installed, the Answerbook parts of the patch would not have
been applied. If, at a later time, you pkgadd Answerbook, you
could re-apply the patch, and the Answerbook components of the
patch would be applied to the system.
Message:
Installpatch Interrupted.
Installpatch is terminating.
Explanation: Installpatch was interrupted during execution
(usually through pressing ^C). Installpatch will clean up
its working files and exit.
Message:
Installpatch Interrupted.
Backing out Patch...
Explanation: Installpatch was interrupted during execution
(usually through pressing ^C). Installpatch will clean up
its working files, backout the patch, and exit.
Patch Backout Errors:
---------------------
Error message:
prebackout patch exited with return code .
Backoutpatch exiting.
Explanation and corrective action: the prebackout script
supplied with the patch exited with a return code other
than 0. Generate a script trace of backoutpatch to determine
why the prebackout script failed. Correct the reason for
failure, and re-execute backoutpatch.
Error message:
postbackout patch exited with return code .
Backoutpatch exiting."
Explanation and corrective action: the postbackout script
supplied with the patch exited with a return code other than
0. Look at the postbackout script to determine why it failed.
Correct the failure and, if necessary, RE-EXECUTE THE
POSTBACKOUT SCRIPT ONLY.
Error message:
Only one service may be defined.
Explanation and corrective action: You have attempted to specify
more than one service from which to backout a patch. Different
services must have their patches backed out with different
invocations of backoutpatch.
Error message:
The -S and -R arguments are mutually exclusive.
Explanation and recommended action: You have specified both a
non-native service to backout, and a package installation root.
These two arguments are mutually exclusive. If backing out a
patch from a non-native usr partition, the -S option should be
used. If backing out a patch from a client's root
partition (either native or non-native), the -R option
should be used.
Error message:
The service cannot be found on this system.
Explanation and recommended action: You have specified a non-
native service from which to backout a patch, but the
specified service is not installed on your system. Correctly
specify the service when backing out the patch.
Error message:
Only one rootdir may be defined.
Explanation and recommended action: You have specified more than
one package install root using the -R option. The -R option
may be used only once per invocation of backoutpatch.
Error message:
The directory cannot be found on this system.
Explanation and recommended action: You have specified a
directory using the -R option which is either not mounted,
or does not exist on your system. Verify the directory name
and re-backout the patch.
Error message:
Patch has not been successfully applied to this system.
Explanation and recommended action: You have attempted to backout
a patch that is not applied to this system. If you must
restore previous versions of patched files, you may have to
restore the original files from the initial installation CD.
Error message:
Patch has not been successfully applied to this system.
Will remove directory
Explanation and recommended action: You have attempted to back
out a patch that is not applied to this system. While the
patch has not been applied, a residual
/var/sadm/patch/ (perhaps from an unsuccessful
installpatch) directory still exists. The patch cannot be
backed out. If you must restore old versions of the patched
files, you may have to restore them from the initial
installation CD.
Error message:
This patch was obsoleted by patch .
Patches must be backed out in the order in
which they were installed. Patch backout aborted.
Explanation and recommended action: You are attempting to backout
patches out of order. Patches should never be backed-out out
of sequence. This could undermine the integrity of the more
current patch.
Error message:
Patch was installed without backing up the original
files. It cannot be backed out.
Explanation and recommended action: Either the -d option of
installpatch was set when the patch was applied, or the save
area of the patch was deleted to regain space. As a result, the
original files are not saved and backoutpatch cannot be used.
The original files can only be recovered from the original
installation CD.
Error message:
pkgrm of package failed return code .
See /var/sadm/patch//log for reason for failure.
Explanation and recommended action: The removal of one of
patch packages failed. See the log file for the reason for
failure. Correct the problem and run the backout script again.
Error message:
Restore of old files failed.
Explanation and recommended action: The backout script uses the
cpio command to restore the previous versions of the files
that were patched. The output of the cpio command should
have preceded this message. The user should take the
appropriate action to correct the cpio failure.
KNOWN PROBLEMS:
On client server machines the patch package is NOT applied
to existing clients or to the client root template space.
Therefore, when appropriate, ALL CLIENT MACHINES WILL NEED
THE PATCH APPLIED DIRECTLY USING THIS SAME INSTALLPATCH
METHOD ON THE CLIENT. See instructions above for
applying patches to a client.
A bug affecting a package utility (eg. pkgadd, pkgrm, pkgchk)
could affect the reliability of installpatch or backoutpatch
which uses package utilities to install and backout the patch
package. It is recommended that any patch that fixes package
utility problems be reviewed and, if necessary, applied before
other patches are applied. Such existing patches are:
100901 Solaris 2.1
101122 Solaris 2.2
101331 Solaris 2.3
SEE ALSO
pkgadd, pkgchk, pkgrm, pkginfo, showrev, cpio