Patch-ID# 101318-94 Keywords: security y2000 kernel libc lockd libaio automountd sockmod ypbind NFS Synopsis: SunOS 5.3: Kernel Update (includes libc, lockd) Date: Feb/06/2001 Solaris Release: 2.3 SunOS Release: 5.3 Unbundled Product: Unbundled Release: Xref: Topic: SunOS 5.3: Kernel Update (includes libc, lockd) NOTE: Refer to Special Install Instructions section for IMPORTANT specific information on this patch. Relevant Architectures: sparc BugId's fixed with this patch: 1088703 1091548 1097418 1098381 1107880 1108615 1113339 1117303 1120225 1121069 1122992 1123140 1123788 1123876 1124354 1124745 1125134 1129167 1130721 1130726 1130791 1131237 1132086 1132302 1132554 1134617 1135394 1136034 1136580 1136864 1137581 1137587 1137670 1137798 1137978 1138196 1138207 1138924 1139124 1139146 1139327 1139493 1139753 1139765 1140047 1140209 1140378 1140503 1140610 1140626 1140802 1141072 1141642 1141654 1142151 1142365 1142479 1142583 1142622 1142662 1142882 1143043 1143231 1143439 1143479 1143567 1143962 1144086 1144228 1144308 1144536 1144683 1144765 1144922 1144962 1145129 1145242 1145401 1145415 1145421 1145457 1145471 1145542 1145573 1145617 1145661 1145746 1145757 1146065 1146159 1146534 1146549 1146597 1146726 1146790 1146808 1146840 1146912 1146922 1146924 1146985 1147165 1147226 1147349 1147620 1147647 1147964 1147977 1148003 1148354 1148668 1148689 1149088 1149105 1149399 1149458 1149518 1149714 1149774 1149928 1149929 1150058 1150304 1150306 1150417 1150491 1150596 1150613 1150765 1151044 1151137 1151159 1151192 1151509 1151592 1151598 1151619 1151643 1151704 1151883 1151965 1151999 1152033 1152150 1152168 1152251 1152282 1152410 1152482 1152682 1152710 1152922 1152960 1152977 1152995 1153024 1153051 1153178 1153229 1153241 1153253 1153274 1153324 1153458 1153707 1153790 1153911 1154060 1154325 1154452 1154502 1154515 1154770 1154975 1155130 1155136 1155298 1155505 1155515 1155701 1155803 1155948 1155951 1156103 1156132 1156518 1156550 1156556 1156649 1156947 1157047 1157053 1157062 1157110 1157265 1157267 1157460 1157463 1157524 1157935 1157978 1157990 1158000 1158215 1158398 1158568 1158638 1158639 1158674 1159152 1159160 1159248 1159330 1159347 1159439 1159691 1159757 1159825 1159882 1159986 1160068 1160087 1160090 1160112 1160151 1160181 1160207 1160379 1160662 1160681 1160720 1161359 1161404 1161525 1162202 1162269 1162277 1162475 1162623 1162712 1162834 1163167 1163170 1163275 1163312 1163346 1163347 1163352 1163355 1163357 1163360 1163445 1163533 1163551 1163617 1163741 1163747 1163776 1163847 1163944 1163946 1164156 1164319 1164428 1164504 1164519 1164554 1164569 1164800 1164926 1165014 1165247 1165250 1165270 1165413 1165615 1165649 1165675 1165687 1165689 1165736 1165902 1165987 1166349 1166581 1166629 1166632 1166712 1166779 1166848 1166933 1167154 1167235 1167439 1167485 1167500 1167602 1167647 1168055 1168083 1168167 1168240 1168331 1168365 1168635 1168672 1169003 1169109 1169132 1169257 1169391 1169424 1169590 1169640 1169686 1169775 1169823 1169904 1169909 1169914 1169945 1170036 1170038 1170091 1170233 1170350 1170407 1170488 1170507 1170527 1170544 1170669 1170814 1170832 1171219 1171363 1171599 1171609 1171613 1171729 1171745 1171833 1171950 1172009 1172118 1172155 1172243 1172260 1172320 1172438 1172644 1172731 1172926 1172979 1172998 1173079 1173201 1173212 1173279 1173731 1173939 1173973 1174222 1174270 1174303 1174494 1174552 1174572 1174767 1174786 1174847 1174851 1174913 1174992 1175115 1175127 1175304 1175356 1175368 1175499 1175668 1175968 1176247 1176350 1176508 1177055 1177469 1177578 1177620 1177644 1178114 1178190 1178236 1178295 1178363 1178379 1178391 1178400 1178641 1178753 1178761 1178810 1178824 1178957 1178999 1179173 1179346 1179480 1179526 1179695 1179738 1179814 1179884 1180414 1180578 1180720 1181201 1181258 1181259 1181327 1181571 1182113 1182194 1182428 1182440 1182458 1182492 1182506 1182509 1182597 1182686 1182937 1183215 1183260 1183552 1183568 1183662 1183714 1183749 1183837 1183927 1184256 1184636 1184788 1184991 1185149 1185694 1186156 1186287 1186303 1186420 1186439 1186557 1186845 1186920 1187179 1187350 1187536 1187901 1188044 1188367 1188551 1188701 1189271 1189329 1189968 1191422 1192238 1192309 1192982 1193147 1193448 1194397 1194673 1195422 1195436 1195996 1196670 1196741 1197816 1197871 1198278 1198439 1199256 1199500 1200502 1200512 1201926 1202675 1203291 1204638 1204871 1205614 1205757 1207568 1208034 1209012 1209014 1209687 1209917 1210770 1210995 1211172 1211537 1212068 1212974 1213871 1214057 1214251 1216036 1216540 1217220 1217231 1217312 1217941 1219671 1219766 1220257 1222599 1222780 1223452 1223900 1224486 1224604 1226516 1226919 1227031 1228963 1229031 1232010 1233088 1234630 1234879 1236638 1236801 1237027 1238582 1239343 1241118 1241369 1241944 1242395 1243116 1244589 1244872 1244917 1244971 1246630 1248840 1249903 1252967 1253223 1255536 1258379 1258916 1259200 1260589 1261016 1263728 1267082 3001400 4010565 4011648 4018985 4028300 4029971 4032974 4034868 4036063 4043953 4045229 4045268 4045941 4050818 4053189 4054980 4057606 4057738 4059632 4073081 4091935 4095455 4135388 4135970 4139126 4144921 4157655 4165597 4167968 4175558 4190645 4242224 4261612 4295834 4296198 4303194 Changes incorporated in this version: 4057738 4295834 4303194 4296198 Patches accumulated and obsoleted by this patch: 101267-01 101294-02 101315-01 101316-02 101319-02 101326-01 101329-16 101344-11 101346-03 101349-01 101378-21 101379-02 101406-01 101411-04 101484-03 101485-01 101489-04 101500-04 101597-02 101615-02 101637-01 101672-01 101674-01 101694-01 101831-01 101855-02 101859-01 101869-01 101881-01 102110-01 102168-01 102220-03 102445-01 103063-01 Patches which conflict with this patch: Patches required with this patch: Obsoleted by: Files included with this patch: /etc/default/utmpd /etc/fs/nfs/mount /etc/init.d/rpc /etc/init.d/utmpd /etc/lib/unix_scheme.so.1 /etc/rc0.d/K50utmpd /etc/rc0.d/K68rpc /etc/rc1.d/K50utmpd /etc/rc1.d/K67rpc /etc/rc2.d/S71rpc /etc/rc2.d/S88utmpd /etc/ttysrch /kadb /kernel/drv/arp /kernel/drv/cgfourteen /kernel/drv/cgsix.conf /kernel/drv/clone /kernel/drv/cn /kernel/drv/esp /kernel/drv/icmp /kernel/drv/ip /kernel/drv/isp /kernel/drv/log /kernel/drv/mm /kernel/drv/rootnex /kernel/drv/sd /kernel/drv/sx /kernel/drv/sx_cmem /kernel/drv/tcl /kernel/drv/tco /kernel/drv/tcoo /kernel/drv/tcp /kernel/drv/udp /kernel/drv/zs /kernel/exec/aoutexec /kernel/exec/elfexec /kernel/fs/autofs /kernel/fs/lofs /kernel/fs/nfs /kernel/fs/procfs /kernel/fs/tmpfs /kernel/fs/ufs /kernel/misc/seg_drv /kernel/sched/TS /kernel/strmod/arp /kernel/strmod/ldterm /kernel/strmod/ptem /kernel/strmod/sockmod /kernel/strmod/tco /kernel/strmod/tcoo /kernel/strmod/timod /kernel/sys/c2audit /kernel/sys/nfs /kernel/sys/shmsys /kernel/unix /sbin/init /sbin/su /sbin/sulogin /usr/4lib/libc.so.1.8 /usr/4lib/libc.so.2.8 /usr/bin/chkey /usr/bin/finger /usr/bin/nisaddcred /usr/bin/nischgrp /usr/bin/nischmod /usr/bin/nischown /usr/bin/nischttl /usr/bin/nisgrpadm /usr/bin/nistbladm /usr/bin/su /usr/bin/uptime /usr/bin/w /usr/include/nfs/nfs_clnt.h /usr/include/rpcsvc/nis_tags.h /usr/include/sys/ddi_impldefs.h /usr/include/sys/errno.h /usr/include/sys/proc.h /usr/include/sys/scsi/targets/sddef.h /usr/include/sys/ser_sync.h /usr/include/sys/sleepq.h /usr/include/sys/socket.h /usr/include/sys/t_lock.h /usr/include/sys/vnode.h /usr/kernel/sched/RT /usr/kvm/adb /usr/kvm/crash /usr/kvm/lib/adb/page /usr/kvm/lib/adb/pid2proc /usr/kvm/lib/adb/setproc /usr/kvm/prtdiag /usr/lib/autofs/automountd /usr/lib/fs/autofs/automount /usr/lib/fs/nfs/mount /usr/lib/fs/nfs/share /usr/lib/fs/nfs/umount /usr/lib/libaio.so.1 /usr/lib/libauth.a /usr/lib/libauth.so.1 /usr/lib/libbsm.a /usr/lib/libbsm.so.1 /usr/lib/libc.a /usr/lib/libc.so.1 /usr/lib/libnisdb.a /usr/lib/libnisdb.so.1 /usr/lib/libnsl.a /usr/lib/libnsl.so.1 /usr/lib/libp/libc.a /usr/lib/libsocket.a /usr/lib/libsocket.so.1 /usr/lib/libthread.so.1 /usr/lib/libthread_db.a /usr/lib/libthread_db.so.0 /usr/lib/netsvc/rusers/rpc.rusersd /usr/lib/netsvc/yp/ypbind /usr/lib/nfs/lockd /usr/lib/nis/nisaddent /usr/lib/nis/nisserver /usr/lib/pics/libc_pic.a /usr/lib/security/unix_scheme.so.1 /usr/lib/straddr.so.2 /usr/lib/utmp_update /usr/lib/utmpd /usr/sbin/newkey /usr/sbin/nis_cachemgr /usr/sbin/nislog /usr/sbin/pwconv /usr/sbin/rpc.nisd /usr/sbin/static/rcp /usr/sbin/syslogd /usr/sbin/wall /usr/ucb/users /usr/ucblib/libucb.a /usr/ucblib/libucb.so.1 Problem Description: 4057738 temporary filename security exploits 4303194 nfs client does not send an NLM UNLOCK request when client app exits 4295834 NETPATH security problem in libnsl 4296198 NIS_OPTIONS sh vars (libnsl) security problem (from 101318-93) 4175558 TZ=GMT0BST-1,M3.5.0/2:00,M10.5.0/2:00 breaks 6 times from now to 2037 4190645 Y2000 Problem in libc in function posixgetdst - Backport of 4152473 4261612 profil not disabled on exec*() as indicated in man page 4242224 memory leak in IP (from 101318-92) 4165597 getdate currently supports years 1970 and above which is a violation of x/open standards. This fix will eliminate that limitation to allow usage of years 1902 and above. This fix is a backport of 4050856 & 4036732. 4167968 su - can create corrupted environment - backport of 1214794 4157655 two buffer overflows exist in libauth 1212068 panic ( deadlock condition ) occurs on the Patch Id:T101318-74 system. (from 101318-91) 4144921 auditd fails to log all events during bulk audit generation 4139126 libnsl buffer overflows 4135970 usr/lib/utmpd is not in 101318-90 4135388 rpc.nisd buffer overflow 4095455 automounter security problem 4073081 SUNWcsr/install/postinstall & postremove scripts missing 1236638 *passwd* shadow file occasionally gets deleted in large user environment (from 101318-90) 4059632 kernel watchdog resets with misaligned stack 4045229 strptime and getdate year calculation not count century; strptime range checks 4050818 getdate %C (century) should use current year offset if year offset not given 4091935 bcp /usr/4lib/libc mktime() fails for specific -ve values in tm structure 4045941 bcp /usr/4lib/libc mktime() doesn't care leap year. (from 101318-89) 4053189 chkey and newkey has buffer overflow 4045268 nis_cachemgr does not verify authenticity of objects 4057606 out of domain NIS+ lookups don't work after applying fix for 4045268 4043953 kernel randomly paniced with assertion failure in callout.c, line 345 4032974 system hangs when lbolt wraps around. 1226919 ping -sv -i 127.0.0.1 224.0.0.1 causes a panic 1187901 Process hung in nanosleep 4010565 su can be interrupted by and not logged in /var/adm/log (from 101318-88) 1267082 nfs mount command should ignore bad options and complete mount 4036063 security problem with writing core files 4054980 w(1) command not year 2000 safe in Solaris 2.3 (from 101318-87) 4034868 Security hole: buffer overflow in bin password. get the effect uid of 0 ( root ) 4032974 system hangs when lbolt wraps around. 1238582 privileged ifconfig ioctls by normal user succeed on sockets created as root 1233088 ioctl(PIOCPSINFO) is 100 times too slow on multi-threaded processes (from 101318-86) 4029971 getopt security problem 4028300 automounter security hole 4018985 Function authdes_getucred is not in libnsl.so.1 1261016 patch 101318-78 breaks multi-threaded apps 1212974 Bogus bootparam packet makes rpcbind stop working (from 101318-85) 1252967 2.3 NFS server can not handle the locking state correctly. (from 101318-84) 1255536 2.5 data fault panic in tdiraddentry accessing tmpfs 1259200 no more syslog from rpc.nisd after the fix for 1244917 in T101318-80 1249903 rpc.nisd hung in nis_list_svc on getmsg in _rcv_conn_con 4011648 Fix for bug 1248840 introduces performance degradation in rsh (from 101318-83) 1263728 Installed latest Kernel jumbo patch and now machine panics 1258379 2.3 Data fault in kmem_alloc() w/T101318-80 1258916 nis_cachemgr causing other many processes to hang in semop 1226516 Booting over the net causes the machine to panic or watchdog sometimes 1223900 alarm(2) doesn't work properly with large arguments 1211537 mxcc parity check disabling is incorrect in sun4m kernel 1182440 Some clients dump core & 5.3 lockd server hangs after heavy locking/unlocking 1142151 rpc.lockd can core dumps after going through reclaim 1141072 lockd accepts reclaim requests after grace period has expired 1241118 libthread panic in thr_join handling of zombie threads seems to be broken (from 101318-82) 1260589 syslogd only accepts messages sent to primary host name 1253223 System running 2.3 with KJP-80 on single CPU /24MB hangs in fork test case 1248840 solaris 2.3,sc2000, TCP socket can't handle FIN pkt from client surely, deadlock 1244971 solaris 2.3, patch 101318-77 has a bug, it can't handle `boot -s` correctly. 1239343 Threaded application dumped core with multi cpu machine. 1237027 If using sendmail V8 and loading aliases from YP nisaddent deletes all aliases. 1219671 Memory is given free which was never allocated before. 1175499 repeated getdate calls leaks memory 1244589 T101398-08 can cause system hang during ddi_getlongprop() on particular prom (from 101318-81) 1246630 nisd can potentially hang if it gets a SIGCHLD/SIGHUP on an established callback 1244872 nis_cachemgr can deadlock when servers are unavailable 1234630 Client side RPC handle caching and server side fd leaks needs a general solution 1162712 RPC callbacks can fail due to TLI error 1147349 Secure-RPC server cache too small (from 101318-80) 1244917 syslog(3) does not correctly cache the file descriptor that it writes on 1242395 NIS+ TTLs for objects not correct on 2.4 slave replicas and 2.3 slave/clients. 1241369 Panic with memory addr alignments when linking a tmp vnode (duplicate) 1232010 retransmit time, 15 seconds, for NIS+ UDP queries is too long 1228963 SC2000 with 2.3 OS and T101318-76 , crashes system with no free MMU context 1143231 synchronization stubs should be exported for third party vendors (from 101318-79) 1241944 2.3 nfs client does not send an NLM UNLOCK request for blocked locks. 1166632 Xsun without preloading SX driver hangs system hard (from 101318-78) 1229031 page_unlock: page not locked panic occurring when locking address space 1222599 pullupmsg() can corrupt kernel memory and hang CPUs 1211172 Automountd fails to unmount lofs file system 1234879 system panic with auditd: zero divide trap when 101318-75 applied. 1227031 Getdate doesn't recognize leap years. 1216036 NIS+ client library does not retransmit RPC call to rpcbind on NIS+ servers 1208034 SC1000 5.4 Data fault out of tcp_rput_data null pointer under heavy load (from 101318-77) 1223452 do not not hold the zs_high_cur spin lock when thread PIL can drop to 10 1219766 recv() returns 0 bytes if you close socket too quickly after doing a send() 1217231 deadlock between shmdt and pagefault when threads of same process 1220257 Syslog(3) possibly can be abused to gain root access on Solaris 2.x systems 1217941 Data fault from cron in anon_getpage 1209917 fddi crashes when MTU is larger than 4352 (from 101318-76) 1214251 System panics at boot time during 2nd call to srmmu_setup 1209014 uprintf can cause panic on modified OBP systems 1186287 an unprivileged user can use utmp_update to clear entries from /var/adm/utmp 1214057 crash provides open fd for /dev/mem 1207568 ticotsord driver panics due to race conditions during close 1184256 data fault in freeproc() caused by race with cfork() 1213871 srmmu_alloc panic when system assigns pid to more than max_nproc processes 1205757 init dumps core when processes exit too fast (duplicate of bug #1129167) 1129167 init dumps core when processes exit too fast. If zombie processes are created at a rate faster than init(1) can reap their status, init eventually dies with a SEGV from a stack overflow. 1175668 automountd consumes all cpu time and loops when cd to automount directory 1217312 gethostbyname() can trash an existing open file descriptor 1210995 dump/rwall still have utmp security problems 1191422 read on af_unix socket returns 0 when other end does a write(len = 0) 1175356 automounter fails to remount unmounted directories 1217220 BSD sockets make SVR4 programs hang 1210770 2cpu machine mouse jumps with realtime cpu bound thread 1187350 nisd[xxx] WARNING: db_query::db_query: bad index 1172926 application hangs on a TCP connection if the remote system dies 1216540 potential deadlock when process auditing is enabled 1209687 Panic from audit_thread_free with saved path not empty 1186420 in.ftpd does not call sa_auth_acctmg and thus ignores password aging, etc. 1158568 recursive mutex enter on kma_lock can occur from calling kmem_freepool 1171219 kernelmap: rmap overflow 1222780 bug in ftpd that can cause it to dump core. (from 101318-75) 1209012 hard hang in findmod: The code in the while goes round and round. 1204871 nistbladm -e erroneously reports an error when modifying entry objects 1201926 strrput is calling queuerun() this may lead to a dead lock. 1193448 autofs unnecessarily blocks requests on already mounted filesystems 1183662 Search for lofs to unmount should stop on first match 1192309 machine panics with audit_finish: residue audit record 1186845 wait4() emulation doesn't handle pid < 0 1184991 panics in tmpnode_hold - data fault - TMPFS 1184636 watchdog reset when checking residency of pages segment attached SHM_SHARE_MMU 1182428 minfree does not change audit logs to alt directory on upgraded machines 1164319 sc2000 panic with sema_v turnstile corruption (from 101318-74) 1205614 Some of the exported APIs in libbc (/usr/4lib) are redundant 1200512 System panic with Deadlock condition detected:cycle in blocking chain on 2.3 1200502 fopen() and unlink() makes a corrupt file on multi cpu machine. 1197871 101318-63 or later, SS1000 tod is reset. 1199256 "date" resets to epoch after system off for >3 days 1196741 Race in zs driver open/close routines for bidirectional port panics the system 1196670 ctl-c in TLI program causes kernel write fault/panic. 1195436 With patch T101945-20 /.profile is not executed 1195422 nis+ library can corrupt memory when servers are unresponsive 1194673 nisping -C occasionally causes rpc.nisd to hang for 10 minutes 1193147 Host map lookups can crash an NIS+ server in yp-compat mode. 1192238 "noac" mount option not honored immediately after mount 1186439 diskless don't boot after installing t101318-67 1185149 nl_langinfo is not MT-safe 1183749 rpc.nisd dumps core in xdrrec_getlong( ) 1180720 NIS+ servers hang on a getmsg() call 1179346 svc_dg_reply should retry t_sndudata if interrupted 1195996 rpcbind dumps core in routine __svcauth_sys() in 2.3 with patch 101318-69 1182509 FTP transfer hangs 1182194 system panics since kernel patch was upgraded from 101318-45 to later 1182113 another memory leak in localtime_r() 1180578 PPP stops working after system is installed with 101318-63 1180414 streams allocb failure results in data corruption 1179884 non-blocking socket connection over x25 would hang the system. 1179814 mmap() and mprotect are not working correctly. 1179738 Users with 8 characters name or more will not be logged into utemp 1178641 NFS client should fail to open files with the mandlock bit set 1176508 panic mutex adaptive exit under 2.4 fcs when accessing directory over nfs. 1172998 x86: auto_lookup(): assertion failure in mutex_exit() on non-existent fs 1175127 2.3 tcp performance over satellite/delayed links is very poor compared to 4.1.3 1171613 "last" problem - wtmpx entries not properly updated when logging out 1169775 Solaris 2.X does not correctly handle Copy-On-Write faults on a page 1169003 portmapper v.3 not compatible with v.2 bug. 1165675 rquotad returns inappropriate error on nfs client 1162834 deadlock between prioctl() and munmap() 1177469 /proc causes page deadlock in NFS 1182597 swapped out lwp->lwp_ar0 in prgetprregs causes data fault and hang 1187536 deadlock using /proc 1188701 assertion failed: new_state != LMS_WAIT_CPU 1189271 procfs: run-on-last-close doesn't always work 1192982 deadlock condition detected: cycle in blocking chain 1198439 procfs is out-of-spec with respect to microstate accounting 1162623 saving attachments from mailtool in file causes it to dump core 1130791 2.x setsockopt SO_SNDBUF fails with protocol error for stream AF_UNIX domain 1124354 Scorpion and Gal-Ross panics on Scheduler stress test (from 101318-73) 1202675 automountd can dump core due to double endnetconfig() call (from 101318-72) 1198278 ksh loops in kernel making NFS read calls on its history file. 1197816 Banyan needs proc_ref(), proc_signal(), and prop_unref() support in 2.3 (from 101318-71) 1189968 Need strict multihoming in IP to prevent breakins over the Internet 1182506 solaris 2.3/2.4 has incorrect xon/xoff flow control for serial driver 1179526 rpc.nisd doesn't work correct on recursive group members 1188551 Creating more than 32767 files/directories loses parent directory. 1181327 5.3 ypbind doesn't bind to a 4.1.3_U1 ypserver with libc patch (or 4.1.4) 1179173 nistbladm -m overwrites entries who's keys match modified entries 1186303 process hangs after allocating shared memory and forking child process 1181259 NFS mount fails with: couldnt bind to reserved port - ESC# 11042 1178190 MT program with more than 6 thread will consume system resource totally. 1183568 NFS client get old data after file on server being updated. 1181201 port option does not work with autofs. 1174767 readonly child rpc.nisd dumps core while parent process checkpointing 1174494 rpc.nisd core dumps with corrupt stack while checkpointing after a clean install 1183260 all SSI replicas went into full resync (from 101318-70) 1178824 When catman is invoked on a system w/BSM & 101318-54 the system crashes 1159986 lckpwdf causes passwd to crash 1188044 sc1000 with 101318-68 panics in mutex_enter 1178114 ioctl SIOCSPGRP/FIOSETOWN path broken for MT libsocket(linked to libthread) code 1159825 write attempts to locked nfs filesystem may result in EAGAIN, instead of retry 1182458 network interface can hang on NFS server with high ipReasmDuplicates counts 1175115 nfs write error "(file handle: xxx xxx" message cannot be redirected by syslog 1143479 setuid/setgid program takes on default system limits (from 101318-69) 1186557 pid_ref field wraps around manifests as kmem list corruption. (from 101318-68) 1183837 Random processes dump core on sun4d 1174552 rewind followed by read crashes on newly created file. 1185694 systems with 7 32MB DSIMMS fail to boot 1183714 failed fork of a process with shared memory segments panics kernel 1158215 Solaris2.3: syslog(3) can't output Japanese language 1183927 SC2000 running 2.3 with 101318-64 panic: recursive mutex_enter 1182937 System panics when doing an oracle shutdown after installing 101318-64 1146790 automount fails to correctly parse multiple line mounts > 1024 bytes 1182492 autmountd's macro_expand function may cause buffer to overflow (from 101318-67) 1182686 kernel rwlocks can allow readers and writers simultaneously 1181571 2.3 patch fix for bug 1164519 can cause panic 1164800 panic: ddi_setcallback: no callback structures (from 101318-66) 1178957 sigurg not delivered on second oob data arrival 1164800 panic: ddi_setcallback: no callback structures 1181258 SAVECORE SEGMENTATION FAULT WHILE TRYING TO SAVE CORE 1159330 automountd unmounts the wrong lofs (from 101318-65) 1179695 System panics in flk_get_first_sleeping_lock_on_vnode() 1177644 swift specific mmu write function doesn't flush tlb This fixes a problem on SS5 that has to do with memory mapping of non-memory space (i.e. SBus space). The mappings that are set up can be incorrect. The condition for which the process sleeps (in this case to get a file lock) was not checked after normal process wakeup. Since procfs operations like truss does a normal wakeup of the process and the condition not being true led to a sleeping lock being freed and thus panic when it is accessed later. (from 101318-64) 1178999 chkey -p does not work from a remote domain 1177055 The NIS+ backend for getpublickey() trashes the stack 1178761 ufs_putapage:bn == UFS_HOLE panic when filesystem fills up... 1178753 SS20 with 7 x 32Mb Simms installed will panic and hangs. 1178363 unstrcpy() can cause an EFAULT failure when it copies certain bytes. 1178295 /usr/sbin/eeprom caused Aurora machine to panic 1177620 gettimeofday doesn't work correctly when dual processors are used 1176350 the SVS V shared memory functions are protected by a single mutex 1174913 autofs checking for local subnets doesn't work when NIS+ is the nameservice 1170407 many rcp's to localhost intermittently errors "rcp: unknown user `uid' " 1166848 L1 A and then sync locks up machine 1145415 nis_ping(3n) is not MT-safe because nis_finddirectory() is MT unsafe This patch makes nis_finddirectory() MT safe. This means nis_ping(3n) is now MT-safe as well. Non MT program's will notice no difference, and there is no API changes (nis_finddirectory() still takes the same arguments). For versions greater than 2.10 of the Open Boot Prom, L1-A followed by "sync" will sometimes hang. Under a heavy system load NIS clients could fail to contact the ypbind process and would then fail. This would cause a variety of symptoms that can be traced to failed calls on getpwnam(), getpwuid(), gethostbyname(), etc. The ypbind process has been changed to cache the ypserv transport address in a file. Now NIS clients can get the address of ypserv without connecting to ypbind. In addition, NIS clients will now keep trying to get results as long as ypbind is running. This patch fixes bug 1174913: autofs checking for local subnets doesn't work when NIS+ is the nameservice. The problem is that when a mount takes place, it is not giving a preference to the interface that the client machine is sitting on. It should be mounting from the servers interface that the client machine is attached to first and then an alternate if that does not respond. This is because the automounter is looking up the table netmask while it should be looking up the table netmasks. Shminit has a typo. It should allocate an array of mutexes instead of an array of pointers to mutexes. On MP sun4m systems, clock interrupts may be serviced more than once per tick, causing the system's notion of time to drift. This fixes the panic that occurs on an SS5 running Solaris 2.3 with patch T101318-61 installed when the "eeprom" command is executed in order to change an option setting in NVRAM. The kernel string copy routine can cause a data fault during exec when the string being copied contains 0x80 and is aligned in a certain way. As most strings copied by the kernel use the 7-bit ASCII code, this error will almost never be seen. SS20 with 7 x 32Mb Simms installed will panic and hang A corrupt inode can be created when extending a UFS file and running out of space. This can later cause a panic 'ufs_putapage: bn==UFS_HOLE'. The original fix was incorrect. This reverses that fix and fixes the bug correctly. chkey does not work across NIS+ domains, and the NIS+ publickey backend trashes the stack. (from 101318-63) 1179480 sun4d needs to clear important registers on startup 1178810 cgfouteen driver causes applications to hang waiting for vertical retrace 1178761 ufs_putapage:bn == UFS_HOLE panic when filesystem fills up... 1178641 NFS client should fail to open files with the mandlock bit set 1178400 NFS copies btw 690MP(512) server and Sunos 4.1.3 corrupt data without any error 1178391 system with PPP device using the same IP address as le0 will stop working 1178379 semctl/ipcs fails when semid is negative with patch 101520-02 1176247 Performance is poor on sun4m Viking MP systems due to unnecessary cross calls 1176350 the SVS V shared memory functions are protected by a single mutex 1166712 significant priority inversion problems when using mmap file access. 1174303 rpc.nisd died in checkpoint without dumping cores 1169823 synctodr() : unable to sync error message ever three days or so 1169257 Automountd taking up to 10% of cpu time 1168331 system() fails VSX, test 53 and 59 1151509 automounter's built in timeout is too short for low speed lines automountd by default only waits 15 seconds for servers to reply to its initial connection requests. This timeout may be too short for slow links or for very busy servers. This patch allows the system administrator to tune the total timeout by specifying the number of attempts (original + retries). This is done by adding a retry=n entry to the options list for the busy server entry in the automounter map. The default is one attempt (retry=0), when no retry=n option is specified in the options field, or when the retry=n option is invalid. Each retry is equivalent to approximately 30 seconds. Since automountd is currently single-threaded, this option should be used with care, as it will cause automountd to take more time to decide whether a server is dead or not (reply received or not), causing incoming autofs kernel requests to be queued for longer periods of time. For example, the following /etc/auto_home map uses the retry=1 option to force automountd to send the original request, and retry it once more, before giving up with a "server not responding error". If the reply is received before the next retry, there will be no retransmission. NOTE: It is not recommended to set this option as the map default, since it will cause automountd to needlessly wait longer for replies from real dead servers which otherwise would have replied without the need for retries had they been up. /etc/auto_home: # Home directory map for automounter # userx -nosuid,hard,intr,retry=1 busy_server:/export/home/userx +auto_home system() would previously fail and set errno to EINTR if it were interrupted by a signal. This was in violation of standards. system() will now either block or ignore signals. automountd should stat() the filesystem specific umount command before attempting to fork exec it. If the stat fails with ENOENT, use the umount() syscall (meaning there's no filesystem specific umount). System hangs until one hits return on the console. The system's time then becomes out of sync by 2 hours to around 24 hours. This bug affected the naming services drastically. A counter was incremented twice but decremented once only, due to which master server could not spawn new process for replication, niscat etc. Also, there was timing issue in case of checkpointing, due to which the rpc.nisd would occasionally exit without dumping core. The fix for 1176247 reduces the number of cross calls on sun4m architecture machines generated by softlocking in physio. This fix can reduce the number of cross calls by up to a factor of six, depending on the physio load. The fix for 1166712 reduces the wait time for reader/writer locks when multi-threaded applications mix Real Time threads and Time Share threads. The Real Time threads doing i/o could get blocked behind TS threads that were changing the address space, e.g. unmap, mmap, shmget. The fix for 1176350 increases the concurrency through the SYS V shared memory operations. Before all shared memory operations were done under a single global mutex. Now they have a mutex per shared memory segment. Removed obsolete error check. When an IP address is shared between an ethernet and a point-to-point links and if the links go down and point-to-point links comes up first, the ethernet link will not be able to come up with the shared IP address. MPs can transmit IP packets with the same ip_id field potentially causing fragmented packets to be reassembled incorrectly. Normally this is not a problem since the corruption will be detected by the UDP/TCP checksum. However, SunOS 4.X does not by default verify the UDP checksum in which case the incorrectly reassembled packets can cause NFS file corruption. The NFS server will deny access to mandatory lock files. This is done for two reasons. First, mandatory locking is not supported over NFS. Second, it is dangerous for the server to access mandatory lock files. It would be very easy for a normal user to completely hang the NFS server. The user could create a file and set the mode to indicate that it is a mandatory lock file. It could then lock the file with a program which then just does a pause. This user could then go to an NFS client and try to access the file. With each request from the client, including retries, another NFS server daemon on the server would get blocked, until the server ran out of NFS server daemons. A corrupt inode can be created when extending a UFS file and running out of space. This can later cause a panic 'ufs_putapage: bn == UFS_HOLE'. If more than one application is waiting for vertical retrace, one can get hung indefinitely. On a sun4d, during the first boot after a power-on reset, the system may experience panics due to stale bits leftover in the MFSR register after POST (with a successive boot succeeding). This fix ensures the MFSR is cleared during the cpu's startup. (from 101318-62) 1178236 System panics with data fault in free_zero_zero() Socket interface networking programs under heavy use may panic the machine with free_zero_zero() on the kernel call stack. This fixes the problem in the sockmod module. (from 101318-61) 1177578 strmakemsg/strgeterr causes panic in strrput due to NULL mblk ptr 1174847 SS5 running 4.1.3U1 - running customer application - HARD HANGS 1173939 Common mmu_mp_writepte routine for ROSS modules hurts HyperSPARC performance 1160068 init can abort with a SIGSEGV in sigpoll() Init seems to be caught in an infinite loop, catching a SIGSEGV signal over and over again. The Ross hypersparc represented several changes over previous Ross CPUs. In order to make efficient use of these changes, code has been added to Solaris. This code centers around the efficient use of the instruction cache and larger main cache. In addition, in order to support self-modifying code, a more efficient means of broadcasting "FLUSH" instructions has been incorporated. This is an enhancement to the workaround created for bug 1161592. Change is local to sun4m/swift cpu code and has NO impact on other non-swift platforms. Kernel panic in putnext/ptcwrite. (from 101318-60) 1174786 Unnumbered interfaces with respect to PPP have problems 1173201 shutdown hangs system 1173079 System clock behaves strangely after adjtime(2) call 1172731 After PPP connects improper routing entries cause problems 1169945 rpcbind crashed with a segmentation violation 1169686 4.1.3 system on network goes down, hangs 2.3 system 1136864 page daemon can unload pages in use by /dev/mem /dev/mem uses segkmem to map the page it's trying to read. Unfortunately, if the page in question is not on the the freelist, the segkmem code will cause the mapping to be added to the page's mapping list, which allows the page demon to find the mapping and unload it. Normally pages mapped by segkmem are locked against the page daemon, but this is not the case in segkmem_mapin when the page is found via page_numtopp. When this happens, the copyout of the page will fail and the system will panic when a mapping that is supposed to be locked is found to be invalid. _end() + 15001488 debug_enter(0x3040,0xf5afbe24,0xe95ec54c,0x3,0xe0027dc8,0x50) + b8 complete_panic(0xe00d7924,0x0,0xfffd,0x0,0x14,0xe) + c0 do_panic(0xe00d7924,0xe95ec6cc,0xe95ec6cc,0x3e86af0,0xe4ed9c00,0x3) + 20 cmn_err(0x3,0xe00d7924,0xe95ec6e4,0x3d4,0xe5e9423c,0x3) + 1c srmmu_unlock(0xe5e7a024,0xe00fa9d0,0xf5507000,0x1000,0xe00fa800,0xe5e2371c) + ac segkmem_mapout(0xe00ad4e4,0xf5507000,0x1000,0xe,0xe00ad0c8,0xe00fa9d0) + 98 mmrw(0xe00e5400,0xe95ec808,0x0,0x0,0x0,0xe) + 1d8 rw(0x1000,0xe95ec920,0x1,0x0,0xe95ece94,0xf5c32d84) + 1e4 The problem shows up when a "ps" thread is running through the virtual memory area to get the address space size for a mapped file. The address space lock is held and a get attributes function is called. This initiates an nfs get attribute request. If the machine that the request is made to is not responding the nfs request will block. The address space lock which is held by the blocked ps thread might block other processes on the local machine. Typically when a server goes down all nfs file system activity is blocked on any clients. The nfs operation resumes once the server comes up. In this situation a server is powered down and causes a client to hang. The hang is due to a process pile-up. The client is doing a ps and its thread is holding the address space lock (as_lock) for a running process lets call A. The A process is a mapped file from the server. The client ps thread path has reached rm_assize() which needs to get the file size so it calls VOP_GETATTR() which goes across the wire to the server. This operation goes nowhere because the server is not running. The as_lock held by the ps process is blocking other processes such as init. The solution is not to go over the wire but to return a cached entry for the file size. The change is to define a new attribute flag in vnode.h called ATTR_HINT. The rm_assize() function recognizes will use this flag when it calls VOP_GETATTR(). The nfs getattr function will see that the size of the file is requested and that the passed in flag is ATTR_HINT. It will return the file size from the rnode rather than make a request to the server. This patch fixes a problem with rpcbind periodically core dumping. The fix is contained in libnsl. When an IP address is shared between a POINT-TO_POINT interface and a numbered interface can result in invalid routing entries. Fix strangeness with time-of-day after adjtime() call. The local transport driver ticots, ticotsord and ticotsord use a minor number algorithm which results in two endpoints referring to the same device after a long time when minor number space wraps around. This can result in applications which hold an endpoint open and the open/close endpoints will see a leakage over a very long period of time. The hang during shutdown is also a consequence. When a point-to-point interface shares an ip address with a numbered interface, point-to-point link will stop receiving packets if the numbered interface is shutdown. (from 101318-59) 1175968 non-master cpu network interfaces broken on SS1000 1151704 kernel read fault at addr=0x2000, pte=0x1 uadmin: Text fault 1140626 MP systems panics with Data Fault The system can panic with a BAD TRAP, Data fault with the following stack backtrace: Begin traceback... sp = e95982b0 cpu_prop_op+0x48 @ 0xe000b2a8, fp=0xe9598328 args=14c002c b4c45e0 1 0 e00dc900 e959844c cdev_prop_op+0x68 @ 0xe00483e0, fp=0xe9598388 args=14c002c b4c45e0 1 9 e00dc900 e959844c e_ddi_getprop+0x48 @ 0xe003164c, fp=0xe95983e8 args=14c002c b4c45e0 1 9 e00dc900 e959844c Sysbase+0x5b59c @ 0xf545b59c, fp=0xe9598450 args=14c002c b4c45e0 e00dc900 9 ffffffff 0 Sysbase+0x5b1b8 @ 0xf545b1b8, fp=0xe95984b0 args=14c002c 3 f63b8180 f63b8184 f546ba84 f63b8180 Sysbase+0xaa894 @ 0xf54aa894, fp=0xe9598568 args=e00e7ff4 14c002c 3 f6294604 e00fee64 f6294600 lookuppn+0x4bc @ 0xe0059560, fp=0xe95985d0 args=f658d400 f658d474 e959863c f658d408 f666ff00 0 vn_create+0x68 @ 0xe009e7f0, fp=0xe9598740 args=e95987f4 0 e95987fc 0 f5a54e08 f63d3208 mknod+0x164 @ 0xe009bd24, fp=0xe9598800 args=0 0 e9598868 1 0 e9598864 syscall+0x3d8 @ 0xe002c4c0, fp=0xe95988b8 args=e9598e94 f000 0 6000 0 1 .syscall+0xa4 @ 0xe0005d34, fp=0xe9598938 args=e00daef4 e9598eb4 0 e9598e90 fffffffc ffffffff (unknown)+0x11660 @ 0x11660, fp=0xdffffc50 args=22ac4 61ff d32c d3 d300 d32c End traceback... init 6 panics Support for SC2000E and SS1000E was patched in the 2.3 and 2.4 releases, and integrated into the 2.5 release. This fix introduced a bug which causes non-zero system boards to have tpe-link-test turned to the incorrect setting. This has the effect of rendering the additional le interfaces non-functional. (from 101318-58) 1167235 panic data fault in strioctl - apparently doing TIOCSPGRP 1174270 ufs 'recursive mutex enter' panic. 1174851 SC2000 hang due to no ip flow control when used with FDDI board 1175304 vnode v_count is not maintained correctly Protect with mutex the testing and setting of the session and controlling terminal related flags in the streamhead. While running 'hsmdump' it is possible for the system to panic with "recursive mutex enter". 'hsmdump' uses a special ioctl() to put a "name lock" on the filesystem. There is a very subtle code path in which the locking protocol is subverted, hence, the panic. In testing of patch 101318-57 for bug 1174851, a problem was found. vnode v_count numbers are not maintained correctly causing vnodes to never disappear or, in the earliest bug, drop to zero prematurely and panic the system. (from 101318-57) 1174851 SC2000 hang due to no ip flow control when used with FDDI board 1174572 Viking workaround enabled on parts that do not need it 1172438 apps using AUTH_DES fail when many simultaneous requests are made 1172260 2.3 <-> 4.1.2 socket connection looses sync and delays transfer of data 1172243 customer runs application from dumb terminal and system crashes. 1172009 recv() on sockets should return the error only once for SunOS 4.X compatibility 1171599 multiple simultaneous priocntl(0,0,PC_GETCID,) commands panic kernel The problem relates to the priocntl(..., PC_GETCID, ...) system call, which returns the class id of a named scheduling class. If more than one process makes this call at the same time with the same named class, and the class does not exist in the system, the system may panic. In SunOS 4.X sockets when a read() or recv*() call returns an error the application can do another read()/recv*() and get an EOF. This patch applies this subtle aspect of socket semantics to SunOS 5.X. Current patch has a bug in it. System would panic if stream event structure's pidp was null. Crash does not display accurate data. Stream event structure member se_proc was replaced by se_pidp. The pid structure must now be read to get the process slot number. A TCP connection might not start immediately when a window update is received after the Solaris 2.3 side has sent a zero window probe. With some TCP implementations at the remote end there will be a few seconds of delay (waiting for a retransmit timeout). When more than 25 users try and login into a NIS+ client simultaneously, only about 25 can make it. The rest get the "Login incorrect" message. Automatically apply the patch for bug 1137125 on systems that need it. When a high bandwidth network interface is receiving a large number of packets addressed to it, but with no one bound to the specified port, then IP does a lot of processing. IP sends an ICMP unreachable packet back to the source of the original packet. This can cause large amounts of kmem to be consumed, which can cause subsequent kmem_alloc() failures, including allocb() failures in the driver for the high bandwidth interface driver. This can cause subsequent fragments of large IP frames to be dropped by the driver. IP will then hold on to these incomplete frames awaiting the arrival of the missing fragments which will never show up, IP holds on to these frames for 60 seconds. Which in the case of a NPI FDDI interface at 80Mb/S can be 300Mbyte of kmem. (from 101318-56) 1172243 customer runs application from dumb terminal and system crashes. 1166779 Add support for dragon+ dual power supply 1152922 prtdiag(1M) should display SBus clock frequency 1174222 5.3 automounter does not mount from 4.1.3 NFS servers with libc patch 1173212 SECURITY: su can display root password in the console 1170488 rpc.nisd running in compatibility mode dumped core 1151883 NIS compat mode does not support maps of kind "x.y" or "x.y.z" 1169909 Running xlib code in Realtime class causes code to block. in poll() 1166349 panic vn_rele: vnode ref count 0 1165687 non-blocking reads on sockets block under Solaris 2.3 1162269 all net IP broadcast packets (255.255.255.255) have a ttl of 1 1164519 Socket returns with "address already in use" because conn in "BOUND" state Applications and environments that depend on routers forwarding broadcast packets might run into problems with the fact that IP sets the TTL of all broadcast packets to 1 (in order to avoid any broadcast storms when there are misconfigured machines on the wire). This patch makes it possible to override the default TTL of 1. TCP sockets that are reset by the peer will return to the BOUND state instead of the IDLE state. If the application holds this socket open, it will prevent any other application to binding to the same TCP port number. This can cause services such as FTP to hang. The non-blocking attribute of a socket endpoint is not transferred from a non-blocking listener endpoint to a accepting endpoint. This causes some socket non-blocking programs to block. This patch fixes the problem by setting the accepting endpoint non-blocking attribute if the listener was non-blocking. fix for a "panic vn_rele: vnode ref count 0" situation Real time stream threads will block in a poll. NIS+ YP compat server does not support special "x.y" maps. It also dumps core when a NULL key field is received from a YP client. If a username is too long (greater than 8 characters), when su root fails or succeeds for that user, the characters typed in as the password are echoed to the console. automountd first makes a null RPC call to the remote portmapper (rpcbind) of the server from which it needs to mount to determine if the server is able to respond to mount requests or not. In some cases (multiple servers specified on map entry) it would call the remote portmapper using version 3, which is not available on non SVR4 systems. Some systems are now silent about version mismatches, which causes automountd to assume the server is dead (or at least it's rpcbind/portmapper). This patch fixes this problem by always using version 2 of the portmapper protocol. Add dual power supply support to SC2000 and SC2000E systems. Systems with dual power supplies will receive warnings on system console when one of the redundant power supplies fails. Modify prtdiag(1M) to indicate SBus Clock frequency. The SC2000E and SS1000E run with 25 MHz SBus clock frequency. The SC2000 and SS1000 run with 20 MHz SBus clock frequency. This change to prtdiag(1M) makes it easy to determine the SBus clock frequency on the system. Running applications that do I_SETSIG on console, when console is the serial port (i.e not the frame buffer), causes system to crash, when attempting to send signal to a process. (from 101318-55) 1172979 spurious SIGALRM received in test program that forks child processes 1172118 hat_share fails during seg_dup if parent's L1 has lost ISM mappings 1171609 lockd swap leak 1169590 Writing directly to /dev/fd/0 will cause a panic 1167647 Auditing of some system call fail. 1163312 processes can hang during exec() when exec args area is depleted 1173731 ps hangs inside prlock function 1160112 socket library accidentally closes file descriptor on error 1120225 recv() returns EPIPE when called with MSG_PEEK 1152710 socket lib in 2.3/2.2 have problems with not clearing bad connections and errno AF_UNIX and AF_INET sockets can sometimes get EPIPE errors for recv(MSG_PEEK). processes can hang during exec() when exec args area is depleted Doing a cat > /dev/fd/0 will panic the system. C2audit checks to make sure that there is only one file path per file struct. In the file descriptor file system a file struct is aliased through a different path. The check for duplicate path can be removed in audit.c, this will fix the panic. During auditing of an open or creat failure due to a permission problem, the audit subsystem will fail to output the path to the file. This create a security problem for the cites that use auditing to monitor their systems. modified audit_savepath and audit_setf to make sure that te path token is always produced. The fix is to a portion of the lock manager which creates client handles. The change was made to keep lockd from allocating the maximum size of 64k for send and receive buffers. The new size is 4k. Using ISM on LX, Classics, SparcStation 5 may cause the system to panic with ""invalid shared memory l1 ptp" message. This specification of signal actions from the signal(5) manual page was being violated: Setting a signal action to SIG_IGN for a signal that is pending causes the pending signal to be discarded, whether or not it is blocked. Any queued values pending are also discarded, and the resources used to queue them are released and made available to queue other signals. The condition under which the pending signal was not being discarded was the specific case of SIGALRM signals generated by the setitimer(ITIMER_REAL) interface. The malfunction happens in a narrow race condition which will be triggered under intensive setting of a signal handler and setting it to SIG_IGN while the itimer is active. (from 101318-54) 1164428 System will hang when echoing msgs between ttya & ttyb 1172320 1155298 fix in patch 101318-49 breaks c2_audit 1170544 utmpd loops consuming all cpu if CDE is installed 1170669 utmpd issues error messages 1172155 utmpd - runs away consuming CPU resources 1169424 rsh hangs occasionally during high activity. daemon rshd not running 1165250 2.3 system hang with ldterm out of blocks message 1168167 BSM with nt flag and 101318 patch loaded crashes at login 1170038 Panic memory address alignment in audit_sock under 2.3 from rpcbind when audit Problem is seen on highly active networks when connection establishment to a TCP port that is currently in TIME-WAIT state from the same client attempting to reconnect. This can cause the server to adjust its global TCP ISS (Initial Send Sequence) number backwards. This can cause a subsequent connection establishment by the server to another system to be ignored, as others will see this sequence number as being older. Resulting in applications using TCP (network sockets) to hang. This ISS adjustment is done in an attempt to guarantee that ISS numbers are always greater then the last sequence number used in a previous incarnation of a server/client/port connection. While at the same time conserving the sequence number space. But the global ISS must never move backwards in time !!! Under certain circumstances the utmp daemon can go into an infinite loop hanging, or at least slowing down, the system. This fix is distinct from an earlier runaway, 1170544. This fix is an update to the utmp daemon, /usr/lib/utmpd, which was created in an earlier patch for CTE Esc #9122. These bugs were found during the shakeout of a pre-fcs version of 494. One problem, 1170544, occurs when a user logs in while using CDE. The other occurs on large multiuser systems such as an SC1000 or SC2000. Essentially utmpd makes extensive use of the /proc file system and the poll() system call to monitor the termination of processes that have made entries into the utmp files. The fixes to the utmpd presented here work around some anomalies associated with /proc. Patch number 101318-49 has a change not compatible with c2_audit. Customers that use it need this fix. ptem has a write service procedure and will flow control if there is any message on the queue or the can put into next queue fails. System hangs with the following message: Warning: ldterm (ldtermsrv/newmsg) out of blocks Warning: ldtermsrv: out of blocks patch 101318 modified /usr/include/sys/strsubr.h. It moved a variable which c2audit was relying on. c2audit was not included on the path, so when the patch was installed it would write at an old offset clobbering the newly added variables. (from 101318-53) 1171363 CTE101318-50 heavy xdm use panic: Deadlock condition detected in blocking chain 1091548 server does not stay alive handling multiple, serially-calling clients 1169109 setuid/setgid program takes on default system limit 1168365 Solaris 2.3/Sun4d system will panic illegal instruction in exec system call. 1168635 5.2 ss1000 crashes in dofusers when "fusers *" on large flat directory 1166848 L1 A and then sync locks up machine 1165014 autoup set to 120 - system would not do a core dump 1158398 Dump fails during sync Problems may occur, when syncing filesystems after a panic. If the sync gets hung, the system should eventually cause a panic timeout, which allows the system to continue and create the coredump. This patch addresses one problem with semaphore use during panic time which may cause the sync to hang. Also, it changes sync_timeout to be updated more frequently, timing out the sync quicker if it gets hung. Finally, changes where made to the sun4d kernel, so if the panic occurs at an IPL above clock, timeouts can still occur. Proc structure corruption due to locking error. Patch 101318-50 for SunOS 5.3 contains a new daemon, /usr/lib/utmpd, that maintains the consistency of /etc/utmp and /etc/wtmp (/var/adm/utmp and /var/adm/wtmp). This is so the 'who' command will show accurately who is logged in at any time. The utmp daemon relies on the polling feature of /proc to do its work. The polling feature of /proc has a lock ordering problem with respect to the poll() system call that can lead to a deadlock. The deadlock, when detected, results in a system panic: panic: Deadlock condition detected: cycle in blocking chain The panic has been observed when making heavy use of xdm (the X Display Manager) on an SS1000. However, the problem is generic to all machines running the 5.3 patch 101318-50. A TLI server with the same fd for listening and accepting endpoints will fail to accept subsequent connection attempts. While the listening port remains the same for each subsequent connection, the client's port changes, so all subsequent connections are unique in terms of bindings. The module loading subsystem is not fully MT-safe. This can be manifested in several ways including BAD TRAP panics while booting Japanese Solaris 2.3; BAD TRAP panics following the message "exec type 108 is already installed"; other panics or BAD TRAPs with module loading routines appearing in the backtrace; and threads appearing to be stuck trying to waiting to load or unload a module. Invoking a setuid program will reset resource limits to system default. (from 101318-52) 1170350 rlogin (and services like rcp/rsh/rdist that use rcmd()) can become disabled 1170091 Patch 101318 the fix for :already allocated shared memory l1ptp, panics sun4m's 1165247 Support for IPX/SPX address family in libsocket 1170814 system deadlocks in thread_lock_high() 1170036 MXCC-based copy/zero code incorrect for sun4m 1169640 sprintf format "%.4S" prints improperly when strings include 0216 or 0217 1170527 segmentation violation in select or socket calls 1169904 syslogd core dump in ismyaddr() 1170233 Syslogd prints "???" instead of client host name ismyaddr(nbp = 0x456e0), line 1419 in "syslogd.c" amiloghost(), line 1599 in "syslogd.c" init(), line 1118 in "syslogd.c" sigacthandler(0x1, 0x0, 0xefffee08, 0xa, 0xefffec28, 0xeffff099) at 0xef6eadf8 main(argc = 1, argv = 0xeffff8fc), line 315 in "syslogd.c" Some socket programs may experience a core dump caused by segment violation under heavy use. Certain error conditions, if they happen on the system cause an internal socket library data structure corruption. The printf routines can fail to count 0x8e or 0x8f character when calculating precision in %.s format. On SPARCstation systems installed with the SuperSPARC processors and SuperCache external cache controllers the kernel block copy and block zero code uses the SuperCache's hardware stream copy features. When a block copy or block zero is performed the code did not wait for the last operation to finish. If kernel code subsequently wrote to the last memory locations touched by a block copy/zero the order of the two operations may have become reversed with the block copy/zero data being returned on subsequent accesses instead of the kernel's data. The fix is for the kernel to wait for the stream operation to complete before returning from the block copy/zero code. If an interrupt thread acquires a readers-writer lock held (but being released by) the thread it pins, a deadlock can occur. This deadlock will appear in pi_waive() calling thread_lock_high(). This problem has been observed on sun4d machines, but it could potentially happen on any machine running Solaris 2.3. However, the window during which this problem could manifest itself is of small duration; the probability of occurrence is low (but it has happened). The address and protocol family declarations AF_IPX/PF_IPX need to be added to header file socket.h and library modified for support of these. These are needed only with SPX/IPX protocol unbundled networking products. When ISM is run on LX, Classics and ROSS 600MPs with 2.3 patch 101318-45 and above, the machine may crash. The rlogin/rsh/rdist/rcp services may not work after they timeout once because of temporary network load or connectivity problems. This fixes a problem in the fix for bug 1138924 TCP connection in zero-window condition times out. that was fixed in rev -44 of this patch. Sites which installed the patch rev-44 or greater upto the patch level where this fix gets shipped might have this problem. (from 101318-51) 1143231 synchronization stubs should be exported for third party vendors The synchronization stubs in libc should be exported for the use of other libraries. The problem is that third party vendors can't benefit from this technique in making their libraries MT safe. Currently they have no way in making a library work for both single- and multithreaded apps at the same time. (from 101318-50) 1168083 2.3 syslogd dumps core near _netdir_getbyaddr() 1169132 Occasional failures to open symlinks during system reliability tests 1168055 BCP programs broken in libnsl 1167439 [bcp-libc] Clients of OW for 4.x don't run properly in the BCP mode. 1168672 libaio should call _sig* functions 1167154 ndd(1) causes kernel panic. 1164156 listen() can cause bound port number to silently change 1130726 rsh fails intermittently (with patch 100468-03) when running 1000's of rsh connections, some of them can hang. 1151598 pututline and pututxline can erroneously return failure 1159347 pututline() does not work properly 1165413 in.rlogind, in.telnetd do not reuse DEAD_PROCESS utmp entries 1163776 kill -9 of xterm does not clean up utmp entry 1123876 UDP can't bind to broadcast address 1156103 pwconv segfaults when last record is +/- 1167485 Solaris-2.3, patch #101318-4[45], syslog msgs not output An UDP/ICMP application cannot bind to a broadcast IP address without this fix. While no packet should be emitted by the system with a broadcast IP source address, a bind to a broadcast IP address should be allowed. A packet sent from such an endpoint will be emitted with the source IP address of the interface. The endpoint receives the packets with the destination address set to the specific broadcast address to which the endpoint is bound. This patch fixes the following utmp problems: 1. Duplicate utmp entries (1165413). If a program that makes an entry dies (like xterm) then cmdtool is started you use to see two entries. Now stale entrys get cleaned up by a new program - the utmp daemon. 2. pututline could return a failure even though it made an entry (1151598). Also if you gave it an alternate file name it wouldn't work (1159347) unless you were root. Executing multiple "ndd /dev/tcp tcp_status" simultaneously on a multiprocessor system can cause a "Data fault" PANIC on Solaris 2.3 systems. BCP programs compiled under 4.x coredump when Patch 101484-03 is installed Symlinks are sometimes resolved incorrectly. strcmp(0x20002100, 0xdfffdf00, 0x72, 0, 0x10, 0x72656477) at 0xdf6f6158 searchhost(0x44480, 0, 0xdfffe764, 0xdfffdf08, 0xdfffe768, 0xdfffdf08) at 0xdf6011dc _netdir_getbyaddr(0x486b0, 0x468e8, 0xdf60140c, 0, 0x44480, 0) at 0xdf600cc0 netdir_getbyaddr(0x486b0, 0, 0x468e8, 0, 0x47998, 0x46ad8) at 0xdf778bec cvthname(nbp = 0x468e8) at 0x13f00 main(argc = 1, argv = 0xdffffd6c) at 0x1248c pwconv was dumping core in some circumstance involving +/- type entries in the passwd file. A patch was supplied that fixed this problem (-46), but introduced a number of other problems, including putting the string "x," into the passwd file instead of "x", zeroing out the passwd aging info and removing shadow entries with *LK* or NP. This patch appears to have fixed those problems. Incorrect accounting of the number of processes logging syslog messages was preventing more that one process to receive syslog messages. (from 101318-49) 1167602 stale nfs file handle, lockd unable to do cnvt 1164428 System will hang when echoing msgs between ttya & ttyb 1155298 bind of AF_UNIX address simultaneously from multiple processes can fail 1168240 flk_allocate_lock() can data fault if kernel memory exhausted 1166581 local locking fails to inform lockd when files can be closed A bind of address to AF_UNIX socket can fail if there are multiple processes all doing binds at the same time and and an unrelated process unlink()'s the AF_UNIX address path at the same time without closing the socket it was bound to. ptem module did not have a service procedure when the write put procedure was doing a putq() during STOPPED state. This resulted in depletion of message buffers. Certain older NFS clients can cause repeated request to unlock a stale file handle causing the error message: _nfssys: error Stale NFS file handle lockd: unable to do cnvt. To flood the console. Lock manager can leak file descriptors. (from 101318-48) 1165736 autofs/lofs: panic : vn_rele: vnode ref count 0 1159152 zs driver latency increased by 20-30msec in 5.3 1107880 shared cd mounted w/o -r option gets multiple lockd error messages The timeout value for the receipt of the next character is determined by the transmission speed. So that for higher baud rate the timeout value is small. It is possible for the system to panic under certain conditions related mounting and unmounting filesystems. This will most likely show up when using loopback filesystems with the automounter. If you have a cd-rom mounted and then shared (both read-only, as they have to be--the system won't allow you to do otherwise) and then you mount this shared filesystem on another machine without specifying read-only, then when you run answerbook off the cd you will get the following messages scrolling hundreds of times in the console of the machine which is sharing the cd. lockd[288]: _nfssys: error Read-only file system lockd[288]: lockd: unable to do cnvt. (from 101318-47) 1166933 machine panic with memory address alignment in flk_insert_in_list, esc9392 The vnode is pointing to an active lock that is trashed. (from 101318-46) 1166629 gypsy panics: data fault, booting on1093 and kernel jumbo patch 101318-45 1156103 pwconv segfaults when last record is +/- 1165689 SC2000 fails to boot with more than 25 DWIS/S SBus Cards. 1155951 TCP 3-way handshake doesn't complete is last ack is lost 1165649 ISM crash with jumbo patch 101318-36 installed 1117303 Unable to install/attach driver cgsix error 1163167 spamified memory is left uncached after unspamification 1163170 program shows linear degradation in performance 1164554 segsx_cmem_fault does not handle F_SOFTLOCK/F_SOFTUNLOCK 1165987 SS20 running 2.3 with SX crashing with could not find a free SX_hmentblk The cgsix driver fails to work on sun4/110 machines The system crashes if users try to use locking operations on ISM segments. 1159882 bcopy for 4.1.3 twice as fast as bcopy A performance improvement is obtained using the MXCC to assist bcopy, when either source and/or destination memory are not cached. bcopy will include use of the MXCC block transfer when the following conditions are met: - transfer length of 1 page minimum - source and destination addresses are page aligned - either source or destination memory NOT cacheABLE. 1155951 TCP 3-way handshake doesn't complete is last ack is lost Under heavy load tcp connections (such as rsh) can time out during the connection establishment. This happens when the SYN+ACK packet is lost. The system hangs when devices use up all IOPB space. Hooking up more than 25 ISP controllers is an example. A problem with password file entries which use the "+" or "-" feature, when shadow password files are constructed using pwconv, has been fixed. Previously, pwconv failed somewhat untidily for password files with this yp/nis/nis+ entry. It now works as advertized, correctly modifying password files (substituting the string "x" for encrypted passwords, and properly writing shadow password files.) Installing kernel jumbo patch 101318-41 or better on Gypsy causes the kernel to panic during boot. (from 101318-45) 1165902 truss broken with patch 101318-42 for bugid 1160087 1156132 ioctl dose not work on Solaris2.3 1159248 kernel panics in tcp_snmp_get while doing netstat 1164569 rmdir on tmpfs w/ sticky bit set causes panic data fault if not owner of dir 1165270 system panics with freeing free fraq/block/inode 1158674 infinite loop in deadflck() hangs system This patch uses a different algorithm to fix many file and record locking problems. During a panic, only the panic'ing thread is allowed to run. Because of this, during a panic the thread always gets the locks it requests. In some cases I/O buffers and critical ufs data structures are locked because the buffer or data structure is in the middle of being modified and should *not* be written out. This is the suspected cause of several ufs panics that involve inconsistent meta-data. The fix is to have the buffer and inode code respect the BUSY and IREF flags (respectively) during a panic. Sun machines can be crashed by users, using tmpfs. Tmpfs is default installation. Mount /tmp as a tmpfs filesystem. chmod 1777 /tmp As a non-root user, mkdir /tmp/testdir As a different non-root user, rmdir /tmp/testdir. BAD TRAP: type=9 rp=f057a72c addr=3 mmu_fsr=3a6 rw=2 rmdir: Data fault 1159248 the tcp_tcph structure is not initialized in some states of tcp, and dereferencing th_lport (or th_fport) causes machine panic at tcp_snmp_get+188 The zs driver was not sending ACK for the ioctls TIOCSBRK and TIOCCBRK. The check for the lwp_sysabort flag in lwp, set via /proc, was being done before the call to issig(). (from 101318-44) 1150417 4d system running 2.3 panics with already allocated shared memory message 1138924 TCP connection in zero-window condition times out. A two way tcp connection such as cat /etc/termcap | rsh host2 '/usr/bin/cat -' > termcap.cat can hang and time out. 1159757 netdir_getbyaddr(3) dumps core in syslogd daemon when running in DNS env. only 1158215 Solaris 2.3: syslog(3) can't output Japanese language netdir_getbyaddr(3) dumps core in the syslogd daemon when running in DNS environment only. 1164156 command piped in rsh hangs in Solaris 2.3 rsh can sometimes hang due to the port number changing as part of the listen() call on the reserved port. This can also effect other applications where multiple applications contend for the same port number. 1164504 FIN_WAIT_2 connections disappearing With the fix for 1135394 connections in FIN_WAIT_2 state might be removed too quickly when the application does a shutdown() before the close(). This patch is for customers using ISM and run into this panic: "already allocated shared memory l1 ptp" (from 101318-43) 1155948 Sybase BCP performance poor under 2.3 Bug in IP causes TCP/IP performance degradation. 1163747 Unbundling of sendmail from patch 101318 sendmail/sendmail.mx is now unbundled from patch 101318. Patch 101318-43 or later is still needed to fix bug id 1155803, ndbm hangs when two large records hash to the same value. The unbundled sendmail patch fixes all other known sendmail problems and will work without patch 101318, with the exception of bug id 1155803. Patches 101318-35 through 101318-42 also contain the additional libc fix needed for 1155803, but MUST be installed prior to the installation of 101739 and NEVER be installed after 101739. Doing so will backout all sendmail fixes that occurred after 101371-04. (from 101318-42) 1125134 IP wrongly sends ethernet packet to token ring and possibly other drivers Datalink drivers that use the 'M_DATA fastpath' can in some cases receive M_DATA packets with Ethernet headers. This has been observed for token ring drivers among others. Often the Ethernet packet is destined to the Ethernet broadcast address. 1139327 remove enterq/leaveq from ttycommon async drivers that support the tty subsystem, calling tty_common.c routines, currently need to call the undocumented enterq/leaveq routines because this is required by ttycommon_ioctl(). Machine with a third party driver hangs at Raytheon. 1146549 bug in ip flow control cause system hang When the ethernet cable is disconnected, in a redundant (fault-tolerant) set up, SUN machine hangs forever. 1147977 panic: recursive mutex_enter. when doing ndd /dev/udp udp_status Running the command 'ndd /dev/udp udp_status' will always panics machine. 1151192 srmmu_setup panic oracle data fault srmmu_pteload. SPARCstation Classics have panic'd in srmmu_setup due to a race. 1160087 large output ALM2 doesn't respond properly to interrupt signal after XOFF Send the M_SIG message type first so that when the stopped thread wakes up and runs it sees the signal SIGINT and exits 1163533 panic Deadlock condition detected: cycle in blocking chain When running a significant load of multiple ndd commands on /dev/tcp the machine can deadlock resulting in a kernel panic. (from 101318-41) 1151592 workaround needed for swift prefetch bug. 1152033 x11perf on Aurora P1.1 and S494 prealpha6 ON caused panic This patch solves two problem. 1. It works around a prefetch cpu bug, 2. A kernel bug causes a panic running x11perf. 1135394 detached connections can stay in FIN_WAIT_2 forever It is possible for TCP connections to get into the FIN_WAIT_2 state and stay there forever. These connections may prevent another application from binding to the TCP port number that the connection is bound to if the application does not enable the SO_REUSEADDR option. 1160681 find returns cannot open /: no such file or directory When using the following find command: /bin/find / -type f ( -perm -4000 -o -perm -2000 ) -exec /bin/ls -lda {} \; as root on a 2.3 machine, the command fails about half the time with: "cannot open /: no such file or directory" (from 101318-40) 1155515 sockmod leaks memory when T_CONN_REQ is T_ERROR_ACK'd This is a generic bug in sockmod but most likely to be seen when X25 8.0 product is running. There is a memory leak caused when connect() requests are rejected on the local machine. X25 provider does it often enough for this memory leak to assume significant proportions. 1157265 pwconv erases the NIS entry in the passwd and shadow files If I ran pwconv with no changes to either the passwd and shadow files and a correct NIS entry in each, the NIS entry gets erased in both the passwd and shadow files. If I make a change to the passwd file or shadow file, pwconv works as it is documented but it still erases the NIS enties. Additional information from a duplicate of 1157265, bug id 1059438: If there is an NIS entry in /etc/shadow, pwconv gives the following message /usr/sbin/pwconv: Bad entry in /etc/shadow. Conversion is not done and will not execute. The NIS entry I had in /etc/shadow was: +::0:0::: 1147647 localtime_r has a memory leak (from 101318-39) 1160720 fix for bugid 1150613 has to be backed out as it is not general 1159439 buffer cache code can deadlock 1140503 cgfourteen cursor does not turn off in response to FBIOSCURSOR ioctl 1140378 galaxy with 2 ross modules hang on Solaris 5.2 running sundiag 1132086 libc has window where programs can dump core in sigaction call The values in the signal handler were set to the new action before a system call if the action was SIG_DFL or SIG_IGN. This resulted in ksh sometimes dumps core with SIGSEGV. 1122992 galaxy ross system hang shortly after sundiag start Very often Ross machines would hang when running Sundiag or even just normal system activity. 1157267 users with passwd file entries > 132 chars. cannot change passwd Users with passwd file entries > 132 characters cannot change their passwd using the "passwd" command. Error is "username does not exist", even though users are in /etc/passwd and /etc/shadow files. This is true of users whose entries come after the first entry > 132 characters in the passwd file. The users can log in, but cannot change their passwd with the "passwd" command. This problem can be re-produced by adding in an entry > 132 chars. to the /etc/passwd file, and manually editing the /etc/shadow file to add entry for this user. (don't use pwconv because of Bug 1151625.) Log in as user and try to change the passwd, it will fail with the error "username does not exist". Add another entry to the passwd file that is less than 132 chars. but add it after the long entry, and log in and try to change the passwd, will get the same error as for the user with the long passwd entry. 1159439 buffer cache code can deadlock Under extremely heavy I/O loads the system may deadlock due to a lock ordering problem in the buffer cache. (from 101318-38) 1157990 df -k does not report correct values for tmpfs df -k reports incorrect values on a tmpfs filesystem mounted with the size option. 1160207 panic: tmp_getapage: no anon slot when reading tmpfs files over nfs Reading holey tmpfs files exported via NFS panics the system. 1155948 Sybase BCP performance poor under 2.3 Bug in IP causes TCP/IP performance degradation. 1140802 ttyname(), ttyname_r() library call scans entire /dev/pts dir to find tty The ttyname() (and ttyname_r()) routine stats every entry in /dev/pts directory until it can find the one matching the file descriptor that has been passed as an argument. This can result in too many stat system calls on large machines with many timeshare users because of large number of /dev/pts/<###> entries being used. (from 101318-37) 1159160 f77 compilation fails because combination of fseek and fwrite writes wrong bits Some combinations of fseek and fwrite on tmpfs files can lead to corruption of the written file. This is present in the 101318 Rev 31 and 32 patches for 1093 as well as in 494, but not in 1093 FCS (from 101318-36) 1157978 interrupted back to back store can cause kernel panic 1144536 need swift idle support for next release (from 101318-35) 1154452 serial port loses when SX context switches 1157110 Resetting SPAM chip could hang the system (due to unexpected level 15) 1157460 Nachos video frame transfer rate to SX memory is very low 1157463 Eliminating one redundant cache flush can benefit performance. 1158000 101318-32 conflicts with the X.25 patch 101524-01 Kernel panics upon X.25 bring up when both 101318-32 and 101524-01 are installed on the system. 1157524 locks are left on NFS files after the locking process is killed. Suppose a process obtains a record lock on an NFS file and is then killed with SIGKILL ("kill -9"). The process will fail to release the record lock when it exits. (from 101318-34) 1152960 panic srmmu_unlock() during Sybase dataserver shutdown 1153324 system gets a srmmu_pteunload panic when starting Oracle DB. Customers seeing data base programs (Informix, Oracle, Sybase) with ISM turned on causes system to crash. There are kernel bugs fixed in this patch, but we also found some bugs in DB vendors' code which makes DBs fail to startup with the fixed kernel. As a result, we are coordinating with DB vendors to fix their releases too. Here is when patch/new release from DB will be available: Oracle: a fix will be available as a patch to the newly released 7.0.16 in early 3/94. Informix: the fix will be in 6.0.UD1 to be released in 2/94. Sybase: For users of the Sybase SQL Server version 10.0 or later. With certain memory configurations, a small number of sites may experience a situation where the Sybase SQL Server may fail to boot. If this occurs, contact Sybase Technical Support for assistance. For users of Sybase SQL Server 4.9.2 or earlier, this patch should have no impact. 1156550 Fix for 1137125 needs to recognize newer Vikings 1152410 deadlock occurs in ufs under heavy nfs workload Under heavy CFE (a filesystem benchmark) workload with a ss1 being pounded by a ss2 at full speed, deadlock happens seemingly waiting to do a pagelock. There is one case readdir was waiting for pagelock and the enclosed threadlist shows a case where ufs_putpages waiting for pagelock. 1155803 ndbm hangs when two large records hash to the same value. The problem is that ndbm, the libc database thing, hangs when two records of over 512 bytes hash to the same value. The man page says that when this occurs and error is returned, however this is currently not the case. This fix now generates an error per the man page. 1157047 32MB DSIMMs do not always work in SS10 and derivatives The new 32MB DSIMMS contain two discontiguous 16MB memory regions and thus look like two DSIMMS rather than one. There currently exists code in the sun4m kernel that assumes no more than 8 regions of physical DRAM thus four 32MB DSIMMS plus one of any other sort of DSIMM (or any other configuration using more than 8 "slots" for DRAM) will cause a kernel panic during boot. If more than 8 regions of DRAM exist the current code does NOTHING at all (ie, it just quits, doesn't do what it is supposed to do). 1139753 locking hangs under heavy load; disturbing ICMP messages Under heavy loads, NFS locking clients may be unable to provide replies to their servers' occasional portmap GETPORT requests within the default RPC timeout. This in turn prevents the server from responding to outstanding locking requests from that client (and others), causing the server lockd to appear to be hung or dead. (from 101318-33) 1142662 mlockall(MCL_CURRENT) returns EIO if used with threads When a thread is created and mlockall(MCL_CURRENT) is called it fails with EIO, mlockall(MCL_FUTURE) works but this does not guarantee that the pages loaded so far have been locked in memory. mlock/mlockall and MAP_NORESERVE. An mlock/mlockall on a mapping of /dev/zero (anonymous memory) will lock all pages in memory. However, an mlock/mlockall operation on a MAP_NORESERVE mapping of /dev/zero will NOT lock pages that have not been faulted in (i.e., have not been accessed). mlock/mlockall will only lock all existing anonymous pages in memory. Thus, applications which expect all pages in an address space to be locked in memory via mlockall(2) should ensure that all pages belonging to MAP_NORESERVE mappings, if any, are accessed before invoking mlockall(). The threads library creates all default thread stacks as a MAP_NORESERVE mapping. Thus, applications which create threads and expects all pages to be locked via mlockall() must provide a stack which is represented by a virtual address range NOT mapped as MAP_NORESERVE. 1148689 Problem secure nfs between solaris 1.x and solaris 2.2. After several testing and investigation, the description of this bug should really be : "Under Solaris 2.x secure NFS, a non-root user on a 2.x NFS Client cannot write a large file to a securely mounted NFS File system after the NFS Server reboots (only if you did a write before the sever reboots." (from 101318-32) 1154325 kernel route table corruption when using routed on a network with gated running When running on a network with routing daemons that generate host routes for machines on the directly attached network (e.g. in netstat -rn: 155.155.48.43 155.155.48.43 UGH 0 0) the routing table will not contain any routes with 155.155.48.43 as a gateway. This will lead to lack of connectivity. in.routed will syslog messages like: in.routed[1923]: rtadd SIOCADDRT: Network is unreachable 1152168 system call use blu, can cause loading error. 1149458 pwconv strips out entries that begin with + from /etc/passwd Passwd entries beginning with "+" or "-" are silently removed from the files /etc/passwd and /etc/shadow on the second invocation of pwconv(1M). 1146840 sockets suffer severe performance problems with a couple of hundred active tcp connections 1150304 tcp_eager_swap fails moving timer_mp if more than one eager connection 1088703 upstream message during I_UNLINK can cause panic. A multiplexing driver might see messages arriving in its lower put procedure for a queue which has already been I_UNLINKed. 1097418 qprocsoff reordering problem 1151044 TCP connections hung in ESTABLISHED state. 1153024 assertion failure in strrput : ASSERT (!(stp->sd_flag & STPLEX)) 1142479 infinite loop in callbparams_free Streams drivers and modules using qtimeout(9F) or qbufcall(9F) can cause the kernel to go in infinite loop in callbparams_free. 1155136 Recursive mutex enter in streams strrput+x980 Stressing TCP/IP on a multiprocessor can cause a recursive mutex_enter panic. The stack trace shows that mutex_enter was called by strrput. 1154975 IP perimeters cause 1000 op LADDIS drop 1145471 3-5 out of 600 concurrent tcp connections just hang and never timed out. 1142622 interactive performance poor on MP system w/CPU bound procs 1113339 t_sndrel()/shutdown() immediately after sending small dataset causes data lost Data may get lost while using TCP (/dev/tcp in TLI andF_INET/SOCK_STREAM sockets) when a t_sndrel()/shutdown() call is made immediately after sending a very small amount of data. Mouse tracking is rough when an MP system is running enough CPU-bound processes to occupy all on-line processors. Under heavy load and with lost packets, a TCP server can get into a state where connections never re-transmit. The symptom is connections in ESTABLISHED state with data on the send queue shown by "netstat" and no retransmissions visible by "snoop". (from 101318-31) 1154515 binaries compiled with "-N" on 4.1.x fail on Solaris 2.3 w/ "Exec format error" OMAGIC binaries do not work on Solaris 2.x. 1153178 tmpfs deals incorrectly with directory permissions 1151999 problem with directory links in tmpfs - pwd gets confused 1146597 panic in strpermod_allocate when MTPERMOD driver opened twice consecutively With X.25 8.0 it is possible to panic the system by a user opening /dev/x25 twice (the first open will fail and the second will panic the system). (from 101318-30) 1150613 _lwp_create doesn't pass on process priority New lwp created using _lwp_create() doesn't inherit the scheduling parameters properly from the parent lwp. (from 101318-29) 1153911 compiler code reordering breaks small4m parity reporting - use volatile During a memory error (i.e. parity error) the MFAR register is reported incorrectly. The address given will report an error from the wrong SIMM. The fix is to use the volatile type to preserve the correct MFAR address, allowing customer to find the correct bad SIMM. (from 101318-28) 1152995 Bad core file generated when mmap() range exceeds object size. A program that uses mmap(2) to map a file and that creates a mapping larger than the size of the file and that then aborts with a core dump will generate a core file that is not readable by a debugger. (from 101318-27) 1144086 data fault in ts_alloc or trap in tstile_alloc when lofs filesystem mounted. The system may run out of turnstiles (a locking resource). Running with the loopback file system may exacerbate this problem. First appeared on an SC2000. 1152995 backed out because of side effect problems (from 101318-26) 1152995 Bad core file generated when mmap() range exceeds object size. A program that uses mmap(2) to map a file and that creates a mapping larger than the size of the file and that then aborts with a core dump will generate a core file that is not readable by a debugger. 1153790 s1093 kadb will not boot kadb on viking 3.5 systems really fixed in this rev (from 101318-25) 1153790 s1093 kadb will not boot kadb on viking 3.5 systems This fix isn't right (from 101318-24) 1152977 Interactive response suffers when CPU intensive jobs are running When CPU intensive jobs ("main(){while(1);}", to take a trivial example) are running, interactive response can suffer badly. Symptoms include sluggish mouse pointer movement, and intermittent echoing of characters in shells. (from 101318-23) 1151159 random and strange bad behavior on 4d systems, i.e. panics, watchdogs, etc. 1153051 enabling of workaround for random and strange behavior on 4d systems In systems utilizing SuperSPARC processors, there is a possibility of random and strange bad behavior on 4d systems. A cause for some of these problems has been identified to be, on occasion, a misoperation of the SuperSPARC processor under very limited circumstances. (from 101318-22) 1152482 kernel panic in prgetstatus 1152251 read from PIOCOPENPD causes panic: data fault A program which does a /proc PIOCOPENPD call, followed by a read on the resulting file descriptor after the target process exits, will trigger a panic of the system due to a DATA FAULT resulting from a dereference of a NULL pointer. This scenario can result from using the SunPro collector which performs performance analysis on another program. This is one of the standard SPARCWorks tools. Rutgers has had at least two panics like this one: BAD TRAP: cpu_id=2 type=9 addr=4 rw=1 rp=e4e364 A kadb stack trace shows 'prgetstatus' called from 'prioctl'. (from 101318-21) 1151619 sockmodwput data fault panic due to socklog problem socklog() was being passed a NULL pointer while calculating the size of the message block. This resulted in the kernel panic with Data Fault. (from 101318-20) 1149928 TCP/IP scalability problems. 1149929 STREAMS outer perimeter scalability problems This patch reduces the time spent locking and unlocking the outer perimeters used by TCP and IP. It also reduces the lock contention on the strmsglock (used by the STREAMS allocator) and reduces the time spent running at high IPL from the Ethernet driver. (from 101318-19) 1146534 swift_mmu_writeptp code in wrong order causing watchdog reset. Under heavy load, a SPARCstation 5 will watchdog reset. This has been seen running kenbus, LST, and svvs. (from 101318-18) 1149088 tcp and sockmod does not protect against QUEUE_ptr in T_CONN_RES going away 1123140 transport providers can crash if accessing T_CON_RES QUEUE_ptr field If TLI applications close the accepting file descriptor (passed to t_accept) while the t_accept is in progress the kernel can panic in tcp_accept, in sockmod, or in timod. (The sockmod panic will only occur if the file descriptor that is opened by the accept() in the socket library is closed.) (from 101318-17) 1149105 lost entries in wtmpx and wtmp. wtmp/wtmpx and utmp/utmpx corrupted during synchronization (update) (from 101318-16) 1147165 Streams resources depleted suddenly (due to no syncq flow control) A machine can rapidly run out of kernel memory under heavy load. This is signified by netstat -m (on the core dump) reporting tens of thousands of allocated messages. 1150306 data fault in background - streams close race. The kernel can crash with a data fault. The stack trace shows that background calling mutex_enter which takes a data fault. (from 101318-15) 1147620 system hangs in deadflck. Under certain circumstances, the kernel may hang due to an error in file and record locking. In this case, a kernel thread will be found to be looping infinitely in deadflck(). (from 101318-14) 1139753 locking hangs under heavy load; disturbing ICMP messages. Under heavy loads, NFS locking clients may be unable to provide replies to their servers' occasional portmap GETPORT requests within the default RPC timeout. This in turn prevents the server from responding to outstanding locking requests from that client (and others), causing the server lockd to appear to be hung or dead. (from 101318-13) 1132554 fcntl: error No record locks available, lockd: out of lock NFS file servers can leak record locks. Eventually all lock requests (including local locks) fail with ENOLCK. Another symptom is syslog messages from lockd (on the server) complaining that it is out locks. This bug can also cause the server to incorrectly grant lock requests, which can lead to corruption of user data files. 1147226 NFS locking broken when byte order is different. Patch 101267-01 introduced a bug in NFS clients that could cause locking operations to fail if the server is not running SunOS or if the server is not a SPARC system. The symptom is syslog messages from lockd on the client complaining about malformed file handles. (from 101318-12) 1150058 SPARCstation-10 SX Vid SIMM Cursor RAM write-enable is weak and corrupts writes. This fix is to the Video SIMM Operating System Driver (cg14 driver) and provides a software workaround to problems observed with a broken cursor image when the cursor is written to. (from 101318-11) 1146924 SS10-51 SS600-51 will fail "watchdog reset" or hard hang under load (from 101318-10) 1140209 cannot exit login sessions simultaneously from alphanumeric terminals properly. The zombie processes were not being removed by the parent process when the handler for SIGCHLD was being reset. 1142882 panic on exit The u.u_ttyp field was being set incorrectly when a pre-svr4 module was being pushed. The oldvalue of u.u_ttyp was not saved and later checked to see if it needs to be reset to NULL or not. (from 101318-09) 1143439 using fork() and libaio together leads to system panics When using libaio to do asynchronous I/O in a process and also doing a fork() in the same process, there is a window in which the system will panic. The same phenomenon occurs with multi-threaded processes that use fork1() (this has been observed with SunPC and the volume manager). Finally, using a /proc tool that reads the address space of a running process, like /usr/ucb/ps -ww, can lead to a panic of the same (not identical) sort. (from 101318-08) 1130721 panic messages are not logged in /var/adm/messages previous putback for this bug caused system to panic if more than one syslogd was started. (from 101318-07) 1146985 data fault panic in lock_try due to interval timer signal There is a race condition in exit() and lwp_exit() where they are canceling outstanding itimer() callouts. If the race is lost, a callout remains that eventually fires and attempts to access a non-existent lwp or process, leading to system panic. (from 101318-06) 1130721 panic messages are not logged in /var/adm/messages Added postinstall script to edit etc/syslog.conf and postremove script to remove the edits. This should have been done as part of 101318-03. (from 101318-05) 1146912 panic: deadlock - cycle in blocking chain when using /proc to read a process When using tools that read the address space of other processes via /proc, there is a window of vulnerability in the operating system that can cause a panic with the message: Deadlock condition detected: cycle in blocking chain. Tools that read the address space of other processes include: /usr/bin/truss /usr/ucb/ps /usr/bin/adb /opt/SUNWspro/bin/dbx 3rd party debuggers (e.g., gdb) The window of vulnerability is extremely small, but the problem has been seen on heavily-loaded multiprocessors. (from 101318-04) 1144765 SunPC fails on sun4m systems running Solaris 2.3 (from 101318-03) 1130721 panic messages are not logged in /var/adm/messages the mechanism implemented in sunos5.0 to save log messages produced before syslogd is started doesn't allow messages recorded in the message buffer before the reboot to be logged. this patch returns to the original method of saving log messages and corrects the problems which prompted the incorrect fix in 5.0. (from 101318-02) 1108615 I_LOOK etc tests for end of stream by walking mid point qnext Kernel crash (data fault). The pc is in the SAMESTR macro either in the build_sqlist function or in the getendq function. (from 101318-01) 1139493 fcntl(2) => ENOLCK and "klm_lockctl: bad nonblk LOCK error 3" If there are problems communicating with the lock manager on an NFS server and a blocking lock request (e.g., fcntl(..., F_SETLKW, ...)) receives a signal, the lock request might not get canceled. This would leave the file locked with no way to unlock it, short of rebooting the client or server. (from 101485-01) 1121069 creating a.out cores can cause panics The kernel can panic with a data fault when an a.out core file is produced because the kernel reads off the end of the user structure. This can produce "WARNING: Kernel BE" or "WARNING: Kernel TO" followed by an "Access bus error" bad trap. The pc is usually in bcopy_asm, and the stack shows the routines "core" and "aoutcore". (from 101406-01) 1146924 The SPARCstation 10 Model 514 MP machines may not operate reliably. For reliable operation of the SPARCstation-10 Model 514 machines the kernel variable 'enable_sm_wa' must be assigned a value of 1. (from 101349-01) 1137581 C2+ gets watch dog reset with Sundia 1144922 cgfourteen driver could still get remap panic 1145401 sx driver memory leak 1145746 C2+ panics when creating an X Window The reliability lab typically runs Sundiag on machines continuously for extended periods of time (more than a week). When doing such reliability testing on the SPARCstation 10BSX machines we discovered problems: a) machines randomly get a watchdog reset (bug ids (1137581 and 1144922). b) After running the machines for a period of 72 hours or greater the machines seem to hang or behave sluggishly after exiting from Sundiag. (bug id 1145401) c) In some very rare situations, when unmapping a range of virtual addresses cloned for SX, the machine panics, because the thread unmapping the address range holds the writer's lock on the address space and then tries to acquire a reader's lock on the same address space. (Bugid 1145746) (from 101346-03) 1145617 NFS/NIS+ servers + clients hang in tcp_lookup If a Solaris machine receives a tcp packet sent to the all-zeros IP address (an old broadcast address that should no longer by used) the kernel might go in an infinite loop. The loop is in drain_syncq calling tcp_rput calling tcp_lookup_listeners and then calling put. (from 101346-02) 1145661 accept() fails with EPROTO, attempts to reconnect on socket fail. Applications can see the socket accept() call fail with errno being EPROTO. This error indicates that the TCP 3-way open handshake failed to complete and should be handled by retrying the poll/select/accept call. This patch prevents the EPROTO errors from being returned by accept(). (from 101346-01) 1144308 Solaris crashes with urgent data RFC 1122 The machine can get a watchdog reset or alternatively hang when receiving urgent data. If it hangs it hangs "hard" i.e. L1-A does not work, and unplugging and replugging the keyboard does not work either. A snoop trace of last packet received should have the Urgent flag bit set and with an Urgent pointer of 0. (Note: the 2.2 version of snoop does not print the Urgent pointer field - the 2.3 version does.) (from 101326-01) 1139124 syslog does not output more than approx 100 characters, no errors reported syslog messages longer than 100 characters result in an empty syslogd posting. Only the header of the message is printed. The message part is empty. (from 101319-02) 1144228 Sparc center 2000 running Solaris 2.2 panics with data fault in do_urg_outofline System panics in various places in do_urg_outofline() routine. Typical stack trace would look like: do_urg_outofline() sockmodrsrv() runservice() with a NULL message block(bp). (from 101319-01) 1137978 telnet returning "protocol error" when attempting to telnet to netbuilder router. From either solaris 2.1 or 2.2 system, telnet returns "protocol error" when telneting into the 3com router. (from 101294-02) 1162202 Patch 101294-01 forces input baud rate the same as output baud rate. In zsa_open() set the input baud rate if the default setting for cflag in /kernel/drv/options.conf wants to set the input baud rate. Otherwise set the output baud rate only. (from 101294-01) 1138196 min baud B50 unable to receive 25 bytes in 7 seconds even if transmitted 1137587 tcflow: START and STOP characters are not read when IXON is not set 1137798 PARENB, INPCK, and PARMRK are set, three character sequence not read correctly 1138207 tcflush is not clearing the data to be read 1141642 local printer on /dev/term/a doesn't work at all on 4/50 The failures were due to bugs in the zs driver and ldterm module not handling the software flow control (tcflow) correctly. (from 101267-01) 1142365 lockd incorrectly examines export information when comparing filehandles. Consider a scenario where a PC application, running under WABI or SunPC, uses File Sharing to synchronize instances of itself. If one instance is running on an NFS server and another instance is running on an NFS client, the NFS server will allow access to both instances at the same time, when it should really only allow access to one at a time. This can cause data corruption. 1140047 suppose a 3-byte (or bigger) region of an NFS file is locked. Now suppose that one or more bytes in the middle of the region are unlocked, leaving two locked regions on either side of the "hole". The client does not properly manage these two regions when they are unlocked. The problem does not appear until the server reboots and the client attempts to reclaim (relock) at least one of the regions. This can lead to situations where the server thinks a region is locked, but nobody owns the lock. The server console may display _nfssys: error Stale NFS file handle if the file was deleted before the server rebooted. 1123788 lockd on an NFS client detects and filters out retransmitted requests from the client kernel. The code to detect retransmissions does not look at the filehandle in the request. Although this does not seem to have been a a problem in practice, it could conceivably lead to cases where application gets the wrong return code from a lock request. (from 101316-02) 1152150 RPC errors when SunUnify started in BCP mode BCP RPC server programs (compiled on SunOS 4.x) don't respond when started from inetd over TCP in wait mode. (from 101316-01) 1131237 socket library is not signal safe 1143043 _s_synch socket library deadlock (from 101674-01) 1160720 backing out the change made for bugid #1150613 as it is not more general. (from 101411-04) 1151137 file system (directory) access sometimes very slow Under heavy filesystem (nfs and ufs) load in which the ufs inode cache is full and the dnlc (directory name lookup cache) contains mostly nfs entries, lots of CPU cycles are spent trying to free up a ufs inode from the dnlc rather than create a new ufs inode. (from 101411-03) 1156947 Solaris 2.3 kernel "panic:ufs_putapage:bn == UFS_HOLE" Drivers using physio() to copy data to mmap'ed files can cause UFS_HOLE panics. 1146726 Concurrent activity inside large directories leads to erroneous results Multiple finds within a directory will sometimes return ENOENT (entry not found) even though the entry does exist. Large directory names exacerbate this problem. (from 101411-02) 1154060 ufs quota reports users going over quota to the console. When a user goes over quota (either soft or hard limits for either files or disk blocks) on an NFS-mounted filesystem, a message appears on the console of the NFS server. This occurs because there is no controlling tty for the NFS server process which is acting on behalf of the user. When many users on the same NFS server go over quota at the same time, the console can become unusable due to the quantity of messages that appear on the screen. (from 101411-01) 1098381 ufs write creates zero length file with blocks allocated when buf read faults Passing the write() call an invalid buffer pointer can cause an internal copy to fail, after space has been allocated to the end of the file for the new data. The additional blocks are not freed after the failed copy, and may later cause a UFS_HOLE panic; or may cause fsck to complain about unallocated blocks at the end of a file. (from 101329-16) 1149399 service should not allow concurrent resyncs 1160662 readonly child is updating the local database from the master 1158639 __log_resync() automatically resets the transaction log state to LOG_STABLE 1158638 update timestamp can be lost after checkpointing 1161525 checkpoint before all replicas are in sync can lead to full resyncs on replicas The NIS+ transaction log can corrupt under heavy updates and checkpointing. (from 101329-15) 1163847 automountd doesn't work with Apollo pathnames which start with // 1153274 machine panics with recursive mutex_enter while using the automounter (from 101329-14) 1156518 Cannot mount mvs/nfs mounts using autofs under Solaris 2.2 & 2.3.?? automountd wrongly assumed that the mounted filesystem had to start with a '/' (slash). This assumption may be invalid if the server is not a UNIX system. Such is the case of MVS, DOS and others. (from 101329-13) 1145421 NIS+ NIS (YP) compatibility does not handle primary host name correctly A problem occurs because the entries for hosts.byaddr are returned one at a time in the order they occur in the hosts table. The fix is to merge the entries for a particular address into one entry and return that instead. We have to be careful to not merge entries for different addresses. 1155701 memory leak found in the NIS+ server code using up all the system resource (from 101329-12) 1163275 automountd sporadically dumps core with SEGV on xdrmem_getbytes/memcpy. (from 101329-11) 1153253 rpc.nisd with PATCH#101329-04 dumps core without any notice. rpc.nisd crashes immediately after starting up. (from 101329-10) 1160379 Major security hole in automount. (from 101329-09) 1157062 autofs and loopback mounts in direct hierarchical maps broken. This patch allows automountd to correctly remount loopback file systems after it determines that at least one member of the hierarchy was busy and therefore could not be remounted. automountd needs to format the mount options before it passes them to /usr/lib/fs/lofs/mount. (from 101329-08) 1150491 cron dies with SIGSEGV in __nis_core_lookup Cron dumps core when the NIS+ environment is unstable. (from 101329-07) 1149774 remote users can override the way NFS filesystems are mounted to gain root access. Closes hole left by previous fix to automounter's security. Fixes options security hole in automounter when using wildcards. (from 101329-06) 1149774 remote users can override the way NFS filesystems are mounted to gain root access: security (from 101329-05) 1136034 NIS+ creates invalid hostname. NIS+ does not work correctly if the hostname in /etc/hosts file is fully qualified. (from 101329-04) 1150596 patch 101329-03 disables RPC threading. The patch 101329-03 created a problem with MT RPC. When running MT and using RPC you get the error: Assertion failed: RW_READ_HELD(&rpcaddr_cache_lock), file rpc/rpcb_clnt.c, line 127 Which means the routine check_cache() is being called without a read lock being held. This is because of this patch. This will be seen by anyone who tries to run a program that is MT while using RPC. (from 101329-03) 1145542 nisaddcred creates LOCAL entries with the wrong group ID when invoked by a non-root user who is a member of the NIS+ group for the credential table. (from 101329-02) 1139765 data corruption in NIS+ cache manager 1144962 rpc.nisd dumps core (while undergoing update from YP maps via nisaddent -my) 1142583 NIS+ command(s) fail to use master server 1147964 NIS+ servers start repeatedly doing FULL RESYNCS because stdio runs out of fd's These set of fixes and work arounds fix a number of problems found at a very large customer using only NIS+ for their name service. The fixes consist of a number of memory leaks discovered by Purify, a real important fix to __nis_core_lookup() (one copy in the NIS+ server and one libnsl) and a fix/workaround to a running out of open file descriptor problem caused by a combination of heavy load (shift changes at site) and the fact that stdio only allows 256 of the 1024 file descriptors to be used causing stdio opens to fail leading to the NIS+ servers constantly doing FULL resyncs. The workaround bumps up TCP connection above 256 to allow stdio to use the lower numbered file descriptors for itself. (from 101329-01) 1145573 CADDS software package fails with rpc error. Servers using librpcsoc (source compatibility) library for service creation do not respond to client requests. (from 101315-01) 1140610 autofs does not work with cachefs file system type 1145129 automountd doesn't follow NIS+ table paths This patch fixes the following problems: 1. autofs will fail to mount entries from the hosts map which specify the cachefs filesystem option, such is the case of /net when the cachefs option is specified. 2. autofs mounts which trigger hierarchical mounts will fail when automountd remounts members of a hierarchy which have previously been unmounted due to an inactive filesystem unmount request. This only occurs when using cachefs. 3. autofs wrongly assumes that the backfstype option is placed last in the list of options. 4. automountd will not follow NIS+ table paths when the auto_* tables are pathed to tables in another domain. (from 101597-02) 1163445 libaio blows away user-defined signal handler when system() call is made. 1. The problem is because both aio and libc are handling signals. The following is my analysis. In the test program the calls to signal are going to libaio signal handling routines and the signal calls in the system() code gets into libc handling routines. This is where the problem comes from. 2. aiowrite/aioread may return success in case of a certain error condition. (from 101597-01) 1148003 libaio and libthread not compatible The async IO library (libaio) and the threads library (libthread) were exporting symbols that should have been static to these libraries. This caused applications that were using both threads and async IO to not work properly. Note: This should permit applications to link with libthread and libaio. It doesn't guarantee that all threaded programs can use libaio. That problem will be fixed later. (from 101859-01) 1152710 socket lib in 2.3/2.2 have problems with not clearing bad connections and errno The listening AF_UNIX socket in a server can get permanent errors should client close the socket before the connection has been accepted. The client might do this if it takes a signal causing the connect() call to return EINTR. (from 101344-11) 1172644 ps(1) very slow on dataless client with 101344-08 installed. The ps command (and any references to /proc) were slow due to an unnecessary amount of buffer flushing requests while trying to determine address space sizes. This behavior would only appear on systems with root and /usr partitions mounted remotely over NFS. (from 101344-10) 1171950 This patch fixes the bug 1171950, "kernel panics in nfs server during an nfs rename operation" and a similar problem associated with nfs link operation. 1163551 NFS area: file nfs_xdr.c xdr_createargs() doesn't initialize argp->ca_sa. This causes a null pointer reference. (from 101344-09) 1157053 ESC8146 System panics when doing a copy to NFS file system mounted across F thisDDI-S Copying a 1 meg file causes a panic in xdr_writeargs(). Over ethernet this problem does not happen. (from 101344-08) 1153707 s1093 panic vn_rele: vnode ref count 0 1160181 2.3 system hang due to kernel out of resources (rmalloc_wait()) 1143962 client hangs when a single page cannot be pushed to a server It is possible for the system to hang or panic in certain situations while using remotely mounted filesystems. (from 101344-07) 1139146 fopen(fn,a/a+) in a non-writable file did not return NULL write(2) to an NFS mounted file, opened with O_CREAT flag, do not fail when file has read-only permission. (from 101344-06) 1161359 machine panic in nfs_bio by user application A user can crash the machine by running two versions of his application. The application mmaps numeric data files that have been automounted from a remote host and does calculations on the data. If the user runs two versions of his program at the same time, it will crash. Simple programs programs that opens the nfs file, mmaps and closes panics the system in in nfs_bio(). (from 101344-05) 1146065 authern_marshal crash on diskless ss10 from a 20-way dragon server. The modifications here address the problem of unsafe handling of the cached credential that the NFS client uses. The changes are to correct the handling so that it is mt-safe. (from 101344-04) 1144683 nfs_inactive rfree causes mutex_exit - lock not held panic 1146065 authern_marshal crash on diskless ss10 from a 20-way dragon server. The modifications here address two problems. The first problem is an mt-unsafe condition in the NFS client. It has to do with the way that rnodes are allocated and freed. It was possible for a client to be in the process of releasing an rnode and have it picked up by another thread before the first thread was finished with the rnode. The second problem is some unsafe handling of the cached credential in the rnode. This cached credential is used to pass credential information around between the various layers in the NFS client. The handling of the credential was not mt-safe and so could result in client crashes. (from 101344-03) 1141654 bin access is possible It is possible to create setuid/setgid programs on a server from an insecure client. (from 101344-02) 1132302 read and write data across 5.1 and 4.1.3 NFS fails intermittently This is an NFS data corruption problem in which clients are sometimes unable to read back data that was just written to the server. This is usually characterized by the term, short read. (from 101344-01) 1146159 du, tar, bar does not work with VMS(NFS) fs due to conflicting fileoffset defs The NFS protocol specifies that a NFS_READDIR request return an opaque 32 bit cookie which is used to get the next directory entry. Solaris places this cookie in the directory offset field for use by seekdir() and other directory functions. However, some OSes such as VMS use negative numbers as cookies which causes a seekdir() on the directory to fail. The fix is to allow arbitrary seeks on NFS mounted directories. (from 101694-01) 1148354 AF_UNIX SOCK_DGRAM sockets suffer excessive data loss Even though the interface is defined to be unreliable, the instances where data is discarded can be minimized and this patch attempts to do that. (from 101831-01) 1151509 automounter's built in timeout is too short for low speed lines Some server's slow rpcbind/portmapper cause automountd to timeout when it attempts to create the client handle to communicate with the remote machine. This patch provides a larger timeout with retransmission if necessary. (from 101637-01) 1151643 ypbind core dumps often on heavily used mail server (from 101500-04) 1170832 (gnu) make in parallel mode will fail on automounted file systems Hold delivery of certain signals during lookup of automounted pathnames while the mount is taking place. Note that the signals are not blocked, only delayed. The lookup can still be interrupted with SIGINT, SIGTERM, SIGHUP and SIGQUIT. This way system calls waiting for an automount to complete will be not be interrupted by signals other than those mentioned above. It now follows the same signal policy as NFS. (from 101500-03) 1155130 Access "/home/terminator" from Frame gives "Device busy" error A number of applications will issue lookups on automounted directories or files which timeout before automountd has finished the mount. The application retries the mount, possibly a number of times before the mount finally takes place. This generates multiple mount requests for the same filesystem. Once the first mount request successfully completes, all other queued mount requests will fail with device busy errors. This patch fixes this problem. (from 101500-02) 1159691 automount skips entries on readdir. 1156556 first ls on a large automounted fs produces a truncated listing. (from 101500-01) 1149714 autofs fails /home/foo during multiple simultaneous logins 1124745 autofs attempts to mount bogus directories This patch fixes the problem with multiple simultaneous auto mounts. Previously when two or more threads triggered an auto mount, one would succeed and the remaining threads would get the "no such file or directory" error message. It also fixes a problem with indirect mounts trying to mount bogus filesystems. Basically any failed lookup on an indirect mounted filesystem would trigger a call to automountd attempting a mount of the non-existent file/directory, which obviously resulted in an unsuccessful mount. (from 101855-02) 1186156 keyserv caches old private key after user's password is changed (from 101855-01) 1167500 login program dumps core with bus error in _getvfsent() if invalid arguments are supplied (from 101881-01) 1145457 ksh does not set the correct arguments for su - 3001400 "su" to special logins behaves strangely In SunOS 5.3, a bug was introduced into the 'su' command such that all shells that were invoked using the 'su - username' command obtained the string '-su' as argv[0] of the shell. This had the effect of simulating a login from the point of view of the shell, but didn't type out /etc/motd, check mail etc. for the Korn or Bourne shell users. Also, some customer applications and scripts that check argv[0] via the output of 'ps' were confused. This fix restores the old behavior of 'su -' that is, argv[0] of the user shell is set to the name of the shell prepended with a '-' sign. Only the special case of the default shell (a blank entry) uses '-su' as the shell name. (from 102445-01) 1175368 SECURITY anyone can gain root access to a 2.3 machine (from 102168-01) 1165615 nistbladm fails to add/modify NIS+ tables attributes when the column definition is not specified. (from 102220-03) 1204638 praudit core dumps with user defined events (from 102220-02) 1187179 in.rshd dumps core with bsm enabled (from 102220-01) 1169914 FTP doesn't enforce access control in certain situations (from 102361-01) 1183552 ftpd processes hang at httpd (WWW server) sitesq (from 103063-01) 1160090 nis_cachemgr should delete expired dir objects only if they can be refreshed (from 102110-01) 1171833 memory leak in thr_setspecific 1170507 libthread memory leak 1169391 thrds uses /dev/zero to initialize thrd stack via mmap(); mmap() contains garbage This patch fixes two different problems: one has to do with a memory leak in a multi-threaded process which failed to reclaim idle threads' stacks. The other problem has to do with libthread using /dev/zero even before any threads have been created. libthread uses /dev/zero for its own use but it should make sure that it does not start using it before any threads are created. This patch ensures that libthread will use /dev/zero only after some threads have been created. Some applications might want to close all file descriptors (say, 3 to MAX) and should continue to work if they use libthread but do not create any threads. The bug is that each thread in an MT program which uses Thread Specific Data (TSD) (the thr_setspecific(3t) interfaces), leaks 2 words plus 1 word per TSD key. So, for example, if there are 6 TSD keys, 2 + 6 words are leaked per thread created/exited. i.e. 32 bytes per thread created/exited. If the rate of thread creation/destruction is high, this could lead to a substantial memory leak, especially for long running applications. (from 101869-01) 1171745 panic, deadlock condition detected: cycle in blocking chain, with 101318-50 1171729 101318-50 utmpd loops consuming 100% CPU (from 101672-01) 1159160 f77 compilation fails because combination of fseek/fwrite writes wrong bits (from 101615-02) 1163347 finger displays the wrong number of users logged in 1163352 rusers displays the wrong number of users logged in 1163355 w displays the wrong number of users logged in 1163357 wall shouldn't write messages to window tools without -a option 1163346 users(1B) displays the wrong number of users logged in. 1163360 utmp_update must stamp the entries made by normal users 1160151 utmp_update shouldn't allow cmdtool to overwrite live utmp entries 1163741 sometimes utmp entries made by window tools aren't removed 1153241 getting an mmap error when I try to start any "LST" test. (from 101615-01) 1157935 utmp entries was not being reaped in Solaris 5.3. utmp_update command was incorrectly marking the utmp entry as defunct. The code now checks to see if the tty device in the entry is owned by the process modifying it. (from 101489-04) 1156649 lwp pool shrinks and doesn't grow regardless of calls thr_setconcurrency. When a multi-threaded programe calls thr_setconcurrency() to set the number of lwps (say n) and calls another thr_setconcurrency() after five minutes when all aged lwps have died, the concurrency level is not the same what is requested in the second call. This problem can be explained in following steps assuming that _minlwp is level of concurrency given in setconcurrency call, and _nlwp is level of current concurrency. i) With first invocation of thr_setconcurrency(), _minlwps is set to the level of concurrency requested and accordingly number of lwps running on the system is equal to _minlwps. This is done only when level of concurrency present at that time (i.e. _minlwps) is less than what has been requested. ii) However after five minutes, when aged lwps are killed, the minlwps is still set to the level of concurrency requested earlier, however, _nlwps (number of lwps available now) are much less. iii) With another invocation of thr_setconcurrency(), since _minlwps is equal to the level of concurrency requested, the lib thr_setconcurrency() simply returns though the number of lwps available are less than minlwps. (from 101489-03) 1154502 libthread panic when a SIGPOLL/SIGIO signal is received during sleep When a program linked with the threads library receives a SIGPOLL or SIGIO signal during sleep() a threads library panic occurs (from 101489-02) 1153229 libthread's version of sigprocmask() is clobbering errno libthread interposes on sigprocmask(2) to provide its own version. This version is simply a call to thr_sigsetmask(3t) thus ensuring that in a multi-threaded (MT) process, the masking operation is carried out only on the calling thread's signal mask. There is no process wide signal mask in an MT process. The bug is that libthread's version of sigprocmask() clears "errno" on success. The bug is typically seen in code such as the following : .... write_ret = write(...); sigprocmask(...); if (write_ret == -1) { printf("write() failed; errno is %d\n", errno); exit(1); } .... If the call to "write()" above fails, then the wrong value for errno will be printed out due to the bug that sigprocmask() clobbers errno. The work-around is to check "write_ret " *before* calling sigprocmask(): .... write_ret = write(...); if (write_ret == -1) { printf("write() failed; errno is %d\n", errno); exit(1); } sigprocmask(...); .... In any case, this is the correct programming style, since even if sigprocmask() were OK, the system would legitimately clobber errno if sigprocmask() were to fail, thus *legitimately* clobbering the error code returned by the failed "write()". Another problem is that libraries such as libsocket which return -1 and the error code in "errno" might also encounter this libthread bug. This results in library calls potentially returning -1 but a 0 in "errno". Hence, applications which call such libraries would run into this bug indirectly with no possibility of implementing a work-around in the application. (from 101489-01) 1146922 cond_timedwait() misses its timeout and hangs. The cond_timedwait() interface doesn't reliably guarantee that a thread will wakeup even if it has specified a timeout period. Also, its possible for signals to be pending on a thread which are never delivered. (from 101484-03) 1163944 libucb should call _{sigprocmask,sigfillset,sigemptyset} 1163946 BCP signal problems. sigpending, sigaddset, sigdelset and sigismember functions not working properly in BCP; handlers set up with sigaction may get the wrong signal number, also in BCP 1161404 sigprocmask() blocks wrong signals, corrupts memory under 5.x BCP Calling sigprocmask() from BCP applications will: 1. not block the signals 2. corrupt memory (from 101484-02) 1152682 UNIFY database can't open lock manager under BCP. semctl() shared memory operations fail when called through semsys() (from 101484-01) 1150765 fstatfs returns incorrect f_bavail field in binary compatibility mode; f_bavail field in incorrect in fstatfs structure (from 101379-02) 1153458 printf sprintf fprintf leading zero format fails in source compatibility mode. leading zero format specifier is ignored (from 101379-01) 1146808 wait3 in ucblib returns an incorrect rusage structure. (from 101378-21) 1243116 ESC: tag command queue'ing errors on Data General array box's (from 101378-20) 1203291 ssd: watchdog reset on transport failure (from 101378-19) 1236801 backoutpatch for 101378 does not take out changes to /etc/system (from 101378-18) 1224486 sd: there should be retries for both read & write in case of media/hw error 1224604 sd: retries on KEY_ABORTED_COMMAND should eventually be given up (from 101378-17) 1186920 kernel panic with data fault when "sched" is accessing diskarray 1199500 isp: with the new ISP PROM, ISP in 2.3 or 2.4 should still run ISP f/w in driver (from 101378-16) 1189329 esp: implicit restore data pointer fix is not complete (bug id 1183215) (bug id1183215 (from 101378-15) 1188367 isp0 : unkown capacity, disk offline (from 101378-14) 1194397 esp: scsi_reset fails with return value zero if patch(101378-10) is applied (from 101378-13) 1184788 ESC - A SCSI-2(?) peripheral eventually times out in SYNC mode of operation (from 101378-12) 1183215 esp: implicit restore pointer at reconnect is incomplete (from 101378-11) 1174992 SC2000 Plus with 17 Modules panics upon boot due to ISP firmware proble, kernel panics running ISP firmware 1.12, 1.11 and earliers. (from 101378-10) 1152282 savecore gets segmentation violation While trying to produce a crash dump on a sun4d, the system savecore program gets a segmentation violation. 1173279 esp: A lot of scsi warning messages are displayed. While trying to start up heavy disk activity and then load and unload the st driver, a lot of scsi warning messages are displayed on the console window. Afterwards, the disks are inaccessible. (from 101378-09) 1173973 esp: scsi resets occurring more often with newer fab FAS286 chips On Sun4d systems (both sundragons and scorpions) we are getting more scsi timeout resets now with the newer reduce die FAS286 chips ( 2400150). They are no longer making the old style chip (2400121) anymore . It doesn't seem to be configuration dependent. On sundragons it occurs most often on tape devices. It also shows up more often with the 1/2 height exabyte 8 mm tape drive. (from 101378-08) 1162475 esp: Restore data pointer triggers a SCSI bus reset If a data transfer is in progress and a restore data pointer message is issued, it appears that the ddi dma window check somehow fails. The host adapter first attempts to abort the data transfer with an abort message. Failing that, it reset the bus. In any case, this translates into a hard failure. STK uses the restore data pointer message because they do not have a look-aside data buffer to handle incompressible data. They simply issue a restore data pointer message and restart the data transfer instead. Net effect, under Solaris 2.3, you can't backup your system with an STK tape drive with data compression enabled. (from 101378-07) 1164926 sd: need mechanism to restart command for sun4 if HBA returns TRAN_BUSY When user runs volume manager on a Sun4 machine and uses CD-ROM drive, he can sometime get his regular command rejected and see message like: WARNING: /dev/ncr@3d,200000/sd@6,0 (sd7): transport rejected (0) (from 101378-06) 1162277 esp: unnecessary maxdma check within esp.c There appears to be an unnecessary DMA size check within esp.c. From what can ascertained, the data transfers are broken into 64k transport sizes anyway so that checking for exceeding the maxdma value does not appear to be warranted. (from 101378-05) 1163617 part of patch 101378-03 missing For rev -01 of patch 101378, there was a prepatch script that would edit /etc/system. This script was missing from revs -03 and -04. The change to /etc/system sets scsi_options to 0x3f8. (from 101378-04) 1155505 isp: driver should always try synchronous data xfer negotiations first If drives are in synchronous mode when isp starts booting and scsi_options don't have SYNC flag set, isp tries to use ASYNC mode. That gives a lot of firmware errors. Hence, isp driver should always try SYNC negotiations first. 1154770 isp: upgrade f/w to 1.12 this f/w fix (isp_fw.c) fixes the null handle problem (spr893) (isp returns a response packet w/o a handle which blows up the isp driver) ; 1-7-94 ggm [spr893] the Bus_Dev_Rst_Seq_Int* ; routine was changed to process * ; the following SCSI messages: * ; BUS DEVICE RESET, ABORT, and * ; ABORT TAG. * ; 1-10-94 ggm [spr893] to correct the null * ; handle problem all decisions * ; based on the scsi_task_state are* ; now based on all 8 states. * ; (from 101378-03) 1148668 isp: timeouts and fatal errors caused by changing pkt_time. The 1093 isp driver copies the packet resp_time over to pkt_time. This is a violation of SCSA and also causes spurious timeouts and fatal errors. Furthermore, if a packet is transported with pkt_time == 0 then this means no timeout, the fatal timeout handling code should ignore these packets. 1151965 isp: f/w version 1.11 This is a f/w release from qlogic (1.11) which will go in prom 1.17 This release fixes: spr 898 - Enhancement Call Syserr spr 899 - Set parity enable bit correctly spr 900 - Fix for system hang when response queue gets full spr 901 - Set dqcb_cmd_depth_limit correctly spr 902 - Patches for earlier SPR894/897 spr 885 - Correct ise of throttle. (IGOR's request) (from 101378-02) 1149518 the esp driver does not handle target initiated sync mode correctly which causes a hang and timeout if the negotiation is immediately followed by a data xfer The problem is in programming the offset and period registers. The req/ack delay is correctly set in the soft copy of this register but not in the register itself. The next time the register is programmed, the soft copy value is used and then there is no problem which is why this bug was not noticed before. 1136580 Heavily loaded SS600MP with DSBE/S on SES/B panics. Please see bug itself for the actual configuration where this happens. 1145757 esp does not issue a device scsi reset when you call scsi_reset (ROUTE, RESET_TARGET) from sd driver. instead it issues a test unit ready. It will impact all target drivers that issue scsi_reset to a particular device. scsi_reset(ROUTE, RESET_ALL) works still. 1134617 Getting "WARNING: Processor level 3 SBus interrupt not serviced" message during C2+ bootup of alpha2.0-a and alpha2.0-b. (from 101378-01) These bugs deal with WIDE scsi negotiation. 1143567 esp: eliminate warning message for wide scsi negotiation rejection. Whenever a SCSI disk drive initiates a wide data transfer negotiation, the esp host adapter driver correctly rejects the message (it's not a wide host adapter). BUT, it then prints out warning messages which are displayed on both the console and logged in the messages file. This guarantees we will have customers calling up about this "problem". 1145242 isp: disable default SCSI_WIDE capability. SCSI_WIDE capability should be disabled for isp for all targets by default. only if target driver wishes, WIDE capability should be turned on by a per target basis. 1137670 sd: add support for wide data transfer negotiations. The disk driver does not check the inquiry data to see if wide xfers are supported by the drive. The driver should check the inquiry data first before asking the HBA to negotiate wide transfer size. This is to maintain compatibility with our current installed base of SCSI disk drives. This bug currently prevents us from mixing wide and narrow devices on an ISP since the narrow devices do not all behave correctly. Patch Installation Instructions: -------------------------------- Refer to the Install.info file within the patch for instructions on using the generic 'installpatch' and 'backoutpatch' scripts provided with each patch. Any other special or non-generic installation instructions should be described below. Special Install Instructions: ----------------------------- The running automountd needs to be stopped prior to patch installation: 1) Stop automountd # sh /etc/init.d/autofs stop 2) Install this patch 3) Edit the necessary entries on your automounter maps (add the retry=n option). 4) Restart automountd # /etc/init.d/autofs start If point patch 101869-01 is installed on your system, please run 'backoutpatch 101869-01' before installing this patch. Unless patch 101331-01 or later is installed, installation of this patch will result in the following warning: WARNING: unable to rename This warning may be ignored and kadb is successfully installed. Reboot after installation. Miscellaneous Notes: --------------------- NOTE 1: Unless the csh shell program is not used, the csh patch is also needed to get the complete fix for bug 4032974 (system hangs when lbolt wraps around): 101610-07 (or newer) /usr/bin/csh patch NOTE 2: sendmail is no longer bundled with this patch and is now available as patch 101739. If specific older revisions of patch 101318 is installed after 101739 is installed, the result will be a downgraded sendmail with fewer fixes. Revisions in question are 101318-35 through 101318-42, and represent all kernel patches with sendmail bundled in. NOTE 3: libthread.so.1 is no longer bundled with this patch and is now available as patch 102110. Patches 101318-55 through 101318-61 should not be installed after patch 102110 is installed. SunSoft recommends that 101318-62 or later be installed instead. If you must install patches 101318-55 through 101318-61, they MUST be installed prior to the installation of 102110 and NEVER be installed after 102110. Doing so would backout all fixes for 102110. NOTE 4: If this patch is applied to a system installed with the entire configuration, approximately 20 MB of free space in /var is required to use with the default save option of the installpatch utility. NOTE 5: Systems running NeWSprint 2.5 should also apply the NeWSprint patch 102113 if you are printing on a NeWSprint printer. Patch 102113 prevents a hang from occurring when printing. NOTE 6: If this patch is applied to a server, it should also be applied to dataless clients that also mount /usr from that server. Failure to do so will generate this error message when openwin is started on the client: "Binding Unix socket: Invalid argument" NOTE 7: ***** Special Note for systems running Oracle, Informix, or Sybase ***** As of rev -34, there are kernel bugs fixed in this patch, but Sun also found some bugs in DB vendors' code which makes DBs fail to startup with the fixed kernel. As a result, Sun is coordinating with DB vendors regarding their corresponding fixes. The complete solution is to have both the 101318 patch, rev 34 or higher, installed on Solaris 2.3 AND the corresponding fixes from the DBMS vendors installed as well. Here is when the DB vendor fixes are expected to be available: Oracle: Oracle has a V7.0.16 patch now available free of charge to customers. Informix: Informix's fix is included in the production version of Online 6.0. Sybase: For users of the Sybase SQL Server version 10.0 or later. With certain memory configurations, a small number of sites may experience a situation where the Sybase SQL Server may fail to boot. If this occurs, contact Sybase Technical Support for assistance and refer to Sybase fix, EBF 2594. For users of Sybase SQL Server 4.9.2 or earlier, this bug does not exist and this patch should have no impact. For additional detail and other work around information please see the file SPECIAL_NOTICE_DBMS also included with this patch. ************************************************************************* NOTE 8: This patch is not applicable for the Voyager, Sun4m1 architecture. README -- Last modified date: Tuesday, February 6, 2001