Patch-ID# 103957-28 Keywords: security nfssrv ufs klmmod klmops libthread.so.1 y2000 acl LOFS Synopsis: SunOS 5.5 CS6400: kernel update Date: Mar/30/00 Solaris Release: 2.5_CS6400 SunOS release: 5.5_CS6400 Unbundled Product: Unbundled Release: Xref: This patch available for non-CS6400 sparc as patch 103093 Xref: This patch available for x86 as patch 103094 Topic: SunOS 5.5 CS6400: kernel update NOTE: Refer to Special Install Instructions Section for IMPORTANT specific information on this patch. This patch is a CS6400 platform specific port of 103093-28. Cray SPR's fixed with this patch: 98179 100433 100861 101359 101372 102615 103171 104194 105066 105514 105825 Cray SPR's incorporated in this version: BugId's fixed with this patch: 1161438 1182705 1189967 1203090 1215792 1220902 1220995 1223882 1223900 1224857 1227376 1227580 1228664 1229015 1229031 1229843 1230150 1230478 1230865 1231471 1231759 1231871 1231997 1232825 1232869 1233049 1233084 1233088 1233175 1233514 1234450 1234858 1234968 1235169 1236018 1237898 1238241 1238559 1238581 1238582 1238919 1240234 1241118 1241816 1242188 1242233 1242481 1243804 1244088 1244142 1244278 1244706 1245291 1245540 1245602 1245703 1246045 1247172 1248186 1248384 1248925 1249250 1249319 1249985 1250351 1250620 1250937 1251421 1251423 1251430 1253223 1253366 1253528 1253810 1256153 1256610 1257803 1258151 1258191 1259966 1259984 1260769 1260873 1260959 1260982 1261400 1261511 1262082 1262694 1262979 1262995 1264333 1264890 1265000 1265170 1265396 1265447 1265705 1266113 1266278 BugId's fixed with this patch: 1266767 1267447 4004147 4004575 4005615 4006846 4007937 4009069 4015176 4015191 4015367 4015497 4016316 4017513 4017770 4019380 4022354 4022849 4024647 4025548 4026339 4026740 4026789 4027360 4027442 4029417 4031186 4032123 4032974 4035167 4035845 4036063 4036589 4037755 4038653 4041518 4041542 4043953 4050892 4051082 4051257 4051590 4051899 4057818 4058892 4058904 4059632 4059736 4061967 4063932 4067641 4067949 4071076 4092407 4096789 4102420 4110026 4125102 4145354 4171116 4175350 4261612 4285794 BugId's fixed with this patch: 4017457 4244171 4293440 Changes incorporated in this version: 1224857 4261612 4285794 Changes incorporated in this version: 4244171 4293440 Relevant Architectures: sparc.cray4d Patches accumulated and obsoleted by this patch: Patches which conflict with this patch: 103084-02 103489-01 103325-03 103164-07 103226-07 103477-13 103153-17 103093-28 Patches required with this patch: Obsoleted by: Files included with this patch: /kernel/fs/lofs /kernel/fs/nfs /kernel/fs/ufs /kernel/genunix /kernel/misc/klmmod /kernel/misc/klmops /kernel/misc/nfssrv /kernel/misc/tlimod /kernel/strmod/rpcmod /kernel/sys/doorfs /kernel/sys/nfs /platform/cray4d/kernel/unix /platform/cray4d/kernel/genunix /usr/include/sys/fs/ufs_inode.h /usr/lib/adb/mntinfo /usr/lib/adb/rnode /usr/lib/adb/ufsq /usr/lib/fs/ufs/fsck /usr/lib/libthread.so.1 Problem Description: 4293440 Back-port fix from BugID 4083498 for Solaris 2.5 4244171 system resets when running lwp05 from the mixtress group of tests (from 103093-28) 1224857 illegal instructions in currently unexecuted assembly code 4261612 profil not disabled on exec*() as indicated in man page 4285794 threads hang waiting for ULOCKFS_SLOCK (from 103093-27) 1238241 data fault when calling ufs_acl_setattr with ufs_acl 0 in inode 4092407 release of i_contents lock in ufs_si_load can lead to race 4125102 ufs_itrunc()/top_end_async() deadlock (from 103093-26) 4175350 longjmp see NULL value with jmp_buf causes csh dump core on SS20 hyperSPARC MP (from 103093-25) 4171116 "kgmon -i" causes panic or hang the system 4102420 segv's and libthread panics when numerous pthread_cancel()'s are run 4061967 assertion failure in _disp() for cancellation test. (from 103093-24) 4145354 Ultra 1 panic in -- segkp_fault: accessing redzone (from 103093-23) 4110026 sol 2.5.1, sigwait() returns '-1' by SIGLWP when compile/link with '-lthread' 4096789 quota -v gives NOT STARTED output for time left column. 4037755 getting portmap RPC for every NLM RPC 1262979 inode cache consumes too much memory; system hangs 4026789 deadlock between i_contents lock and page_lock 4051899 ufs idle queue has no hysteresis control 4063932 orphan lock problem caused by sigalrm/sigintr & large packet loss (from 103093-22) 1266113 due to memory corruption in the OS, Xsun crashes randomly on IPX (from 103093-21) 1234968 System Panic, ufs_ifree: freeing free inode, mode= %o, ino = %d, fs = %s^J 4041542 kRPC/COTS client thinks that it is getting large records (from 103093-20) 4071076 On 2.5 nfs server, the data over length in nfs header was written on disk. (from 103093-19) 4041518 RFE: fix for sys hard hang during kernel coredumping, either intended or forced (from 103093-18) 1223882 Neutron r hangs reproducibly for a back to back NFS I/O load 4059632 Kernel watchdog resets with misaligned stack (from 103093-17) 4067949 setfacl allows ACLs to be set on a readonly NFS filesystem 4051257 watchdog reset occurs in sys_rtt when running threaded application 4043953 kernel randomly paniced with assertion failure in callout.c, line 345 4038653 nfs mount fails with fully qualified hostname > 32 char's 4026740 assert failure in segnf_gettype: seg->s_base == addr 4058892 as_getprot() needs to report real size of ISM segments 4058904 accessing addresses in ISM segments between "real" end and "segment" end loop 4059736 as_memory() does not dump ISM segments 4067641 Changing acl's on a UFS fs mounted readonly causes machine to panic 1203090 Core dump doesn't include share memory used by the (from 103093-16) 4038653 nfs mount fails with fully qualified hostname > 32 char's 4024647 chgrp does not work on NFS mounted filesystems 4015191 nfs client leaves .nfs files on the server (from 103093-15) 4051590 ioctl I_NREAD returns wrong value when patch 103640-08 is applied 4050892 init_swift_idle_cpu() should not search OBP for property 4031186 boot program gets hung on sun4m by level-14 clock interrupt 4026339 /usr/ucb/ps hangs while trying to get anonmap serial_lock in segvn_fault() 4006846 BAD TRAP data fault panic in sun4m locked_pgcopy() routine (from 103093-14) 1233514 savecore does not save unix.0 on large memory (8GB) sunfire machines 1262995 dbx produces error "Cannot open "/dev/zero" in child" after a call 4015176 crash dumping on small swap device is broken 4015367 Solaris 2.5 cannot handle crash dump bigger than 2GB 4017513 sybase is getting segv on failed logins on ss10's and ss20's 4025548 estimate and print the size needed for full crash dump 4027360 system hangs during shutdown 4036589 mt application hangs if last pthread_create is allowed to exit 4057818 panic due to procfs access of non-existant mapping (from 103093-13) 4036063 security problem with writing core files 4032123 Panic (segkp_fault: accessing redzone) occurred on the Solaris2.4 system. 4022354 kill -9 can not kill application thread in cv_wait called from getandset() 1257803 watchdog reset encountered on a PDB 1.2 system under load 1248925 le: 2.6 debug kernel panics on sun4c 1238582 privileged ifconfig ioctls by normal user succeed on sockets created as root (from 103093-12) 4035167 Need a new, private interface between JVM and libthread to get a thread's TOS 4032974 system hangs when lbolt wraps around. 1264890 Sun4d running 2.5.1 panics bp_map: read_hwmap failed 1262082 2.5.1 sun4d hangs w/kernelmap fragmentation (from 103093-11) 4022849 2.5.1 kadb kernel panics with kernel heap corruption; appl hang; sys unusable 4016316 On 2.5.1 and 2.5.1 SHWP system goes into a state of soft hang. 4015497 Locking bug in I_NREAD ioctl handler. 4004575 High mutex hits, slow performance when c2auditing enabled 1245291 Bug in libthread.so(cond_timedwait()) and libposix4.so(sigtimedwait) in 2.4,2.5 1182705 Signals may orphan locks on clients (from 103093-10) 4004147 panics in segkp_load when the file command is run 1265447 SYSTEM HANG, CLOCK THREAD IN MUTEX_ENTER WAITING FOR ANOTHER LOCK (from 103093-09) 4009069 2.5 TCP generates wrong checksum and never recovers from error 1265396 Ctrl-C typed to dbx is sent to child debugee (not to dbx) when app uses sigwait 1249985 "deadman" doesn't work correctly on MP systems. 1233088 ioctl(PIOCPSINFO) is 100 times too slow on multi-threaded prcesses 1247172 threads losing signals when preempted (from 103093-08) 1227580 cannot support high TCP connection rates: noncaput errors reported by the driver 1223900 alarm(2) doesn't work properly with large arguments 1266767 F_GETLK returns incorrect value on 2.x if a lock is pending 1261400 several processes are hung waiting for rwlock 1245540 The application which worked on non-Ultra does not work on 2.5/Ultra system. 1259966 winlock timed out causes Copy8FFB come into infinite loop on ffb single buffer (from 103093-07) 1265705 Add hyperSPARC Colorado-4 support to S2.5 and later kernels 1264333 _lwp_suspend()/continue() interrupts blocking system calls 1262694 Solaris 2.4 hangs due to memory leak in kmem_alloc-8, kmem_alloc_24 and -40 lea 1260982 rwnext & infonext fix in 2.4 to wait to enter inner perimeter didn't make 2.5 1260959 Streams information delayd 50-100 ms until dbri driver schedules it 1253223 System running 2.3 with KJP-80 on single CPU /24MB hangs in fork test case 1238559 sun4m user process can arbitrarily dump core with kadb 1256153 watchdog after continuing from kadb 1261511 alloc_hunk() bug causes panic with 1MB CPU cache (from 103957-06) SPR 105825 2.5 PATCH FOR SPR 105662: DATA CORRUPTION WITH DR-MEM-DETACH ENABLED There are 4 symptoms and an optimization; note that only one of these symptoms was ever seen at a customer installation (data corruption). All of the problems only occur when dr-mem-detach is enabled: - data corruption (2 different bugs, same symptom) - assertion failure - ASSERT(PTBL_IS_LOCKED(ptbl->ptbl_flags)) Note that asserts are only enabled in debug mode kernels. - assertion failure - ASSERT(pp->p_vnode) Note that asserts are only enabled in debug mode kernels. - optimization: the caged kernel messages which used to be emitted in debug mode are now only emitted if the DR_MEMDBG_CAGEDKERN flag is set in dr_mem_debug 4017457 Customer encounters poor I/O performance, is interested in freemem_lock fix (from 103093-06) 1256610 strwrite failes to call queuerun on error path: bug performance hit 1253528 The problem is associated with the bug found in the SE5 kernel. 1251423 panic - recursive mutex_enter on lwplock 1249250 SIGSEGV handler gets truncated fault address (from 103957-05) 105066: 2.5 PATCH FOR SPR 105031: DATA FAULT PANIC IN PAGE_LOOKUP+0X4C Data fault panic could occur in exec() flow due to kernel attempting to access kernel addresses while accidently in user context. Situation could occur if process in exec() flow should fault when accessing ELF header. Very obscure bug. 105514: SYSTEM PANIC IN TRASH_USER_WINDOWS+0X6C - SUN BUG 1255692 Invalid address panic in trash_user_windows(). (from C103093-03) 98179: THREAD_LOAD ASSUMES STACK IS IN USER SPACE WHICH IS NEVER TRUE: SUNBUG 1230150 103171: 2.5 PATCH SPR 103156: DATA FAULT PANIC IN IDLE ROUTINE - SEE SPR 101309 104194: 2.5 PATCH FOR SPR 104099: PAGE_CREATE_WAIT FAILS TO FREE P->PCF_LOCK MUTEX (from C103093-02) 100433: 2.5 PATCH FOR SPR 99353: SYSTEM PANIC: SRMMU_UNLOCK ORACLE RUNNING AT THE TIME panic: srmmu_unlock [with additional arguments] 100861: 2.5 PATCH FOR SPR 100719: CS6400 PANIC IN CHECKPAGE ROUTINE UNDER 2.5 WITH MEMOR The system will panic with a memory address alignment error in the checkpage procedure (around checkpage+0x148). 101372: 2.5 PATCH FOR SPR 100404: PANIC: SRMMU_PTELOAD - PTE REMAP PANIC WITH JKP-36 panic: srmmu_pteload - pte remap 102615: 2.5 PATCH FOR SPR 102190: SYSTEM HANG AFTER NPI FDDI DETACH ERROR The problem is a result of a call to the srmmu_logger() during a pause_cpus() state. The srmmu_logger code indirectly attempted to acquire a mutex held by a different thread which had been paused due to the previous pause_cpus() call. 101359: 2.5 PATCH FOR SPR 101309: REPEATED DATA FAULT PANICS AT IDLE+0X130, LOOKS LIKE NULL ADDRESS IN L7 System panics in idle thread while running Oracle database testing. Problem was a porting problem from 2.4 to 2.5 in the Processor Partition code due to change in a Solaris protocol related to thread scheduling. (from 103093-05) 1251421 Files may be corrupted after a power failure (from 103093-04) 1244142 ULTRA panics with 3rd party ATM card driver 1232869 paging thresholds are too low on very big systems causing kmem alloc failures 1161438 The pageout daemon blows up when a lot of memory is added to a system 1231471 Global register %g1 gets corrupted on SPARC2 and 2.5 1243804 lockfs -h and umount of the UFS lying under a loopback file system causes panic (from 103093-03) 1238581 indirect system calls fail on sun4u when C2 auditing is enabled 1230150 THREAD_LOAD ASSUMES STACK IS IN USER SPACE WHICH IS NEVER TRUE THREAD_LOAD ASSUMES STACK IS IN USER SPACE WHICH IS NEVER TRUE (from 103093-02) 1233084 freectty set cred pointer to NULL causing other module panic the system 1231759 strioctl ic_timout changed values from seconds to miliseconds 1229031 page_unlock: page not locked panic occurring when locking address space 1220902 workaround needed for Viking Hardware Problem 1189967 real-time latency limits exceeded occasionally 1231871 cpu_surrender doesn't check for threads waiting on kp queue (from 103093-01) 1228664 no fp queue in the signal handler mcontext for floating point ieee exceptions (from 103489-01) 1248186 ULTRA panics with 3rd party ATM card driver (from 103084-02) 1235169 ftp tests cause "le0 port" hang on Neutron ftp sessions hang neutron (from 103084-01) 1229015 interrupt fails to get to driver (from 103325-03) 1251430 Solaris 2.5 system panicked with message "lm_get_sysid: too many lm_sysid's" (from 103325-02) 1251430 Solaris 2.5 system panicked with message "lm_get_sysid: too many lm_sysid's" (from 103325-01) 1238919 mount causes the system to panic Data fault. (from 103164-07) 1258191 msgrcv was not interrupted by thr_suspend(SIGLWP). (from 103164-06) 1260769 MT application is dropping signal events when run on multi-processor systems (from 103164-05) 1247172 threads losing signals when preempted (from 103164-04) 1241118 libthread panic in thr_join handling of zombie threads seems to be broken (from 103164-03) 1253366 threads deadlock occurs in delivering SIGIO (from 103164-02) 1230478 deadlock in libthread (from 103164-01) 1230865 Problem with threads and signals. (from 103477-13) 4035845 do_unmount can hang while an NFS server is down (from 103477-12) 4007937 Processes hang accessing files over NFS in clnt_tli_kcreate() 4005615 mounting from HP3000 takes too long because of repeated NFS_ACL retransmits (from 103477-11) 4032974 system hangs when lbolt wraps around (from 103477-10) 1246045 NFS/TCP client loops forever trying to bind an in use reserved port 4017770 fix to bugid 1225408 doesn't work (from 103477-09) 4029417 bug fix for 1250937 has a bad putback for 2.5 4027442 fix for 1234450 is not complete for 2.5, 2.5.1 4019380 other access to directory hangs while HSM on server restores file (from 103477-08) 1250937 NFS server can crash NFS client by sending bogus stat() data (from 103477-07) 1253810 rpcmod's mir_close() routine should not block waiting for flow control (from 103477-06) 1258151 Solaris 2.4 nfs -o noac option not working properly with novell nfs server (from 103477-05) 1234450 NFS (VOP_WRITE &c) returns EINTR when "intr" is not specified on the mount. 1260873 Kernel memory gets corrupted when sharing and unsharing secure NFS. (from 103477-04) 1234450 NFS (VOP_WRITE &c) returns EINTR when "intr" is not specified on the mount. (from 103477-03) 1244706 rpcmod's mir_do_rput() can create a NULL tail_mp that can cause a crash (from 103477-02) 1240234 NFS server does not accept lock requests from a fujitsu client (from 103477-01) 1232825 RPC: Unable to send/receive (from 103226-07) 1237898 nfs transfer hangs when transfering file > 8k from apollo (from 103226-06) 1242233 during heavy read/lseek/write activity, NFSv3 client sometimes reads old data 1234858 problems with block allocation calculations in NFS V3 client and server The reason why bugid 1234858 is listed in this revision is because bugfix 1234858 requires /kernel/misc/nfssrv to be delivered in the patch and it was inadvertently left out from patches 103226-04 and 103226-05. (from 103226-05) 1241816 vi will fail with Stale NFS file handle if option nocto is set (from 103226-04) 1234858 problems with block allocation calculations in NFS V3 client and server (from 103226-03) 1236018 multiple big file copying causes freemem 0 on version 3 NFS (from 103226-02) 1233175 csh can become unkillable (from 103226-01) 1231997 f77 REWIND makes error : "eor/uio [1010] off end of record" on nfs files (from 103153-17) 1244278 acl support is missing from lofs filesystem NOTE: this provides the complete fix for 1244278 (from 103153-16) 1244278 acl support is missing from lofs filesystem (from 103153-15) 4051082 Short duration machine hangs after installation of ufs patch 1265170 .../cmd/fs.d/ufs/fsck/utilities.c will not handle 2000AD and beyond YY formats (from 103153-14) 1265000 "panic: kernel heap corruption detected" while running TStrans (high/long) (from 103153-13) 1259984 2.4 Sun4d hangs durning shutdown or halt (from 103153-12) 1267447 deadlock when running quotactl on heavily loaded system (from 103153-11) 1266278 freeing free xxx panic; indirtrunc tries to free the same block twice (from 103153-10) 1249319 lo_sync() should not flush the underlying filesystem (from 103153-09) 1233049 System hangs when user stops thread writing to ODS logging device (from 103153-08) 1250351 fsck mounted fs uses block rather than raw name, so error-lock state isn't fixe 1250620 fix-on-panic hard-locks trans. devices, when only error-lock is necessary (from 103153-07) 1244088 SS2000 is completely hanging under heavy I/O - Solaris 2.4 + 101945-36 (from 103153-06) 1248384 lockfs -h and umount of the UFS lying under a loopback file system causes panic (from 103153-05) 1245703 Deadlock condition detected: cycle in blocking chain (from 103153-04) 1227376 panic "Deadlock condition detected: cycle in blocking chain" (from 103153-03) 1242188 hang waiting for rwlock with holdcnt of -1 but no owner 1215792 delayed availabilty of freed diskspace when UFS logging with ODS4.0/3.0 1245602 Logging UFS is slower than UFS for local writes 1242481 panic: ufs_putapage: bn == UFS_HOLE This problem happens when we are in the process of deleting the inode (but delete is not complete yet) and an nfs thread sneaks in and tries to operate on same inode (in this case it was write). The nfs side checks for generation number in inode to detect if the inode it is referring to is no longer valid. However this check is not able to prevent this problem from happenning because the inode deletion is not complete yet and the inode generation number is changed much later in the delete process. During the delete process we do not hold the i_contents lock all the time, which allows the nfs thread to sneak in and get the same inode when we are actualy deleting it. The ufs_delete code has some checks for v_count > 1 to check for such situation, but these checks are not sufficient to prevent this. The nfs thread could still come in after the last v_count check is made in ufs_delete and get the same inode. (from 103153-02) 1220995 directory blocks not counted in quotas (from 103153-01) 1229843 ufs `umount' on an errored device hangs system. Patch Installation Instructions: -------------------------------- Refer to the Install.info file within the patch for instructions on using the generic 'installpatch' and 'backoutpatch' scripts provided with each patch. Any other special or non-generic installation instructions should be described below. Special Install Instructions: ----------------------------- If possible, perform patch installation in single user mode. If this can not be done, we recommend having the system in as quiet a state as possible: no users logged on, no user jobs running. Reboot the system after patch installation. NOTE 1: TO GET THE COMPLETE FIX FOR 4027360, ONE NEEDS TO ALSO INSTALL THE FOLLOWING PATCH: 103712-02 (or newer) usr/kernel/fs/namefs patch NOTE 2: TO GET THE COMPLETE FIX FOR 4032974, ONE NEEDS TO ALSO INSTALL THE FOLLOWING PATCHES: 103936-03 (or newer) kernel/drv/isp patch 104548-03 (or newer) CS6400 isp driver fixes 104744-01 (or newer) platform/sun4m/kernel/drv/sx patch (for sun4m machines only) 102982-02 (or newer) usr/bin/csh patch FAILURE TO INSTALL ALL THESE PATCHES FOR 4032974 WILL CAUSE THE SYSTEM TO HANG AFTER 248 DAYS. NOTE 3: TO GET THE COMPLETE FIX FOR 4035845 (do_unmount can hang while an NFS server is down) and 4026118 (do_unmount hold vfslist mutex and then hangs on NFS GETATTR call), ONE NEEDS TO ALSO INSTALL THE FOLLOWING PATCHES: 103256-03 (or newer) kernel/fs/cachefs patch 103492-03 (or newer) kernel/fs/autofs patch NOTE 4: Due to bugfixes 4026740, 4058892, 4058904 and 4059736 in 103093-17, it is recommended that one installs the following patches: 103328-04 (or newer) kernel/fs/procfs patch 105312-01 (or newer) kernel/exec/elfexec patch 105316-01 (or newer) usr/bin/gcore patch NOTE 5: TO GET COMPLETE FIX FOR BUGID 4102420 (SEGV's AND LIBTHREAD PANICS WHEN NUMEROUS pthread_cancel()'s ARE RUN), ONE ALSO NEEDS TO INSTALL THE LIBC PATCH (103187-41 or newer). NOTE 6: IF YOU HAVE SOLSTICE DISKSUITE 4.1 INSTALLED, YOU'LL NEED TO INSTALL the SOLSTICE DISKSUITE 4.1 JUMBO PATCH TO GET THE COMPLETE FIX FOR BUGID 4125102 (ufs_itrunc()/ top_end_async() deadlock). 104172-17 (or newer) Solstice DiskSuite 4.1: Jumbo patch