----------------------------------------------------------------------- NOV-PER1.DOC -- 19970526 -- Email thread on NetWare Performance Aspects ----------------------------------------------------------------------- Feel free to add or edit this document and then email it back to faq@jelyon.com Date: 19 Feb 1997 From: Floyd Maxwell Subject: NetWare "Speed" numbers for various machines/processors [Floyd: Compiled from submissions by various individuals] 386/33 = 320 486DX 33 EISA = 906 486DX 50 = 1,370 486DX2-66 ISA = 1,830 486DX4/100 = 2,741 Acer Altos P60 = 3,294 Pentium 66 = 3,660 HP Netserver LF 5/66 = 3,660 Pentium 90 = 4,592 HP Netserver LC 5/100 = 4,920 HP LH 5/100 = 5,490 Compaq Proliant 4500 P100 = 5,490 Clone P120 = 6,605 Micron P133 = 7,284 Clone P133 = 7,302 HP Netserver LC 5/133 = 7,322 Gigabyte HX MB w Pentium 133 = 7,339 P200 = 11,009 PPro/150 = 12,350 Alpha AXP c'93 = 13,000 Compaq Proliant 5000 PPro166 = 13,728 Gigabyte HX M/B w AMD K6/166 = 13,761 Dell PowerEdge P180 = 14,757 Gateway PPro200 = 16,429 Clone PPro200 = 16,429 --------- Date: Thu, 3 Apr 1997 15:27:11 PST From: Kevin Miller Subject: Re: Novell on Pentium Pro -Reply According to one of the TechShare 1996 conferences, INW has optimizations for the Pentium Pro processor. ------------------------------ Date: Thu, 28 Sep 1995 12:53:02 -0600 From: Joe Doupnik Subject: Re: FWD: General networks statistics >I am having troubles with my connection from my server to my >printserver. Could anyone explain these lines to me: You left out the interesting values at the top of the list, particularly the total number of sent and received packets. > Checksum Errors: 4,898 What is says: packets damaged so checksums fail > Hardware Receive Mismatch Count: 4,821,452 Big trouble. The packet says it's one length, but the packet handler was told another length. See below on trashing that lan adapter. > Adapter Reset Count: 14 It died and was reset by the driver. Sign of trouble in server too. >Adapter Operating Time Stamp: 513 >Send OK Single Collision Count 29,880 >Send OK Multiple Collision Count: 50,591 >Send OK But Defered: 204,857 things are queued up a lot >Send Abort from Late Collision: 672 Your network is broken. Late collisions should almost never happen. Look for broken client boards, wiring much too long, boxes on the verge of collapse. >Send Abort from Excess Collisions: 359 Normal but a tad large. >Rx Length Error: 45 >Rx Late Collision 255 Your network is broken, same as above. >Its a Digital-PCI-Turbo-etc.-card type: DC21040 (=drivername) In addition to checking your wiring plant replace this board and its driver. Joe D. >These numbers are collected in 5 days....TIA ------------------------------ Date: Mon, 6 Nov 1995 17:03:37 -0600 From: Tim Copley Subject: Anyone know where PERFORM3 is ? -Reply ftp://tribble.missouri.edu/pub/utils/prfrm3.zip --------- ftp://tui.lincoln.ac.nz/MISC/PERFORM3.ZIP Gunnar Jensen --------- From: David Hanson Subject: Re: Benchmarking Servers >Utility to do benchmarking on 3.12/4.1 servers to determine what needs >to be added or changed to boost performance. A handy tool for a quick benchmark is netlab2.usu.edu/apps/PERFORM3 I use it for loading the network, then I use its output along with the MONITOR statistics to assess performance and locate bottlenecks. ------------------------------ Date: Wed, 6 Dec 1995 09:28:21 GMT From: Phil Randal Subject: Re: Has anyone used the the Balance NLM >I have been looking at spec sheets fron NSI inc of >NewJersey giving info on their balance.nlm which allows >you to use two or more NIC cards in a server and have all >the cards on the same network segment. > >Im wondering if anyone can share any expericences with >this product? Before you spend your hard-earned bucks, try Novell's NLSP router, which supports load-balancing. For NW3.12 it's ftp://ftp.novell.com/pub/updates/nwos/nw312/ipxrt3.exe The NW4.10 version is 41rt2.exe somewhere on the NW 4 tree. ------------------------------ Date: Thu, 22 Feb 1996 11:55:18 -0600 From: David Kobbervig Subject: Re: how to run a nlm in ring 1 or 3? >We have novell 4.1 server. I want to run a specific nlm in ring 1, 2 or >3. What do I have to do? You can't run anything in rings 1 and 2, they are theoretical. To use domains, you need to LOAD DOMAIN in startup.ncf, then at the console type DOMAIN= where is either OS_PROTECTED (ring 3) or OS (ring 0). Then you load your .NLM. Switch domains by typing DOMAIN= again. Type DOMAIN to see what domain you are in. --------- Date: Thu, 28 Mar 1996 10:18:08 -0600 From: Joe Doupnik Subject: Re: Low priority threads >Does anyone have a few words of advice on upgrading low priority >threads Is it a good idea, is it a bad idea, what effect does it have >on NDS traffic etc.?? Novell says NEVER upgrade low priority threads. ------------------------------ Date: Wed, 8 May 1996 22:18:30 GMT From: Jesse Oliveira Jr Subject: Re: Monitoring server statistics >Anyone know of a good (cheap/free) program which will monitor and record >server statistics such as processor utilization, disk writes/dirty cache >buffers Check out: http://www.avanti-tech.com/ (NConsole and NodeInfo) or http://www.bmc.com/ (NetTune and NetReport) Both are good and +- cheap ! ------------------------------ Date: Fri, 28 Jun 1996 17:03:03 -0600 From: Joe Doupnik Subject: Re: Single Workstation Test with Perform3 Sp >I think our NW3.12 setup is way too slow. I've done a test using Perform3 >on a single workstation and have gotten a max throughput of only 680 >Kb/s. We're running NW3.12 on a 486DX2-66 with 32 MB RAM. Does anyone >know off-hand what type of throughput I should be typically getting with >one workstation with no other traffic on the network??? ---------- What were your expectations? Be aware that Perform3 is extremely vunerable to caching effects and makes a poor benchmark in many cases. Turn off caching in the client. An average IPX throughput for file transfers is about 350KB/sec, including Packet Burst. That's in the 0.5-2KB block request range. Slow networks are best approached by putting a packet snoop program on the wire to investigate deeply. Novell makes a fine one, Lanalyzer for Windows. Here's a tiny piece of useless information for you on real network performance versus user perception of it. Testing my TCP/IP code in Kermit. Just for fun I changed the code to discard every third packet but report success to the higher levels (simulating packet loss on the wire), and discard every third again but with a lan adapter failure (simulating too many broken boards out there), and let through only one out of every three transmissions. That's a 66% failure rate (2 of every 3 packets really never hit the wire). TCP transfers still worked, slower than normal, but about what I am accustomed to while working cross continent. That's a terrible loss rate, and yet it would be marginally acceptable in terms of performance (after all, it did work). Another of this kind. An Ethernet board by a well known vendor (no, not 3Com) generated bad frames (CRC errors galore) when faced with competing traffic. Throughput was hardly affected yet network monitor alarms went off all over the place, and my NW 3.12 server netlab2 crashed from bad frames. An ordinary user would be unaware of the situation. The board is no longer in production (but it certainly makes an interesting stress test article). And yet another. Some machines are pretty awful about dealing with extended memory (raw stuff) and lose interrupts and worse when switching between real and protected mode. (Hint, put your RAMDRIVE.SYS in expanded memory for smoother performance). So the DOS memory manager made a hash of operations of a particular Ethernet board driver, causing lots of board transmission failures. But the user didn't even notice until a special test program revealed the sorry state of affairs. A useful bit of information is that some boards + drivers can fail to send frames when pushed a little. Eventually timers recover lost NCP requests, and that causes slow downs. Joe D. ------------------------------ Date: Wed, 3 Jul 1996 17:27:55 +1200 From: "Baird, John" Subject: Re: 3.12 boot time >>Can someone tell me why my NW3.12 server takes 25 minutes to boot up, 15 >>if partitions are fully sync. >> > >I have a server that takes a while to mount the Users volume. My >experience with this is that there are many, many small temporary files >that are created and deleted on this volume. If I do a Purge /all from >the root (which purges anywhere from 10,000-50,000 files), it cuts the >mount time considerably. You might try this and if it works possibly set >automatic purging to on. Experience with 3.11 points to the number of directory entries allocated, not the number actually used, being the major determinant of mount time for volumes, and hence server boot time. After an end of year cleanup on one of our student servers, a 1 GB student volume with only 25 MB of files remaining (all deleted files had been purged) was taking 6 mins to mount. A 2 GB applications volume on the same server with 1.85 GB in use took 2.25 mins to mount. The student volume had 462,000 directory entries of which 7,800 were in use, whereas the apps volume had 78,000 directory entries of which 47,000 were used. After deleting and recreating the student volume, and restoring the files, it mounted within a few seconds. ------------------------------ Date: Mon, 26 Aug 1996 17:41:24 +0100 From: "David W. Hanson" Subject: Re: NIC Card Performance >I am getting the following on my NIC card from MONTIOR.NLM > >Total Send OK Byte Count Low (very high error numbers) >Total Recieve OK Byte Count Low (nearly as high) > >My total send counter gets up to 2 million in 1 day of uptime. OK means OK. Large OK numbers are -good- things, they are the number of Good bytes the NIC handled. They are not errors, they are statistics. They are represented by unsigned long integers, so they are broken into two parts, Low and High. Once the Low counter reaches 4,294,967,295 the High counter will increment, and the Low counter will start counting again at 0. The reason this "error" is taking over your "LAN card" is because your server works! ------------------------------ Date: Wed, 11 Sep 1996 19:18:14 -0600 From: Joe Doupnik Subject: Re: Help --Netware upgrate >At our site, we use a file server running Netware 3.12(250users) >system with a single 486-66CPU, 64MRAM. it contains 8gb scsi hard >disks and FDDI backbone connect to a switch hub. almost 100 users >connect to the switch hub. all these users have the same network >segment ( the net id is the same ). > >Here is my problem. When all of our users connect to the network, the >utilization of the file server up to 80% (can obtain it from >fileserver's monitor screen)and its response becomes very slow. So we >decide to upgrade it. but how can I find it need a cpu upgrade or >network innovation or increase the RAM? Any help will be appreciated. ---------- Well stated question. Applying engineering techniques we look for areas of heavy usage which can be a bottleneck. Before touching the server put a packet monitor on the wire to see what is happening. A good one for IPX work of this kind is Novell's Lanalyzer for Windows. If the traffic rate on a single wire is above say 1000 packets/sec sustained then the wire is overloaded. Partition the network. Second, use Monitor again and look at the disk channel numbers. You could have a slow disk system. Double check by performing an NCOPY of a large file from one location to another on the server (all versions except one or two perform the copying within the server, use that packet monitor to be sure). EISA/PCI boards give much better performance than ISA bus boards, even if both kinds are bus masters. Did you allocate sufficient directory cache buffers? Third, look at printing activity. Printing takes lots of cpu resources all by itself. What else is running in the server? If there are lots of nice components but not needed all the time try unloading those NLMs. Fourth, pay attention to the lan adapters in the server and clients. In the server these should be bus master adapters, not simple ISA bus units. Did you allocate plenty of receive cache buffers, and does Monitor show many errors or "ECB not available" counts? Does Lanalyzer show many "server overload" error messages? 80% utilization of the cpu is common and nothing to worry about by itself, but it is high if it continues that way most of the day. It does say the cpu is working hard (as if you have simple ISA bus lan adapters) but it still has unused capacity. A very simple minded test program, and I emphasize that it is not reproducing activities similar to your users, is Novell's old Perform3 which you may find on netlab2.usu.edu in directory apps (file prfm3.zip). Run this on all clients at the same time and note the results. For reference get file dsktests.txt from the same directory. Finally, there are a great many comments on performance in the list's FAQ (most easily seen by a web server to say netlab1.usu.edu and then to a mirror site closer to you). Reviewing the large amount of material can yield items of value. Joe D. ------------------------------ Date: Fri, 27 Sep 1996 17:50:38 -0600 From: Joe Doupnik Subject: Re: Netware 4.1 on a 33 MHZ server? John L. Stevens --- Computer Specialist wrote: >"John S. Chute" wrote: >>According to a Netware 4.1 book by QUE, it is either not wise or not >>possible to install Netware 4.1 on a server with less than a 50 MHz >>processor. We only have a 33 MHz processor. Do you know if this might have >>been a reason why I had so much difficulty installing 4.1? Or, if it can be >>installed what kind of performance hindrances and related problems might ensue? > >I had a moment of panic. But then I remembered, that the 486 - 33 MHz >machine on which I installed Netware 4.1 as a Web server is still >running after many months and the installation went very nicely. > >I guess my 33 MHz machine doesn't read the QUE books. ;) ----------- Stress test result, run an hour ago. Server = 486-33 EISA bus, NW 4.11/beta, 32MB (going to 64MB when SIMMs arrive), 6GB SCSI storage, NE-3200 Ethernet boards, Adaptec 2742A SCSI controller (all EISA bus master components). Clients = 28 Pentium 90's, 32MB, NE-2000 clone Ethernet boards. Test = type WIN (for Win3.1) simultaneously (+/- 1 sec) and watch. Result = no sweat. Server had one peak of 90% utilization, average was about 75%. All machines performed just fine. One subnet worked faster than the other (of two in the test). Things worked as quickly as the two Ethernet coax lines would allow. Inference = the server wasn't even working hard and I could have added another 20 clients to the mix if I had enough people to press the Enter keys. Observation = even with only 32MB in the server it performed well. Decent system bus and bus master boards count for a great deal, as does tightly written systems software. Next question please. Joe D. ------------------------------ Date: Wed, 2 Oct 96 08:45:08 -0700 From: Randy Grein To: "NetWare 4 list" Subject: Re: Pentium Pro & Netware >Is there a real benefit in using a Pentium Pro server over a Pentium >of the same clockspeed. Will Netware utilize the features of the >Pro? If by that you mean is the compiler optimized for Pentium Pro action, the answer for 4.10 is obviously no. The answer for 4.11 MAY be yes (I seem to remember a discussion about this, but none of the details). The larger question is do you need it? Compaq has a very nice white paper detailing stress tests of a single processor P133 Proliant 4500 supporting 2500 + users. (Netbench 3, 500 workstations simulating a load of 5-10 workstations each.) Processor utilization maxed out well before 500 workstations, but throughput continued to climb nicely. When all is said and done a modern server is a multi-processor machine. It's just that most of them aren't Intel X86. ------------------------------ Date: Tue, 01 Oct 1996 09:32:57 -0700 From: Darren Rogers To: netw4-l@bgu.edu Subject: Re: Pentium Pro & Netware -Reply Netware may not be optimized for the Pentium Pro, but the Pro is definitely optimized to handle 32bit code. It also does SMP better than a pentium. ------------------------------ Date: Tue, 8 Oct 1996 10:11:26 -700 From: Michael Sahs Subject: Re: Extremely slow 4.1 network Just a bit of info that might help Peter Nikolai with his slow server problems. I experienced a similar slowdown when I went from Netware 3.11 to 4.1 on a Compaq Proliant 1500. Tweaking cache buffers, file buffers, directory cache etc. only brought limited improvement. (There are a number of performance-related parameters which can and should be tweaked. Most of them are in well documented in the red books.) What I finally found was that I was actually double verifying every transaction. The Proliant had verification enabled in the setup and the 4.1 had verification enabled. When I disabled the 4.1 verification, I got an instant increase in performance speed. Some of the slow-down you are noticing will probably never go away. Your Public and System directories are huge now in comparison to 3.11. This may effect the speed of some commands. MAP and LOGIN don't seem to be as responsive as they were in 3.11. Also, I suspect that, because of the way the caching works, larger, more active networks perform better than smaller. This would be because the larger network is more likely to have the public and common stuff in cache. I may be way off with most of this rambling and would certainly welcome any tips and suggestions on enhancing performance on a 4.1 network. Another part of your setup you might want to look at is the compression. Is it enabled? What is the schedule? If your network swings into compression cycle during prime time, you might experience what appears to be a lock up or major slow down. ------------------------------ Date: Tue, 8 Oct 1996 13:29:58 -0600 From: Joe Doupnik Subject: Re: Perform3 >>I have been using Perform3 for some years now to test throughput on >>our LANs from server to workstation and have always been confident of >>the results that are produced. >> >>I run the tests usally for 30 seconds (possibly 60) using an IPX data >>length of 1436 bytes as this fits into exactly one ethernet frame. If >>I get 500Kbs + on shared 10Mb or 400Kbs + on switched 10Mb then I am >>reasonably happy that throughput for the users is acceptable. In a >>test environment I am hoping for 750Kbs + shared and 500Kbs + >>switched. >> >>The other day using a Compaq 5/60 running Perform3 (downloaded from >>Netlab2) to a 3.12 server (Compaq 5/75 patched to 312PT8) I let it run >>for 5 minutes without stepping up the data size. Performance dropped >>to 70Kbs!. I checked Client VLM versions and they are the latest Ver >>1.20B. I then ran it again to double check and put the HP Advisor on >>to see if I could see any collisions/runts etc. but all appeared >>normal. I then ran it from an HP Vectra XM4 5/120 running Netx to a HP >>Netserver LF 5/90 (also latest patches) and got exactly the same >>except this happens after 2 + minutes when using Netx. Both PC were >>running simple net.cfg files with only Link Driver setting using >>Ethernet_802.3 and a First Network Drive. Nothing fancy. During the >>tests both servers showed 5 - 10 % utilization and as it was in a test >>environment there was no other devices on the network the servers >>should have plenty of time to service the requests. Both PC were using >>the AMD PCNTNW v 3.10 driver and both servers were using the e100B.Lan >>driver for the Intel ether Pro/100 PCI NIC. Both servers had 96Mb >>memory and where running defualt settings. >> >>My question : >> >>Is this a Perform bug that occurs after 2- 5 minutes or is this some >>net.cfg setting/ NIC problem?. Has anyone seen this problem before? >>Its not critical but I am curious as to why this occurs. > > What COULD be happening is that on the short-cycle tests you are >writint to & reading from the server cache, which exists in RAM. During the >longer runs (like you would get with graphics files) you were overflowing >the cache and forcing the server to write directly to disk. > >> Both servers had 96Mb memory and where running defualt settings. > > This would explain the virtually identical performance: They are >overflowing at about the same time. As an experiment, throw in another 64 >MB into one of the servers and repeat the tests... I'll wager a six-pack >that the server with more RAM offers a faster sustained transfer rate to >the network! > > [Actually, I've seen this on AppleShare 4.2 servers with 21 MB >Photoshop files*... And AppleShare Server 4.2 uses essentially the same >caching scheme as NetWare 4.1. I use either LaCie TimeDrive or FWB's HDT >Bench Test -- disk drive timing apps -- to measure sustained transfer rates >to remote drives.] > > Dan Schwartz ----------- I beg to differ on this one. Perform3 is an especially limited and simple minded program. When one does a test with a 4096 byte buffer it first creates a 4096 byte file and then repeatively reads it over and over again, like this Lanalyzer summary snippet: No. Source Destination Layer Size Summary 388 EDU-USU-NETL This_Worksta ncp 1516 Burst Packet; 1432 bytes 389 EDU-USU-NETL This_Worksta ncp 1324 Burst Packet; 1240 bytes 390 This_Worksta EDU-USU-NETL ncp 0108 Req Burst Read 4096 bytes 391 EDU-USU-NETL This_Worksta ncp 1516 Burst Packet; 1432 bytes 392 EDU-USU-NETL This_Worksta ncp 1516 Burst Packet; 1432 bytes 393 EDU-USU-NETL This_Worksta ncp 1324 Burst Packet; 1240 bytes 394 This_Worksta EDU-USU-NETL ncp 0108 Req Burst Read 4096 bytes 395 EDU-USU-NETL This_Worksta ncp 1516 Burst Packet; 1432 bytes 396 EDU-USU-NETL This_Worksta ncp 1516 Burst Packet; 1432 bytes 397 EDU-USU-NETL This_Worksta ncp 1324 Burst Packet; 1240 bytes 398 This_Worksta EDU-USU-NETL ncp 0108 Req Burst Read 4096 bytes 399 EDU-USU-NETL This_Worksta ncp 1516 Burst Packet; 1432 bytes 400 EDU-USU-NETL This_Worksta ncp 1516 Burst Packet; 1432 bytes While this is going on the server's disk light keeps flashing. Hint: caching isn't doing anything. Adding memory won't help. Look at the server to see. Guess which end has to work hard keeping up with the traffic: hint the receiver/client. This also shows the critical path is the lan adapter and driver, and especially Packet Burst. No-so-hot lan adapters will drop packets under load (client and/or server) and that shows up strongly in Pburst sequences. Please put a monitor on the wire to see. Recall, PBurst spaces out packets if necessary to fit conditions, and that is likely happening in your tests. To make sense of tests it is often helpful to know what is interesting plus what the test is doing. You can see what Perform3 is doing easily enough, but the hard part is applying its results to the "interesting" portion of the problem, and often it doesn't apply well. A much more stressful simple minded test is iozone, which first writes and then reads files of various lengths up to many MB. Please see netlab2.usu.edu, cd apps for Perform3, iozone, and especially file dsktests.txt. Incisive stress tests tend to be just that, incisive, and hence pick out one of the many links in the chain. There aren't publically available programs to do that job, alas. General tests tend to work pretty well, such as typing WIN or bringing up Excel and other real world activities. That's "interesting." Joe D. P.S. Become TSR-free on the client, and that includes never running DOS PRINT (a horrid TSR which eats systems for lunch). ------------------------------ Date: Tue, 8 Oct 1996 21:12:55 -0600 From: Joe Doupnik Subject: Re: Sustained transfer, part two > OK, I want to now present two 2 node LANs, each with the same >server and client hardware; and generic (NetWare, AppleShare, NT, Unix, >etc...) server software. > > LAN 1: The client is writing a 9 MB file to the server; and the >server has 96 MB of RAM; > > LAN 2: The client is writing a 200 MB file to the server; and the >server has 16 MB of RAM. > > Now, which LAN will have the higher sustained transfer rate?! ---------- Uh, er, is it "It all depends" again? Thought so. Disk system versus lan speed versus memory I presume. Something to keep in mind about caches. Caching goes to pieces when competing disjoint events occur, incoherence. The tests below had basically empty machines to work with and hence maximised caching efficiency to about as far as it can go. Active servers have fragmented caches. Here are some real numbers from a very simple program, iozone.exe, in directory apps on netlab2.usu.edu. It does not confine it self to one 4096 byte file or smaller as does perform3. Nevertheless, this is a highly stilted test unrepresentative of normal lan activity. There are four test machines, three NW servers and one UnixWare box just for grins. Client is a Pentium-100, VLMs, NE-2000, Packet Burst is active, DOS 6.22. The smaller NW server (netlab4) worked the hardest with the largest cpu utilization (very high) and dirty cache buffer backlog, netlab3 also worked nearly as hard, and netlab2, the slowest cpu but the best bus + lan adapter, hardly worked at all. Here we see combinations of i/o request sizes (very important), overall file length, machine buses, and lan adapter classes. The ISA bus lan adapters caused very high cpu utilization figures. Because the file is read back just after being written caching has the maximum possible effect/benefit. Had we waited before reading back, and had a normal mix of other activity the figures would have been different. The two most similar machines are the first two: netlab4 and netlab3. Basically the same o/s version, different memory capacity, similar bus but netlab4 has a slower disk system. The UnixWare machine largely cached everything so the bytes/sec figures are rather large, and it had no lan adapters to traverse. The bottom line seems to be, um, not all that much difference in the reported numbers, but the servers certainly loaded down differently. Try the experiment at your place for comparison's sake. Joe D. -------------- IOZONE: Performance Test of Sequential File I/O -- V1.15 (5/1/92) By Bill Norcott Operating System: MS-DOS IOZONE: auto-test mode Netlab4: 486-50 DX/2 ISA, 16MB, DEC Etherworks 3 (~NE2000), NetWare 4.10, Adaptec 1542B SCSI ISA busmaster, older 3.5" drive. About 1600 free cache buffers MB reclen bytes/sec bytes/sec written read 1 512 276669 273066 1 1024 353055 335008 1 2048 444311 442437 1 4096 579323 635500 1 8192 733269 733269 2 512 265126 278506 2 1024 341000 347210 2 2048 443372 438734 2 4096 637432 465000 2 8192 794375 708497 4 512 264124 281875 4 1024 335008 351871 4 2048 424095 454420 4 4096 587437 592415 4 8192 726915 574562 8 512 268435 257556 8 1024 329870 310344 8 2048 416308 381821 8 4096 567564 337841 8 8192 690989 295373 16 512 266983 259990 16 1024 330195 317509 16 2048 409400 369298 16 4096 484190 379918 16 8192 480998 421962 Completed series of tests ----------------------------------------- Netlab3: 486-66 DX/2 EISA, 32MB, NE-2000, NetWare 4.11, Adaptec 2742 SCSI, HP 2GB drive About 4900 free cache buffers MB reclen bytes/sec bytes/sec written read 1 512 284939 280367 1 1024 367921 353055 1 2048 476625 455902 1 4096 655360 659481 1 8192 733269 794375 2 512 283016 278506 2 1024 363457 360335 2 2048 454913 471270 2 4096 670016 595781 2 8192 828913 720670 4 512 281875 283782 4 1024 361889 360026 4 2048 473932 466033 4 4096 663655 631672 4 8192 812849 755730 8 512 282254 282349 8 1024 359409 360180 8 2048 469950 468375 8 4096 652302 636464 8 8192 741698 771011 16 512 280743 282587 16 1024 358104 361889 16 2048 462819 469161 16 4096 633820 638888 16 8192 756071 775287 Completed series of tests ----------------------------------------- Netlab2: 486-33 EISA, 64MB, NE-3200, NetWare 3.12, Adaptec 2742 SCSI, Seagate Hawk 2GB drive About 9000 free cache buffers MB reclen bytes/sec bytes/sec written read 1 512 247890 233016 1 1024 353055 329740 1 2048 489988 453929 1 4096 635500 560735 1 8192 800439 680893 2 512 249660 227210 2 1024 332353 331827 2 2048 254817 454913 2 4096 453929 569878 2 8192 318232 680893 4 512 235767 232758 4 1024 339344 332090 4 2048 274676 451972 4 4096 522980 574562 4 8192 611414 675411 8 512 236765 233146 8 1024 338659 332749 8 2048 313592 455902 8 4096 397752 571820 8 8192 395689 682000 16 512 237536 233503 16 1024 339070 332353 16 2048 304818 455160 16 4096 408901 574168 16 8192 564699 677046 Completed series of tests ----------------------------------------- Netlab1: Dual P-100 EISA, 48MB, SMC 8013 ~(NE2000), UnixWare 2.10, Adaptec 2742 SCSI, Seagate Hawk 2GB drive. It's Unix, it's all one big buffer anyway. See dsktests.txt in directory apps on netlab2.usu.edu for same test with much less memory (see cliff edge). MB reclen bytes/sec bytes/sec written read 1 512 3615779 6553600 1 1024 5518821 8738133 1 2048 6990506 10485760 1 4096 9532509 11650844 1 8192 11650844 14979657 2 512 2953735 6553599 2 1024 4766254 8738133 2 2048 6990506 11037642 2 4096 9118052 13107199 2 8192 11650844 14979657 4 512 2853268 6452775 4 1024 5178153 8924051 4 2048 6990506 11037642 4 4096 6990506 12710012 4 8192 11335956 15534459 8 512 3084047 6403517 8 1024 4534382 8924051 8 2048 6875908 11184810 8 4096 8830113 12710012 8 8192 11184810 15252014 16 512 2943371 6452775 16 1024 4415056 8924051 16 2048 5805265 11184810 16 4096 7049250 12614448 16 8192 9020008 15252014 Completed series of tests --------- Date: Wed, 9 Oct 1996 12:33:49 -0600 From: Joe Doupnik Subject: Re: Perform3 >I think there are a few more things here to consider: > >I beleive Perform to be running from the client NIC to the Server NIC >and then into server cache,testing NIC and LAN throughput. No disk >reads or writes take place (only the output file if you specified a >network drive) as the disk light during these tests is barley active. > >Cache buffers throughout this test remained above 60% available so I >don't think more memory would make a difference. The LAN monitor >showed Pburst to be active and no drop in packets per second during >these tests was recorded. It appeared to have negotiated a window size >of 7 and was sending back an acknowledgement after every 7 packets. >Utilization also remained constant at 65% which indicated to me that >no drop in throuput was being recorded. If spacing was taking place >then wouldn't there be less packets on the LAN therefore utilization >and packets per second would decrease? > >I do agree however, that Perform is not a "real world test" but it >does test LAN bandwidth and NIC capacity taking the disk i/o bus out >of the equation and as such is a reasonabe benchmarking tool. > >If there is no bottle neck on the avaialble memory buffers and no >bottleneck on the disk I/o channels then why does Perform drop off >after 5 minutes +.? ------------- MONITOR samples once per second, which is slow. Cache buffers are misreported much of the time (we don't see the transient borrowing clearly at all). PBurst can begin to fail and request repeats. Those can't be seen by eye or MONITOR, but can with a wire snoop program. You can also turn off PBurst to double check. I don't know what boxes exist between your client and server, nor what the contending traffic looks like. My suggestion is put a snoop program on the wire and see what is happening. Crystal ball work from here tends to use a lot of listserver bandwidth, as for example last night's iozone test results. A snoop pgm atuned to IPX work is Novell's Lanalyzer for Windows, and there are others. Joe D. ------------------------------ Date: Sat, 12 Oct 96 22:31:32 -0700 From: Randy Grein To: "NetWare 4 list" Subject: Re: Packet Burst >Packet burst is built into NetWare 3.12 and will be used automatically when >an application makes a read or write request that requires more than one >data packet. Something else to consider: Packet burst will increase >performance providing there is sufficient badwidth. Low bandwidth can >decrease your network performance Packet burst is indeed built into 3.12 (it can be added to 3.11 no cost), but low bandwidth will NOT decrease performance - in fact, that's sort of what it's designed for. A low bandwidth WAN link, for example has fairly high latency - traditional IPX requires an ack packet for every transmitted packet. What's more they're sent as 512 byte packets, instead of the max possible (1514 for ethernet, 4k for token-ring, and whatever the WAN protocol supports). So your session is waiting for an ack after every packet, and they're divided up into 3 times as many. Enter burst mode. Typical busts are around 9-10 packets in size, with a single ack at the end - resulting in 1/30 the idle time waiting for acks during file transfer. Of course this isn't the whole story. Novell's had some issues with the sliding window and hogging the entire bandwidth, especially with the early releases, but an up to date server and workstation are very well behaved. In addition, this does nothing for database queries and client/server applications that typically respond in .5k chunks. In addition, a considerable amount of overhead it taken simply by file searches - it's amazing how much time, in fact. However, the benefits of packet burst are quite real - every measurement I've taken indicates a 50% increase in performance on a local segement, and 100% or more over a single WAN link. If you have to jump several routers it's even better - Novell claims up to 500%, and I tend to believe that. FWIW this is an excellent reason to NOT use the MS ipx/spx emulation if possible. Not only does it have problems on heavily loaded networks, it doesn't include burst mode or LIP. ------------------------------ Date: Mon, 14 Oct 1996 18:49:45 -0600 From: Joe Doupnik Subject: Re: LAN Performance and Packet Burst >I have a weird problem and I need any ideas/assistance from the list. >Scenario: I have a 4.1 server that is attached to a 100BaseT network. >If I perform a benchmark test with the old PERFORM3 program I see >strange numbers. If all works well the performance will run as high as >6200 KBps. Unfortunately most of the time the performance runs >extremely poor at 50-200 KBps. I have tried varies clients as well as >different Ethernet cards in the server. I finally was able to >troubleshoot it down to a packet burst related issue. If I turn packet >burst on at the client the performance is mixed like I stated above. >If I turn it off, the performance is same evey time I run the >benchmark at 4000 KBps. I have checked the NCP stats screen on the >console and everything looks fine, even when it is giving poor >performance. I have the latest patches and all appears fine on the >server regardless of the performance level at a given time. I have >even stuck a sniffer on the line and all I can see is an large delay >between packets when the poor performance is taking place. Any ideas? ----------- There is a known but not yet openly documented problem with PBurst timing on at least the server side under these conditions. One suggestion is to load up the very latest in lan drivers on both ends of the connection, plus the usual patches and updates. Very poor performance, your 50-200Kbps figures, are indicative of serious troubles in the driver and maybe even boards (can't stand the fast back to back packet burst). But we can't say for sure without more snooping on the wire in detail. When packets are lost this way PBurst can form an opinion that it is is own fault and back off. Please do remember that Perform3 is simple indeed, and more importantly that caching in the client will make a hash of timing values. Perform3 uses only a 4096 BYTE file, max, which is so easy to cache. Finally, don't let MONITOR distort your measurements: never snoop on an individual connection because updating that screen can easily half or worse your throughput. Joe D. ------------------------------ Date: Tue, 15 Oct 1996 20:19:47 -0600 From: Joe Doupnik Subject: Re: block size and its ramifications >I am curious to hear thoughts on block size, compression (on or off) and >sub-allocation settings for volumes, including the sys volume. - I have >recently inherited a WAN consisting of five 4.1 servers and one 3.11 >server. All but one of the five 4.1 servers have their volumes set to 4k >blocks, no compression and no sub-allocation. However one server has all >volumes set to 64k blocks, with compression turned on and sub-allocation >turned on. It appears as if whomever installed this server did not know >what he/she was doing - I can find no reason to have such large block >sizes, but would like some feed back before I go through the hassle of >destroying the server and tampering with the NDS. --------- You have made a smart decision to dig deeper before tweaking. In this case I recommend reading the NW 4 manuals on each item above to learn how NW 4 does them (differently than you would think from NW 3 experience). Then look at file distributions and usages, think, think some more, and only then consider optimizing things. Keep in mind that changing block sizes means rebuilding the volume from scratch, and that's serious with NDS. In any case, get confirmed-good backups before messing about, and remember you can't restore an NDS machine from tapes without a bunch of planning and reading (see dsmaint). You will find that many NW 4 managers turn on subblock allocation, use 64KB block sizes, and turn off disk compression. This works fine in most environments, but not in all. Joe D. ------------------------------ Date: Sat, 19 Oct 1996 20:14:02 +0100 From: Richard Letts Subject: Re: IP tunnel / IPX WAN links >>In the opinion of people in this list, what is the preferred method of >>interconnecting remote servers? At the moment, we have an all-bridged >>network, but are planning to go to a routed one later in the year (after >>the switched hubs). >> >>What are the advantages/disadvantages of using IP tunneling? >>What are the advantages/disadvantages of using IPX over WAN links? >>(assuming the bridges will bridge IPX) > >I'd use the Netware/IP route. IPX is too bulky on the line. This is where I write "what a load of drivel" NetWare/IP encapsulates the NCP packets inside a UDp datagram, so the same ammount of data flows up and down the link as would with straight IPX. You might get the performance benefit if you're using SAp/RIP routing, but I'd reccomend the IPX router upgrade and running NLSP as this is slightly less chatty. The big problem with NCP over a wan link is the round-trip-time, but then you get this wil all file-serving protocols (SMB, NFS). ------------------------------ Date: Mon, 21 Oct 1996 08:53:02 +0200 From: Henno Keers Subject: Re: IP tunnel / IPX WAN links >In the opinion of people in this list, what is the preferred method of >interconnecting remote servers? At the moment, we have an all-bridged >network, but are planning to go to a routed one later in the year (after >the switched hubs). I would route IPX instead of IP, see arguments below: >What are the advantages/disadvantages of using IP tunneling? Advantage: - Easy on TCP/IP orientated network managers - Only one type of configuration on your routers Disadvantage: - You cannot control IPX traffic like RIP/SAP since they are wrapped in IP packets. >What are the advantages/disadvantages of using IPX over WAN links? Advantage: - Tight control of traffic over WAN links - You must have a properly defined IPX naming and numbering standard Disadavantage: - More complicated router configurations - You must have a properly defined IPX naming and numbering standard >(assuming the bridges will bridge IPX) A bridge is a MAC level device, it does not see the difference between IP or IPX, hence they are far less suitable on WAN links. ------------------------------ Date: Wed, 23 Oct 1996 20:48:06 -0600 From: Joe Doupnik Subject: Packets/sec story, with numbers I just got back to my office tonight after being called with the classical message "The network is slow!" in one of my student labs. What happened is sort of interesting. The NW 4.11 file server (a 486-33 EISA bus machine) registered 100% cpu utilization, dstrace said not fully synchronized, users were grumbling audibly. Monitor also said a couple of lan adapters had very high "No ECB available" counts and climbing quickly. The packets sent and received numbers were also climbing rapidly. All were in the many millions. Normally there are no "No ECB available" counts. What the heck was going on? I took another sip of coffee and thought and poked the server keyboard. No, the wiring (coax) was just fine. Plenty of server memory. The directly attached printers were going fine, but user apps were extremely sluggish. The server did look ok except for the ECB loss rate and the cpu utilization value. Oh. There is someone running Lanalyzer for Windows, and Wow! Look at the packets per second dial over in the red zone! 3000 pkts/sec!!! Yikes, no wonder things were slow with that traffic rate as competition. Capture a few thousand packets (while grumbles increase in volume) and they were tinygrams, 64B guys, zillions of them, TCP/IP Telnet, going from one wire in the room through the server to another wire in the room. Ah ha! I know what is happening. It's my grad Computer Networks class doing their homework assignment. That was to measure throughput versus packet size (TCP MSS) for various situations, with MS-DOS Kermit acting as the TCP app at both ends of the connection. Sure enough, someone had tried an MSS (the TCP payload) of 16 bytes, generating tinygrams. ECB's were not available because the packet rate was high enough to exceed the rate at which the server could supply buffers to the NE-3200 boards, and some packets were consequently lost. (No problem, TCP repeats them and thus keeps adding load). Once we stopped the file transfer test everything was perfectly normal again. The overhead handling tinygrams with a bus master board is often greater than with a simpler port i/o or shared memory board because of the busy work setting up a block transfer. Hence a simpler board, say an NE-2000 flavor, would have done less work moving tinygrams than the better board, and the other way around with larger pkts. A general rule of thumb on 10Mbps Ethernet is 1000 packets per second is a hefty load on machines. Here we were at triple that rate, and in this case the server was acting as an IP router rather than as a file server so we did not have delays while the server's disk drives were accessed. MS-DOS Kermit has rather efficient Kermit and TCP/IP protocol stack code, in this case altogether too efficient. By the way, the throughput was only 684 file data bytes/sec, compared to about 160KB/sec with normal sized IP packets. All that overhead of headers with just a few data bytes, plus repeats for lost frames. If the server were less strong, or some other item in the server were consuming lots of resources (say running a tape drive for backups), then the same loss of ECBs would arise simply because the server could not attend to every arriving packet. In this case the server was healthy but the packet rate was way out of bounds. And we see the server is not an IP router which routes packets "at wire speed" comfortably. It certainly can be handy to be the system manager as well as the course instructor because I didn't have to blame anyone else (this time). Joe D. ------------------------------ Date: Mon, 11 Nov 1996 10:23:13 -0600 From: Joe Doupnik Subject: Re: What is an acceptable proportion of collisions on ethernet? >We are running Novell Netware v3.12 over Ethernet 802.3 (CAT5 and ThickNet >cable). > >I have been looking at the LAN/WAN Information in Monitor to get some idea of >the proportion of packets sent by our file servers which are involved in some >sort of collision. > >For example, for one of our servers running an NE3210 card I get > >Generic Statistics: > Total Packets Sent: 278,466,792 > Total Packets Received: 285,057,031 > No ECB Available Count: 0 > Send OK Single Collision Count 3,481,217 > Send OK Multiple Collision Count: 11,945,893 > Send OK But Deferred: 0 > Send Abort from Late Collision: 0 > Send Abort from Excess Collisions: 0 > >Proportion of packets affected by collisions 5.5% >(Send OK single + send OK multiple + Send Ok deferred + Send Aborts)/total >sent > >This is the top of the range, another server located elsewhere chugs along >at 0.03% > >Ideally I'd like to see 0 collisions > >Any ideas on what proportion is acceptable? ---------------- Don't tilt at windmills. Forget about collision counts until the excess collision count is very large. A collision is Ethernet's access control mechanism, nothing more than that. The only way you will see a zero collision count is if there is only one station on the wire, one. Collisions are not bad; they merely reflect activity. Please take a course or read the IEEE 802.3 specs, and/or get onto NEWS and follow along comp.dcom.lans.ethernet where this urban legend stuff comes up time after time. Joe D. ------------------------------ Date: Wed, 13 Nov 1996 15:34:15 +1000 From: Greg J Priestley Subject: Re: Netware 4.1 32 Bit Native? >What an interesting thread. Almost no attention to detail, and hence >nothing much has been said in a great many words. >... >Comments on "optimized for Pentium Pro" are about as interesting. What >vendor is mass marketing a product which can run only that particular cpu >and not others? Does it say so in the product specs? What might one mean >by "optimized" and could you care less? > Joe D. The Pentium Pro uses a highly pipelined architecture with all these whiz bang features like out of order execution, predictive branching, register renaming and the like. Check out http://www.intel.com/procs/ppro/info/index.htm for all the low down information of what all of this is. By re-ordering the order of code execution, it is possible for the Pentium Pro to run quicker than if the code is not optimised. Such re-ordering of code can help the accuracy of "predictive" branching, prevents stalls in the pipeline (that is the processor being momentarily left with nothing to do because it has mis predicted what is happening), etc. In a nut shell, optimised for Pentium Pro means just that. The code will run quicker on a Pentium Pro than if it was the same code which had not been optimised. This allows them to squeeze a few extra horsepower out of the chip. Further, optimised for Pentium Pro need not necessarily mean that it cannot run on earlier processors. It just means the instructions will allow the Pentium Pro to run more efficiently. Note that the Pentium Pro does have an extra instruction which means that if this instruction was used in software then it wouldn't be downwardly compatible and hence would only run on a Pentium Pro - but then they would describe it as Pentium Pro only/above or something similar. Ignoring all of the above, the important thing is real world performance for your application. For a look at the various performance indexes, I suggest you look at http://pentium.intel.com/procs/perf/highend/index.htm Why are there a number of indexes? To reflect different usage of the processor - say Windows 95 vs Windows NT vs whatever. ------------------------------ Date: Wed, 13 Nov 1996 18:16:13 +0000 From: Richard Letts Subject: Re: Maximum Service Processes >We have a problem with a number of our NetWare 4.1 servers with Palindrome >Network Archivist failing to run overnight. This can happen when the >number of Service Processes has reached the same as the Maximum Number. >The maximum is set to 40 which according to the Red Books is the default. >Our experts say that this should not be causing the fault but if the server >is downed and brought up again PNA will run perfectly. This seems to be >only affecting PNA as users have no problems during the day. > >1. Has anyone else experienced this sort of problem, and > >2. What are your thoughts regarding increasing the Maximum Processes >from the default. This may be true, as a service process can sometimes take a long time to complete, and if you're in a large network of servers with a large number of replicas then you'll start to use alot of service processes. On superservers I'd reccomend a maximum of ~100 if you've LOTS of extra memory then more may be worthwhile, however you'll need to ensure yuou aren't bottlenecking on disk IO. (if you've RAID controllers, are all of the disks in the array?) ------------------------------ Date: Tue, 3 Dec 1996 10:55:48 -0600 From: Joe Doupnik Subject: Re: Pburst at 64kbps >I have been testing a 64kbps WAN link by copying several files from a >workstation on one end of the link to a server on the other end. > >I noticed that pburst did not fare well in this situation (many >missing pieces that had to be resent). In fact, at one point, it got >into a loop where one missing piece was being continually re-requested >and dropped, hanging the copy operation. > >When I disable pburst, the files appear to copy fine, albiet slowly. > >Why does pburst do so poorly over this link while a non-burst copy >(apparently) does fine? ----------- I should bring this one to class. It is a classical situation of inadequate end to end flow control, with a limited WAN link in the middle, and the inherent difficulty of end nodes negotiating a suitably small burst length because they have no direct knowlege of the weak link in the middle. If your WAN link is a couple of comms black boxes plus Telco wire then look at the manuals on those boxes and find their buffer capacity. Set Pburst to stay within that buffer capacity (after allowance for competing traffic). The controls are lines Pburst read window= and Pburst write window= in net.cfg. The number is garbled in Novell docs and I interpret it to be the number of full length frames (576 bytes each when crossing an IPX router), though it can also be interpreted as the number of kilobytes in a burst. There is a third parameter in net.cfg, lip start size= which tells Pburst the longest frame length to try across the link; units is bytes. I might add that TCP does handle this situation by design. It's called congestion control and avoidance, and congestion here is too many packets queued for transmission across the slow link. It observes the link by testing it and then adapting to the link capacity dynamically as traffic load changes; there is no negotiation between clueless end nodes. Were I designing the comms black boxes I'd provide "back pressure" on local senders when the transmission buffer fills, by causing a fake collision or a full duplex Ethernet busy signal (preamble bits going on and on as per Richard Steven's suggestion). Etherswitches have exactly the same problem, but at higher speeds, when many inputs compete for time on the same output port. All in all, some very basic networking concepts are involved here. We will see them again on the final, so now is a good time to ask questions... Joe D. ------------------------------ Date: Thu, 12 Dec 1996 08:58:45 +0000 From: Richard Letts Subject: Re: NetWare Storage Methods >>NetWare actually benefits in some ways from physical fragmentation. > >Huh?! Could you please elaborate a bit? Does it have to do with the >caching algorithm used? No, to do with the fact that the fileserver normally has a whole bunch of users reading and writing data to the fileserver. It would be difficult for the server to ensure that all of the files were contiguous. A simple test: shuffle a pack of playing cards and deal them out into four piles, one for each suit. Did each pile grow at about the same rate? place the four piles one on top of each other without shuffling and repeat the test. Did each pile grow at the same rate this time? The same is true with NetWare, because the data is fragmented (shuffled) across the disk it ensures that users get 'fair' share of the resources of the server. the 'dealing' mechanism in my anaolgy above is the elevator seeking, where Netware re-orders the disk requests so it reads data from the disk as the head moves across the surface of the disk. Other mechanisms such as the turbo-fat, extensive directory caching, etc means Netware knows where all the parts of the file are on the disk without having to check first. (It knows where all the aces are) If anyone is interested in Operating System design, I found Andrew Tanenbaum's "Operating System Design and Implementation" a good book on the fundamental services provided by operating systems. [Pub. Prentice Hall, available in a red student edition] ------------------------------ Date: Sat, 11 Jan 97 12:58:34 CDT From: John_Cochran@odp.tamu.edu To: netw4-l@bgu.edu, kknoll@cpuoptions.com Subject: Success with Netware and Digital! I hope no one minds, but I wanted to share this information. I think it might come in handy if anyone is ever looking to build a rather large Netware system. We have recently completed the installation of a new Netware system which I think is probably unique in its configuration. Below are the specs for the hardware which were researched and decided to use. First a little history. This particular server has been running Netware 3.12 for about a year and a half. It's original configuration included several small Micropolis RAID arrays for a total of 16gigs of drive space. This server is mainly used by our publications group which work with some rather large files. These files have a lifetime of anywhere from 6 months to a couple of years. Basically they have to stay on-line for modifications until the book is published. We quickly outgrew the 16gigs of drive space. Having many problems with backup times, printing, etc... we were forced to come up with a solution. Quite often these users will send 100-300meg print jobs. In Netware 3.12, that really blows the idea of only having a 500meg or even 1gig SYS volume. Backup time on Netware 3.12 with ArcServe 5.01g for this 16gigs to a Quantum DLT4000 drive was taking about 12-16 hours (for full backups). Of course, we have MAC and NFS name spaces to deal with since a lot of these volumes are exported to UNIX and MAC environments. System reboots would take ~30-45 minutes for the server to come back on-line and mount all volumes. Without upgrading the CPU, memory, or tape devices, we have seen a substantial improvement in backup time by upgrading to Netware 4.1. I have upgraded this server to Netware 4.1, ArcServe 6.0, and added a 100gig RAID array. Well actually, with the hot spares and RAID 5, we ended up with ~76gigs of actual drive space on the new array. We also kept around one of the fast wide Micropolis RAID units (8gig) for use as the SYS volume. After the upgrade, I was able to backup 18gigs of data in only 3 hours and 20 minutes! Guess there is a lot to be said for Netware 4.1 and ArcServe 6.0 versus Netware 3.12 and ArcServe 5.01g. This is much better than even I had hoped for! The server can boot in about 10-15 minutes now even with the additional drive space. Netware 4.1 does much better caching disk/directory data than Netware 3.12. As I said, no hardware was changed other than the addition of this new RAID array. The only piece of the puzzle left is the addition of a new DLT7000 tape unit (which I should have in my hands sometime in February). This unit is a fast wide SCSI device versus the DLT4000 which is only fast SCSI. The DLT7000 is 4 times faster than the DLT4000 as far as raw throughput goes and can hold 35gigs (noncompressed) data and up to 70gigs of compressed data. I am also working on completing the installation of a new ethernet switch from Cabletron. The switch has 24 switched 10 ports and 12 switched 100base-TX ports. This switch will collapse our backbone (thicknet ethernet) onto the switched 10 ports and provide 100megabit to my server farm. So, here is the equipment that we have. Most of this was purchased in September of '95. Only the new RAID controller and drives are recent. Digital Prioris HX 590 - Pentium 90 server with 6 EISA and 6 PCI slots - 512megs of RAM - 3 Adaptec 2940UW SCSI controllers - 1 Adaptec 2740 SCSI controller - 1 3com Etherlink III NIC (PCI) - Running Netware 4.1 with ArcServe 6.0 Micropolis Gandiva RAID unit - 3 4.3gig drives on fast and wide SCSI - 16megs RAM cache - Provides 8gigs of space for SYS volume - RAID 5 CMD CRD-5500 RAID controller - 64megs RAM cache - Provides 4 SE Fast Wide SCSI drive channels (Has 8 ports, 4 in use) - Provides 1 SE Fast Wide SCSI host channel - Redundant A/C power inputs DEC SW800 cabinet - Digital Storage Works storage cabinet - Redundant 208V three phase A/C power supplies - Redundant fans on separate power sources - contains 4 storage works shelves with room for more DEC BA350 shelf - Storage works shelf for SCSI components - Each shelf has redundant power supplies each one supplied by a separate 208V power source. - Each shelf has 6 4.3gig Fast Wide SCSI drives Quantum DLT4000 tape unit - 20/40gig DLT drive - On its own Adaptec SCSI adapter The drive array's basic configuration is set up as follows: The CMD controller is set up as three separate RAID 5 arrays. Each array is mapped to a separate SCSI LUN on the host SCSI channel. The array contains three drives, two are hot spares and the third is a warm spare. The spares act as spares for the entire RAID system, capable of being a spare to any of the three arrays. Each array is split into 8gig volumes for use by Netware. Using the SW800 cabinet and the CMD RAID controller, we have plenty of room to grow. We can easily add another 100-200gigs of drive space when we need it. My thanks to CPU Options and specifically Karry Knoll who helped me extensively with the system configuration for the new RAID array. If any of you have any technical questions about this set up, feel free to email me. ------------------------------ Date: Fri, 17 Jan 1997 13:44:06 -0600 From: "Mike Avery" To: netw4-l@bgu.edu Subject: Re: Client 32 and Windows NT > I have one building that connects to my server, and the building is > having traffic problems. Most of the users there use either VLMS > (DOS/Windows) or Client32 (win95) to connect to my server, so I was > able to increase the IPX/SPX counts to suitable numbers, and they > are all able to connect. Shared LAN Cache lets users cache data on their local drives. They claim that you can see throughput increases of up to 300%. When a user access a program, SLC checks the date stamps on the files, makes sure the cached version is current, and then uses the local version. It runs about $40.00 a node and is the best performance boost I've seen. Sometimes moving to a faster topology just moves the bottleneck to the server. SLC doesn't, it uses local resources. The product is a read-only cache, and maintains concurrency, so it's a safe product, and it's easy to install. http://www.lancache.com ------------------------------ Date: Sun, 26 Jan 1997 12:50:06 +1300 From: "Baird, John" Subject: Re: Out of control ECB's >I have seen several conversations about ECB problems. Seems to me that the >problems occur either from too low packet receive buffers or perhaps a bad >NIC, CABLE, etc. > >I have a fully patched 4.1 HP 4/66 server that runs with no ECB counts. All >of a sudden, the server hangs at 100% and the ECB's climb out of site. I >have set the packet receive buffers at: (Minimum = 500 and Maximum = 1500). >Upgrade Low Priority Threads = ON. I am also running NFS 2.1 with the 1.99 >patch installed. > >We generally have around 190 people logged in with 3200 files open. The >server is working pretty hard at an average 80% utilization. > >I don't see any indications of problems in the error log. >I suspect that the NIC may be failing but when I down the server and >restart, she runs fine. I suspect this might be a simple case of the server being unable to handle the load. Are your packet receive buffers at 1500 when ECBs skyrocket? Are your file service processes at the maximum when ECBs skyrocket? What is the adaptor queue depth? Packet receive buffers are used for temporary storage of both incoming and outgoing packets. File service requests are moved from packet receive buffers to file service processes (which are buffers for handing file service requests). If you have a disk channel bottleneck, FSPs can hit the max, incoming requests remain lurking in PRBs due to lack of FSPs, and so PRBs can hit the max shortly after you run out of FSPs. The adaptor queue depth indicates the number of packets waiting to be transmitted, and a high number here indicates high network load or possible network problems. I assume these packets are occupying PRBs (Joe D can correct me if I'm wrong here). A high adaptor queue depth can occur with a busy network and lots of incoming packet burst read requests where a single incoming request can result in 6-10 (maybe more) large outgoing packets. Check out these numbers and you should get a better feel for what is happening. ------------------------------ Date: Sat, 25 Jan 1997 17:10:12 -0600 From: Joe Doupnik Subject: Re: Out of control ECB's [Floyd: Above message reply snipped] ---------- That's all correct John. Well stated. Many lan adapters have room for one or two transmitted packets in the memory on-board the adapter itself. The rest are queued within NetWare. My undergrad class has an assignment to tinker with the division of lan adapter memory to change the proportion assigned for reception versus transmission, by revising the ODI MLID/board driver, and note differences in performance sending and receiving files. It is not uncommon to have a failing disk system cause growth of queues and the above symptoms. So can a failing lan wiring plant, and failing lan adapter, and IRQ/port conflicts, and so on. Joe D. ------------------------------ Date: Fri, 31 Jan 1997 12:17:30 -0600 From: Joe Doupnik Subject: Re: Packets/Segment >Given a situation where there are 60 nodes 8 servers on a network, what is >a "normal" number of packets on a segment over 60 seconds? > >The reason why I ask is because Managewise keeps generating an alarm the >packets have exceeded 80000 over 60 seconds for 1 segment. Now we do have >graphics workstation sharing graphics files over the network, so I would >expect the traffic to be pretty high. But before I go adjusting my alarm, >I'd like to better evaluate it. ----------- Recall my "rule of thumb" that a busy Ethernet carries about 1000 pkts/sec? It's not the wire that is of concern because it can carry much more but the ability of stations to deal with large streams of closely spaced packets. In short, it's the strength of stations which counts more than how warm the wire gets. Just turn off most alarms in LZFW/Managewise. Joe D. ------------------------------ Date: Mon, 3 Feb 1997 17:29:59 +0000 From: Richard Letts Subject: Re: Changing Volume Block Sizes >On dirty cache buffers. Those are buffers queued for disk writing, >naturally. NW uses a write-behind strategy through those queues (hope the >power does not fail in the middle). The slower the disk system and/or the >more items in the queue the longer it takes to drain the queue. Add to this >reading back from disk for validity and the time grows. Such queues are >perhaps THE major use of memory by a client, included under the average >of 400KB or less server memory per client login. So far I haven't seen >queues become large because of block suballocation; if anything >suballocation speeds the process. > Joe D. There are two queues associated with disk writes: the dirty buffer queue, and the disk write queue. NetWare only allows a certain number of write requests to enter the write queue at a time. If you have a fast cpu/disk/disk-controller you might like to experiment with allowing more entries in the write queue. With hardware RAID controllers you may see a benefit to this as the raid controller can re-order the requests internally. If you're running on a standard IDE interface then you might not buy very much with this strategy. If the dirty buffer pool is full and the server needs to read a block it has to wait for a request to leave the write queue etc. With NW3.x this would lead to undesireable behavoir if all bufers were dirty the server would stop responding to requests until the dirty buffer queue was EMPTY, effectively stopping the server until all data was written --------- Date: Tue, 4 Feb 1997 11:20:57 +1300 From: "Baird, John" Subject: Re: Changing Volume Block Sizes > The last time we went over this territory the question arose: >do files always start fresh blocks. I think the answer was: yes but only >at file creation time, but later the o/s will move around the material >to fill up partial blocks. Some clarification on both halves of this >operation would be appreciated. Thats how I understand it. The number of files on a volume with suballocation enabled can exceed the number of blocks. It makes sense that a newly created file be allocated a new block as the (future) size at that time is unknown and redoing the suballocation of the file as it is expanded prior to closure would be a waste of effort. A file can be created but not written to if the user's free space as determined by a directory quota is less than the block size which indicates a whole block is allocated when the file is first written to. --------- Date: Tue, 4 Feb 1997 08:59:14 +1000 From: Mark Cramer Subject: Re: Changing Volume Block Sizes Files don't always start at the beginning of a block, they'll start out that way, but once you go under 1000 free blocks (64M at 64k block size) NW will start shuffling things round, and you can have files NOT starting at the beginning of a block, then you can add more files to the volume. A 10M (160 64K block volume) can hold 297 33k files, at which point it's full, I've done it as a test. ------------------------------ Date: Sun, 16 Feb 1997 13:06:46 -0600 From: Mike Connell To: netw4-l@ecnet.net Subject: Re: MONITOR, LAN Statistics >In monitoring the LAN statistics is MONITOR for a Compaq EtherExpress >EISA 10MB NIC, I've noticed that the "SEND OK BUT DEFERRED' counter is >set to 320. Here's an excerpt from the readme documentation: Send OK But Deferred The number of packets deferred before transmission: This counter contains the number of packets whose transmission was delayed on its first attempt because the medium was busy. Packets involved in any collisions are not counted. Frames that wait before transmission are counted. This statistic will be incremented often during normal operation on a busy network. Deferred transmissions occur when the network is extremely busy; so busy that the NIC did not try to transmit. High counts of multiple collisions and excessive collisions also occur. Deferred transmissions indicate that this segment of the LAN is overcrowded. Reduce the traffic by reorganizing the LAN. For example, if you have 100 stations on one Ethernet bus, break it into two Ethernet segments by adding a NIC to your server. In this way you can balance the load by putting 50 stations on one segment and 50 on the other. If a few isolated stations create the traffic, put them on a separate segment. ------------------------------ Date: Tue, 18 Feb 1997 02:50:03 -0600 From: "Kevin McIntosh" To: Subject: Re: Retain Trustee rights If you're using a SCSI controller and devices that ALL support parity, make sure it's on for each device. Then, you can "Set enable read after write verify" to "OFF", because the hardware does it. This saves CPU cycles, big-time. Also, watch the "LRU sitting time" in MONITOR, under Cache Utilization. This is your file cache and it works on a FIFO basis. I, typically, record the "time of the oldest file in memory" every hour or so for about a week. If I see it routinely around 30 minutes, or below, the server needs more memory. This helps performance because disk reads are in milliseconds and memory reads are in nanoseconds (i.e. more files cached in memory = fewer HDD reads). Forget the formulas we used with v3.xx. The "LRU Sitting Time" is a much better indicator and takes into account real life server usage. There are some "rules of thumb" for packet receive buffers and Directory Cache Buffers. I'll look through my "Cheat-sheets" and send you the info. ------------------------------ Date: Wed, 19 Feb 97 06:42:09 -0800 From: Randy Grein To: Subject: Re: Retain Trustee rights >There are some "rules of thumb" for packet receive buffers and Directory >Cache Buffers. I'll look through my "Cheat-sheets" and send you the info. There should be sufficient packet receive buffers such that the no ecb available counter isn't incrementing - generally start with a minimum of 100, max of 200. And you're right about the directory cache buffers; makes a HUGE difference. I've found that setting a minimum of 800, max of 1000+ and a reuse time of at least 30 seconds really helps.performance. Also, bump up the simultaneous directory cache buffer writes and simultaneous disk cache writes to about 40 and 500, respectively. ------------------------------ Date: Fri, 21 Feb 97 16:22:13 From: supervis@gtlcmh.usa.com ("Bodjack, Bruce") To: netw4-l@bgu.edu Subject: 4.11 Serv. Parameters REPORT: Improvement notice after optimizing server parameters on 4.11 Server. The GOAL of reducing my CPU utilization has been achieved, with resetting a couple of parameters suggested by Randy Grein. BACKGROUND: My server is a 3 year P60 with 64 megs of RAM running two Cache Controllers (a 1gig mirror, and 6gig RAID5), one 100tx LAN segment. The duty of this server is User files and Windows 3.1 shared and metered applications, which I consider a low demand. CHANGES: Directory Caching Buffers Directory Cache Buffers Nonrefferenced delay: From 5.5 to 30 Maximum Concurrent Directory Cache Writes: From 20 to 40 Maximum Directory Cache Buffers: From 100 to 1000 Minimum Directory Cache Buffers: From 200 to 800 File Caching Buffers Maximum concurrent Disk Cache Writes: From 100 to 200 Subjective: My server was spiking to 100% utilization often, ran 60% to 80% most of the time, and would idle around 12%. Response time on the system was sluggish. Today was not a high utilization day, but it would handle the load at 20% to 30% utilization most of the time and spike up to 80% and will now idle at 2%. Response time has improved some. I see this as a major improvement and I'm looking forward to higher loads next week for a more comprehensive test. Cache Buffers (under Resource Utilization) has dropped from 58% to 55%. Downside: 64 megs is not enough, and I believe my response time has not improved as much as the utilization has dropped because my sever needs more memory. This has also been observed by watching the LRU Setting Time (under the Cache Utilization thanks Kevin McIntosh) which averages to 17 minutes. Overall very pleased! Again and Again thanks for the help! --------- Date: Thu, 20 Feb 97 23:03:36 -0800 From: Randy Grein To: Subject: Re: 4.11 Serv. Parameters >Overall very pleased! Again and Again thanks for the help! You're welcome. And get that RAM! ------------------------------ Date: Mon, 3 Mar 1997 17:39:56 -0600 From: Joe Doupnik Subject: Re: NW 4.11 as a router? >Does anyone have any information on the forwarding rate (in terms of >packets/sec) possible with Novell 4.11 acting almost purely as a >router? (Routing IP, IPX and Appletalk). I am certain the NIC and >CPU make a large difference. Right now, I am considering a couple >Cogent quartet cards (four 100mbs ports per card), but would consider >recommendations for anything that would be faster... > >Alternately, is there any freely available benchmarks that can be >setup to measure something like this... ----------- You'll need to run those benchmarks. NW is a software IP router, which means routing takes signficant cpu cycles. But since you asked I can give you a measurement done here. 486-33 EISA bus server, NE-3200 EISA lan adapters. Send tinygrams as fast as a Pentium-90 can generate them, almost, from one subnet to another on the same server. At 3000 packets/sec the server was nearly unable to anything else. But it kept running and running; INW 4.11 is strong. That's 3000 in one port and 3000 out the other, per second, continuously. That is a big number. The packet generator was one you can use without any trouble: MS-DOS Kermit. Tell Kermit SET TCP MSS 20, SET SEND PACK 2000, SET REC PACKET 2000, SET WINDOW 4, make a connection to another MSK acting in server mode with the same SET's, move a big file one way or the other. MSK's TCP/IP stack is fast, and the limiting factor here was the NE-2000 ODI driver in each client. Best to use MSK 3.15 beta 18 because it has dead-lan-driver recovery code stronger than in v3.14; see today's list for pointer. As can be guessed, setting the TCP Max Seg Size to be say 20 bytes creates an enormous number of minimal length IP datagrams (64 bytes, natch) and the long Kermit packet window can keep the wire busy solidly. Put the file source and destination to a RAM drive for throughput, or on the file server for max server stress. Note this is 10Mbps Ethernet. What is difficult to predict is performance on 100Mbps Ethernet, because the quality of drivers varies a lot as well as native efficiency of boards. I'm teaching a lan driver design class, ODI drivers this year, and we make measurments. The very few benchmarks I have of Intel EtherExpress Pro/100 boards (no letter, older units) is we gain a factor of 2-3 over 10Mbps Ethernet, and that of course is much less that one hopes for. Over the past year this has become an EE course, with the active help of Novell for $oftware and the donations of the Intel boards from two Novell list members. (Thanks to all for making this happen). I don't have any other 100Mbps gear at this time. The Intel boards are pretty efficient, a great deal more so than NE-2000 boards at the same speed Ethernet. To check IPX routing (a different pathway through the server) use program iozone (see netlab2.usu.edu, directory apps). It is a clear simple file writer/reader with source code, so there is nothing hidden. For your purposes here is such a test (desktop to INW 4.11 server): (Disk read after writing turned off. Intel EtherExpress Pro/100 PCI on Pentium 100 client, Intel EtherExpress Pro/100 EISA on 486-66 DX/2 server) IOZONE: Performance Test of Sequential File I/O -- V1.15 (5/1/92) By Bill Norcott Operating System: MS-DOS IOZONE: auto-test mode MB reclen bytes/sec written bytes/sec read 1 512 733269 733269 1 1024 1191563 1115506 1 2048 1588751 1476867 1 4096 2139951 1906501 1 8192 2383127 2097151 2 512 647269 683111 2 1024 1271001 1158647 2 2048 1530767 1416994 2 4096 2231012 1823610 2 8192 2526689 2139951 4 512 700217 694421 4 1024 1252031 1106676 4 2048 1625699 1441341 4 4096 2184533 1864135 4 8192 2542002 2016492 8 512 678689 710297 8 1024 1201806 1106676 8 2048 1591766 1388842 8 4096 1441341 1796275 8 8192 2629657 2066159 16 512 712105 651289 16 1024 1179003 799676 <<< Note running out of 16 2048 1519675 821204 cache buffers for blocks 16 4096 1542023 612083 queued to disk, reads hit 16 8192 1754938 846052 slow disk drive heavily Completed series of tests Joe D. ------------------------------ Date: Fri, 14 Mar 1997 10:18:14 -0600 From: Joe Doupnik Subject: Re: 3.12/window3.1 problem >I have a user on a 3.12 fileserver that has windows 3.1 on her local >hard drive. When she logs on the the network and goes to her local >windows and opens File Maker Pro, it takes 1.5 minutes to open this >program. If she reboots and goes into windows from the c:\ without >getting on the network, it takes 9 seconds to open filemaker pro. If >I login to a different fileserver,(we have 7 to choose from), and go >into her local windows, it takes 9 seconds to open filemaker pro. --------------- That Windows application is trying to read all kinds of directory information from the file server. Look at MONITOR to see it; get a copy of Lanalyzer and watch the wire. Also, you must have slow servers to require 1.5 minutes to scan directories. Joe D. --------- Date: Fri, 14 Mar 1997 14:01:49 -0500 From: Larry Hansford Subject: Re: 3.12/window3.1 problem >I have a user on a 3.12 fileserver that has windows 3.1 on her local >hard drive. When she logs on the the network and goes to her local >windows and opens File Maker Pro, it takes 1.5 minutes to open this >program. You might want to check the Path statement in the PC. You may want to also check the Working Directory setting in the Windows Icon to ensure it is set the C: directory with the program files. It would appear from your description that it is searching all the search paths on the server before searching the c: drive for the programs. There may also be a difference in the number of search drives mapped via the login scripts on the various servers. --------- Date: Sat, 15 Mar 1997 00:11:36 -0800 From: "Philip J. Koenig" Subject: Re: 3.12/window3.1 problem >I have a user on a 3.12 fileserver that has windows 3.1 on her local >hard drive. When she logs on the the network and goes to her local >windows and opens File Maker Pro, it takes 1.5 minutes to open this >program. Filemaker Pro is an unusual app which has embedded support for a variety of network protocols, including IPX. Basically it is designed to look for a Filemaker Pro "server" on the network, which is basically a copy of the application which opens the database and then "broadcasts" the contents over IPX. (or IP, or Appletalk) In addition to Joe's suggestion that it may be searching for directory information, I suspect that it may also be looking for something with the same internal network number as the server you have problems logging into, or otherwise is paying undue attention to that particular machine. If you don't have a Filemaker Pro server on your LAN, I suggest checking to see that the network element of the program is turned off. Whenever you see that little moving "sinewave" thingy when you run file/open in Filemaker Pro, it is in the process of searching for a FMpro server on the network to connect to. ------------------------------ Date: Sat, 22 Mar 1997 13:32:07 -0600 From: Joe Doupnik Subject: Re: Getting performance stats for Win95 network connections? >Can anyone tell me a way to get performance and error statistics for the >network connections of Windows 95 workstations on a Netware 3.12 >network. I'd like to know traffic and error statistics by workstation. -------- Use an SNMP monitoring tool and put the snmp client support on the workstations. That will tell little about "performance", and stats on errors don't tell much unless things are in really bad shape anyway. Putting a monitor such as Lanalyzer on the wire can reveal quite a bit. Making up a big matrix of traffic patterns and so on is the job for expensive snmp/rmon instrumentation. Joe D. ------------------------------ Date: Tue, 15 Apr 1997 10:35:27 -0600 From: Joe Doupnik Subject: Re: Backbone - FDDI or Fast Ethernet? >I am trying to put together a backbone for my Netware >servers, which will also include our Cisco 4500 and >some NT boxes (web servers etc.). > >Does anyone have any experience with FDDI vs. fast ethernet? >I know about ethernet collisions, but I can overcome those by >going to an ethernet switch (e.g. Catalyst). My understanding >is, I can then go to full duplex and get a theoretical 200mbs. Forget collisions. The wires must be shared, and how they are is of little concern. Intelligent network design accomodates the paths of most traffic to separate them as much as possible. Etherswitches are bridges, that's all, and face identical traffic convergence/congestion problems (basic physics). >Of course, the FDDI vendors claim FDDI is still faster in a real >world environment. Is this just marketing mumbo jumbo? This seems like solutions in search of problems. What is your traffic rate, now and over the next two years, and where is it flowing? What is you budget, because fiber costs about $1K per attachment point, and up. And FDDI introduces IP fragmentation problems if you go overboard. What are your system goals in terms of required performance? Recall that 100Mbps Ethernet yields only a two to maybe three times throughput improvement over 10Mbps Ethernet with today's software. On the other hand, you probably are well served most of the time by regular 10Mbps Ethernet. Put your measured traffic numbers into this equation! >Anyone with experience or insite or both, I would love to >hear from you. I would also be interested in listservs, web >pages, or other resources discussing fast ethernet, FDDI, >Ultra SCSI, and other LAN technologies. My boss is a >gadget freak, and he always wants the latest and greatest >technology. As such, I am always on the "bleeding edge". Please obtain a NEWS feed and talk to the forums dedicated to such topics. There are thousands of NEWS groups, a few are even usable. Set your requirments, budget, expectations. Then shop the marketplace for components. Test some. Iterate a couple of times. In short, reseach the problem for both internal constraints and what's available from vendors. If you boss simply wants to play with shiny new toys please take the discussion elsewhere. Joe D. ------------------------------ Date: Mon, 28 Apr 1997 18:55:58 -0600 From: Joe Doupnik Subject: Re: Locating a faulty nic without a lan analyser >A few months ago we had trouble with our NW3.11 server 'freezing' on an >irregular basis at irregular times. I had put a version of Netscape on the >server for it to be used by a certain course. This caused utilisation to >go very high and freeze the server at times so I put a copy on each local >hard disk instead. The server hanging was not solely due to Netscape. > >After that I did the usual memory, cabling, and temination checks. All >were ok. I surfaced tested the server's hard disk and that of the pcs >- all ok. Due to a high 'space in use by deleted files' I set the print >queue and Pegasus spool directories to be 'purge immediate'. That made >a difference. Lately, however, the hangs are reoccuring. > >Our department does not have LANanalyzer. The server had 3 labs of >differing pcs attached to it with each lab split by way of a repeater. >I tried to locate a faulty NIC by doing a loopback test on each pc's NIC. >However, I could only do this for a lab of new Dells as the diagnostic >programs for the other labs of older Dells and Digital Decs would not >find the netword cards on these machines. > >So basically, without LANanalyzer, are there other tests I can use to >check if a NIC in a PC in the other two labs is causing the server to >freeze? ----------- The non-thinking approach is to unplug items one by one, starting at hubs and working down to workstation wires. But this may lead you in circles. And the reason can well be far too much traffic, and lots of physical broadcasts amongst it, for the health and well being of all but very robust stations. You do indicate a large number of stations sharing the wires. Identifying the weakest stations isn't solving the core problem. I would recommend you do spend the money for Lanalyser. It will likely pay for itself many times over in the next few years as you begin to understand the traffic and what extra boxes (bridges, switches, routers) to place where strategically. Remember, the traffic rates aren't going down and network life will become more difficult if nothing is done. In addition to traffic analysis one must pay attention to capabilities of components, particularly lan and disk systems in servers. Too many folks think in terms of IDE based el cheapo Windows machines when choosing components, and they deservedly lose. Good quality SCSI drives and controller are mandatory. Smart choices of lan adapters is equally important but much more difficult to quantify from looking at adverts. Pay attention to trade press benchmark reviews, and please do read the fine print before jumping to conclusions, and test before making major purchases. Bridges can be stressed beyond their capabilities. It's happened here in a big way. So step to multiple bridges (Etherswitches) when that topology makes sense (and it does not make sense if all traffic goes to one place). And bridges introduce one packet time delays which impact throughput (don't use cut-through techniques, please). To know what to buy, and where to place it, takes measurementing and thinking. Get LZFW and start the measurements part; I think you will be amazed/appalled at what you find. Joe D. ------------------------------ Date: Mon, 26 May 1997 15:50:08 -0600 From: Joe Doupnik Subject: Re: Packet Burst and Packet Size NW 3.x & 4.x >Can someone help me with the syntax for setting up Packet Burst and Large >Internet Packet size on 3.1x and 4.1x LANs? I've found a lot of info on >this in principle, but I'm not sure of the best syntax to use to implement >it in the autoexec.ncf and user's net.cfg files. > >Are there other paramters that need to be set to enable these features? > >Are there any disadvantges to turning these features on for a Netware LAN >running on ethernet, and accessing a wide area SMDS network via Cisco routers? > >How much of a performance improvement should one expect from these changes? ---------- First, see your client documentation for syntax, and (gasp) even the NW server documentation. Net.cfg has perhaps four lines of interest, all under major heading of NetWare DOS Redirector: PB buffers=(0 for no PBurst, any other value to get PBurst) PBurst read window= (number of KB) PBurst write window= (number of KB) lip start size = (1500 to start probing this length and shorter) But you must test these parameters on your system. PBurst itself can go unstable under stress, as can be observed via a wire monitor. If it does so then throughput varies up and down by a factor of at least two and generally drops a lot and hurts comms in the meanwhile. The PBurst read/write window sizes are tied to the ability of your lan adapters and routers to absorb traffic, and that's the memory size on the board minus one packet (to go the other way) maximum. Observe the PB results with a wire monitor. Your long slow link very likely cannot take the stress of PBurst, but only testing will reveal is characteristics. This is a fundamental flow control problem of a fast transmitter and a slow comms link where feedback is minimal and buffering capacity is unknown. While PB reduces the number of ACK packets flowing uphill it also sends back to back packets downhill where the packet rate and capacity become very serious concerns. Testing tools vary. Try regular apps, try perform3 from Novell (rather too simple), try iozone (more like apps). The latter two are on netlab2.usu.edu, cd apps (or netlab1.usu.edu, cd pub/mirror/apps). Please do watch the wire(!), with say Novell's Lanalyzer product. Joe D. ------------------------------