This is to serve as a repository of information about networking in the online environment.
The network layout at the STAR experiment has grown from a base laid over ten years ago, with a number of people working on it and adding devices over time with little coordination or standardization. As a result, we have, to put it bluntly, a huge mess of a network, with a mix of hardware vendors and media, cables going all over the place, many of which are unlabelled and now buried to the point of untraceability. We have SOHO switches all over the place, of various brands, ages and capabilities. (It was only about one year ago all hubs were at least replaced with switches, or so I think – I haven’t found any hubs since then.) There are a handful of “managed” switches, but they are generally lower-end switches and we have not taken advantage of even their limited monitoring capabilities. (In the case of the LinkSys switches purchased one year ago, I found their management web interface poor – slow, buggy and not very helpful.)
In addition to the general messiness, a big (and growing) concern has been that during each of the past several years, there have been a handful of periods of instability in the starp network, typically lasting from a few minutes to hours (or even possibly indefinitely in the most recent cases which were resolved hastily with switch hardware replacements in the middle of RHIC runs). The cause(s) of these instabilities has never been understood. The instabilities have typically manifested as slow communications or complete lack of communication with devices on the South Platform (historically, most often VME processors). Speculation has tended to focus on ITD security scanning. While this has been shown to be potentially disruptive to some individual devices and services, broad effects on whole segments of the network have never been conclusively demonstrated, nor has there been a testable, plausible explanation for the mechanism of such instability.
The past year included the two most significant episodes of instability yet on starp, in which LinkSys SLM 2048 switches (after weeks or months of stability) developed problems that appeared to be similar to prior issues, only more severe. The two had been purchased as a replacement (plus spare) for a Catalyst 1900 on the South Platform. When the first started showing signs of trouble, it was replaced by the second, which failed spectacularly later in the run, becoming completely unresponsive through its web interface and pings, and was only occasionally transmitting any packets at all, it seemed. (After all devices were removed, and the switch rebooted, it returned to normal on the lab bench, but has not been put back into service.)
At this point, all devices were removed from the LinkSys switch and sent through a pair of unmanaged SOHO switches, which themselves each link to an old 3Com switch on the first floor. Since then, no more instabilities have been noted, but it has left a physical cabling mess and a network layout that is quite awkward. (And further adding to the trouble, at least one of the SOHO switches has a history of sensitivity to power fluctuations, every once in a while needing to be power-cycled after power dips or outages.
In addition, there have been superficially similar episodes of problems on the DAQ/TRG network, which shares no networking hardware with the starp network. As far as I know, these episodes spontaneously resolved themselves. (Is this true?) Speculation has been on “odd” networked devices (such as oscilloscopes) generating unusual traffic, but here too there is no conclusive evidence of the cause. Having no explanation, it seems likely this behavior will be encountered again.
There are several “core” pieces currently. Core is defined somewhat vaguely as connecting lots of devices or requiring relatively high performance:
1. ITD’s main switch in the DAQ room
2. DAQ’s event builder switch in the DAQ room
3. the starp switch on the South Platform
4. the DAQ/TRG switch on the South Platform
5. the Force 10 switches for the HPSS network in the DAQ room
It seems likely that any reshape will have to include those same core components, though perhaps some combinations are possible at the hardware level using VLANs or other technologies. (combining starp and DAQ/TRG on the platform on to a single large switch, for instance)
This switch chassis is in the networking rack in the northwest corner of the DAQ room. It is managed by ITD. STAR has no way to interact with this switch at the software/configuration level.
Slot 1: WS-X4013 (Supervisor II Engine, fiber uplink to 515 and local management port)
Slot 2: WS-X4548-GB-RJ45 (48 1Gb/s copper ports @8:1 oversubscription) port 43 is 162 subnet, rest are subnet 60.
Slot 3: WS-X4232-RJ-XX (32 copper 100 Mb/s) plus a WS-U5404-FX-MT daughter card with 4 MTRJ fiber ports at 100Mb/s)
Slot 4: WS-4148-RJ (48 copper 100Mb/s) - mix of subnets 60 and 162?
Slot 5: WS-4148-RJ (48 copper 100Mb/s) - all subnet 60?
Slot 6: WS-X4306-GB (6 GBIC (not mini!) ports, 3 of which have 1000-SX modules with SC connectors)
Here we can keep miscellaneous files documenting the state of the network.
First, I have attached an image showing the current (late 2009/early 2010) switch layout and links in the WAH. ("WAH_switches.pdf")
Then there is an "after" picture with a rough idea of the patch panel placement to replace most of the unmanaged switches. ("WAH_patch_panels.pdf")
For the South Platform, a more refined patch panel plan was put together in June 2010 ("Network Plan for South Platforms.doc")
There is an attachment with general guidelines for installing UTP ("Cat5e_Network_cable.ppt")
WAH: (starp and DAQ/TRG devices are scattered throughout these locations. I am going to use the term “satellite racks” to include all locations within the C-AD PASS system that are NOT on the South Platform. Also, note that the satellite racks are semi-mobile, and the entire detector platform (North and South) can move into the Assembly Building.):
- PMD racks: ~3 devices on starp and ~3 on DAQ/TRG
- FMS/FPD east side: Handful of devices on DAQ/TRG and on star
- Southwest corner work area: rarely more than two systems here, but might want starp, “trailers” and DAQ/TRG networks here for use as needed
- EEMC racks, west side: Handful of devices on DAQ/TRG and on starp
- FPD/FMS west racks: Handful of devices on DAQ/TRG and on starp
- PP2PP east and west: at least one VME processor on DAQ/TRG on each side - these are in the RHIC tunnel, technically not in the WAH.
- South platform – (IMPORTANT NOTE: The south platform must remain electrically isolated from the rest of the facility – there can be no conducting cables running from the South Platform to other locations)
o First floor: Three rows of 8-9 racks each (volatile, in that subsystems and components are installed or removed each year)
o Second floor: Three rows of 8-9 racks each (volatile)
- North platform: currently unoccupied, but has had devices in the past and a switch on the starp network is still present there, with a fiber link back to the South Platform (somewhere!)
Control Room:
- Perimeter (~3 dozen PCs), almost all on starp, but
o 2-3 on DAQ/TRG
o 4-5 on C-AD 108
o 1-2 on C-AD 90 network?.
o Numerous small unmanaged switches in this room currently
DAQ Room: (Highest performance of the entire facility is needed in rack row DA, including a minimum 56-port switch with non-blocking/line rate 1Gb inter-links on the DAQ/TRG network)
- three “rows” plus two networking racks:
o the “old” network rack and the “new network rack” near the northwest corner
o rack row “DA” on west side (nearest the Control Room)
o shelf row in middle with a racks at each end.
o East row: ~6 stand-alone starp servers (one of which has a DAQ/TRG connection as well), along with a handful of VME devices on starp. DAQ or trigger might have a device or two here. The rack space is primarily occupied by devices on a C-AD network.
GMR:
- 3 PCs – generally stable area.
Clean room:
- several jacks needed, network use may vary between starp, daq/trg and the 130.199.162 subnet depending on the active use at any time
1006C and 1006D (trailers):
- typically only subnet 130.199.162 is needed here.
Online network reshape notes from the week of Oct. 18, 2009
During this week, three meetings were held to discuss the STAR online networking reshape plans.
The first meeting included Jeff Landgraf, Wayne Betts, Dan Orsatti (ITD) and Frank Burstein (ITD). At this meeting the ITD network engineers presented two proposals for core network components based on information previously provided to them by STAR. The two options were Force-10 based and Cisco-based, with costs of approximately $150,000 and $100,000 respectively. They included a shared infrastructure for the DAQ/TRG and STARP networks, including a switch redundancy in the DAQ room to handle the two networks and meet DAQ’s relatively high performance needs in the DAQ room. These ITD options are generally smart, expandable, highly configurable and well-supported by ITD, and meet the initial requirements.
However, in informal discussions since then, Bill Christie suggested that we should consider the possibility of radiation damage and/or errors in any electronic equipment in the WAH. While this had been mentioned as a possibility in the past, it was not generally taken seriously by those of us in STAR looking after the networks. Nor is there any way for us to test this to a standard of “beyond reasonable doubt” (or any other standard really). At Bill’s suggestion, we (Jeff L., Wayne B., Jack E., Yuri G. and Bill C.) met with three members of C-AD’s networking group, who stated they were certain that radiation could impair switches and strongly suggested that ITD’s suggested equipment was inappropriate for a radiation area. They also provided some feedback from individuals at two other laboratories that networking equipment in radiation areas are subject to upsets, with one explanation for effects on metal-oxide semiconductors, which at face value would suggest that newer (thus generally smaller) electronic components would be less susceptible, however my intuition is that smaller electronics are denser, and more easily upset by smaller deposited charge, and thus might be more susceptible.
Here are excerpts from the other labs:
From JLab: "The flash memory loses its ability to hold data, making it useless. We have worked around the problem by pulling cable or fiber back to lower radiation areas wherever we can. Because we made these cabling changes when we were only using cisco fixed-configuration 100Mbit switches ( 29XX models), I have no data for Gigabit switches. Since our experience is that it's the flash memory that fails, I'd expect no better performance from any other switches. All of our switches that use modular supervisor modules are outside of radiation areas."
From FermiLab: "The typical devices used employ metal oxide semiconductors and the lock up happens when ionizing radiation is trapped in the gate region of the devices. We see this happen at our two detectors (CDF and DZero) when losses go up and power supplies circuits latch up. The other thing working in the positive direction is that when IC feature sizes go down, there is less likelihood for the charge to get trapped so they are more radiation tolerant. Having said all that I can't answer your specific question because we don't put switches or routers in the tunnel at all."
All this said, the general consensus was that we should move as much “intelligence” as far away from the beam line as reasonably possible. (Until now, the “big” switches on the platform have actually been about as close to the beam line as possible!) This means putting any switches in rack rows 1C. Given both the cost and the radiation concern, we (the STAR personnel) agreed to investigate less expensive switches than ITD’s suggestion, while trying to provide some level of intelligence for monitoring. We also have a consensus that the DAQ/TRG and STARP networks should try to use common hardware whenever possible, and that we should work to remove as many SOHO-type unmanaged switches as possible as time permits (replacing them with well-documented and labelled patch panels feeding back to core switches). The C-AD personnel also recommended Cisco’s 2950, 2960 and 3750 switches and Garrett products in general. One more miscellaneous tidbit from Jack we should avoid LanCast media convertors.
The final meeting of the week included Jerome, Wayne and Matt Ahrenstein, in which Jerome was briefed on the two prior meetings and he generally agreed with the direction we are taking. At this meeting, we selected an additional area to try to clean-up before the run, specifically the racks on the west side, where there are at least four 8-port unmanaged switches (3 on DAQ/TRG and one on STARP). He also suggested we consult with Shigeki from the RACF about the whole affair, and is trying to arrange such a meeting as soon as possible.
In addition to this, Jeff has also stated that while either ITD solution would meet DAQ’s needs for several years, he believes he can obtain adequate performance for far less money with lower end equipment. Here is Jeff's latest on the DAQ needs for the network:
"My target is 20Gb/sec network capability across switches. In likely scenarios, the network capability would be significantly higher than this because hi bandwidth nodes would all be on the same switch (ironically, the cheaper switches mostly seem to be line-speed switches internally, unlike the big cisco switches...) However, in the current year, I'll have a hard limit of 12 gigabit ethernet cards incoming on EVBs for a hard max of 12Gb/sec. The projected desired data, according to the trigger board is around 6Gb/sec (600MB/sec). I don't expect much more than a factor of two through the EVBs above this 600MB/sec in the lifetime of STAR (meaning current TPC + HFT + FGT), although there are big uncertainties particularly for the HFT. The one lump in the planning involves potential L3 farms - and I don't know how this will play out. There are many scenarios some of which would not impact the network (ie... specialized hardware plugged into the TPX machines...), but my current approach is that the network needs will have to be incorporated in the L3 farm design plan..."
Where does this leave us? We need to quickly evaluate options for the “big” switches for the DAQ room and the South Platform. The DAQ and Trigger groups have 3(?) similar managed switches that might be adequate for the South platform (including a spare), and we should look into the Cisco models suggested by C-AD. We also should let ITD make another round of suggestions based on our discussions to date, and especially focus with them on what to do with the large ITD switch in the DAQ room that currently has the link to the rest of the campus “public” network. And we need to do this rather hastily.
Do we support multiple networks on single switches with VLANs, switch port segmentation or other means? For instance, at remote spots, like PMD’s racks, can we put in a single switch and have it handle both starp and DAQ/TRG? Daniel Orsatti's most recent advice was leaning towards having a few large switches in four or five core places with VLANs and installing patch panels at or near the various locations needing network connections.
Is there a single brand/line of switch equipment that meets most or all of our goals? Can we get a line of switch products that includes a range from small (~8 port) switches up to the large switches required for DAQ’s event builders or ITD’s main switch, such that they can interoperate and be part of shared monitoring? (If we go with a patch-panels-to-big-switches approach, then the small switches would not be necessary.)
What kind of monitoring can we expect and how much effort will it take for it to be useful? SNMP-based? Nagios? Etc…
Can we setup a shared but “private” monitoring network for the managed switches, such that starp and DAQ/TRG monitoring share the same infrastructure? (Most likely, yes.)
Can fiber connectors be easily changed/replaced/repaired? STAR apparently does not have the tools to terminate fibers at this point. Do we want to acquire the tools and know-how to do this, or continue to rely on ITD and/or folks like Frank Naase (C-AD) who have done most of our fiber termination to date?
The goal of the online networking reshape is to provide a stable and well-understood networking environment with the possibility of future expansion to meet STAR’s foreseeable needs over time. The physical layout needs to be well understood, with elements of redundancy and/or easily swapped parts on hand as much as possible. The devices on the network should be known, including their location, what other systems they are expected to interact with and traffic volumes. Significant networking errors should be detected at the switch level and allow for troubleshooting without significant disruption to large parts of the network.
Along the way, it will be very useful to increase the availability of knowledge and sources of assistance related to the network. Naturally this calls for a well documented network in any case. Consolidating networking hardware into a common brand or line for the multiple online networks (which are currently a hodgepodge) may reduce the number of errors encountered, improve the ability of STAR's personnel to understand more fascets of the networking environment and allow for better monitoring of the network performance. Our network should mesh well with existing ITD infrastructure so that their expertise can be brought to bear as needed. However, ITD expertise cannot be the sole source of support for the online networks – at least two individuals in STAR (but not much more than that) should have broad access to realtime network data and configuration. STAR’s 24-hour on-call experts (DAQ and online computing in particular) need to be able to respond quickly to incidents and gather clues and information from all sources.
I think we need to start from the core and work outwards. This will allow us to finish as much as possible before the run starts and start to see the most benefits as early as possible. The two big pieces at the core (in order of importance) are:
1. DAQ’s event builder switch, which calls for 56 (let’s say 64) non-blocking/line speed 1Gb/s ports. No matter what, this piece needs to be put in place before the run starts. We can probably limp by with everything else as it exists now if we have to, but this has to be a new piece of hardware in place before December 1 (is this a reasonable deadline?).
2. Whatever ITD wants to replace the current Catalyst 4000-series chassis and blades in the DAQ room.
After this, the next items for consideration/replacement are the starp and DAQ/TRG switches on the South Platform.
Then it is on to the satellite racks in the WAH with their relatively small number of devices.
Then the DAQ room, cleaning up the handful of unmanaged switches that exist for both starp and DAQ/TRG.
Control Room clean-up. The available wall jacks in the Control Room are insufficient for the number of devices, and many of the jacks are inaccessible behind the west side console, but at least this area is always accessible and has had few problems, so it isn’t a high priority.
This documents the Network Power Switch plugs used to remotely power cycle STAR's network switches in the Wide Angle Hall.
Updated February 8, 2019 (Ideally, STAR's RackTables would be the definitive source for this information, but it is far from complete.)
*ID | Location | Switch IP name | NPS IP name | NPS plug | NPS access method | NPS type |
SW22 | east racks | east-trg-sw.trg.bnl.local | pxl-nps.starp.bnl.gov | 8 | telnet, http (ssh and https available, but not enabled) | APC AP7901 (August 2015) |
SW56 | east racks | east-s60.starp.bnl.gov | eastracks-nps.trg.bnl.local | 8 | ssh (slow to respond to initial connection) | APC AP7901 (August 2012) |
SW59 | SP 1C4 | splat-s60.starp.bnl.gov | netpower1.starp.bnl.gov | 3 | telnet, http | APC |
SW2 | SP 1C4 | splat-trg2.trg.bnl.local | netpower1.starp.bnl.gov | 1 | telnet, http | APC |
SW27 | SP 1C4 | switch1.trg.bnl.local | netpower1.starp.bnl.gov | 2 | telnet, http | APC |
SW60 | SP 1C4 | splat-s60-2.starp.bnl.gov | netpower2.starp.bnl.gov | A1 | ssh (has key for wbetts) | WTI NPS-8 |
SW28 | SP 1C4 | switchplat.scaler.bnl.local | netpower2.starp.bnl.gov | A2 | ssh ssh (has key for wbetts) | WTI NPS-8 |
SW55 | west racks | west-s60.starp.bnl.gov | westracks-nps.trg.bnl.local | 1 | ssh, http | APC |
SW30 | west racks | switch2.trg.bnl.local | eemc-pwrs1.starp.bnl.gov | A4 | telnet | old WTI |
SW51 | NP 1st floor | nplat-s60.starp.bnl.gov | north-nps1.starp.bnl.gov | 1 | telnet, ssh, http | APC AP7900B (January 2019) |
A. Only use managed switches and have each networked device plug directly into a managed switch port.
- Eliminate all “dumb” consumer/SOHO/desktop switches – they are not robust, add to confusion when troubleshooting and prevent isolation of individual devices
- allow the blocking of any single device at any time through its nearest switch’s management interface
- block the addition of any new, unknown nodes and/or be informed of anything showing up unexpectedly
- ability to monitor individual ports for traffic volumes, link settings, errors, major links going down, preferably with some history/logging.
- allow real-time monitoring and alerts for unusual event (capabilities will be hardware/vendor dependent and subject to available time to develop monitoring tools and become familiar with capabilities)
B. All devices should be within 10-15 feet of a “core” patch panel or network switch.
- Individuals working on detector subsystems should not have to install network cables that cross rack rows, go from one floor (or room) to another, etc.
- Piecemeal additions of network segments by subsystems should not be done – that is to say, no one should be adding switches to the network other than core personnel using “approved” devices consistent with the rest of the network components.
- This calls for cabled and labeled patch panels and/or switches liberally placed throughout the WAH, the Control Room and the DAQ Room.
C. Some degree of “commonality” between the infrastructures of the starp and DAQ/TRG networks. Same line of hardware, media convertors (when needed), switches, monitoring tools, possibly even shared switches with VLANs. This is a big question – are VLAN’s viable to share switch hardware amongst starp and DAQ/TRG? A shared “private” management network for the switches is likely a good idea.
D. An easily extensible network, such that new locations can be added easily, and existing locations can have additional capacity added and subtracted in accord with the other goals.
E. Redundant links (fibers or copper, as appropriate) available between all linked core components (preferably with automatic failover).
F. Spares on hand for just about everything – a good reason to use as few models of hardware as possible. If we develop a plan with 10 small 8-port switches in various locations, ideally all 10 will be identical and we will have one or two spares on the shelf at all times.
G. All network components should be on UPS power so that short and/or localized power outages do not bring down portions of the network. This is not terribly important, but should be kept in mind and allowed for when feasible.
H. (Added after the initial items above) Move IC-based devices (switches) away from beam line and attempt to reduce radiation load. Our working hypothesis, based on anecdotal evidence, is that at least some of the networking problems last year were caused by errors caused by radiation. The two "big" switches on the South Platform have historically always been in just about the WORST place for radiation load, so these need to be moved away from the beam line.
Document everything!
All hardware with an IP address should be labelled.
All installed cables should have a label on each end that is adequate to quickly locate the other end.
All patch panel ports with cables connected should be labelled appropriately to identify the other end.
All network equipment (switches, patch panels, cable runs, etc.) need to be documented, preferably in appropriate documents in Drupal.
Copper connections:
Use Cat5e or higher graded cables.
Use yellow cables for devices connected to the STARP network (130.199.60-61.x IP addresses).
Use green cables for devices connected to the DAQ/TRG network (172.16.x.x IP addresses).
Use colors other than yellow and green for any other network connections.
Use T568A termination when adding connectors to bare cable.
Fiber Connections:
Use 50 micron multi-mode fiber.
Use 1000Base-SX fiber transcievers where possible.
“starp”: 130.199.60.0/23
“DAQ/TRG”: 172.16.0.0/16 (non-routed)
“HPSS”: RCF network for DAQ → HPSS transfers
“Alexei”: Alexei’s video camera and laser network (currently consists of a switch on the South Platform and a switch in the DAQ room connected by a fiber pair?). This includes 3-4 PCs including obsolete Windows OSes (e.g. Win 98). No devices on this network are dual-homed, so it is very isolated from everything else and is mentioned here for completeness.
“trailers”: 130.199.162. - includes wired connections for printers, vistors’ laptops and workstations not directly involved in operations and may exist outside of the trailers, such as the Control Room for visitors’ laptops while on shift.
“Wireless”: Not really relevant conceptually, but there are also three ITD wireless access points in the area.
“C-AD 108” and “C-AD 90”: C-AD has at least two networks operating in the DAQ and Control Rooms, which are left well enough alone in their hands, but are mentioned here for the sake of completeness.