I was recently involved in rebuilding NetApp FAS 3140 filer from scratch and I also had to set up the networking for both the filer and Cisco Catalyst 4507. I decided to share my findings and I hope that this post will be useful for other admins out there. Hopefully I’ll have some time to write part two of this post, going into more details and performance analysis.
A quick recap of link aggregation
There are two possible ways of forming an aggregated link – static and dynamic. In case of the static link aggregation, no verification is made to ensure that the other end is also configured, but in case of dynamic link aggregation, information is exchanged between both endpoints and the link will only form if both ends are configured correctly. The two dynamic link aggregation protocols available in Cisco world are LACP and PAgP, but NetApp does not support PAgP, therefore you can use either LACP dynamic or static configuration to set up aggregated links. There are few other advantages of using the dynamic method, for instance LACP monitors the state of the links and can remove them from the channel if it detects a loss of flow.
I would like to emphasize that by bundling two ports together you will not get twice the speed. You will however, ensure that a single stream of data (for example, file copy) does not consume all of your available bandwidth. The traffic from multiple streams will be distributed across the members of the channel, according to the load balancing method, but the rate is still limited to the speed of the underlying individual links. Another very important point is that none of the load balancing methods take the link utilization into account. Therefore if you chose inappropriate load balancing method for your environment, you could potentially end up with a situation where all or majority of the traffic is sent over the same link.
I personally use LACP whenever I can, therefore the example configuration below is for LACP. NetApp introduced support for it in Ontap 7.2.1. For more detailed information please refer to Network Management Guide, available from Netapp Now. I noticed an interesting suggestion in this guide – “It is best that the network interfaces that are part of an interface group are on the same network adapter.” – isn’t redundancy one of the reasons why interface groups are used?
In this scenario trunk links are used between the switch and the storage system, and there are three VLANS that are already configured on the switch. Anyway, everyone knows how to create a VLAN on Cisco switch! If you don’t know which ports are connected to the filer use “show cdp neighbours” command (filer must be running at least Ontap 8.0.2)
The configuration looks like this:
ciscoswitch(config)#default interface range Gi1/1 , Gi2/1 ciscoswitch(config)#interface range Gi1/1 , Gi2/1 ciscoswitch(config)#switchport trunk encapsulation dot1q ciscoswitch(config)#switchport mode trunk ciscoswitch(config)#switchport trunk allowed vlan 10,11,12 ciscoswitch(config)#spanning-tree portfast trunk ciscoswitch(config)#channel-group 10 mode active
There are few things I should note about this:
- The port-channel interface will not come up until the corresponding inteface group on the NetApp is configured, and has IP address assigned, either directly or on VLAN.
- The load balancing method on the switch, in this case, is src-dst-ip, run “show etherchannel load-balance” to verify your settings. This can be changed with “port-channel load-balance <method>“, but unfortunately it is a switch-wide setting
- For static Etherchannel instead of LACP use “channel-group <number> mode on“
- The first line is totally optional, but I like to know that all interfaces have the same base configuration (it is actually a requirement)
- If you are not using VLANS, you would configure these ports as access ports with “switchport mode access”
- Use the corresponding port-channel interface to make further changes to the channel. This will then apply to all interfaces in the channel. If you misconfigure individual ports, they automatically go info suspended (s) state, until you set all interfaces in the channel with the same settings
- If the port-channel you specify does not already exist, it will be created automatically. To remove channel interface run “no interface port-channel <number>“
- On Cisco side you can add or remove links to the channel without any disruption, however you can only “hot-add” interfaces on the NetApp. Removing an interface requires th interface group to configured as down.
- If you have modular or stacked switch distribute your Etherchannel links across modules or stack members. (this also applies to NetApp) I’m using module 1 and 2 on the Catalyst 4507
- There are, of course, many other options you can set, such as, the max MTU value and various spanning tree protection/optimization settings
To verify the status of the newly created Etherchannel run “show etherchannel <number> summary“. You should see output similar to this:
Group Port-channel Protocol Ports ——+————-+———–+————————————— 53 Po53(SU) LACP Gi1/1(P) Gi2/1(P)
In the Port-Channel section, (SU) indicates that link is up and running. Individual ports should have (P) next to them, meaning that port is an active member of the channel. If the corresponding ports on the NetApp are not configured for LACP you will see (SD) and (w) instead – channel is down and ports are waiting to be aggregated. I believe the port state will change to (s – suspended) after a while.
filer>ifconfig e3a down filer>ifconfig e4a down filer>ifgrp create lacp ifgrp_test -b ip e3a e4a filer>vlan create ifgrp_test 10 11 12 filer>ifgrp_test-10 192.168.10.10 netmask 255.255.255.0 partner ifgrp_test filer>ifgrp_test-11 192.168.11.10 netmask 255.255.255.0 partner ifgrp_test filer>ifgrp_test-12 192.168.12.10 netmask 255.255.255.0 partner ifgrp_test
- Use “cdpd show-neighbours ” to see which ports are connected to the switch (cdpd is available in Ontap 8.0.2 and later)
- The general syntax of the first command is ifgrp create [single|multi|lacp] ifgrp_name -b [rr|mac|ip][interface_list]
- The interfaces need to be in a down state before joining to the interface group
- Use “multi” for static Ethercannel instead of LACP, and use the appropriate load balancing mechanism
- You can add alias addresses to the interface group with “ifconfig <ifgrp name> alias <ip address>“
- The max length of the ifgrp name is 15 characters, however if you plan to add vlans, they will be appeneded to the ifgrp name with a dash in a middle (ifgrp_test-10). Therefore if the total length exceeds 15 characters, the ifgrp name will be truncated when you run “ifconfig -a“.
- “partner <interface name>” statement refers to the interface on the parter system in HA pair, to which this should fail over
- To check the status of your interface group run “ifgrp status <ifgrp name>“. Hopefully all links within the group are in the up state at this stage. You will see “broken” or “lag_inactive” if there is a problem with one of the ports or the entire group.
- And finally, remember to update your RC file with the new configuration!
If the link does not come up, use the two commands mentioned earlier to verify the configuration, with “ifconfig -a” on the NetApp and “show etherchannel <number> summary” on Cisco. The most common reasons why the link might be down are:
- One of the devices is configured for static Etherchannel, but the other one is set to LACP
- There is no IP address associated with the interface group on the NetApp, either assigned directly, or to VLAN. There is an option in Cisco IOS to make the Etherchannel Layer 3, but this is not required in this scenario
There are some potential performance issues that I should mention. For some reason many of the hosts in our environment were getting ten times slower performance that it should have been. For instance, standard Windows file copy would run at a rate of ~100mb/s for some hosts, but only ~10mb/s for others. Of course, initially we suspected that there is an issue somewhere in the environment, but after some thorough testing we just could not find any problem with the hosts or switches, or anything else. This would happen only if the Etherchannel is being used, in both static and LACP mode.
If you experience similar issues, I suggest downgrading your Ontap version to 8.0.1, if possible. Something in Ontap 8.0.2 and above seems to be causing this. Also, try to create an Interface group with the onboard ports only. I found that this issue does not occur with the onboard ports. I realize that if your filer is in production there is only so much you can do in regards to testing. But I’m very curios to see if anyone else has this problem, and if there are other possible solutions.