iSCSI and Jumbo Frames

I’ve recently been working on a project to deploy a couple of Pure Storage Flash Array //M10‘s, and rather than using Fiber Channel we opted for the 10Gb Ethernet (admittedly for reasons of cost) and using iSCSI as the transport mechanism.

Whenever you read up on iSCSI (and NFS for that matter) there inevitably ends up being a discussion around the MTU size.  MY thinking here is that if your network has sufficient bandwidth to handle the Jumbo Frames and large MTU sizes, then it should be done.

Now I’m not going to ramble on about enabling Jumbo Frames exactly, but I am going to focus on the MTU size.

What is MTU?

MTU stands for Message Transport Unit.  It defines the maximum size of a network frame that you can send in a single data transmission across the network.  The default MTU size is 1500.  Whether that be Red Hat Enterprise Linux, , Fedora, Slackware, Ubuntu, Microsoft Windows (pick a version), Cisco IOS and Juniper’s JunOS it has in my experience always been 1500 (though that’s not to say that some specialist providers may change this default value for black box solutions.

So what is a Jumbo Frame?

The internet is pretty much unified on the idea that any packet or frame which is above the 1500 byte default, can be considered a jumbo frame.  Typically you would want to enable this for specific needs such as NFS and iSCSI and the bandwidth is at least 1Gbps or better 10Gbps.

MTU sizing

A lot of what I had ready in the early days about this topic suggests that you should set the MTU to 9000 bytes, so what should you be mindful of when doing so?

Well, lets take an example, you have a requirement where you need to enable jumbo frames and you have set an MTU size of 9000 across your entire environment;

  • virtual machine interfaces
  • physical network interfaces
  • fabric interconnects
  • and core switches

So you enable an MTU of 9000 everywhere, and you then test your shiny new jumbo frame enabled network by way of a large ping;

Linux

$ ping -s 9000 -M do 192.168.1.1

Windows

> ping -l 9000 -f -t 192.168.1.1

Both of the above perform the same job.  They will attempt to send an ICMP ping;

  • To our chosen destination – 192.168.1.1
  • With a packet size of 9000 bytes (option -l 9000 or -s 9000), remember the default is 1500 so this is definitely a Jumbo packet
  • Where the request is not fragmented, thus ensuring that a packet of such a size can actually reach the intended destination without being reduced

The key to the above examples is the “-f” (Windows) and “-M do” (Linux).  This will enforce the requirement that the packet can be sent from your server/workstation to its intended destination without the size of the packet being messed with aka fragmented (as that would negate the whole point of using jumbo frames).

If you do not receive a normal ping response back which states its size as being 9000 then something is not configured correctly.

The error might look like the following;

ping: local error: Message too long, mtu=1500
ping: local error: Message too long, mtu=1500

The above error is highlighting the fact that we are attempting to send a packet which is bigger than the local NIC is configured to handle.  It is telling us the MTU is set at 1500 bytes.  In this instance we would need to reconfigure our network card to handle the jumbo sized packets.

Now lets take a look at what happens with the ICMP ping request and it’s size.  As a test I have pinged the localhost interface on my machine and I get the following;

[toby@testbox ~]$ ping -s 9000 -M do localhost
PING localhost(localhost (::1)) 9000 data bytes
9008 bytes from localhost (::1): icmp_seq=1 ttl=64 time=0.142 ms
9008 bytes from localhost (::1): icmp_seq=2 ttl=64 time=0.148 ms
9008 bytes from localhost (::1): icmp_seq=3 ttl=64 time=0.145 ms
^C
--- localhost ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2085ms
rtt min/avg/max/mdev = 0.142/0.145/0.148/0.002 ms

Firstly notice the size of each request.  The initial request may have been 9000 however that doesn’t take into account the need for the header to be added to the packet, so that it can be correctly sent over your network or the Internet.  Secondly notice that the packet was received without any fragmentation (note I used the “-M do” option to ensure fragmentation couldn’t take place).  In this instance the loopback interface is configured with a massive MTU of 65536 bytes and so all worked swimmingly.

Note that the final packet size is actually 9008 bytes.

The packet size increased by 8 bytes due to the addition of the ICMP header mentioned above, making the total 9008 bytes.

My example above stated that the MTU had been set to 9000 on ALL devices.  In this instance the packets will never get to their intended destination without being fragmented as 9008 bytes is bigger than 9000 bytes (stating the obvious I know).

The resolution

The intermediary devices (routers, bridges, switches and firewalls) will need an MTU size that is bigger than 9000 and be size sufficiently to accept the desired packet size.  A standard ethernet frame (according to Cisco) would require an additional 18 bytes on top of the 9000 for the payload.  And it would be wise to actually specify a bit higher.  So, an MTU size of 9216 bytes would be better as it would allow enough headroom for everything to pass through nicely.

Focusing on the available options in a Windows world

And here is the real reason for this post.  Microsoft with all their wisdom, provide you with a drop down box to select the required predefined MTU size for your NICs.  With Windows 2012 R2 (possibly slightly earlier versions too), the nearest size you can set via the network card configuration GUI is 9014.  This would result in the packet being fragmented or in the case of iSCSI it would potentially result in very poor performance.  The MTU 9014 isn’t going to work if the rest of the network or the destination device are set at 9000.

The lesson here is make sure that both source and destination machines have an MTU of equal size and that anything in between must be able to support a higher MTU size than 9000.  And given that Microsoft have hardcoded the GUI with a specific number of options, you will probably want to configure your environment to handle this slightly higher size.

Note.  1Gbps Ethernet only supported a maximum MTU size of 9000, so although Jumbo Frames can be enabled you would need to reduce the MTU size slightly on the source and destination servers, with everything in between set at 9000.

Featured image credit; TaylorHerring.  As bike frames go, the Penny Farthing could well be considered to have a jumbo frame.

CentOS 7 – Watch out for nmtui

It would appear that I have been caught out twice now due to the way that nmtui (Network Manager Text User Interface) works.  I have been messing around with various internal sandboxed networks in my VM environment and (I can only assume ion my haste), I have entered the IP address of a second NIC without full regard for the on screen prompts.

In nmtui, there is one field missing which is quite common in many other tools.  Take a look at the following and tell me what’s missing;

nmtui_sandbox_config

So what field do you think is missing?

Now although the information is all on screen in the screenshot above, there is one thing that may not be obvious.  In the Addresses field you specify not only the IP address but also the subnet mask in CIDR (which stands for Classless Inter-Domain Routing) notation.

IF, you happen to enter and IP address without thinking about it and don’t specify the netmask or CIDR, nmtui assumes that you are only referring to a /32, a.k.a. a netmask of 255.255.255.255, which for the uninitiated means just that IP.  If assumes that there is nothing beyond that IP address.  It’s world is only itself.

In the good old days where I used to configure the IP address via ifcfg-eth* files, I also remembered to enter the NETMASK= line, and therefore never had this issue.

Anyway rant over.  Hopefully twice is enough, because if nothing else, if I have name resolutions errors in my logs again, I will be making sure my netmask is set correctly, before thinking that tomcat is having issues.

Grrrrrr

Featured image credit:  Thanks to versageek for making the Network Spagetti image available on Flickr.com.

Back to basics – Setting a DHCP server

Following on from other posts in the “back to basics” series; Local yum repository and setting up an internal DNS server. Here is the next step in the process of building a infrastructure services server.

So we have DNS in place and working.  Now lets make sure that none of the client machines in the lab have to be configured with an IPv4 address manually.

First things first, lets get dhcp installed;

[root@rhc-server etc]# yum install dhcp -y

Before I look to configure DHCP for my needs, lets just have a quick look at the example configuration file.  This is good for a number of reasons, sometimes you will see certain options being used which you would not have used if it weren’t for having seen them in an example.

[root@rhc-server etc]# cat /usr/share/doc/dhcp*/dhcpd.conf.example
# dhcpd.conf
#
# Sample configuration file for ISC dhcpd
#

# option definitions common to all supported networks...
option domain-name "example.org";
option domain-name-servers ns1.example.org, ns2.example.org;

default-lease-time 600;
max-lease-time 7200;

# Use this to enble / disable dynamic dns updates globally.
#ddns-update-style none;

# If this DHCP server is the official DHCP server for the local
# network, the authoritative directive should be uncommented.
#authoritative;

# Use this to send dhcp log messages to a different log file (you also
# have to hack syslog.conf to complete the redirection).
log-facility local7;

# No service will be given on this subnet, but declaring it helps the
# DHCP server to understand the network topology.

subnet 10.152.187.0 netmask 255.255.255.0 {
}

# This is a very basic subnet declaration.

subnet 10.254.239.0 netmask 255.255.255.224 {
range 10.254.239.10 10.254.239.20;
option routers rtr-239-0-1.example.org, rtr-239-0-2.example.org;
}

# This declaration allows BOOTP clients to get dynamic addresses,
# which we don't really recommend.

subnet 10.254.239.32 netmask 255.255.255.224 {
range dynamic-bootp 10.254.239.40 10.254.239.60;
option broadcast-address 10.254.239.31;
option routers rtr-239-32-1.example.org;
}

# A slightly different configuration for an internal subnet.
subnet 10.5.5.0 netmask 255.255.255.224 {
range 10.5.5.26 10.5.5.30;
option domain-name-servers ns1.internal.example.org;
option domain-name "internal.example.org";
option routers 10.5.5.1;
option broadcast-address 10.5.5.31;
default-lease-time 600;
max-lease-time 7200;
}

# Hosts which require special configuration options can be listed in
# host statements.   If no address is specified, the address will be
# allocated dynamically (if possible), but the host-specific information
# will still come from the host declaration.

host passacaglia {
hardware ethernet 0:0:c0:5d:bd:95;
filename "vmunix.passacaglia";
server-name "toccata.fugue.com";
}

# Fixed IP addresses can also be specified for hosts.   These addresses
# should not also be listed as being available for dynamic assignment.
# Hosts for which fixed IP addresses have been specified can boot using
# BOOTP or DHCP.   Hosts for which no fixed address is specified can only
# be booted with DHCP, unless there is an address range on the subnet
# to which a BOOTP client is connected which has the dynamic-bootp flag
# set.
host fantasia {
hardware ethernet 08:00:07:26:c0:a5;
fixed-address fantasia.fugue.com;
}

# You can declare a class of clients and then do address allocation
# based on that.   The example below shows a case where all clients
# in a certain class get addresses on the 10.17.224/24 subnet, and all
# other clients get addresses on the 10.0.29/24 subnet.

class "foo" {
match if substring (option vendor-class-identifier, 0, 4) = "SUNW";
}

shared-network 224-29 {
subnet 10.17.224.0 netmask 255.255.255.0 {
option routers rtr-224.example.org;
}
subnet 10.0.29.0 netmask 255.255.255.0 {
option routers rtr-29.example.org;
}
pool {
allow members of "foo";
range 10.17.224.10 10.17.224.250;
}
pool {
deny members of "foo";
range 10.0.29.10 10.0.29.230;
}
}

As we can see from the above, the examples have provided us with pretty much everything we need to know and more to get things up and running.  Below is the configuration file that I created;

[root@rhc-server dhcp]# cat dhcpd.conf
#
#  lab.tobyheywood.com dhcp daemon configuration file
#
#  2016-02-22 - Initial creation
#

# Define which IP to listen on.  NOTE. daemon can only listen to one
# IP at a time if defined.
local-address 192.168.20.1;

# option definitions common to all supported networks...
option domain-name "lab.tobyheywood.com";
option domain-name-servers ns.lab.tobyheywood.com;

default-lease-time 600;
max-lease-time 7200;

# Use this to enble / disable dynamic dns updates globally.
#ddns-update-style interim;

# This is the authoritative DHCP server.
authoritative;

# Use this to send dhcp log messages to a different log file (you also
# have to hack syslog.conf to complete the redirection).
log-facility local7;

# My management network on a separate interface
# Included in configuration otherwise I get errors in journalctl
# **** NOT IN USE ****
subnet 192.168.122.0 netmask 255.255.255.0 {
}

# The lab network
subnet 192.168.20.0 netmask 255.255.255.128 {
range 192.168.20.50 192.168.20.99;
option routers rtr.lab.tobyheywood.com;
}

Hopefully the comments above are sufficient to give you a good idea of what each bit is therefore and what it does.

Now lets start the dhcpd daemon and check everything is working as it should.

[root@rhc-server dhcp]# systemctl enable dhcpd
[root@rhc-server dhcp]# systemctl start dhcpd
[root@rhc-server dhcp]# systemctl list-units | grep named
named.service        loaded active running     Berkeley Internet Name Domain (DNS)
[root@rhc-server dhcp]# systemctl list-units | grep -e dhcpd
dhcpd.service        loaded active running     DHCPv4 Server Daemonctl

And a quick review of the logs shows we are cooking with gas!

Feb 22 22:11:13 rhc-server systemd[1]: Starting DHCPv4 Server Daemon...
Feb 22 22:11:13 rhc-server systemd[1]: Started DHCPv4 Server Daemon.
Feb 22 22:11:13 rhc-server dhcpd[3145]: Internet Systems Consortium DHCP Server 4.2.5
Feb 22 22:11:13 rhc-server dhcpd[3145]: Copyright 2004-2013 Internet Systems Consortium.
Feb 22 22:11:13 rhc-server dhcpd[3145]: All rights reserved.
Feb 22 22:11:13 rhc-server dhcpd[3145]: For info, please visit https://www.isc.org/software/dhcp/
Feb 22 22:11:13 rhc-server dhcpd[3145]: Not searching LDAP since ldap-server, ldap-port and ldap-base-dn were not specified in the...ig file
Feb 22 22:11:13 rhc-server dhcpd[3145]: Internet Systems Consortium DHCP Server 4.2.5
Feb 22 22:11:13 rhc-server dhcpd[3145]: Copyright 2004-2013 Internet Systems Consortium.
Feb 22 22:11:13 rhc-server dhcpd[3145]: All rights reserved.
Feb 22 22:11:13 rhc-server dhcpd[3145]: For info, please visit https://www.isc.org/software/dhcp/
Feb 22 22:11:13 rhc-server dhcpd[3145]: Wrote 1 leases to leases file.
Feb 22 22:11:13 rhc-server dhcpd[3145]: Listening on LPF/ens8/52:54:00:ca:b3:a6/192.168.20.0/25
Feb 22 22:11:13 rhc-server dhcpd[3145]: Sending on   LPF/ens8/52:54:00:ca:b3:a6/192.168.20.0/25
Feb 22 22:11:13 rhc-server dhcpd[3145]: Listening on LPF/ens3/52:54:00:2b:da:2b/192.168.122.0/24
Feb 22 22:11:13 rhc-server dhcpd[3145]: Sending on   LPF/ens3/52:54:00:2b:da:2b/192.168.122.0/24
Feb 22 22:11:13 rhc-server dhcpd[3145]: Sending on   Socket/fallback/fallback-net
Feb 22 22:13:35 rhc-server dhcpd[3145]: DHCPDISCOVER from 52:54:00:a6:a4:fa via ens8
Feb 22 22:13:35 rhc-server dhcpd[2682]: DHCPDISCOVER from 52:54:00:a6:a4:fa via ens8
Feb 22 22:13:36 rhc-server dhcpd[3145]: DHCPOFFER on 192.168.20.50 to 52:54:00:a6:a4:fa (rhc-client) via ens8
Feb 22 22:13:36 rhc-server dhcpd[2682]: ns.lab.tobyheywood.com: host unknown.
Feb 22 22:13:36 rhc-server dhcpd[2682]: rtr.lab.tobyheywood.com: host unknown.
Feb 22 22:13:36 rhc-server dhcpd[2682]: DHCPOFFER on 192.168.20.50 to 52:54:00:a6:a4:fa (rhc-client) via ens8
Feb 22 22:13:37 rhc-server dhcpd[3145]: DHCPREQUEST for 192.168.20.50 (192.168.20.1) from 52:54:00:a6:a4:fa (rhc-client) via ens8
Feb 22 22:13:37 rhc-server dhcpd[3145]: DHCPACK on 192.168.20.50 to 52:54:00:a6:a4:fa (rhc-client) via ens8
Feb 22 22:13:37 rhc-server dhcpd[2682]: DHCPREQUEST for 192.168.20.50 (192.168.20.1) from 52:54:00:a6:a4:fa (rhc-client) via ens8
Feb 22 22:13:37 rhc-server dhcpd[2682]: DHCPACK on 192.168.20.50 to 52:54:00:a6:a4:fa (rhc-client) via ens8

So with the exception of the two “host unknown” errors we are looking good!  So that will do for now.

Time to go investigate the host unknown issue, ggrrrrrr!  Can I ping it? Yes I can!

Enabling GNS3 to talk to it’s host and beyond

I’m currently working my way through a CCNA text book and reached a point where I need to be able perform some tasks which rely on connecting the virtual network environment inside of GNS3 to the host machine, for the purpose of connecting to a tftp service (just in case you were curious).

After a little googling it became apparent that this is indeed possible, though most of the guides focused on using GNS3 on a Windows machine. Where as I, am very much a Linux guy.

So as a reminder to myself but also as a helpful reference for others here is what I had to do on my Fedora 22 machine

The first way was using standard tools in Linux, the second I made sure I was able to create the same setup using Network Manager (again to make sure that I am utilising the latest tools for the job).

Standard method (from the command line)

$ sudo dnf install tunctl
$ tunctl -t tap0 -u toby
$ ifconfig tap0 10.0.1.10 netmask 255.255.255.0 up
$ firewall-cmd --zone=FedoraWorkstation --add-interface=tap0 --permanent

 

Using Network Manager (from the command line)

$ sudo ip tuntap add dev tap1 mode tap user toby
$ sudo ip addr add 10.0.0.10/255.255.255.0 dev tap1
$ sudo ip link set tap1 up
$ sudo firewall-cmd --zone=FedoraWorkstation --add-interface=tap1
$ sudo ip addr show tap1
11: tap1: <broadcast,multicast,up,lower_up> mtu 1500 qdisc fq_codel state UP group default qlen 500
link/ether 26:2b:e4:a0:54:ba brd ff:ff:ff:ff:ff:ff
inet 10.0.0.10/24 scope global tap1
valid_lft forever preferred_lft forever
inet6 fe80::242b:e4ff:fea0:54ba/64 scope link 
valid_lft forever preferred_lft forever

Configuring the interface on the Cisco router inside of GNS3

router1#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
router1(config)#int f0/0
router1(config-if)#ip address 10.0.0.1 255.255.255.0
router1(config-if)#no shut
router1(config)
router1>write mem

The bit of config inside GNS3

I nearly forgot to write this section.  Doh!  Anyway, it’s lucky for everyone that I remember, so without any further padding… seriously, no more padding… the config in GNS3…

GNS3_Side_Panel
Select Cloud from the side panel in GNS3.

Next we need to configure the cloud… (hint.  right click the cloud and select configure).

GNS3_tap1_configuration_screen
Select TAP and then type in the name of the tap device created, in my case tap1.

The final step is to draw the virtual connection between the cloud and the router (making sure to map it to the correct interface).

At this point we should be in a happy place.

Proof that it works

# ping -c 5 10.0.0.10
PING 10.0.0.10 (10.0.0.10) 56(84) bytes of data.
64 bytes from 10.0.0.10: icmp_seq=1 ttl=64 time=0.103 ms
64 bytes from 10.0.0.10: icmp_seq=2 ttl=64 time=0.096 ms
64 bytes from 10.0.0.10: icmp_seq=3 ttl=64 time=0.050 ms
64 bytes from 10.0.0.10: icmp_seq=4 ttl=64 time=0.058 ms
64 bytes from 10.0.0.10: icmp_seq=5 ttl=64 time=0.103 ms

--- 10.0.0.10 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 3999ms
rtt min/avg/max/mdev = 0.050/0.082/0.103/0.023 ms
# ping -c 5 10.0.0.1
PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.
64 bytes from 10.0.0.1: icmp_seq=1 ttl=255 time=9.16 ms
64 bytes from 10.0.0.1: icmp_seq=2 ttl=255 time=5.64 ms
64 bytes from 10.0.0.1: icmp_seq=3 ttl=255 time=11.2 ms
64 bytes from 10.0.0.1: icmp_seq=4 ttl=255 time=7.29 ms
64 bytes from 10.0.0.1: icmp_seq=5 ttl=255 time=2.98 ms

--- 10.0.0.1 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4005ms
rtt min/avg/max/mdev = 2.980/7.266/11.253/2.847 ms

router1#ping 10.0.0.10

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.0.0.10, timeout is 2 seconds:
.!!!!
Success rate is 80 percent (4/5), round-trip min/avg/max = 4/9/12 ms

I do believe that about covers.

GNS3 Installation on Fedora 22

Following on from one of my earlier posts installing GNS3 on Fedora, CentOS and RHEL, it would appear that things have changed a little once you are on Fedora 22.

The following details the steps requred to get up and running with GNS3 on Fedora 22.

When I tried to install the pre-requisites based on my original post, I got the following error;

Error: Transaction check error:
file /usr/bin/pylupdate4 conflicts between attempted installs of python3-PyQt4-devel-4.11.3-5.fc22.i686 and python3-PyQt4-devel-4.11.3-5.fc22.x86_64
file /usr/bin/pyrcc4 conflicts between attempted installs of python3-PyQt4-devel-4.11.3-5.fc22.i686 and python3-PyQt4-devel-4.11.3-5.fc22.x86_64

Obtaining the files

If you download the latest GNS3 zip file it will contain pretty much everything you need (apart from the require packages to build the different applications, see below).

https://community.gns3.com/community/software/download/

Note.  You will need to register.

Pre-reqs – GNS3 Generic

sudo dnf install python3-setuptools python3-devel python3-sip.i686 python3-sip.x86_64 python3-PyQt4.i686 python3-PyQt4.x86_64 python3-PyQt4-devel.i686 python3-net*

Note.  Fedora 22 now uses dnf for package management and yum is depreciated.  If you do run yum it will redirect your request to dnf.

Installation of dynamiqs

You will need to make sure you have the required pre-req packages as detailed below.

$ sudo dnf gcc gcc-c++ elfutils-libelf-devel libuuid-devel libuuid-devel cmake

And then the process to build remains the same;

</pre>

$ cd /path/to/gns3_source/dynamips_extracted_folder
$ mkdir build
$ cd build
$ cmake ..
$ sudo make install

Installation of GNS3-GUI

<code class="bash plain"></code><code class="bash functions"></code><code class="bash plain"></code><code class="bash functions"></code>$ cd /path/to/gns3_source/gns3-gui
$ sudo python3 setup.py install

Installation of GNS3-Server

$ cd ../gns3-server-1.3.7/
$ sudo python3 setup.py install

 Installation of IOUYAP

$ sudo dnf install gcc flex bison glibc-devel iniparser-devel

Some packages may already be installed and will be skipped.

$ bison --yacc -dv netmap_parse.y
$ flex netmap_scan.l
$ gcc -Wall -g *.c -o iouyap -liniparser -lpthread

At this point there is the VPCs, to configure but as I’m not intending on using that right now, I shall leave that for another day.