vSphere 6.7 – operation failed for an undetermined reason.

Recently I’ve upgraded a test vcenter appliance to 6.7 update 1 and found that a long standing issue is, well, an even longer standing issue.

I was attempting to deploy an OVA file via the vCenter HTML5 (and subsequently the Flex) client and was confronted with the following warning;

The operation failed for an undetermined reason. Typically this problem occurs due to certificates that the browser does not trust. If you are using self-signed or custom certificates, open the URL below in a new browser tab and accept the certificate, then retry the operation.


If this does not resolve the problem, other possible solutions are shown in this KB article:

Annoyingly, I was unable to copy and past the KB URL, but after a small amount of google use I got to the URL (as whatever I was typing in seemed to fail).

The KB article discusses numerous issues relating to the TLS/SSL certificates being used.

For me my certificates are all good, they are custom certs but valid with a number of SANs to cover all bases. However it still failed even after following the advice in the KB.

For me the final fix was to log into the WebUI of one of the ESXi 6.7 servers (which DO have self-signed certificates), and deploy the OVA via that method.

Prior to logging in I did have to accept the self-signed certificate. This was clearly the underlying issue. My vCenter has legit SSL/TLS certs issues by an internal CA, however the ESXi hosts do.

Post primarily to make my brain remember that if I hit this issue again I will remember what to do. Failing that I can search my blog :).

Featured image credit goes to: Kristine Cecilia, the image is entitled FAIL

Ever wondered how many “raw” devices you can create on a Linux server?

I have to admit this isn’t a question that crops up all that often, but given what I have been through over recent months, I thought I’d share with the world.

Lets start from the beginning…

What are raw devices?

Raw devices, are used when there is a need to present underlying storage devices as character devices, which will consume be consumed typically by RDBMS.

Some examples of database technologies where “raws” might be used would be;

  • SAP IQ
  • SAP Replication Server
  • MySQL

The list is not limited to the above, but they are the ones I know about.

Why uses raw devices?

Well, there is a belief (and probably some, maybe lots of evidence) that file systems introduce an extra overhead when processing I/O.  Though I would question whether in the day and age of Enterprise SSD and NVMe based storage solutions, whether it is anywhere near as relevant as it used to be??

So the idea is simple.  Present a raw character device (or multiples of)  to the database technology being used and allow it to handle the file system type tasks and thereby remove the (perceived??) overhead of a file system.

For me I can only think this would be really relevant in particular high, low latency environments, and even then I would be keen to see some base line figures to confirm if the extra admin overhead raw devices brings is worth it.

Hang on! You enticed me in with how many raw devices can you create?  So tell me!!

OK,  the greatest number of raw devices you can created on a Red Hat Enterprise Linux/Cent OS 7 system is… 8192.

That’s 0 all the way up to 8191.

Would I really ever need that many raw devices?

Well, maybe.  I doubt you would ever really create 8192 raw devies, as this would require you to effectively provision the same number of storage LUNs.

So how did I stumble across this (potentially unimportant) fact?

Well, whilst working on the requirements from my local DBA team,  I was also attempting to introduce a level of standardisation, so it was easy to see what the raw devices were used for.

Raw devices are numbered and it doesn’t look like you can use alphabetical characters in the name.  So a number standard had to be created.  For example;

raw devices numbered 1100-1150 might be the raws used to store the actual data, 1300-1320 be contain the logs and 1500-1510 might be for temp tables.

So if you also include a bit of future proofing and have a large enough number of storage LUNs to provision for use directly by the RDBMS then you could quickly find yourself constrained if you don’t plan ahead.

Anyway, the above was found out because I started to get strange problems when trying to create udev rules which would out create the raw devices, so I had a fun our of trying to work out the magic number.

For raw devices (if not much else) is 8191 (the maximum raw number you can use).

Improving your email reputation

In the past I have looked at, adding Sender Policy Framework (SPF) which looks at the bounce address of an email, I have also looked at SenderID which looks at the FROM address part of the email header, and then there was Domain Keys Identified Mail (DKIM) which signs your emails to confirm that they were indeed sent from an authorised email server (which has a published public key via DNS).

Now the final piece in the email puzzle is DMARC.

What is DMARC?

Very briefly it stands for Domain Message Authorisation Reporting and Conformance.

What does it do?

Two things;

  • Via DNS you (the sender) publish under your domain record an instruction that states whether or not a receiving email server should trust your SPF, SenderID and DKIM.  All three are also published under your domain DNS zone.  You can say whether or not they should reject or quarantine emails that purport to come from your email server but don’t (or do nothing).
  • Receiving information in XML format about how your domain reputation is fairing from a given receiving email server, as and when your users send emails to third parties.

What does the DNS record look like?

It looks like the following;

_dmarc IN TXT "v=DMARC1; p=none; rua=mailto:postmaster@lab.tobyheywood.com"

OK, so if we break this down a little we have the following components;

  • “_dmarc” – This is the name of the TXT record, which recipient email servers will try to retrieve from your DNS when configured to use DMARC.
  • “IN” – Standard part of a BIND DNS record.  Means Internet, nothing more, nothing less.
  • “TXT” – This is the record type.  DMARC utilises the bog standard TXT record type as this was seen as the quickest method to adoption, rather than treading the lengthy path towards a new DNS record type.
  • Now we have the actual payload (note they are separated by semicolons);
    • “v=DMARC1”  this is the version of DMARC
    • “p=none” – We are effectively saying do nothing and we don’t want to confirm or deny that the emails are from us or that the SPF, SenderID or DKIM information should be trusted
    • “rua=postmaster@lab.tobyheywood.com” – Doesn’t need to be included but if you do include this part, then you can expect to receive emails from the 3rd party email servers who have received email(s) from your domain, confirming what they thought of it

Are there other options?

Yes, though I am only going to focus on a couple here;

The “p=” option has three values that you can use;

  • none – Effectively do nothing, this should be set initially whilst you get things set up. Once you have confirmed things look good, then you can start to be a bit more forceful in what you would like other email providers to do with messages which do not come from your email server.
  • “quarantine” – This is were they would potentially pass the email on for further screening or simply decide to put it into the spam/junk folder.
  • “reject” – This is you saying that if a 3rd party receives an email, supposedly from you, but that wasn’t sent from one of the list of approved email servers (SPF or SenderID) or if it doesn’t pass the DKIM test then it should be rejected and not even delivered.

You set your _dmarc record, now what?

We will assume that you DNS zone has replicated to all of your DNS servers and that you have correctly configured the you email server to sign your outbound emails with your DKIM private key.

At this point I would highly advise going to https:///www.mail-tester.com and send a test email (with a realistic subject and a paragraph or two of readable text) to the email address they provide.

Once mail-tester.com has received your test email, they will process the email headers to confirm whether or not SPF, SenderID, DKIM and DMARC are all correctly configured and working.

It is possible that if your DNS servers are not completely aligned and up-to-date, mail-tester.com may be unable to provide an accurate report.  If that happens give it 12 hours and repeat the test again.


Credit: Thanks to Mr Darkroom for making the featured image called Checkpoint available on Flickr.

Upgrading Fedora 25 to Fedora 26

Normally, I perform my OS upgrades by way of a clean install.  This time round though, I thought I’d give the upgrade process a try, given Fedora are pushing it quite a lot this time round.

The actually process took about 30-35 minutes on my machine, and that’s including the time required to download the software updates in the first place.

The upgrade process was started from the Software GUI.  Clicking the install button results in the PC rebooting and then running in “no mans land” for a while whilst the updates are applied.  During the process all you really have to watch is a small bit of text in the upper left corner of your screen.

Once the upgrade has completed, the PC reboots.

The first thing you notice is that grub now has a new kernel version to boot from.  Admittedly not overly note worthy for me this time around as I’m just upgrading my day to day machine and don’t really need to consider what new features there are in the kernel on this occasion.  And if it breaks something then I will enhance my knowledge whilst fixing whatever goes wrong.

Next up I have the usual prompt for my disk encryption password and then shortly after that the login prompt.

Upon entering my password, my screen flickered, the screen went grey and the mouse pointer was relocated right into the centre of my screen.  At this point my PC locked up.  Awesome! Just what I wanted.

A brief bit of googling didn’t really show anything specific for Fedora 26 but it did yield a link the the Common Fedora 25 Bugs page.  The more interesting part though described my exact problem.  Frozen grey screen after upgrade.

So, it looks like it is my fault, well sort of.  I happen to have installed the EasyScreenCast Gnome plugin and this seems to upset things.  Well sort of.  I left that enabled and installed, however I removed (as advised in the F25 bugs page) the package “clutter-gst2”.

A quick reboot and my issue was resolved!  Yay google and the Fedora wiki to the rescue. And now to have a look at what has changed.

Red Hat Satellite Server – Fatal error in Python code occurred [[6]]

I have embraced Red Hat Satellite server in a big way over the past year and try to use it wherever possible though not for everything.

One of the features I started using to simply life whilst I look at other configuration management systems, was Configuration Channels.  These allow you to provide a central repository of files and binaries which can be deployed to a server during the initial kickstart server deployment process.

Some changes had been made a month or so ago, to ensure that a specific configuration channel would be included in future deployments by way of updating the Activation Key for that deployment type in Satellite server.  Seems innocent enough at this point.  It is worth noting that there were other configuration channels associated with this activation key.

At the same time I had also added a couple of packages to the software package list which were also required at time of deployment.

Now, I rely on scripts which have been deployed to a server to complete some post server build tasks.  The first thing I noticed after a test deployment, was a complete lack of any scripts where I expected them to be.  The configuration channels had created the required folder structure but had stopped completely and had gone no further.  The error the Satellite server reported back to me was… well not massively helpful;

Fatal error in Python code occurred [[6]]

Nothing more, nothing less.

At this point I started trying to remember what I had added (thankfully not to hard as I document things quite heavily 🙂 ).  Here is roughly the steps I took to confirm whether the issue resided;

  • Remove the additional packages I had specified for this particular build – made no difference
  • Remove what I the most recently added configuration channel – made no difference
  • Tested another Red Hat Enterprise Linux 7 build (not using this particular kickstart profile) – success, so the issue would appear to be limited to this one profile
  • Remove the other configuration channels that were added some time before the last one was added – failed, still the configuration channels would not deploy. But wait, there was light at the end of the tunnel!

But, following this last step, the error message changed, from something not very helpful to something quite helpful indeed!  The message stated that permissions could not be applied as per those stipulated against specific files in the configuration channel.

So it transpires that it was a permissions resolution issue. Well, more a group resolution issue really.  There were a couple of files which were set to be deployed with a specific group.  The group in question is served from a LDAP server and the newly built machine wasn’t configured at that point to talk to the LDAP server, for this particular deployment we didn’t want auto registration with the LDAP services.

So the lesson here is make small changes, test frequently and make sure you document what you have done.  Or use a configuration management system which is version controlled, so you can easily roll back.

Just so we are clear, I was running Red Hat Satellite Server 5.7 (full patched) on RHEL 6.8 and trying to deploy RHEL 7.3.  My adventure to upgrade Satellite server to version 6.2 will be coming to a blog post soon.

So, it would appear this story comes with a lesson attached (free of charge) that all should take note of – “Always make one change at a time and test or as near to one as you can”.

Featured image credit: Charly W Karl posted e.Deorbit closing on target satellite on Flickr.  Thanks very much.

iSCSI and Jumbo Frames

I’ve recently been working on a project to deploy a couple of Pure Storage Flash Array //M10‘s, and rather than using Fiber Channel we opted for the 10Gb Ethernet (admittedly for reasons of cost) and using iSCSI as the transport mechanism.

Whenever you read up on iSCSI (and NFS for that matter) there inevitably ends up being a discussion around the MTU size.  MY thinking here is that if your network has sufficient bandwidth to handle the Jumbo Frames and large MTU sizes, then it should be done.

Now I’m not going to ramble on about enabling Jumbo Frames exactly, but I am going to focus on the MTU size.

What is MTU?

MTU stands for Message Transport Unit.  It defines the maximum size of a network frame that you can send in a single data transmission across the network.  The default MTU size is 1500.  Whether that be Red Hat Enterprise Linux, , Fedora, Slackware, Ubuntu, Microsoft Windows (pick a version), Cisco IOS and Juniper’s JunOS it has in my experience always been 1500 (though that’s not to say that some specialist providers may change this default value for black box solutions.

So what is a Jumbo Frame?

The internet is pretty much unified on the idea that any packet or frame which is above the 1500 byte default, can be considered a jumbo frame.  Typically you would want to enable this for specific needs such as NFS and iSCSI and the bandwidth is at least 1Gbps or better 10Gbps.

MTU sizing

A lot of what I had ready in the early days about this topic suggests that you should set the MTU to 9000 bytes, so what should you be mindful of when doing so?

Well, lets take an example, you have a requirement where you need to enable jumbo frames and you have set an MTU size of 9000 across your entire environment;

  • virtual machine interfaces
  • physical network interfaces
  • fabric interconnects
  • and core switches

So you enable an MTU of 9000 everywhere, and you then test your shiny new jumbo frame enabled network by way of a large ping;


$ ping -s 9000 -M do


> ping -l 9000 -f -t

Both of the above perform the same job.  They will attempt to send an ICMP ping;

  • To our chosen destination –
  • With a packet size of 9000 bytes (option -l 9000 or -s 9000), remember the default is 1500 so this is definitely a Jumbo packet
  • Where the request is not fragmented, thus ensuring that a packet of such a size can actually reach the intended destination without being reduced

The key to the above examples is the “-f” (Windows) and “-M do” (Linux).  This will enforce the requirement that the packet can be sent from your server/workstation to its intended destination without the size of the packet being messed with aka fragmented (as that would negate the whole point of using jumbo frames).

If you do not receive a normal ping response back which states its size as being 9000 then something is not configured correctly.

The error might look like the following;

ping: local error: Message too long, mtu=1500
ping: local error: Message too long, mtu=1500

The above error is highlighting the fact that we are attempting to send a packet which is bigger than the local NIC is configured to handle.  It is telling us the MTU is set at 1500 bytes.  In this instance we would need to reconfigure our network card to handle the jumbo sized packets.

Now lets take a look at what happens with the ICMP ping request and it’s size.  As a test I have pinged the localhost interface on my machine and I get the following;

[toby@testbox ~]$ ping -s 9000 -M do localhost
PING localhost(localhost (::1)) 9000 data bytes
9008 bytes from localhost (::1): icmp_seq=1 ttl=64 time=0.142 ms
9008 bytes from localhost (::1): icmp_seq=2 ttl=64 time=0.148 ms
9008 bytes from localhost (::1): icmp_seq=3 ttl=64 time=0.145 ms
--- localhost ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2085ms
rtt min/avg/max/mdev = 0.142/0.145/0.148/0.002 ms

Firstly notice the size of each request.  The initial request may have been 9000 however that doesn’t take into account the need for the header to be added to the packet, so that it can be correctly sent over your network or the Internet.  Secondly notice that the packet was received without any fragmentation (note I used the “-M do” option to ensure fragmentation couldn’t take place).  In this instance the loopback interface is configured with a massive MTU of 65536 bytes and so all worked swimmingly.

Note that the final packet size is actually 9008 bytes.

The packet size increased by 8 bytes due to the addition of the ICMP header mentioned above, making the total 9008 bytes.

My example above stated that the MTU had been set to 9000 on ALL devices.  In this instance the packets will never get to their intended destination without being fragmented as 9008 bytes is bigger than 9000 bytes (stating the obvious I know).

The resolution

The intermediary devices (routers, bridges, switches and firewalls) will need an MTU size that is bigger than 9000 and be size sufficiently to accept the desired packet size.  A standard ethernet frame (according to Cisco) would require an additional 18 bytes on top of the 9000 for the payload.  And it would be wise to actually specify a bit higher.  So, an MTU size of 9216 bytes would be better as it would allow enough headroom for everything to pass through nicely.

Focusing on the available options in a Windows world

And here is the real reason for this post.  Microsoft with all their wisdom, provide you with a drop down box to select the required predefined MTU size for your NICs.  With Windows 2012 R2 (possibly slightly earlier versions too), the nearest size you can set via the network card configuration GUI is 9014.  This would result in the packet being fragmented or in the case of iSCSI it would potentially result in very poor performance.  The MTU 9014 isn’t going to work if the rest of the network or the destination device are set at 9000.

The lesson here is make sure that both source and destination machines have an MTU of equal size and that anything in between must be able to support a higher MTU size than 9000.  And given that Microsoft have hardcoded the GUI with a specific number of options, you will probably want to configure your environment to handle this slightly higher size.

Note.  1Gbps Ethernet only supported a maximum MTU size of 9000, so although Jumbo Frames can be enabled you would need to reduce the MTU size slightly on the source and destination servers, with everything in between set at 9000.

Featured image credit; TaylorHerring.  As bike frames go, the Penny Farthing could well be considered to have a jumbo frame.

A step-by-Step Guide to Installing Spacewalk on CentOS 7

Please note.  This is for an outdated version of Spacewalk.

It would appear that during an upgrade of my blog at some point over the past year, I have managed to wipe out the original how to guide to installing Spacewalk on CentOS 7, so here we go again.

A step-by-step guide to installing Spacewalk on CentOS 7.  Just in case you weren’t aware Spacewalk is the upstream project for Red Hat Satellite Server.


  • You know the basic idea behind Spacewalk, if not see here
  • You have a vanilla VM with CentOS 7.2 installed which was deployed as a “minimal” installation
  • You have subsequently run an update to make sure you have the latest patches
  • You have root access or equivalent via sudo
  • You have got vim installed (if not run the following command should fix that)
    yum install vim -y
  • The machine you intend to install Spacewalk onto has access to the internet


Firstly, we need to install and/or create the necessary YUM repo files that will be used to install Spacewalk directly from the Spacewalk official yum repository and all it’s associated dependencies.

  1. Run the following command as root on your spacewalk VM
    rpm -Uvh http://yum.spacewalkproject.org/2.5/RHEL/7/x86_64/spacewalk-repo-2.5-3.el7.noarch.rpm
  2. You then need to manually configure another yum repository for JPackage which is a dependency for Spacewalk, by running the following (you will need to be the root user to do this);
    sudo -i
    cat > /etc/yum.repos.d/jpackage-generic.repo << EOF
    name=JPackage generic
  3. And then we also need to install the EPEL yum repository configuration for CentOS 7;
    rpm -Uvh https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm

Installation: Embedded Database

Spacewalk utilises a database back end to store the required information about your environment.  The two options are PostgreSQL and Oracle.  Neither would be my preference but I always opt for the lesser of two evils – PostgreSQL.

The installation is a piece of cake, and can be performed by issuing the following command at the command line;

yum install spacewalk-setup-postgresql -y

During the process you should be prompted to accept the Spacewalk GPG key. You will need to enter “y” to accept!

Installation: Spacewalk

Now things have been made pretty easy for you so far.  And we wont stop now.  To install all of the required packages for spacewalk just run the following;

yum install spacewalk-postgresql

And let it download everything you need.  In all (at the time of writing) there were 379 packages totalling 563M.

Again you will likely be prompted to import the Fedora EPEL (7) GPG key.  This is necessary so just type “y” and give that Enter key a gentle tap.

And.. you will also be prompted to import the JPackage Project GPG key.  Same process as above – “y” followed by Enter.

During the installation you will see a lot of text scrolling up the screen.  This will be a mix of general package installation output from yum and some commands that the RPM package will initiate to set and define such things as SELinux contexts.

The key thing is you should see right at the end “Complete!”.  You know you are in a good place at this point.

Security: Setting up the firewall rules

CentOS 7 and (for that matter) Red Hat Enterprise Linux 7 ship with firewalld  as standard.  Now I’m not complete sure of firewalld but I’m sticking with it, but should you decide you want to use iptables (and you have taken steps to make sure it is enabled), then I have provided the firewall rules required for both;


firewall-cmd --zone=public --add-service=http
firewall-cmd --zone=public --add-service=http --permanent
firewall-cmd --zone=public --add-service=https
firewall-cmd --zone=public --add-service=https --permanent

Note.  Make sure you have double dashes/hyphens if you copy and paste as I have seen the pasted text only using a single hyphen.

Skip to section after iptables if you have applied the above configuration!


Now as iptables can be configured in all manor or ways, I’m just going to provide the basics, if your set-up is typically more customised than the default, then you probably don’t need me telling you how to setup iptables.

I will just make one assumption though.  That the default INPUT policy is set to DROP and than you do not have any DROP, REJECT lines at the end of your INPUT chain.

iptables -A INPUT -p tcp --dport 80 -j ACCEPT
iptables -A INPUT -p tcp --dport 443 -j ACCEPT

And don’t forget to save your firewall rules;

# service iptables save

Configuring Spacewalk

Right then, still with me?  Awesome, so lets continue with getting Spacewalk up and running.  At this point there is one fundamental thing you need…

You must have a resolvable Fully Qualified Domain Name (FQDN).  For my installation I have fudged it and added the FQDN to the host file, as I intend to build the rest of my new lab environment using Spacewalk.

So assuming you have followed everything above we can now simply run the following;


Note.  The above assumes you have the embedded PostgreSQL database and not a remote DB, or the Oracle DB option.  Just saying.

So you should see something like the following (it may take quite some time for many of the tasks to be completed so bare with it);

[root@spacewalk ~]# spacewalk-setup
* Setting up SELinux..
** Database: Setting up database connection for PostgreSQL backend.
** Database: Installing the database:
** Database: This is a long process that is logged in:
** Database:   /var/log/rhn/install_db.log
*** Progress: ###
** Database: Installation complete.
** Database: Populating database.
*** Progress: ###########################
* Configuring tomcat.
* Setting up users and groups.
** GPG: Initializing GPG and importing key.
** GPG: Creating /root/.gnupg directory
You must enter an email address.
Admin Email Address? toby@lab.tobyhewood.com
* Performing initial configuration.
* Configuring apache SSL virtual host.
Should setup configure apache's default ssl server for you (saves original ssl.conf) [Y]? 
** /etc/httpd/conf.d/ssl.conf has been backed up to ssl.conf-swsave
* Configuring jabberd.
* Creating SSL certificates.
CA certificate password? 
Re-enter CA certificate password? 
Organization? Toby Heywood
Organization Unit [spacewalk]? 
Email Address [toby@lab.tobyhewood.com]? 
City? London
State? London
Country code (Examples: "US", "JP", "IN", or type "?" to see a list)? GB
** SSL: Generating CA certificate.
** SSL: Deploying CA certificate.
** SSL: Generating server certificate.
** SSL: Storing SSL certificates.
* Deploying configuration files.
* Update configuration in database.
* Setting up Cobbler..
Cobbler requires tftp and xinetd services be turned on for PXE provisioning functionality. Enable these services [Y]? y
* Restarting services.
Installation complete.
Visit https://spacewalk to create the Spacewalk administrator account.

Now at this point you are almost ready to break open a beer and give yourself a pat on the back.  But lets finalise the installation first.

Creating your Organisation
(that’s Organization for the Americans)

Setting up your organisation requires only a few simple things to be provided.

  • Click the Create Organization button and you should finally see a similar screen to the following;
    Set up your Spacewalk organization.
  • The last thing to do now you have your shiny new installation of Spacewalk is to perform a few sanity checks;
    Successful installation of Spacewalk.
  • Navigate to Admin > Task Engine Status and confirm that everything looks health and that the Scheduling Service is showing as “ON”
  • You can also take a look at my earlier blog post – spacewalk sanity checking – about some steps I previously took to make sure everything was running.

And there we go, you have install Spacewalk.

Security Broken by Design

Admit it. You, just like me, use Google every day to answer those tough questions that we face daily.

Sometimes we will ask it how to get us home from somewhere we have never been before – “OK Google, take me home” – other times we might be close to starvation (relatively speaking) – “show me interesting recipes” or “OK Google, give me directions to the nearest drive through McDonalds”, but were I use it most, is at work, where I search for such mundane things as; “rsyslog remote server configuration”. Yes, I know, I could just look at the man page for rsyslog.conf but Google seems to have worked its way into my head so much that it is often the first place I look.

Right… back to the topic at hand – Security Broken by Design.

So whilst Googling how to set up a remote syslog server I read through one persons blog post and an alarm bell started to ring!

This particular post had correctly suggested the configuration for rsyslog on both the client and server but then went on (in a very generic way), instructing readers to opening up firewall ports on the clients.

This highlighted a fundamental lack of understanding on the part of the individual whose blog I was reading. You only need to open up ports 514/tcp or 514/udp to enable rsyslog to function on the server-side.  The connection is initiated from the client NOT the server.  Granted, in a completely hardened installation it is likely that outbound ports will need to be enabled.  BUT, where security is concerned, I feel that things should not be taken for granted or worse, assumed!

This generic discussion about security seems completely idiotic! The likes of Red Hat, Ubuntu and almost all other distributions now enable firewalls by default.  And the normal fashion for such a thing, is to allow “related” and “established” traffic to flow out of your network card to the LAN and potentially beyond.  But (and more importantly) to block none essential traffic inbound to your machine.

If you are working in a hardened environment then one of the two options below would be better suited for your server;

So in short.

Please think before you apply make potentially unnecessary changes to your workstations and servers!

Thanks to Sarah Joy for posting the featured image Leader Lock on Flickr.

Openfire Server-to-Server connectivity issue

Recently, I’ve been working on deploying a clustered Instant Messaging (IM) chat service in my lab and after setting up the clustering by way of the Hazelcast plugin, I found that I was have some rather strange errors being written into the log files which suggested that the server to server connectivity was not being successfully initiated.

Here is a snippet from the log file;

2016.09.14 17:51:13 WARN [Server SR - 16593225]: org.jivesoftware.openfire.net.SocketReader - Closing session due to incorrect hostname in stream header. Host: of1.lab.tobyheywood.com. Connection: org.jivesoftware.openfire.net.SocketConnection@c53ac0 socket: Socket[addr=/,port=44042,localport=5269] session: null
 2016.09.14 17:51:13 WARN [Server SR - 3158473]: org.jivesoftware.openfire.net.SocketReader - Closing session due to incorrect hostname in stream header. Host: of1.lab.tobyheywood.com. Connection: org.jivesoftware.openfire.net.SocketConnection@1f8e35b socket: Socket[addr=/,port=44043,localport=5269] session: null
 2016.09.14 17:51:13 WARN [pool-10-thread-3]: org.jivesoftware.openfire.server.ServerDialback[Acting as Originating Server: Create Outgoing Session from: openfire.lab.tobyheywood.com to RS at: of1.lab.tobyheywood.com (port: 5269)] - Unable to create a new outgoing session
 2016.09.14 17:51:13 WARN [pool-10-thread-3]: org.jivesoftware.openfire.session.LocalOutgoingServerSession[Create outgoing session for: openfire.lab.tobyheywood.com to of1.lab.tobyheywood.com] - Unable to create a new session: Dialback (as a fallback) failed.
 2016.09.14 17:51:13 WARN [pool-10-thread-3]: org.jivesoftware.openfire.session.LocalOutgoingServerSession[Authenticate local domain: 'openfire.lab.tobyheywood.com' to remote domain: 'of1.lab.tobyheywood.com'] - Unable to authenticate: Fail to create new session.

Now as part of my investigation into the issue I noticed that the servers were not listing on the server to server port (TCP port 5269).  Which in all honesty confused me even more.

A bit of Googling later (admit it, we all do it sometimes), and I had found the solution.


From the Openfire Web UI, Navigate to the following location and set the STARTTLS Policy to “Required“.

  • Server > Server Settings > Server to Server > STARTTLS Policy = Required.

Do this for both (or all) nodes and restart the services.  You should find that things are looking happier.

kernel: BUG: soft lockup – CPU#0 stuck for 67s!

Over the weekend I was confronted by the above error being repeated on the console of a VM running Oracle RDBMS.

This error occurs when there is a shortage of CPU resources.  For me the solution was a quick shut down of the VM and increasing the available CPU resources.  However there are more ways to skin this cat…

There is also a kernel parameter which can be tweaked;


Were “x” is the threshold (in seconds) you want to allow the kernel to wait before decided there has been a soft lockup.

The Red Hat documentation showed a threshold of 30 seconds.  So I would recommend a bit of experimentation if you feel that 30 seconds is not high enough.  Or throw more resources at it.

Reference Material

Featured image: The Old Lockup was made available by John Powell on Flickr.