Ask
Reply
Solution
13-06-2014 03:47 PM
Like many people here, I've been trying to get a SureSignal (v1 and v3) working with a BT broadband connection. Following extensive testing, I'm pretty sure I've identified the root cause - which, unfortunately, appears to be on BT's network.
Firstly, the internet connection from BT is provided over a standard phone line using either ADSL (for conventional broadband) or VDSL (for BT infinity).The connection is established using PPPoA or PPPoE - the former for conventional broadband, and the latter for infinity. The PPP connection allows data to be transferred between your router and a device inside BT's network - it is effectively a tunnelled connection. This connection has an MTU associated with it, which defines the maximum packet size that can be conveyed across the link.
For a standard wired Ethernet segment, the MTU is almost always 1500 bytes. For a PPPoE session, the MTU is usually set to 1492 bytes, as the PPPoE protocol has a 6-byte header, and the PPP protocol uses an additional two bytes. This means that the maximum packet size conveyed by the PPPoE connection is 1492 bytes.
In order to transfer more than 1492 bytes, it is necessary to split the packet into multiple parts, i.e. fragment it. The transmitting host will automatically try to determine the path MTU (i.e. the effective MTU between devices). It is possible to verify this under Linux using the 'tracepath' command. This works according to RFC 1191 - it sends a large packet to the remote host, and watches for any ICMP replies. If a router along the path can't transfer the packet, it will return an ICMP response indicating that the packet is too large, and that it requires fragmentation. Running the tracepath command from inside my network to a host on the internet correctly indicates that the path MTU of my infinity connection is 1492 bytes.
Now, if the same command is run from a host on the internet, the path MTU is reported as 1500 bytes. This indicates that the router that needs to reduce the size of the MTU is not sending the ICMP 'too large' response. The router responsible for sending this is the PPP concentrator on BT's network. It needs to send this response to allow remote hosts to correctly discover the path MTU, but for some reason, it doesn't. So we effectively have an internet connection that doesn't obey the rules.
Why does this matter? When the SureSignal attempts to establish communication with the Vodafone servers, it is trying to set up an IPSEC VPN. As part of the negotiation, the Vodafone server will try to send a certificate to the SureSignal. This is quite large, and results in a fragmented packet. The Vodafone server automatically splits the packet into 1500-octet chunks, as this is the MTU of the segment that the machine is connected to.
When the 1500-octet packet reaches the BT PPP concentrator, the packet clearly exceeds the 1492-byte MTU, and an ICMP 'too large' packet should be sent back to the Vodafone server. In turn, the server will reduce the packet size, and the packet will be sent through the PPP link successfully. However, as the ICMP response isn't being sent, the packet is too large to fit through the PPP tunnel, and is dropped by the BT PPP concentrator. How nice. We (on the customer side of the BT PPP tunnel) would never know - the packet simply wouldn't arrive. In fact, if you trace packets on this interface, you can see the final small fragment arriving - clearly useless without the previous packet data.
So the fix is simple: BT needs to make their concentrator return the ICMP response as per RFC 1191. This would allow the Vodafone server to reduce the packet size, and in turn the IPSEC VPN could be established between the Vodafone server and the SureSignal. Now - I've said it is simple but I really have no idea whether this is even possible - I don't work for BT, and have no knowledge of their configuration. However, it is a national issue, and they do need to fix it.
As to why some routers work and some don't: it is possible to change the MTU of the PPP connection to more than 1492 bytes. Setting the MTU to 1500 should work, as it will match the corresponding MTU on Vodafone's Ethernet segment. It doesn't fix the problem, but it does allow you to work around it. BT allows you to increase the PPP link MTU, but the PPP client needs to support RFC 4638.
However, there's another caveat: if (like me) you're trying to make this work on a conventional ADSL connection using a Vigor 120 ADSL modem, then you may be out of luck. The Vigor 120 essentially converts between PPPoE and PPPoA, allowing a router to connect to the BT ADSL network using PPPoE. The overhead of PPPoE means that setting the MTU of the PPP link to 1500 means that the MTU of the Ethernet segment connected to the modem needs to be set to 1508. This is known as a baby jumbo frame. If you can find a router that supports this (I'm using a MikroTik RouterBoard) then great - but unfortunately the Vigor doesn't support an Ethernet MTU of greater than 1500 bytes. I have identified another modem to try (D-Link DSL320B) but I have yet to experiment with it.
The long and short of it is: BT need to fix their broken implementation to send the correct ICMP response for packets which exceed the PPP MTU. Vodafone could theoretically reduce their Ethernet MTU to achieve the same result, but this really isn't the right solution to the problem - BT needs to make its network behave as per the standards. You may have some success changing the MTU of the PPP connection (some of the BT hubs support this), but again, you're only really working around the problem.
I have raised this as an issue with BT (who, incidentally are familiar with the SureSignal issue, and are blaming Vodafone). I will post updates if I receive a sensible reply... in the meantime, I'm happy to help anyone who might be wanting to seek a better understanding of the problem. Hope this is useful to someone.
13-06-2014 04:09 PM
That is WAY over my head but it sounds like you definitely know your stuff! Does it explain my problem which is that my SS v1 works usually fine with my BT HH3 (Infinity 2 connection) but twice, once a few months ago and again the other day (still not resolved this time) it just stopped working? Why would it work 99% of the time - for around a year now - but then suddenly have moments of not working? Does that fit your account above?
13-06-2014 04:21 PM
It's unlikely to be the same problem - in my case, it just plain refuses to work. Unfortunately the SureSignal provides precious little information to determine what's going on - you really need to get down to packet level to see what it's up to. I guess there are a few possibilities in your case: a firmware upgrade to the BT hub could have caused problems (this issue from March looks interesting) - another possibility is that the BT dynamic IP address might not be in a known range to Vodafone, and needs whitelisting.
Initially, we upgraded our SS1 to a SS3 in the hope that they would have fixed the firmware - this was before we understood where the problem was in our case.
13-06-2014 05:22 PM
This issue always strikes me as very odd. I've used an SS2 and an SS3 with a HH4 and HH5 with both ADSL and Infinity and never had an issue with connectivity.
13-06-2014 05:38 PM
I agree that it is strange. If the MTU on the HH was set to 1500, you wouldn't see the problem. Also - it's not beyond the realms of possibility that only some of BT's PPP concentrators are affected by this issue, which could also contribute to the randomness of the problem.
We did have a SS1 working on a BT ADSL connection using an old 2-wire hub. When we changed to a different router and modem combination, it stopped working. The extra 8 bytes associated with the PPPoE header was the cause. The SS would not work on my home (infinity) connection because I can't set an MTU higher than 1492 with the BT business hub 3.
If you've got time, it would be very interesting to perform an inbound tracepath to your BT public IP address. You can find your public IP address here and run a tracepath from here. If your path MTU is in the region of 1492, it suggests that you're connected to a concentrator that is returning the correct ICMP packets. If your path MTU is 1500, either your router's MTU is set to 1500, or the concentrator is returning the correct ICMP packets.
Ultimately I'm trying to provide the results of my investigation to try and get to the bottom of this once and for all. It is a problem that seems to have affected some users since the SS1 was first released, and BT/Vodafone don't seem to be able to fix it. Hopefully the information in this thread will help someone on the BT or Vodafone side to home in on the actual problem.
13-06-2014 05:50 PM
14-06-2014 05:47 PM
16-06-2014 10:24 AM
I'll be sure to let you know the result of my communication with BT. Sadly they only seem to want to communicate by letter, so I'm not holding out much hope that it will reach the right person to deal with this.
19-11-2014 09:11 AM
Patrick,
Did you ever manage to solve your issue here?
I encountered problems with Sure Signal the moment I upgraded from ADSL to VSDL (BT Infinity). I did exactly the same as you and upgraded from SS1 to SS3 but with no luck either.
After staggering around between BT and VF support trying to find a solution, I have come to exactly the same conslusion as you. Shame I didn't find you post 2 weeks ago.
Anyhow, I would be very interested to hear how you've got on.
BN
19-11-2014 09:44 AM
Was the issue ever solved? Not at all, however I have learnt quite a bit more about what's going on.
I actually have two connections in different locations - one is a BT infinity connection, and the other is a BT ADSL broadband connection. Neither connection worked with the standard BT business hub.
Since I posted this, I've done quite a bit of work to determine what I think is going on. Now - here's the thing: I can only theorise as to what's going on here, as both Vodafone and BT tech support are about as useful as a used teabag - neither is able (or willing) to answer the technical questions that I have posed - which means that I can't confirm whether what I'm seeing is in fact true or not.
I replaced the router on both connections with a Mikrotik RB2011-series router. This isn't your average consumer-grade device - it has more functionality than you can imagine, and has been used to successfully get a VSS working in both locations. I wouldn't consider the issue resolved by the way - it's working in spite of Vodafone and BT, not because of them, if that makes sense.
Anyway - my BT infinity connection eventually (and, funnily enough after leaning on BT) suddenly started working. The VSS connected and has remained connected for several months now. Happy days. However, the ADSL connection stubbornly refused to connect. Now - both of these connections should (technically) be the same - so why does it only work in one location? I suspect this may be down to BT configuration on the BRAS (Broadband Remote Access Server).
Many people have observed that Vodafone requires an MTU of 1500 all the way to the VSS on the customer site. When a large UDP packet is sent (as is the case when the Vodafone server starts to negotiate the IPSec tunnel encryption), the payload is split into several 1500-byte packets. When a BT ADSL (or infinity) connection is set up, the MTU is actually a little smaller than this to account for the additional PPP headers. This shouldn't normally present a problem as there are mechanisms to deal with this, but Vodafone have helpfully blocked this mechanism on their firewall.
What's supposed to happen is this: when a large packet arrives at a router which has a lower MTU on the other side, an ICMP 'fragmentation required' packet is generated, and sent back to the sending host. This should in turn cause the sender to reduce the size of the packet and re-send. This process should not be confused with TCP MSS clamping - the VSS boxes use UDP to negotiate tunnel encryption - so MSS clamping isn't relevant here. Now - here's the killer: have you ever tried to ping the servers that the VSS boxes try to contact? No reply? This is very bad. What this means is that Vodafone have helpfully blocked inbound ICMP packets which are used by both ping and, critically, the 'fragmentation required' mechanism. Let me say that again: Vodafone have blocked the very packets that would allow their servers to deal with differing MTU values.
I asked Vodafone tech support about this, and was told that it had been blocked for ''security'. Clearly Vodafone's network technicians have no clue about low-level network protocols. But given Vodafone's track record with customer service elsewhere, this isn't really a surprise. Note also that BT could still be involved here, but it's impossible for me to sniff packets on BT's network side, so I can't 100% prove that my theory is correct.
Effectively this means that Vodafone's servers require the MTU to be 1500 to the VSS because their servers don't know any different. Clever, eh?
So what options do you have? Several, but none of them are fixing the problem - they are all workarounds:
Personally, I'd rather that Vodafone simply removed the ICMP block on their firewall. Anyone who blocks the entire ICMP protocol for 'security' deserves to be re-employed elsewhere. This issue, which seems to affect hundreds of customers (and, incidentally has made Vodafone millions in revenue from selling the VSS boxes) could be fixed for everyone in less than 5 minutes by a competent network engineer.
Sorry for the length - hope there was some useful information...