cancel
Showing results for 
Search instead for 
Did you mean: 

File downloads are truncated on Three Broadband

electricworry
Active

I'm looking for some verification about whether the issue I have is isolated to me, my area, or if it's a general Three-wide problem as I think it is.

I use Three 5G broadband and I'm about 50 metres away from the gNodeB so I've got excellent uninterrupted signal. It's not a Layer 1 problem I'm facing. The problem I have is that TCP connections are terminated prematurely (i.e. a RST packet is sent) before all data is received. Here's a simple test to verify if you have the problem or not.
The following command will attempt to download an 8MiB file (all NULs) from a website in AWS. It should work the same on Linux, MacOS, and modern Windows computers just the same. For me, I get the error "curl: (18) transfer closed with XXXXXX bytes remaining to read", which is the problem.

curl -H "Connection: close" https://electricworry.net/test-8 -o test-curl

If you're not comfortable connecting to my server, the following third-party download test should produce the same result (it does for me!):

curl -H "Connection: close" https://files.testfile.org/ZIPC/15MB-Corrupt-Testfile.Org.zip -o test-curl

When I tested, I collected a packet capture at both sides and I can see that my server sends the whole 8MiB file in the TLS session and then terminates the connection with a RST packet at the end (which it does because we sent a "Connection: close" header). However on my client side, only half of the file comes through before the session is impolitely terminated.

Would people on Three 5G broadband mind testing please to help confirm/deny whether this is a general problem or an individual one?

I've done a lot of testing over the past month and I've got a hypothesis.

  • Comparing the server and client packet captures, the packets do not match up; the sequence and ack numbers - though they start the same - end up being very different. It appears that something in the middle is buffering the stream and ACKing the packets on my behalf.
  • The problem only happens when I'm on my Three 5G Broadband service. If I take my laptop into work, the problem is gone. The problem doesn't occur when I use my Giffgaff mobile as a hotspot either.
  • The problem exists on all websites (I suffer *a lot* from Ubuntu APT packages being half-downloaded and rejected on my workstation).
  • Since the times on my server and client are synchronised as good as possible with NTP I can compare progress of the stream at both sides. When my server has finished transmitting (and received the final ACK) it correctly sends a RST packet according to the standard. However, at that same time on the client all of the stream has not been received (we're about half-way) and I certainly haven't sent an ACK for it. then a RST comes in tearing down the session before it's finished and truncating the download.
  • The problem only happens if "Connection: close" header is used. If "Connection: keep-alive" is used, then it's the responsibility of the client to terminate the connection once it's done. In this case, no problem! However, a lot of things don't use that. A web browser generally uses keep-alive for efficiency - hence 99% of users won't encounter or know about the problem - but a lot of systems (e.g. APT, Ansible) will use "close", which is why it's such a problem for me in my work.
  • Changing APN and PDP type in the router has zero impact; it doesn't matter whether I'm using IPv4, IPv6, IPv4v6, APN "3internet", "3secure", or "three.co.uk". The problem for me is general.

Ultimately, my hypothesis is that Three have some sort of connection buffering to optimise the user experience or maybe to prevent wasted re-transmissions, but there's a glaring bug in it whereby it resets the connection and discards the buffer it holds for the session once the server has closed the connection. This would make sense for an ISP based solely on a Radio Area Network because if clients exist in grey spots where the connection can go down momentarily much of the time it is helpful to buffer the lost packets for the clients rather than have the server spamming their link with retries of the unACKd packets (and further polluting the radio waves). So I think Three ACKing the packets on my behalf is by design. Only the implementation is bad and it mistakenly assumes it can throw away the buffer when the server terminates the connection.

Any help/testing/solidarity would be much appreciated because Three technical support have been zero help since I raised it with them over a month ago. I sent over detailed evidence, but all they can muster is a call occasionally to incorrectly restate the problem and ask if I'm still having it. Really awful experience; I've never seen a team so completely unable to escalate to responsible people who might actually be able to help eventually.

35 REPLIES 35
Gardinerr
Fledgling

I tried both commands but they failed (after about a minute) with a SSL connect error:

LibreSSL SSL_connect: SSL_ERROR_SYSCALL in connection to electricworry.net:443 

 I thought this might be something to do with the way the Mac resolves hostnames under IPV6. Consequently, I switched to an IPV4-only connection but got the same error.

Sorry, definitely not a problem at your end. The server fell over (which is what always happens when I run Apache on low RAM server, due to massive probing from botnets).

It's running again. You've got from now until next time it falls over (could be hours or days); after that I'm switching back to Nginx.

jr0
Rising star

Over broadband

jr0_0-1741963977445.png

Over Three mobile phone hotspot

jr0_1-1741964060204.png

 

electricworry
Active

It's really interesting that the problem doesn't appear with mobile phone hotspot but does with the ZTE and (if Gardinerr is right) the Zyxel routers. I've still not tried putting the SIM card in my mobile phone but I might get a chance over the weekend.

PeteG
Community Support Team
Community Support Team

If you get a chance, please report your findings for me, it's all information I can add to the report. 

Thanks.
Pete. 



Mod tip! The author of a post can hit 'Accept as Solution', to highlight a reply that helped solved their query.


electricworry
Active

Ok, I put the 3 SIM into my mobile phone and I connected my computer to the phone's hotspot. I also ensured that my phone's WiFi (client) was turned off so that there's no doubt my phone is routing via the mobile data and not some other access point.

The test produced the same results:

electricworry@BOB1:~/projects/download-test$ curl -H "Connection: close" https://electricworry.net/test-8 -o test-curl
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
50 8192k 50 4140k 0 0 1991k 0 0:00:04 0:00:02 0:00:02 1991k
curl: (18) transfer closed with 4148646 bytes remaining to read

electricworry@BOB1:~/projects/download-test$ curl -H "Connection: keep-alive" https://electricworry.net/test-8 -o test-curl
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 8192k 100 8192k 0 0 3402k 0 0:00:02 0:00:02 --:--:-- 3403k

So that rules out the device as the cause. There has to be some kind of connection optimisation system on the network with an implementation flaw. That or I'm being monitored by the authorities and there's a bug in their platform. 😁

PeteG
Community Support Team
Community Support Team

Thanks for doing that. 

Pete.



Mod tip! The author of a post can hit 'Accept as Solution', to highlight a reply that helped solved their query.


PeteG
Community Support Team
Community Support Team

Hello. 

Waw, that's really interesting. Your testing should be useful. I don't know the answer and don't have a way of finding out directly, I'll need to speak to another team internally to find out more. Hopefully I can find some useful into to come back to you with. 

Pete.



Mod tip! The author of a post can hit 'Accept as Solution', to highlight a reply that helped solved their query.


TheEnglishman
Regular

When I called the support people they went through some tests and said there wasn’t a problem

 

can you at least acknowledge there is an issue and you’re investigating?  It’s not hard to test and confirm/deny. 

thanks

Yes, I agree. @PeteG, please see my responses sent today. @TheEnglishman has exactly the same issue as me. I really hope that you can get this issue in front of someone in the network engineering team to investigate. No offence to the first line support but I don't think that they are equipped to investigate and are working from a script of checking the signal quality and the computer configuration. I believe I've provided enough evidence to repeatedly reproduce the problem and that all usual suspects (such as signal quality or computer misconfiguration) can be ruled out.

Due to the technical nature of the problem I don't expect you to get a solution in 5 minutes, but I really want a conclusion one way or the other in the long term because this makes the service completely unusable for certain use cases: for me downloading Linux packages or running configuration management tools such as Ansible, and for @TheEnglishman who's trying to download Wordpress packages.

At the very least I'd like you to confirm there's an issue and set an expectation, whether that's won't fix, or "working on fixing".