Tag Archives: network

MikroTik local link up / down on high traffic

One of our RouterBOARD routers – MikroTik RB750 (mipsbe with Atheros 7240 switch) running the version v6.35.2 (latest version) have been constantly crashing on high network load. One particularly interesting note is that it crashed on traffic that goes through the internal switch but not on other traffic.

The log file shows the following pattern:
may/13 21:23:17 interface,info ether1-gateway link up (speed 100M, full duplex)
may/13 21:23:17 interface,info ether2-master-local link up (speed 100M, full duplex)
may/13 21:23:17 interface,info ether4-slave-local link up (speed 100M, full duplex)
may/13 21:24:21 system,info sntp change time May/13/2016 21:23:38 => May/13/2016 21:24:21
may/13 21:26:22 interface,info ether2-master-local link down
may/13 21:26:22 interface,info ether4-slave-local link down
may/13 21:26:24 interface,info ether2-master-local link up (speed 100M, full duplex)
may/13 21:26:24 interface,info ether4-slave-local link up (speed 100M, full duplex)
may/13 21:29:16 interface,info ether2-master-local link up
may/13 21:29:16 interface,info ether4-slave-local link down
may/13 21:29:17 interface,info ether2-master-local link down
may/13 21:29:18 interface,info ether2-master-local link up (speed 100M, full duplex)
may/13 21:29:18 interface,info ether4-slave-local link up (speed 100M, full duplex)
may/13 21:36:01 interface,info ether4-slave-local link down

A little search on the internet shows that we are not alone. Port flapping is widespread in MikroTik world. There are many reports with the similar problem dating back to 2011, but there are no solution:

My guess is that switch chip is broken / dead /malfunctioning / buggy. Or the switch “part” of MikroTik router is sensitive to voltage / current changes.

But anyway, we solved this by disabling the switch and changed each port to different subnet (Bridging also may work). Now all the traffic is sent through the CPU, and even when MikroTik advertises, that switch have wire-speed, we noticed that traffic-through-CPU have even better performance.

Mystic TCP/IP packet loss and MTU 1504 in Windows 7

We recently have observed some strange things in our local are network. Some Internet resources, mostly from Microsoft, failed to open in any web browser (IE, Firefox, Chrome, Opera, Telnet to port 80). As it happened on all computers, we thought that this is our ISP issue and wrote to support. ISP shortly responded that everything works on their side.

Some random/example resources that didn’t work:

We started to dig deeper. And after some troubleshooting and debugging found, that Microsoft resources do not open because URLs/sites related to CDN does not send any info. The strange thing was, that TCP/IP connection is established, but no data is coming our way (later I learned that partial first packed come through, but it was not visible in Telnet console).

Some of the problematic CDNs:

  • media.ch9.ms
  • ajax.aspnetcdn.com
  • static.ch9.ms

It is important to note, that at this point we thought that this is DNS issue, because everything worked well for our 2nd ISP, and we tried to use IP addresses from our second ISP in Windows hosts file, and sites seemed to start working.

Days past… conversation with our ISP… most of Internet works for us including Gmail, news, etc… Microsoft websites still does not work…

ISP is sending technician to check issue on site. Comes with laptop, and to my big surprise, everything works flawlessly on his PC.

Started to debug, comparing IP addresses, DNS, changing IPs, changing DNS, using Goolge DNS 8.8.8.8, switching cables… and nothing works on our Windows 7 but still works on his laptop.

Out of curiosity I start virtual machine with Windows 7, open IE and… Microsoft sites are opening in virtual machine, that is on the same physical PC.

Ok. Now try to disable Windows Firewall, Antivirus, etc., etc…

Now I clearly see that problem is related to our Windows PCs, not the ISP, so I start to think of all dark scenarios — rootkit, virus, broken hardware driver, broken hardware on all our PC simultaneously… still no progress…

Starting Wireshark. Connection to CDN is established… but data is not coming except partial first packet. Looking closer, Wireshark shows multiple [TCP Previous segment lost] and [A segment before this frame was lost]. Search for this in Google and one topic talks about rare TCP segment loss:
TCP PREVIOUS SEGMENT LOST #REALLY RARE CASE#

This must be it, because it looks like Rare Case 🙂

From this point it was straightforward. The article talks about MTU size mismatch. The first thing to do, I check MTU for my network adapter. Unfortunately my NIC does not support changing MTU via GUI interface, so I use netsh. How do I change the MTU setting in Windows 7?

To view MTU use the following command:
netsh interface ipv4 show subinterfaces

For me it was mysteriously changed from 1500 to 1504. Still do not know why and how it was changed.

To change it to 1500 (default for Ethernet):
Start command prompt cmd.exe
netsh
interface
ipv4
set subinterface "Local Area Connection" mtu=1500 store=persistent

In my case it was “Local Area Connection 2”.

See also:
The default MTU sizes for different network topologies