TrafficShaping-CongestedLink

Bandwidth management using traffic shaping with Meraki MX Security Appliances

At Christian Aid, our field offices quite often have limited bandwidth, particularly at sites in Africa.  In some cases we are using expensive VSAT links for these sites, so I am very interested in traffic shaping to ensure we get the best value and don’t spend too much to make our applications work for the local teams.

I haven’t found there to be a lot of material available on traffic shaping strategies.  It is easy enough to find instructions on how to set up traffic shaping, but little out there that explains the theory.  And the theory would assist in working out what settings one wants to apply.  So I’ve ended up taking the following approach based on my own made up theory. Comments would be appreciated!

Set your uplink bandwidth as accurately as possible

For most offices we have a contracted bandwidth level.  In Meraki MX networks Traffic shaping device I can set this using a slider, or clicking in details where I can enter more specific details (necessary where the up and down bandwidth are different which is often the case).  Here is an image of the UI for this:

Traffic Shaping Uplinks

Note that there are WAN 2 and Cellular options – the Meraki equipment allows for a second failover/load balancing uplink, as well as a 3/4G uplink using a USB dongle.

This is an important for two reasons.  Firstly, by setting the available bandwidth by here you are ensuring that any queuing that results from bandwidth demand exceeding supply occurs here at the MX appliance where you can manage it, rather than at the ISP equipment where you probably can’t.

Secondly, the settings here allow traffic shaping rules to be based on proportions of your available bandwidth.  If you leave these at 100 Mbps (the default) then the priority settings for traffic shaping rules will not work effectively!

Note – I am not sure what to do in case of a contended connection where you know the maximum burst rate and the CIR you may see most of the time.  Please comment if you know about this!

Apply device bandwidth limits if you have limited bandwidth

You want to ensure that you prevent a single user (or device) from hogging all the bandwidth.

Traffic Shaping Global Limits

Using the Global bandwidth limits you can specify the maximum bandwidth available to a single client.  There is also a feature called SpeedBurst which allows a client to burst above this at the start of a single download for 10 seconds.  This reduces the amount that the bandwidth limitation will be noticed by users.  I am not clear whether this has been effective, but it certainly hasn’t hurt.

I’m not sure the best way to determine the per-client limit.  It’s a balance between preventing a single client from hogging all the bandwidth – however if there is only one client using the connection you don’t necessarily want to prevent them using the bandwidth.  Currently the Speedburst setting is the only tool that allows any sort of compromise.

If anyone has any ideas of how to determine a per-client limit, please leave a comment!

Note that you should set the Per-client limit here lower than the overall bandwidth – otherwise it will not achieve anything.

Traffic shaping rules

Now those basic global settings are out of the way, you can start shaping the different kinds of traffic passing through the device.

The first thing you need to do is think about the different kinds of traffic – I find it helpful to make five categories and identify the kinds of traffic in each:

  • Realtime traffic
  • Background traffic
  • Things you want to limit but not block
  • Things you want to block
  • Everything else

The first three are the ones you want to create traffic shaping rules for.  Blocking is achieved through the firewall (and possibly through another filtering solution such as OpenDNS about which I may write soon).  Everything else you just leave alone.

Slicing up the pie

Consider the uplink bandwith you set as 100%.  One element of traffic shaping is limiting different kinds of traffic to a portion of this hundred percent.  Decide the proportions for each category you want to limit.  I do it as follows:

  • Background traffic – 30% of bandwidth
  • Other things I want to limit but not block – 20%

This leaves 50% of my bandwidth if the background and discouraged traffic is running at full whack.  Here is a schematic that explains better:

Meraki Traffic Shaping Share

Calculate the actual bandwidth limits using the percentages you decide and the uplink bandwidth you set.

Realtime traffic

This is the kind of traffic where any delays will be immediately noticeable and are critical to your operations – for example voice traffic, live video (video conferences rather than streaming concert footage) and remote desktop solutions such as Citrix or RDP.

In my opinion this is the only kind of traffic you want to give high priority to.  Don’t be tempted to put email traffic or things that non IT folk might consider high priority in here.

Here is my rule:

Traffic Shaping Real Time Rule

In the definition box I have used a combination of custom rules and Meraki’s predefined categories.

In my organisation I want to prioritise:

  • voice traffic – that means Skype, Lync (now Skype for Business) – Skype is included in the predefined VoIP & Video Conferencing rule, but Lync isn’t yet, so using the available documentation about Lync traffic I specified ports and hosts used in Lync communication. update Meraki support confirmed Skype for Business is included in the Skype rule.
  • video conference traffic – again Skype covers some of this, but also port and host values for VSee which isn’t yet in Meraki’s predefined rules
  • remote desktop traffic – Citrix and RDP – again this isn’t in the specified rules, so I’ve specified our internal hostnames – citrixfarm and the FQDN citrixfarm.caid.local.  I’ve also specified the ports 3389 which is used by RDP traffic.

I’ve allowed this traffic to ignore the per-client limit – if people doing a video conference need it then let them have it even if it gets in the way of other traffic.

I’ve also specified the Priority as high. Initially I understood this a a QoS setting – if a queue formed at the device, then let this traffic through first.  After reading Meraki’s documenation on prioritisation more deeply I now think this works a bit differently, and it may be better to just use this setting rather than the custom bandwidth limits I’ve specified below.

Lastly is DSCP tagging – here I am tagging packets matching the tool with DSCP value 7 – I think this may result in the traffic being recognised by upstream devices doing traffic shaping as needing some kind of priority but I am unsure how effective this is.  Can’t hurt right?

Background traffic

TrafficShaping-BackgroundRuleIn my analysis of our use of bandwidth in each of our field offices I found that the number one application that was likely to hog bandwidth was email – mostly internal organisational email.  Not torrenting, video or anything else.  Our staff just send and receive a lot of email with large attachments.  The worst cases happen when someone needs to resync their entire mailbox with Outlook.

Staff consider email to be the highest priority, but in most cases they won’t notice if syncing is slow.  Since uncontrolled email regularly hogs the connections, this should be limited.  In our case we use Office 365 email which is matched under the Windows Live Hotmail and Outlook rule (perhaps Meraki will update this label sometime soon?).

I’ve also put software updates in here – both from public update sources such as Windows update, and also our internal WSUS server lps253gbr.  When Microsoft release a lot of patches this often ties up limited bandwidth connections for many hours.  We still want the updates, but not at the expense of other things working.  I’m hoping that Microsoft use of peer to peer technology in Windows 10 and Windows Update for Business will help us here.

Microsoft Skydrive (now called OneDrive) is also in here.  We are planning on using OneDrive for Business to allow syncing of content – but any syncing tool is potentially a bandwidth hog, so I am trying to manage it.

I’ve specified that traffic matching this rule should be limited to 30% of the uplink – approximately 333 Kb/s down and 150 Kb/s up.  AS mentioned previously the Priority setting may be a better way of applying the limits, but I’m not confident enough I understand why yet, so I am still manually setting the bandwidth as well.

Finally I’m tagging this with DSCP value 3.  Again, I’m not quite sure I understand which setting to use and may change this later.

Discouraged traffic

Traffic Shaping - Discouraged RuleThe third and last rule is for traffic that I would like to discourage but don’t want to block altogether.

Apple.com is listed here in order to control automatic updating of ios clients such as iPhone – these are often huge and while I don’t want out of date clients on our networks, I really don’t want their updating to prevent day to day work by other people.

The other items are generally related to non-work related traffic.  It is debatable whether these should really be limited as much as this.  For the moment I’m throttling them in most locations.

The bandwidth is limited to 20% of the uplink – 200 Kb/s down and 100 Kb/s up.  The priority is set as Low and the DSCP tag is 0.

Blocking traffic

Traffic Shaping - Blocking RulesThere isn’t a lot of traffic that I think we should be blocking – it is really things that potentially jeopardise our operations or reputation – namely bittorrenting copyrighted material, which could result in our ISPs disconnecting us, or getting threatening messages from litigating bodies or worse.  This blocking isn’t done on the Traffic Shaping page, but instead the Firewall page.

I’ve only selected Peer to Peer and Web file sharing.  Peer to peer blocks BitTorrent and similar traffic, whereas Web file sharing blocks the sites where you find torrent files – e.g. Pirate bay, Kick Ass Torrents etc.

How effective is this?

I had the opportunity to confirm that these rules could improve the experience of using a saturated connection when our office in Delhi started to complain about slow speeds and disconnections.

Before applying traffic shaping rules based on the above the connection looked like this:

TrafficShaping-CongestedLink

The “plateau” shape of this graph is classic saturation – the link is demanding more bandwidth than is available.  Looking at the specific traffic that was happening at the time, I concluded that no-one was abusing the link – it was just the combination of so many staff doing work related things demading more bandwidth than was available.

After applying traffic shaping rules the link looked far healthier:

Traffic Shaping - After RulesStaff reported that the connection had improved.  I feel this validated the principles of my rules.  In other locations I think that applying these rules have reduced our need to pay for increased bandwidth – and that was one of the main selling features of adopting Meraki MX Security Appliances in the first place.

However, I still have questions and am not sure I am applying the best possible rules for what I apply.  Should I use only the Priority settings, or only the custom bandwidth limits?  What are appropriate DSCP tags to apply, and are they at all effective.

If you have any thoughts on this, please leave a comment.

Meraki Webinar, live usage view and Nepal MX64W online!

Meraki Webinar ScreenshotI made a webinar with Meraki about Christian Aid’s use of their technology.  I cover why we decided to replace our Cisco ASA VPN infrastructure with Meraki MX Security Appliances, and what benefits we experienced – both expected and unexpected.

You’ll need to register to watch.  Once you have registered, why not attend a live webinar and get a free Meraki AP with 3 years license.

Nepal First Two DaysOur Regional ICT Service Manager, Sanjay has just enabled our first MX64W Security Appliance and WiFi in Kathmandu, Nepal.  Here is a screen shot of the first two days use.

I’ve also enabled a live public view of traffic on our WiFi access points in Freetown.

Hoping to see Kindu, Democratic Republic of Congo on Meraki next week, and Managua, Nicaragua shortly after.

Christian Aid Meraki Network in Numbers

Here’s some details about our global Meraki network:

2 models of office (soon to be 3)

We have a variety of different sized offices and, since we use donated money, a responsibility to spend it wisely.  So we opted to go with two models.  One for larger offices, and one for smaller offices.

Large office

The large offices have between 10 and 50 staff, and are often spread over quite a large area, and sometimes multiple floors.  For these offices we chose the MX80 Security Appliance, with one or two MR16 Access Points.

Small office

A small office is anything greater than 1 and less than 10.  For these offices we opted for a MX60W Security Appliance which combines the functionality of a Security Appliance with a wifi radio.

Emergency/micro office

I’ve recently been looking with interest at the Z1 teleworker gateway, which has all the functionality of the MX60W, but at a lower price, and more compact.  I’ll be shipping off our first Z1 to our office in Tacloban, Philippines later this month.  Given the cost and compact nature, I hope we can keep some of these in reserve for emergency response – they will make setting up a secure working environment in situations like Nepal much faster and easier.

27 offices in 26 developing countries

Christian Aid Meraki Networks

Some networks are combined in the above map – there are three networks in DR Congo for example. Offices in Kathmandu, Nepal, Managua, Nicaragua and Tacloban, Philippines are yet to come online.

7 offices in UK

UK Meraki offices Aug 3 2015

This will increase to cover a total of 18 offices, including our offices in Dublin, Ireland and Madrid, Spain.

1.06 TB transferred by over 4000 clients in the last week

Clearly we have a lot of people visiting our offices.  We don’t have nearly that many staff!

72 devices deployed

  • 1 MR12 Access Point (our first device, given free by Meraki to get us hooked – it worked)
  • 18 MR16 Access Points
  • 16 MR18 Access Points
  • 16 MX60W Security Appliances with WiFi
  • 13 MX80 Security Appliances
  • 1 MX100 Security Appliance (at HQ as the VPN hub)
  • 1 MX64W Security Appliance with WiFi
  • 2 MX64 Security Appliances (no WiFi)
  • 1 Z1 Teleworker Gateway – waiting to be shipped to Tacloban, Philippines.

My Meraki Journey

One of the things I’ve enjoyed most in the last couple of years in my job has been rolling out and using Cisco Meraki networking equipment across the 29 offices Christian Aid has in the developing world.

The whole experience of learning about the devices and service, designing our implementation, rolling out, and then supporting and using Meraki daily has been fun, which hasn’t been something I’ve found myself saying about other projects I’ve worked on such as SharePoint or Cisco ASAs. Meraki just hasn’t been as challenging, but in a good way – I’ve managed to achieve so much without increasing blood pressure or tearing any hair out on the way. I have had to put my technical knowledge and skills to use for sure, but in a more graceful way that has just been more…pleasant.

In upcoming posts I want to share what I have done with Meraki, because I think that anyone who wants to achieve the same things with their network can do it most easily with Meraki and should consider it.

In posts over the next couple of weeks I’ll be covering the following things:

  • Global VPN
  • Standardised WiFi networks
  • Mobile Device Management
  • Bandwidth Management
  • Sharing networks and internet connections with other organisations and the public

I hope you are interested on coming along with me for some of the journey.

Running Windows Remote Server Admin Tools with a different account

Using a separate admin account is common on the Unix world. At Christian Aid we adopted separate admin accounts for staff in the ICT Services teams to give increased security.

One annoying thing about this is that Windows tools based in MMC don’t easily run as a different user AND with elevated permissions (confusingly referred to as Run as Administrator in the UI). We had been working around this by remoting to a server and then running the tools from there while logged in with an admin account.

That’s a bit of a pain though, right? It would be much better to just run the tools locally as the admin user.  It can be done by editing the shortcut to each item in Administrative Tools like this:

runas.exe /user:DOMAIN\adminuser "cmd /c Start /B app.mmc"

Obviously adjust DOMAIN\adminuser as appropriate.

Putting the whole “normal” run command behind a cmd is necessary for some applications that require additional flags, and works for those that don’t too.

Here is a list of commands that work on my copy of Windows 7:

  • Administrative Center: runas.exe /user:DOMAIN\adminuser "cmd /c Start /B dsac.exe"
  • Domains and Trusts: runas.exe /user:DOMAIN\adminuser "cmd /c Start /B %SystemRoot%\system32\domain.msc"
  • Sites and Services: runas.exe /user:DOMAIN\adminuser "cmd /c Start /B dssite.msc"
  • Users and Computers: runas.exe /user:DOMAIN\adminuser "cmd /c Start /B dsa.msc"
  • DNS: runas.exe /user:DOMAIN\adminuser "cmd /c Start /B dnsmgmt.msc /s"
  • Group Policy Management: runas.exe /user:DOMAIN\adminuser "cmd  /c Start /B gpmc.msc"

Use this approach for any application that needs to both run as a different user (and always the same user) and run with elevated privileges.

/savecred security hole

Anyone using this can add a /savecred flag to the runas command, which allows storage of credentials.  The first time you use a shortcut like this, you’ll get asked for the users password in a command window.  The /savecred flag means they will get stored in Windows Credential Manager, and you won’t need to add them all the time.  That’s convenient, but it does mean if the computer and Windows account is compromised, an attacker is a click away from your admin interfaces!