Cloud managed VPN at Christian Aid using Meraki Security Appliances

When we decided to replace of our aging Cisco ASA firewalls, one thing we knew for sure was that we needed a VPN to give staff working in our remote offices access to the internal resources located at our HQ in London.

On the ASAs we used the Easy VPN feature to do this.  This is different from traditional IPSEC VPNs in that you don’t need to know the IP address for each node.  Since several ISPs we use for our global offices only give us dynamic IP addresses, this style of VPN isn’t possible.  Easy VPN gets around this by treating each ASA as a VPN client – the remote ASA 5505s initiate the VPN connection to a known hostname or IP address at the HQ.

This worked really well with the Cisco ASAs, so we wanted VPN that was as easy to roll out and maintain.  The Meraki security appliances proved to be even easier.

Basic Configuration

The interface for a site to site VPN is very simple with only three options to select for our purpose:

Mode

Meraki VPN Mode

Three settings:

  1. Disabled (i.e. no VPN)
  2. Split tunnel (only traffic to and from VPN connected networks goes over the VPN tunnel)
  3. Full tunnel (all internet traffic from the device is sent over the VPN tunnel.

At Christian Aid we use the Split tunnel option.  We know that the majority of traffic from each site is to and from the Internet (increasingly more so as we move our own resources to the cloud) and having that traffic first travel via HQ is not the very efficient, and will hammer our HQ internet connection.

Topology

Meraki VPN Topology

Even simpler – only two options.  The first option to Connect Cirectly creates a mesh VPN – each node connects to every other node.  Traditional IPSEC VPNs can do this but have to configure the fixed public IP address of each node on all the other nodes.  The Meraki cloud handles this automatically.  Easy VPN did not allow us this option at all, so we never had to face the challenge of keeping IP addresses updated across over 40 locations.

A benefit of a mesh VPN is if clients at remote sites need to communicate directly without passing through HQ.  This lends itself to distributed Active Directory networks – with a mesh VPN domain controllers at different locations can replicate with each other.  Previously replication had to come via our headquarters.  A mesh VPN promised better replication and redundancy – if our headquarters went down for any reason, or DCs could still replicate.

We also wondered whether peer to peer communication solutions such as Skype would work better, although we haven’t really answered that yet.

There is a hidden third option that appears if you select Connect directly to only one peer that allows a hybrid of the two – if you select hub and spoke mode you can select multiple hubs – a spoke can create VPN tunnels to multiple locations.

You can also configure different topology settings for each node.

This allows you a great level of flexibility.  In our case, sites that have a DC are set to connect directly to all VPN peers.  Sites without a DC are set to only connect to our HQ.  This results in the DC sites being meshed so they can replicate, but non-DC sites aren’t part of the mesh.

More of this later when I discuss Non-Meraki VPN Peers.

NAT Traversal

Meraki VPN NAT Traversal

This sets up port forwarding through the Meraki device (not through the ISP equipment which may be doing NAT – more on that later).  In our case Automatic has always worked fine.

Advanced configuration options

As mentioned above you can use the Topology option if you have more complex needs.  It is also possible to add non-Meraki VPN peers that can be joined to the mesh.  A future plan we are hoping to implement involves creating Domain Controllers on the Azure cloud, to make our Active Directory infrastructure site redundant.  The thinking is that we can then add these DCs as non-Meraki VPN peers, and they will then be linked to   As things currently stand, it looks like these may be one way only like Easy VPN clients.  At some point in the future we hope it is possible to create virtual Meraki VPN peers on the Azure cloud that can be configured to work exactly the same as a hardware MX device.  That would allow us to use Azure hosted servers to provide cloud based authentication, Active Directory, print driver deployment, RADIUS – my colleagues and I see a lot of potential in this. I guess I should make a wish

Outcome and monitoring

With three basic settings in place across our MX devices, we had a working VPN network up and running so quickly we actually spent some time looking for additional settings before we realised we’d completed the job!  Meraki MX devices easily met or exceeded our requirements in this area.

Here is a map of our 45 sites connected by VPN at the time of writing this article.

Christian Aid Meraki Network Sept 2015

Note that sites physically close together (at this scale) are grouped.

Advertisement

Custom Pie Charts in Meraki

Custom Pie charts is a feature in Meraki that helps you to answer some of the most common questions that are often asked about how internet connections are used – the most common of all being “how much of our bandwidth is being used for non work activities?”

After you have defined what constitutes work traffic, you get a pie chart that breaks down all the usage into categories you specify.  This is displayed on the Clients page in the dashboard – click the right arrow from the default Applications chart to get to it most quickly.

Meraki Custom Pie ChartHovering over a slice of the pie will give you the specific details about the category represented.

Clicking the More link under the chart displays a table showing the details about all the categories.

The chart is also shown on individual client pages, showing the breakdown for that particular client.

How to create your custom pie charts

The first thing to note is that charts are created per network.  If you are using a network as a template. its chart will get copied as part of the setup process, but if you adjust the chart later, you’ll need to apply the change to other networks as well.  I haven’t explored the Configuration Templates feature of Meraki yet, but one might suppose this resolves this issue. Please comment if you know whether this is the case or not.

The first thing you need to do is decide what constitutes work traffic somehow.  If everything is hosted within your LAN or at fixed data centres, then that is pretty straightforward.  If you are using hosted services such as Office 365, then things are slightly more complicated.

A simple example

Meraki’s documentation about how to configure this is a bit hidden in this knowledgebase article.  I hope I can build on that by working through an example.

At the start of this example we’ll assume that work traffic is just on the LAN, and everything else isn’t work.  Obviously that isn’t very realistic but we’ll build on it.

In this example, the LAN uses addresses on a class A network, represented by 10.0.0.0/8.

Everything else is represented by 0.0.0.0/0.

The interface for creating slices is on the General page of a network, under the Traffic Analysis section.

Meraki Define Slices

Above I have selected to use Detailed Traffic Analysis – this keeps a record of all the hostnames and IP addresses requested, and is also useful to leave enabled for your out of the box application analysis.

I also added a slice in order to show the boxes.  In the name column, you provide the Name for the slice.  The signature is based on HTTP hostname (not just HTTP traffic is measured though), the IP subnet or address, ports or a combination of IP addresses and ports.

Let’s apply our work and non work rules to start creating the pie chart.

Meraki Pie Chart Example 1

Note that the rules are applied starting with the first. Anything that matches is a rule will not be matched further down, so put your catch all rule (0.0.0.0/0) at the bottom.  You can drag your rules up and down using the Meraki Drag Icon icon.

Once the traffic has been analysed, you’d get a simple custom pie chart showing Work (or internal) and Not Work (or external) traffic.

In the real world. there are probably some externally hosted services that constitute work.  Let’s say that staff access email hosted on Office 365 via the web interface, and files are stored in a Office 365 SharePoint instance.

Meraki Custom Pie Chart Example 2

Perhaps you want to monitor the use of a specific system hosted on your LAN. As long as you can identify traffic to and from it via a hostname or IP address then you can graph it.  In this case, its a CRM system accessed vie the hostname companycrm.

Meraki Custom Pie Chart Example 3

By placing it above the Work rule, this will be matched first which is necessary to graph it separately.

As you can see you can build up your rules to get quite complex reports on bandwidth use.  It’s very flexible so you can use it to slice and dice your bandwidth reports in the way most useful to you.

After you save the changes, it will take a while for Meraki’s servers to crunch the data and produce a graph (and it seems to show partial graphs during the process) so leave it a couple of hours or overnight before marvelling at your charts.

If you find this useful, please share your experience in the comments, and please do share any tips that make this feature any more useful.

Bandwidth management using traffic shaping with Meraki MX Security Appliances

At Christian Aid, our field offices quite often have limited bandwidth, particularly at sites in Africa.  In some cases we are using expensive VSAT links for these sites, so I am very interested in traffic shaping to ensure we get the best value and don’t spend too much to make our applications work for the local teams.

I haven’t found there to be a lot of material available on traffic shaping strategies.  It is easy enough to find instructions on how to set up traffic shaping, but little out there that explains the theory.  And the theory would assist in working out what settings one wants to apply.  So I’ve ended up taking the following approach based on my own made up theory. Comments would be appreciated!

Set your uplink bandwidth as accurately as possible

For most offices we have a contracted bandwidth level.  In Meraki MX networks Traffic shaping device I can set this using a slider, or clicking in details where I can enter more specific details (necessary where the up and down bandwidth are different which is often the case).  Here is an image of the UI for this:

Traffic Shaping Uplinks

Note that there are WAN 2 and Cellular options – the Meraki equipment allows for a second failover/load balancing uplink, as well as a 3/4G uplink using a USB dongle.

This is an important for two reasons.  Firstly, by setting the available bandwidth by here you are ensuring that any queuing that results from bandwidth demand exceeding supply occurs here at the MX appliance where you can manage it, rather than at the ISP equipment where you probably can’t.

Secondly, the settings here allow traffic shaping rules to be based on proportions of your available bandwidth.  If you leave these at 100 Mbps (the default) then the priority settings for traffic shaping rules will not work effectively!

Note – I am not sure what to do in case of a contended connection where you know the maximum burst rate and the CIR you may see most of the time.  Please comment if you know about this!

Apply device bandwidth limits if you have limited bandwidth

You want to ensure that you prevent a single user (or device) from hogging all the bandwidth.

Traffic Shaping Global Limits

Using the Global bandwidth limits you can specify the maximum bandwidth available to a single client.  There is also a feature called SpeedBurst which allows a client to burst above this at the start of a single download for 10 seconds.  This reduces the amount that the bandwidth limitation will be noticed by users.  I am not clear whether this has been effective, but it certainly hasn’t hurt.

I’m not sure the best way to determine the per-client limit.  It’s a balance between preventing a single client from hogging all the bandwidth – however if there is only one client using the connection you don’t necessarily want to prevent them using the bandwidth.  Currently the Speedburst setting is the only tool that allows any sort of compromise.

If anyone has any ideas of how to determine a per-client limit, please leave a comment!

Note that you should set the Per-client limit here lower than the overall bandwidth – otherwise it will not achieve anything.

Traffic shaping rules

Now those basic global settings are out of the way, you can start shaping the different kinds of traffic passing through the device.

The first thing you need to do is think about the different kinds of traffic – I find it helpful to make five categories and identify the kinds of traffic in each:

  • Realtime traffic
  • Background traffic
  • Things you want to limit but not block
  • Things you want to block
  • Everything else

The first three are the ones you want to create traffic shaping rules for.  Blocking is achieved through the firewall (and possibly through another filtering solution such as OpenDNS about which I may write soon).  Everything else you just leave alone.

Slicing up the pie

Consider the uplink bandwith you set as 100%.  One element of traffic shaping is limiting different kinds of traffic to a portion of this hundred percent.  Decide the proportions for each category you want to limit.  I do it as follows:

  • Background traffic – 30% of bandwidth
  • Other things I want to limit but not block – 20%

This leaves 50% of my bandwidth if the background and discouraged traffic is running at full whack.  Here is a schematic that explains better:

Meraki Traffic Shaping Share

Calculate the actual bandwidth limits using the percentages you decide and the uplink bandwidth you set.

Realtime traffic

This is the kind of traffic where any delays will be immediately noticeable and are critical to your operations – for example voice traffic, live video (video conferences rather than streaming concert footage) and remote desktop solutions such as Citrix or RDP.

In my opinion this is the only kind of traffic you want to give high priority to.  Don’t be tempted to put email traffic or things that non IT folk might consider high priority in here.

Here is my rule:

Traffic Shaping Real Time Rule

In the definition box I have used a combination of custom rules and Meraki’s predefined categories.

In my organisation I want to prioritise:

  • voice traffic – that means Skype, Lync (now Skype for Business) – Skype is included in the predefined VoIP & Video Conferencing rule, but Lync isn’t yet, so using the available documentation about Lync traffic I specified ports and hosts used in Lync communication. update Meraki support confirmed Skype for Business is included in the Skype rule.
  • video conference traffic – again Skype covers some of this, but also port and host values for VSee which isn’t yet in Meraki’s predefined rules
  • remote desktop traffic – Citrix and RDP – again this isn’t in the specified rules, so I’ve specified our internal hostnames – citrixfarm and the FQDN citrixfarm.caid.local.  I’ve also specified the ports 3389 which is used by RDP traffic.

I’ve allowed this traffic to ignore the per-client limit – if people doing a video conference need it then let them have it even if it gets in the way of other traffic.

I’ve also specified the Priority as high. Initially I understood this a a QoS setting – if a queue formed at the device, then let this traffic through first.  After reading Meraki’s documenation on prioritisation more deeply I now think this works a bit differently, and it may be better to just use this setting rather than the custom bandwidth limits I’ve specified below.

Lastly is DSCP tagging – here I am tagging packets matching the tool with DSCP value 7 – I think this may result in the traffic being recognised by upstream devices doing traffic shaping as needing some kind of priority but I am unsure how effective this is.  Can’t hurt right?

Background traffic

TrafficShaping-BackgroundRuleIn my analysis of our use of bandwidth in each of our field offices I found that the number one application that was likely to hog bandwidth was email – mostly internal organisational email.  Not torrenting, video or anything else.  Our staff just send and receive a lot of email with large attachments.  The worst cases happen when someone needs to resync their entire mailbox with Outlook.

Staff consider email to be the highest priority, but in most cases they won’t notice if syncing is slow.  Since uncontrolled email regularly hogs the connections, this should be limited.  In our case we use Office 365 email which is matched under the Windows Live Hotmail and Outlook rule (perhaps Meraki will update this label sometime soon?).

I’ve also put software updates in here – both from public update sources such as Windows update, and also our internal WSUS server lps253gbr.  When Microsoft release a lot of patches this often ties up limited bandwidth connections for many hours.  We still want the updates, but not at the expense of other things working.  I’m hoping that Microsoft use of peer to peer technology in Windows 10 and Windows Update for Business will help us here.

Microsoft Skydrive (now called OneDrive) is also in here.  We are planning on using OneDrive for Business to allow syncing of content – but any syncing tool is potentially a bandwidth hog, so I am trying to manage it.

I’ve specified that traffic matching this rule should be limited to 30% of the uplink – approximately 333 Kb/s down and 150 Kb/s up.  AS mentioned previously the Priority setting may be a better way of applying the limits, but I’m not confident enough I understand why yet, so I am still manually setting the bandwidth as well.

Finally I’m tagging this with DSCP value 3.  Again, I’m not quite sure I understand which setting to use and may change this later.

Discouraged traffic

Traffic Shaping - Discouraged RuleThe third and last rule is for traffic that I would like to discourage but don’t want to block altogether.

Apple.com is listed here in order to control automatic updating of ios clients such as iPhone – these are often huge and while I don’t want out of date clients on our networks, I really don’t want their updating to prevent day to day work by other people.

The other items are generally related to non-work related traffic.  It is debatable whether these should really be limited as much as this.  For the moment I’m throttling them in most locations.

The bandwidth is limited to 20% of the uplink – 200 Kb/s down and 100 Kb/s up.  The priority is set as Low and the DSCP tag is 0.

Blocking traffic

Traffic Shaping - Blocking RulesThere isn’t a lot of traffic that I think we should be blocking – it is really things that potentially jeopardise our operations or reputation – namely bittorrenting copyrighted material, which could result in our ISPs disconnecting us, or getting threatening messages from litigating bodies or worse.  This blocking isn’t done on the Traffic Shaping page, but instead the Firewall page.

I’ve only selected Peer to Peer and Web file sharing.  Peer to peer blocks BitTorrent and similar traffic, whereas Web file sharing blocks the sites where you find torrent files – e.g. Pirate bay, Kick Ass Torrents etc.

How effective is this?

I had the opportunity to confirm that these rules could improve the experience of using a saturated connection when our office in Delhi started to complain about slow speeds and disconnections.

Before applying traffic shaping rules based on the above the connection looked like this:

TrafficShaping-CongestedLink

The “plateau” shape of this graph is classic saturation – the link is demanding more bandwidth than is available.  Looking at the specific traffic that was happening at the time, I concluded that no-one was abusing the link – it was just the combination of so many staff doing work related things demading more bandwidth than was available.

After applying traffic shaping rules the link looked far healthier:

Traffic Shaping - After RulesStaff reported that the connection had improved.  I feel this validated the principles of my rules.  In other locations I think that applying these rules have reduced our need to pay for increased bandwidth – and that was one of the main selling features of adopting Meraki MX Security Appliances in the first place.

However, I still have questions and am not sure I am applying the best possible rules for what I apply.  Should I use only the Priority settings, or only the custom bandwidth limits?  What are appropriate DSCP tags to apply, and are they at all effective.

If you have any thoughts on this, please leave a comment.

Meraki Webinar, live usage view and Nepal MX64W online!

Meraki Webinar ScreenshotI made a webinar with Meraki about Christian Aid’s use of their technology.  I cover why we decided to replace our Cisco ASA VPN infrastructure with Meraki MX Security Appliances, and what benefits we experienced – both expected and unexpected.

You’ll need to register to watch.  Once you have registered, why not attend a live webinar and get a free Meraki AP with 3 years license.

Nepal First Two DaysOur Regional ICT Service Manager, Sanjay has just enabled our first MX64W Security Appliance and WiFi in Kathmandu, Nepal.  Here is a screen shot of the first two days use.

I’ve also enabled a live public view of traffic on our WiFi access points in Freetown.

Hoping to see Kindu, Democratic Republic of Congo on Meraki next week, and Managua, Nicaragua shortly after.

Christian Aid Meraki Network in Numbers

Here’s some details about our global Meraki network:

2 models of office (soon to be 3)

We have a variety of different sized offices and, since we use donated money, a responsibility to spend it wisely.  So we opted to go with two models.  One for larger offices, and one for smaller offices.

Large office

The large offices have between 10 and 50 staff, and are often spread over quite a large area, and sometimes multiple floors.  For these offices we chose the MX80 Security Appliance, with one or two MR16 Access Points.

Small office

A small office is anything greater than 1 and less than 10.  For these offices we opted for a MX60W Security Appliance which combines the functionality of a Security Appliance with a wifi radio.

Emergency/micro office

I’ve recently been looking with interest at the Z1 teleworker gateway, which has all the functionality of the MX60W, but at a lower price, and more compact.  I’ll be shipping off our first Z1 to our office in Tacloban, Philippines later this month.  Given the cost and compact nature, I hope we can keep some of these in reserve for emergency response – they will make setting up a secure working environment in situations like Nepal much faster and easier.

27 offices in 26 developing countries

Christian Aid Meraki Networks

Some networks are combined in the above map – there are three networks in DR Congo for example. Offices in Kathmandu, Nepal, Managua, Nicaragua and Tacloban, Philippines are yet to come online.

7 offices in UK

UK Meraki offices Aug 3 2015

This will increase to cover a total of 18 offices, including our offices in Dublin, Ireland and Madrid, Spain.

1.06 TB transferred by over 4000 clients in the last week

Clearly we have a lot of people visiting our offices.  We don’t have nearly that many staff!

72 devices deployed

  • 1 MR12 Access Point (our first device, given free by Meraki to get us hooked – it worked)
  • 18 MR16 Access Points
  • 16 MR18 Access Points
  • 16 MX60W Security Appliances with WiFi
  • 13 MX80 Security Appliances
  • 1 MX100 Security Appliance (at HQ as the VPN hub)
  • 1 MX64W Security Appliance with WiFi
  • 2 MX64 Security Appliances (no WiFi)
  • 1 Z1 Teleworker Gateway – waiting to be shipped to Tacloban, Philippines.