Categories
CCNP Enterprise Core (350-401) Cisco Revision Topics Services

Congestion Management: Queuing Algorithms

There are many different types of queuing algorithms, though not all may be suitable in modern networks that can carry many different media types.

FIFO – First In, First Out

The first packet to be placed on an output interface is the first packet that will leave that interface. All traffic belongs to the same class in first in, first out queuing.

RR – Round Robin

Queues are serviced in sequence one after the other, each queue will process one packet only. There are no queue starvation with round robin since every queue gets an opportunity to send a packet at every round. No queue has a priority over others, and if the packet sizes from all queues are around the same in length, bandwidth is shared equally. There is no mechanism to prioritise traffic in round robin.

WRR – Weighted Round Robin

Weighted round robin can provide prioritisation to the round robin queuing algorithm. It allows a weight to be assigned to a queue, and based on that weight the queue gets a proportionate share of the interface bandwidth that may be more or less than the other queues

CQ – Custom Queuing

Custom Queuing is Ciscos implementation of the weighted round robin algorithm. It creates a set of 16 queues with a round robin scheduler with first in first out on each queue.

Each queue can be customised with a portion of the link bandwidth for each selected traffic type. If there is a portion of the reserved link bandwidth unused, other queues can utilise that unused bandwidth. Custom queuing can suffer from long delays, and suffers from foundational issues of first in first out.

PQ – Priority Queuing

Priority queuing creates four queues assigned to different priorities, high, medium, normal, and low. There is first in, first out queuing within each queue. The high-priority queue always gets served first, and lower priority queues are served once the higher priority queues are empty.

If a packet arrives at any time during a lower priority queue being serviced, it is stopped and the higher priority queue is cleared before resuming any lower priority queue.

If a higher priority queue is always being serviced, the lower priority queues can suffer from starvation.

WFG – Weighted Fair Queuing

The weighted fair queuing algorithm divides the interface bandwidth by the number of flows, weighted using IP Precedence, to allocate bandwidth fairly among all flows. The WFG method provides good service for high priority real time flows, but it can not provide a fixed bandwidth guarantee for a particular flow.

CBQFG – Class-based Weighted Fair Queuing

One of the two algorithms recommended for modern networks. Class based weighted fair queuing allows up to 256 queues to be created, serving a certain traffic class. The queue is serviced dependant on the bandwidth that has been assigned to that class. Weighted fair queuing functionality is extended to support these user-defined traffic classes.

Packet classification is done in this queuing algorithm by looking at traffic descriptors such as quality of service markings, protocols, access control lists and input interfaces.

Once a flow has been classified by its traffic descriptors, it can be assigned specific amount of bandwidth, weight, a queue limit and a maximum packet limit.

The bandwidth that is assigned to a class is a minimum amount of bandwidth that will be provided to that class during periods of congestion on the interface.

The queue limit for the class is the maximum number of packets allowed to be buffered in that class queue.

If the queue reaches the configured packet limit, excess packets will be dropped.

There is no guarantee to low latency using this algorithm, so is not suitable for real time flows such as audio and video calls.

LLQ – Low Latency Queuing

Low latency queuing is the other recommended algorithm for modern networks, and combines class-based weighted fair queuing with priority queuing. It makes up for the disadvantage CBQFG faces with real time traffic by guaranteeing a lower latency and bandwidth

Traffic assigned to the strict-priority queue is serviced up to its assigned bandwidth before other CBWFG queues are serviced.

Real time traffic must be assigned to the strict-priority queue in order for packets to move quickly.

There can be multiple classes defined for servicing real time traffic with separate bandwidth guarantees on each, but there is a single priority queue that combines all of these classes together to be serviced first.

If there is a traffic class not utilising its bandwidth assigned, it can be used by other classes.

If there is congestion on the link, real time traffic that goes beyond its assigned bandwidth will be policed to ensure that non-priority traffic is not starved. The policing rate must be specified as a fixed amount of bandwidth or as a percentage of interface bandwidth

Categories
CCNP Enterprise Core (350-401) Cisco Revision Topics Services

Policing and Shaping: Types of Policers

Single-Rate Two-Colour Markers and Policers

The first policers implemented use a single rate two colour model based on a single token bucket algorithm.

For this type of policer, traffic can be either conforming to the committed information rate or exceeding it. Marking down traffic or dropping it can be performed to either of these two states.

Single-Rate Three-Colour Markers and Policers

A single rate three colour policer algorithm is based on RFC 2697.

This type of policer uses two token buckets. Traffic can be classed as conforming, exceeded, or violating the committed information rate.

Traffic can be marked down or dropped on any of these three traffic states.

The first token bucket operates in a similar way to a single-rate two colour system. The difference between this system and the earlier two-colour system is that instead of discarding excess tokens in the first token bucket, they are stored in the second bucket to be used later for temporary bursts that exceed the committed information rate.

Tokens that are placed in the second bucket are known as excess burst. Excess burst is the maximum number of bits that can exceed the committed burst size.

The two bucket algorithm works well with TCP causing fewer TCP transmissions and is more efficient in the utilisation of bandwidth. It is an ideal policer to use with Assured Forwarding (AF) classes.

This system defines the three-colour mechanism, with three different classes for traffic.

Confirming

Traffic under the committed burst size is classed as confirming, or ‘green’. This traffic is normally transmitted and left unmarked.

Exceeding

Traffic over the committed burst size but under the excess burst is classed as exceeding, or ‘yellow’. Exceeding traffic can be dropped, or marked down and transmitted.

Violating

Traffic over committed burst size and excess burst size is classed as violating or ‘red’. This type of traffic is usually dropped but can be marked down instead

Using this three colour system makes sense with different actions for each level, e.g. marking down on exceeding and dropping on violating.

Parameters Used

The single-rate three colour marker and policer uses the following parameters to meter traffic:

  1. Committed Information Rate
  2. Committed Burst Size
  3. Excess Burst Size
  4. Committed Burst Size Bucket Token Count
  5. Excess Burst Size Bucket Token Count
  6. Incoming Packet Length

Two-Rate Three-Colour Markers and Policers

The two-rate three colour system is based on RFC 2698, it is similar to the single rate three-colour policer.

The difference between the two is a single-rate three colour policer relay on excess tokens from the committed burst size bucket. This introduces a level of variability and unpredictability in traffic flows. The two-rate three-colour marker address this level of variability and unpredictability by using two distinct rates, the committed information rate and the peak information rate.

The two rate three colour marker and policer allows for a sustained excess rate based on the peak information rate. This allows for different actions to be performed on the traffic that exceeds different burst values. Violating traffic can be dropped at a defined rate; this is not possible with a single rate system.

The two rate three colour system uses two token buckets. Instead of transferring excess unused tokens from the burst committed size to the burst excess bucket, the policer has two totally separate buckets filled with two different token rates.

The burst excess bucket is filled with peak information rate tokens, and the committed burst size bucket is filled with committed information rate tokens. The burst excess represents the peak limit of traffic that can be sent during a sub second interval.

The initial check is to see whether the traffic is within the peak information rate, or in a violation state. If the traffic is not, then the committed information rate, or excess state comparison, is made.

Parameters Used

  1. Committed Information Rate
  2. Peak Information Rate
  3. Committed Burst Size
  4. Peak Burst Size
  5. Committed Burst Size Token Count
  6. Excess Burst Size Token Count
  7. Incoming Packet Length
Categories
CCNP Enterprise Core (350-401) Cisco Revision Topics Services

Policing and Shaping: Single Token Bucket Algorithm Example

An example of how a single token bucket algorithm works:

There is an interface of 1Gbps, with a policer defined with a committed information rate of 120Mbps and committed burst size of 12Mbps. A committed time interval can not be configured in Cisco IOS but can be calculated.

Committed time interval = ( Committed Burst Size in bits / Committed Information Rate in bits per second) x 1000

Committed Time Interval = ( 12,000,000 / 120,000,000) * 1000 = 100ms

Once the committed time interval has been calculated, the number of committed time intervals in a second can be calculated:

Committed Time Intervals per Second = 1000 / Committed Time Interval

Committed Time Intervals per Second = 1000 / 100 = 10

If there are a continuous stream of 1500-byte (12,000 bit) packets, and it is processed by the token algorithm. Only a committed burst size of 12 Mb can be taken by the packets within each committed time interval.

Number of packets that can conform with each committed time interval = Committed Burst Size / packet size in bits (rounded down)

Number of packets that can conform with each committed time interval = 12,000,000 / 12,000 = 1000 packets

Any additional packets beyond a total of 1000 will be dropped or marked down.

To figure out how many packets that would be sent in one second:

Packets per second = Number of packets that conform within each committed time interval * committed time intervals per second

Packets per second = 1000 * 10 = 10,000 packets

To calculate the committed information rate for the 10,000, use the following formula:

CIR = Packets per second * Packet size in bits

CIR = 10,000 packets per second * 12,000 bits = 120,000,000 bps = 120Mbps

To calculate the time interval it would take for 1000 packets to be sent at interface line rate:

Time interval line rate = Committed Burst Size / Interface Speed) * 1000

Time Interval Line Rate = (12 Mb / 1 Gbps) * 1000

Time Interval Line Rate = (12,000,000 bits / 1000,000,000 bps) * 1000 = 12 milliseconds

The recommended values for Committed Time Interval range between 8ms to 125ms. Shorter time intervals are recommended to reduce delays for sensitive traffic such as voice or video. Times longer than 125ms are not recommended as the delay will be too large.

Categories
CCNP Enterprise Core (350-401) Cisco Revision Topics Routing

Reverse Path Forwarding (RPF)

Reverse Path Forwarding is an algorithm that helps prevent loops and ensures that multicast traffic arrives on the correct interface.

The RPF algorithm has several features:

  • When a router receives a multicast packet on an interface that it uses to send unicast packets towards the source, that interface is a RPF interface.
  • When a packet arrives on the RPF interface, the router forwards the packet out of an outgoing interface list interface.
  • When a packet arrives on an interface that is not the RPF interface, it discards it to try prevent loops from occuring

PIM will utilise source trees between the source and last hop router, and between the source and the rendezvous point. Shared trees are established between the rendezvous point and the last hop routers. RPF checks are performed differently for each tree:

  1. If a PIM router has a (S, G) entry in the multicast routing table, a shortest path tree, the router will perform a RPF check against the IP address for the source of the multicast packet.
  2. If a PIM router has no source-tree state, it is considered a shared tree state. The router will perform a RPF check on the address of the rendezvous point – the rendezvous address is known when members join the group

Spare mode uses the RPF lookup function to determine where joins and prunes need to be sent. (S, G) joins are sent towards the source, (*, G) joins are sent towards the rendezvous point.

Categories
CCNP Enterprise Core (350-401) Cisco Revision Topics Routing

PIM Sparse Mode – Designated Routers

Where there are multiple PIM-SM routers on a subnet, an election is called to determine a designated router (DR). The designated router helps prevent duplicates of multicast traffic from being sent to to the rendezvous point.

In an election between PIM sparse mode routers, the highest priority wins the election. By default the priority of PIM routers is 1. If all routers in a subnet have the same priority, the election tiebreaker is the highest IP address in the subnet.

On the first hop router, the designated router is responsible for encapsulating register messages (in unicast) that are originated by a source, with a destination of the rendezvous point.

On a last hop router, the designated router is responsible for sending PIM join and prune messages towards the rendezvous point to inform it about group membership. The last hop router is also responsible for a shortest tree path switch-over.

If there are no designated routers, all the of the last hop routers on the same subnet would send PIM joins upstream. This may result in duplicate multicast traffic arriving on the LAN.

For the first hop routers, if there are no elected designated routers, multiple register messages to the rendezvous point can occur.

If there are no hello messages sent in 105 seconds, 3.5 times the hello interval, a new designated router will be elected.

Categories
CCNP Enterprise Core (350-401) Cisco Revision Topics Routing

PIM Shortest Path Tree Switchover

PIM sparse mode allows the last hop router to switch from the shared tree to a shortest path tree for a specific source.

This is a default behaviour in Cisco routers and happens after the first multicast packet is received from the rendezvous point from the shared tree, even if the shortest path is through the rendezvous point.

When the last hop router receives the first multicast packet from the rendezvous point, it will become aware of the IP address of the multicast source. The last hop router will check its unicast routing table to see where the shortest path is to the source. It will send a (S,G) PIM join hop-by-hop to the first hop router to form a shortest tree path.

If the shortest tree path upstream is out of an interface that is not currently the reverse path forwarding interface, it will send a PIM prune message to the rendezvous point through the existing RPF interface, and change the RPF interface to the interface that is part of the shortest path tree towards the source of the multicast stream. The pruning of the branch helps stop duplicate traffic from reaching the last hop router from multiple directions.

After the pruning, if the rendezvous has no other interfaces that are interested in receiving the multicast traffic, it will send a prune message towards the direction of the first hop router. If there are routers between the rendezvous point and the first hop router, they will travel along their interfaces.

Categories
CCNP Enterprise Core (350-401) Cisco Revision Topics Routing

PIM Terminlogy

Reverse Path Forwarding (RPF) Interface

The reverse path forwarding interface is the interface with the lowest cost path to the IP address of the root of the shortest path tree (the source of the multicast stream).

The lowest cost is based on the factors of administrative distance and metric. If there are multiple interfaces with the same cost, a tiebreaker is carried out with the interface of the highest IP address elected as the preferred path.

RPF Neighbour

The RPF neighbour is the PIM neighbour on the RPF interface.

Upstream

Upstream works towards the source of the tree. This can be towards the source of the multicast traffic in a shortest path tree or towards the router rendezvous point in a shared tree.

Upstream Interface

The upstream interface is the interface towards the source of a tree. It cna be known as the RPF interface or an incoming interface.

Downstream

Downstream is away from the source of the tree towards the receiving host.

Downstream Interface

A downstream interface is an interface that is used to forward multicast traffic towards the receiving hosts.

Incoming Interface (IIF)

An incoming interface accepts multicast traffic coming from a source of multicast traffic. It is the same as an RPF interface.

Outgoing Interface (OIF)

An outgoing interface is used to forward multicast traffic down the tree towards the receiving hosts.

Outgoing Interface List (OIL)

An outgoing interface list is a collection of outgoing interfaces that are forwarding multicast traffic to the same group.

Last Hop Router (LHR)

The last hop router is the router that is directly attached to receiving hosts. It is also known as the leaf router. It sends PIM joins upstream towards the rendezvous or the source.

First Hop Router (FHR)

The first hop router is directly attached to the multicast source. It is also known as the root router. It sends register messages towards the rendezvous point.

Multicast Routing Information Base (MRIB)

The multicast routing information base is a topology table, also known as the multicast route table, or mroute. The multicast routing information base contains the source S, group G, incoming interfaces, outgoing interfaces, and reverse path forwarding for each multicast router.

Multicast Forwarding Information Base (MFIB)

The multicast forwarding information base contains multicast forwarding information to make forwarding faster through hardware.

Multicast State

The multicast forwarding state is used by a router to forward multicast traffic. It is composed of entries found in the mroute table (S, G, IIF, OIF)

Categories
CCNP Enterprise Core (350-401) Cisco Revision Topics Routing

PIM Distribution Trees

A multicast router will create a distribution tree that will define the path multicast traffic follows to reach receiver devices.

There are two types of multicast distribution trees known as source trees. They are also known as shortest path trees, SPTs and shared trees.

Source Tree

A source tree is a multicast distribution tree where the source of the multicast stream is the root of the tree. There are branches that form a tree through the network to the receivers.

When the source tree has been built, it will utilise the shortest path from the source of the multicast traffic to the destination hosts. This is where the source tree gets the definition: shortest path tree.

The shortest path tree uses a forwarding state, known as (Source, Group), or (S, G).

The source is the sender of the multicast packets, or the source of the packet flows. The group is the multicast group address.

Since shortest path trees begin at the source of the multicast stream, every sending source requires to have a shortest path tree.

Shared Tree

A shared tree is a multicast distribution tree where the root is not at the source of the multicast stream. The root of source tree is defined at a router known as a rendezvous point, or RP.

Shared trees can also be known as RP trees (RPT) due to this characteristic.

Multicast traffic are forwarded down the shared tree according to the group address that the packets are addressed too, regardless of the source address.

The forwarding state of a RPT is known as (*, G)

Shared trees require fewer multicast entries. If more sources are introduced utilising the same multicast group, it will not require a separate entry in a shared tree as it would in a shortest path tree.

This advantage also works to the shared trees disadvantage. All receiver hosts in the shared tree will receive traffic sent by any host to the multicast group address. This can generate a lot of unwanted network traffic.

Without the policing of the source address in a shared tree, this can introduce a security issue – as any host can send data to the shared multicast address.

Categories
CCNP Enterprise Core (350-401) Cisco Revision Topics Routing

Multicast Addressing

The Internet Assigned Number Authority (IANA) assigned IP Class D space for multicast addressing – 224.0.0.0/4

This includes the range 224.0.0.0 to 239.255.255.255

Address RangeDescription
224.0.0.0 – 224.0.0.255Local Network Control
224.0.1.0 – 224.0.1.255Internetwork Control
224.0.2.0 – 224.0.255.255Ad-hoc 1
224.1.0.0 – 224.1.255.255Reserved
224.2.0.0 – 224.2.255.255SDP/SAP
224.3.0.0 – 224.4.255.255Ad-hoc 2
224.5.0.0 – 224.255.255.255Reserved
225.5.0.0 – 231.255.255.255Reserved
232.0.0.0 – 232.255.255.255Source Specific Multicast
233.0.0.0 – 233.251.255.255GLOP
233.252.0.0 – 233.255.255.255Ad hoc 3
234.0.0.0 – 238.255.255.255Reserved
239.0.0.0 – 239.255.255.255Administratively Scoped
Multicast ranges and their definitions

Local Network Control (224.0.0.0/24)

Addresses in this range are considered not to be forwarded out of a broadcast domain. They are used for protocol control traffic such as all hosts in the subnet, 224.0.0.1, all routers in the subnet, 224.0.0.2, and all PIM routers, 224.0.0.13

Internetwork Control (224.0.1.0/24)

Addresses in the internetwork control block may be forwarded across broadcast domains, and even across the internet. Examples of protocols that may use the internetwork control range are NTP, 224.0.1.1, Cisco RP Announce, 224.0.1.39, and Cisco RP Discovery, 224.0.1.40

Source Specific Multicast (232.0.0.0/8)

The 232.0.0.0/8 range is used by SSM. Source Specific Multicast is an extension to PIM as part of RFC 4607. SSM will forward traffic to receivers from only multicast sources that the receivers that have explicitly requested the multicast source.

GLOP (233.0.0.0/8)

Addresses in the GLOP are scoped statically assigned addresses. They are made for domains with a 16-bit AS number by mapping the domains AS number that is expressed in octets as X and Y into the middle two octets of the GLOP block (233.X.Y.0/24)

Domains with 32 bit numbers can utilise space in Ad Hoc 3 or utilise IPv6 multicast addressing.

Administratively Scoped Block (239.0.0.0/8)

These addresses are specified in RFC2365 and are limited to a local group or organisation. The addresses are similar to reserved IP unicast ranges such as 10.0.0.0/8 and 192.168.0.0/16. Network administrators can utilise this range as they please without worry that they will conflict with another organisation on the internet. Even though SSM has been assigned 232.0.0.0/8 it is not uncommon to see it utilise 239.0.0.0/8

Well Known Multicast Addresses

Multicast AddressDescription
224.0.0.1All hosts in subnet
224.0.0.2All routers in subnet
224.0.0.5All OSPF routers
224.0.0.6All OSPF DRs
224.0.0.9All RIPv2 Routers
224.0.0.10All EIGRP Routers
224.0.0.13All PIM Routers
224.0.0.18VRRP
224.0.0.22IGMPv3
224.0.0.102HSRPv2 and GLBP
224.0.1.1NTP
224.0.1.39Cisco RP Announce (Auto RP)
224.0.1.40Cisco RP Discovery (Auto RP)
Well known multicast addresses

Categories
CCNP Enterprise Core (350-401) Cisco Revision Topics Routing

BGP Path Attributes: Minimum Cluster List Length

With a tie on the Router ID, it falls to the minimum cluster list length to try break the deadlock.

The step uses the cluster list to locate the path that has travelled the least number of iBGP advertisement hops.

The cluster list is a non-transitive BGP attribute that is appended by a router reflector with it’s cluster ID.

Cluster IDs are used by route reflectors as an anti-loop mechanism to prevent loops.

This Cluster ID is only locally significant and not advertised between AS numbers.