BGP MED Explained

As you probably know, the BGP protocol has many different attributes that can be used to influence path selection. These many attributes have different characteristics and are used for many different things. Today, we are going to be taking a look specifically at the BGP MED attribute. The BGP MED attribute is one that is typically used to influence how traffic comes back into your own autonomous system. We can usually accomplish the same thing with the BGP AS_PATH attribute by utilizing AS Path Prepending, but let’s focus in on MED. I have found through my experience and studies that the MED is one of the most confusing topics for people and that it is often misunderstood. Let’s take a look at some of the misconceptions and how we can actually use MED to our advantage in the real world or in a CCIE lab situation. Let’s take a look at our topology today

We have four routers, each in their own AS. R1 is going to be advertising it’s loopback0 interface 1.1.1.1/32 into BGP to both R2 and R4 (Yes, I know there is no R3…a carry over from my days with IPexpert. My home lab is still built around that topology. Call it bad practice, call it habit…whatever). R5 will therefore receive the prefix 1.1.1.1/32 from both R2 and R4. If we leave everything at the defaults which path will it choose? In this case it actually depends…let’s walk through the BGP best-path selection algorithm. If you are studying for a CCNP or a CCIE there is no reason you should not have the following link bookmarked: http://www.cisco.com/en/US/tech/tk365/technologies_tech_note09186a0080094431.shtml

1) Highest Weight
2) Highest Local Preference
3) Prefer Locally Originated routes over externally learned routes
4) Shortest AS Path
5) Lowest Origin Type
6) Lowest MED
7) Prefer eBGP over iBGP
8) Prefer the path with the lowest IGP metric to the BGP next-hop
9) Determine if multiple paths need installed (BGP multipath)
10) When both paths are external prefer the one that was learned first

Well, we have not done anything yet so everything is going to be the same. Since R2 happened to boot up first and peer with R5 first, R5 prefers the path via R2. Let’s see that in action:

R5#show ip bgp
BGP table version is 2, local router ID is 45.45.45.5
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*  1.1.1.1/32       45.45.45.4                             0 400 100 i
*>                  25.25.25.2                             0 200 100 i

OK great. Now, let’s say that we need R5 to prefer the path via R4 instead. For whatever reason (uncooperative ISP, people that don’t know how to do their job, policies, or simply a lab requirement) you are not allowed to use weight or local preference on any router. You are not allowed to change the BGP origin type or use AS Path Prepending on any router. That leaves us with MED. OK, well the BGP path selection says lowest MED wins. Default MED is 0. That is kind of irritating, but OK. Sounds simple — Let’s just make the MED on the R1/R2 peering higher. How about 100? R2 should get a MED of 100 from R1 while R4 will get the default MED of 0. When R2 and R4 relay the prefixes to R5 the prefix with the MED of 0 that passed through R4 should win right? Perfect, let’s try that out

R1(config)#route-map SET-MED-100 permit 10
R1(config-route-map)#set metric 100
R1(config-route-map)#exit
R1(config)#router bgp 100
R1(config-router)#neighbor 12.12.12.2 route-map SET-MED-100 out
R1(config-router)#end
R1#clear ip bgp * soft

Let’s verify on R2 and R4

R2#show ip bgp
BGP table version is 3, local router ID is 25.25.25.2
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 1.1.1.1/32       12.12.12.1             100             0 100 i
R4#show ip bgp
BGP table version is 2, local router ID is 45.45.45.4
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*  1.1.1.1/32       45.45.45.5                             0 500 200 100 i
*>                  14.14.14.1               0             0 100 i

We can see that R2 has the metric set to 100 on the prefix 1.1.1.1/32. We can see that R4 still has the default metric of 0 but …hmmmm that is pretty interesting why is R4 learning the prefix from both R1 and R5?! Let’s look at R5

R5#sh ip bgp
BGP table version is 2, local router ID is 45.45.45.5
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*  1.1.1.1/32       45.45.45.4                             0 400 100 i
*>                  25.25.25.2                             0 200 100 i

Oh. Well, first of all our MED doesn’t appear to have done anything! the metric is not set on the prefix that went from R1 to R2 and up to R5. This also answers our question from above. Since R5 still prefers the path via R2 that path is the valid and best path in it’s BGP table. Because of that fact, R5 advertises it’s valid and best path over to R4 which is why on R4 we see the prefix twice.

OK, so after all that what have we learned? Well, if you read the BGP best path selection very carefully you will see the following under the MED section:

This comparison only occurs if the first (the neighboring) AS is the same in the two paths. Any confederation sub-ASs are ignored.

In other words, MEDs are compared only if the first AS in the AS_SEQUENCE is the same for multiple paths. Any preceding AS_CONFED_SEQUENCE is ignored.

So basically what it says is that the BGP MED value is not even looked at in the sequence if the prefix came from two separate autonomous systems. In other words, to actually be able to use MED the same prefix has to be coming from the same autonomous system. The MED attribute was designed for AS’ that are directly connected. Now, this is ONE problem people usually have with understanding MED. The second misunderstanding I usually see is that people fail to remember that the BGP MED attribute is a BGP optional non-transitive attribute. What does that mean? Optional means that first of all not all BGP implementations are required to support it. Non-transitive is the more interesting one though. This means that if an AS learns a prefix with a non-transitive attribute that attribute will not be passed to it’s peers. Oh.

Let’s put that into perspective in our situation. First of all, we are setting an optional non-transitive attribute on R1. So right away we are screwed because R2 and R4 will not “reflect” that attribute up to R5. Secondly, even if they did pass the MED up to R5 it wouldn’t matter because as we saw the MED is only compared if the prefixes are coming from the same AS. What to do, what to do? First lets clean up R1

R1(config)#router bgp 100
R1(config-router)#no neighbor 12.12.12.2 route-map SET-MED-100 out
R1(config-router)#do clear ip bgp * soft

So, to solve problem #1 why don’t we set the MED outbound from R2 to R5 instead. Great. But what about problem #2? Even if R5 gets the MED from R2 it won’t matter because it will be getting the same 1.1.1.1/32 prefix from two different AS’s. Well, there is a little well-known command we can use to fix that on R5. That would be the bgp always-compare-med command under the BGP process. Let’s get to work then

First, we set R2 to set the MED to 100 on it’s peering to R5:

R2#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
R2(config)#route-map SET-MED-100 permit 10
R2(config-route-map)#set metric 100
R2(config-route-map)#router bgp 200
R2(config-router)#neighbor 25.25.25.5 route-map SET-MED-100 out
R2(config-router)#do clear ip bgp * soft

Next, we tell R5 to always compare the MED values even if the prefixes come from two different AS’s.

R5(config)#router bgp 500
R5(config-router)#bgp always-compare-med
R5(config-router)#do clear ip bgp * soft

Let’s check it out now

R5#show ip bgp
BGP table version is 3, local router ID is 45.45.45.5
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 1.1.1.1/32       45.45.45.4                             0 400 100 i
*                   25.25.25.2             100             0 200 100 i

NICE!!!

There are two lessons to remember when dealing with MED

1) Remember, it is an optional non-transitive attribute. It won’t “bounce” through more than one AS
2) By default, the MED is only compared if the same prefix is learned more than once from the same AS. If the prefix is learned from two different AS’ and we still want to look at MED we need to configure the bgp always-compare-med command!

I hope this blog has been informative and useful for you all! Until next time keep studying hard!

Joe Astorino, CCIE #24347

14 Comments

Leave a Reply