Introduction
In part 2 we will continue with integration of a PVE cluster with a Cisco 9kv running NXOS.
Figure 1. shows the network diagram.

We are using the 9000v as an eBGP peer for the PVE cluster. Any device that supports BGP can replace the Cisco 9k.
SETUP
We have a PVE cluster. However, we are now connecting to two tenants via eBGP and we are accessing them via eBGP peering, This neatly separates the functions of the cluster (internal machines) and access to external clients.
A few notes:
1) We have two zones configured in the cluster.
a) Zone 1 for tenant A (TNA), 10.10.100.0/24
b) Zone 2 for tenant B (TNB), 10.10.200.0/24
c) Make sure any container in each zone configuration can ping each other.
2) Once again we use a manual fabric. This time though we configure OSPF so the cluster and the Cisco are reachable sans static routes.
Configuration
PVE
Configure a standard PVE cluster.
1) Configure the DATA segment. Add a new vmbr1 bridge on each node and give it the corresponding IP address.
2) Create an evpn controller (DC1) and add each IP configured in (1) to it: 192.168.1.41,42,43. Use ASN 65000 the default.
3) Create an eBGP controller and add 192.168.1.10 (IP of the Cisco). Use PX3 as the source of the peering.
4) Create two zones:
a) v100 and v200. Put them under the controller that you just created.
b) Assign vxlan:10000 to v100, vxlan:20000 to v200.
c) Create two VNETS:
1. TEST1: vni 11000 on v100
2. TEST2: vni 21000 on v200
d) Assign 10.10.100.0/24 (GW 10.10.100.1) to TEST1, 10.10.200.0/24 (GW 10.10.200.1) to TEST2.
e) Configure SNAT if you wish. IMPORTANT: make PX3 the exit node for the zones. This will guarantee that BGP routes are advertised correctly.
Create file “snd-local” and put it under “/etc/network/interface.d” and add the following to it:
auto dummy_DC1
iface dummy_DC1 inet static
address 10.10.10.3/32
link-type dummy
ip-forward 1
auto vmbr1
iface vmbr1 inet static
address 192.168.1.43/24
ip-forward 1
Replace above on all the nodes making sure to enter the correct IP addresses for each node. Here DC1 refers to the controller name created earlier.
Create “frr.conf.local” in the “/etc/frr/” directory:
router ospf ospf router-id 10.10.10.3 exit ! interface dummy_DC1 ip ospf area 0 ip ospf passive exit ! interface vmbr1 ip ospf area 0 exit
Do the same on each node and enter the correct router-id.
Create “/etc/default/frr” and add “ospfd=yes”. This will prevent PVE from disabling “ospfd” under the daemons setting every time you apply new SDN settings.
Apply SDN settings and make sure there are no issues, correct them if so.
Finally create containers in each zone and make sure they can ping each other. Check that OSPF is running and peering is up. Use “vtysh” to do so, the syntax is similar to that of Cisco’s CLI.
Instead of creating dummy interfaces you could also have used the standard loopback interface “lo” that Linux uses. I did it this way to show how to manipulate the creation on interfaces outside the GUI.
Cisco 9000v
Configuration
Below are the pertinent parts of the 9k configuration.
nv overlay evpn feature ospf feature bgp feature pim feature fabric forwarding feature interface-vlan feature vn-segment-vlan-based feature nv overlay
Next the vrf part of the setup.
ip prefix-list tna seq 10 permit 10.10.100.0/24
ip prefix-list tna-export seq 10 permit 172.16.10.0/24
ip prefix-list tna-export seq 20 permit 192.168.0.0/30
ip prefix-list tnb seq 10 permit 10.10.200.0/24
ip prefix-list tnb-export seq 10 permit 172.16.20.0/24
ip prefix-list tnb-export seq 20 permit 192.168.4.0/30
route-map tna-export permit 10
match ip address prefix-list tna-export
route-map tna-import permit 10
match ip address prefix-list tna
route-map tnb-export permit 10
match ip address prefix-list tnb-export
route-map tnb-import permit 10
match ip address prefix-list tnb
vrf context management
vrf context tna
rd 10.10.10.10:100
address-family ipv4 unicast
route-target import 10.10.10.10:100
route-target export 10.10.10.10:100
import vrf default map tna-import
export vrf default map tna-export
vrf context tnb
rd 10.10.10.10:200
address-family ipv4 unicast
route-target import 10.10.10.10:200
route-target export 10.10.10.10:200
import vrf default map tnb-import
export vrf default map tnb-export
We need to leak routes between the global table route (GRT) and the vrfs TNA and TNB. The reason for this should be apparent. The eBGP controller from the cluster is sending routes from the VNETS, they need to be injected into the corresponding tenants.
We need to make sure we do not leak routes from VRF TNA onto VRF TNB or vice versa, when the routes from the tenants are send back back to the cluster, PVE will put them in the correct VNET.
Next we set the correct IP addresses.
interface Ethernet1/1 no switchport mac-address 0000.1111.1111 ip address 192.168.1.10/24 no shutdown interface Ethernet1/2 interface Ethernet1/3 interface Ethernet1/4 interface Ethernet1/5 interface Ethernet1/6 no switchport mac-address 0000.2222.1234 vrf member tna ip address 192.168.0.1/30 no shutdown interface Ethernet1/7 no switchport mac-address 0000.3333.1234 vrf member tnb ip address 192.168.0.5/30 no shutdown
Finally we set the mgmt vrf address (if we want to access via ssh) and the loopback address.
Then we configure eBGP back to the cluster and the vrf BGP to the tenants
interface mgmt0
vrf member management
ip address 192.168.100.10/24
interface loopback0
ip address 10.10.10.10/32
icam monitor scale
line console
exec-timeout 0
line vty
exec-timeout 0
boot nxos bootflash:/nxos.9.3.5.bin sup-1
router bgp 65100
router-id 10.10.10.10
address-family ipv4 unicast
network 10.10.200.0/24
neighbor 192.168.1.43
remote-as 65000
address-family ipv4 unicast
soft-reconfiguration inbound
address-family l2vpn evpn
send-community
send-community extended
vrf tna
address-family ipv4 unicast
network 10.10.100.0/24
neighbor 192.168.0.2
remote-as 64000
address-family ipv4 unicast
send-community
send-community extended
vrf tnb
address-family ipv4 unicast
network 10.10.200.0/24
neighbor 192.168.0.6
remote-as 63000
address-family ipv4 unicast
send-community
send-community extended
!
Remarks:
1) Two tenants are created tenant A (TNA) and tenant B (TNB).
2) In Fig 1. TNA and TNB are routers capable of peering via BGP.
3) As stated two vrfs are created tna and tnb.
4) By default the global routing table (GRT) and each for the tenants (TNA), (TNB) will only hold routes for the cluster and each tenant. We need to leak routes so each other is aware of the other. From the security perspective this is not an issue, if your are the provider you need access to all routes.
a) However, you need to have prefix lists and route maps to prevent accidentally leaking routes between tenants.
b) Pay attention under the VRF definitions and how we apply route maps to inject routes to the GRT, TNA and TNB.
5) The peering to PX3 brings the routes from and to the cluster.
Testing
Cisco 9000v
We should see now that both the cluster and and 9k should have exchange routes.
On the 9k the standard IOS commands should show you the status of BGP peering.
nxos-9k# sh ip bgp summary BGP summary information for VRF default, address family IPv4 Unicast BGP router identifier 10.10.10.10, local AS number 65100 BGP table version is 9, IPv4 Unicast config peers 1, capable peers 1 5 network entries and 6 paths using 980 bytes of memory BGP attribute entries [5/860], BGP AS path entries [3/18] BGP community entries [0/0], BGP clusterlist entries [0/0] Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd 192.168.1.43 4 65000 41 38 9 0 0 00:01:41 2
This of course shows peering to PX3 because we are doing IPV4 unicast to it.
Next we can see routes that the 9k is getting from the cluster and what we will be sending to it.
nxos-9k# sh ip bgp BGP routing table information for VRF default, address family IPv4 Unicast BGP table version is 9, Local Router ID is 10.10.10.10 Status: s-suppressed, x-deleted, S-stale, d-dampened, h-history, *-valid, >-best Path type: i-internal, e-external, c-confed, l-local, a-aggregate, r-redist, I-i njected Origin codes: i - IGP, e - EGP, ? - incomplete, | - multipath, & - backup, 2 - b est2 Network Next Hop Metric LocPrf Weight Path *>e10.10.100.0/24 192.168.1.43 0 0 65000 ? l10.10.200.0/24 0.0.0.0 100 32768 i *>e 192.168.1.43 0 0 65000 ? *>e172.16.10.0/24 192.168.0.2 0 0 64000 i *>e172.16.20.0/24 192.168.0.6 0 0 63000 i *>e192.168.0.0/30 192.168.0.2 0 0 64000 ?
As you can see 10.10.x.0/24 are the containers, we also see the 172.16.x.0 routes we injected from the tenants, They should appear within the cluster.
Finally if we do:
nxos-9k# sh bgp l2vpn evpn nxos-9k# sh bgp l2vpn evpn summary BGP summary information for VRF default, address family L2VPN EVPN BGP router identifier 10.10.10.10, local AS number 65100 BGP table version is 2, L2VPN EVPN config peers 1, capable peers 0 0 network entries and 0 paths using 0 bytes of memory BGP attribute entries [0/0], BGP AS path entries [0/0] BGP community entries [0/0], BGP clusterlist entries [0/0] Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd 192.168.1.43 4 65000 220 216 0 0 0 00:10:37 0 (No Cap)
You may think this an error that needs to be resolved. It is not, we are trying to do EVPN to a device that is not configured for it (the cluster). If you notice we left the commands for the address family L2VPN EVPN in the configuration, they are not needed. They should be taken out.
PVE Cluster
At a command prompt issue “vtysh”, this is a command that you should know by heart if you use FRR (which PVE uses).
We should see routes.
pve-3# sh bgp pv4 unicast BGP table version is 5, local router ID is 192.168.1.43, vrf id 0 Default local pref 100, local AS 65000 Status codes: s suppressed, d damped, h history, u unsorted, * valid, > best, = multipath, i internal, r RIB-failure, S Stale, R Removed Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self Origin codes: i - IGP, e - EGP, ? - incomplete RPKI validation codes: V valid, I invalid, N Not found Network Next Hop Metric LocPrf Weight Path *> 10.10.100.0/24 0.0.0.0(pve-3)@10< 0 32768 ? *> 10.10.200.0/24 0.0.0.0(pve-3)@15< 0 32768 ? *> 172.16.10.0/24 192.168.1.10 0 65100 64000 i *> 172.16.20.0/24 192.168.1.10 0 65100 63000 i *> 192.168.0.0/30 192.168.1.10 0 65100 64000 ? Displayed 5 routes and 5 total paths
As you can see we have the 10.x.x.x from our containers and in addition the 176.16.x.x routes from the external tenants.
Tenants
You should be able to ping from the tenants (tenant b) here:
tnb#sh ip int bri Interface IP-Address OK? Method Status Protocol FastEthernet0/0 192.168.0.6 YES NVRAM up up FastEthernet0/1 172.16.20.1 YES NVRAM up up Loopback0 172.16.1.2 YES NVRAM up up tnb#ping 10.10.200.20 so tnb#ping 10.10.200.20 source 172.16.20.1 Type escape sequence to abort. Sending 5, 100-byte ICMP Echos to 10.10.200.20, timeout is 2 seconds: Packet sent with a source address of 172.16.20.1 !!!!! Success rate is 100 percent (5/5), round-trip min/avg/max = 8/12/20 ms tnb#ping 10.10.200.10 source 172.16.20.1 Type escape sequence to abort. Sending 5, 100-byte ICMP Echos to 10.10.200.10, timeout is 2 seconds: Packet sent with a source address of 172.16.20.1 !!!!! Success rate is 100 percent (5/5), round-trip min/avg/max = 8/10/12 ms tnb#
Notice we need to source the ping from the right interface. And for those from Missouri the show me state, to show were not pulling a fast one.
From the container of tenant a:
test100:~# ifconfig eth0 Link encap:Ethernet HWaddr BC:24:11:79:34:C1 inet addr:10.10.100.10 Bcast:0.0.0.0 Mask:255.255.255.0 inet6 addr: fe80::be24:11ff:fe79:34c1/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1450 Metric:1 RX packets:8 errors:0 dropped:0 overruns:0 frame:0 TX packets:9 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:656 (656.0 B) TX bytes:726 (726.0 B) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) test100:~# ping 172.16.10.1 PING 172.16.10.1 (172.16.10.1): 56 data bytes 64 bytes from 172.16.10.1: seq=0 ttl=252 time=8.330 ms 64 bytes from 172.16.10.1: seq=1 ttl=252 time=9.146 ms ^C --- 172.16.10.1 ping statistics --- 2 packets transmitted, 2 packets received, 0% packet loss round-trip min/avg/max = 8.330/8.738/9.146 ms
We can ping router TNA!
Conclusion
This lab shows how to add an eBGP peer so you can connect external clients to the PVE cluster. Hope you enjoyed it.
Take care,
Ciao.
