Networking useful stuff

Showing posts with label routing. Show all posts

Thursday, April 2, 2026

FRR Troubleshooting: Fixing Kernel Route Redistribution Issues on Boot

If you are running FRRouting (FRR), you might have encountered a frustrating quirk: after a system reboot, kernel routes (like a static default route) fail to redistribute into RIP or OSPF. Curiously, as soon as you manually restart the FRR service, everything works perfectly.

Here is a breakdown of why this happens and how to fix it for good.

The Problem

During a cold boot, FRR starts its daemons (Zebra, RIPd, etc.), but routes defined at the OS level aren’t being advertised to neighbors. This breaks connectivity after an automated reboot and forces manual intervention (systemctl restart frr), which defeats the purpose of an automated routing stack.

The Root Cause: The Startup "Race Condition"

The culprit is typically a race condition between your network manager (like Netplan or systemd-networkd) and the FRR service.

Interface Renaming: Modern tools like Netplan often rename interfaces during boot (e.g., from eth0 to enp1s0). If FRR initializes while an interface is in transition, Zebra may ignore routes associated with a "missing" or "changing" interface.

The "Up" State: If Zebra scans the kernel routing table before the physical or virtual interface is fully marked as "UP" and operational, the routing protocols (like RIP) will deem the route invalid for redistribution.

Dependency Timing: By default, the FRR service might attempt to load before the network is "fully online," leading to a failed initial synchronization between the kernel and Zebra.

Step-by-Step Solution

The fix involves modifying the systemd unit file to ensure FRR waits until the network stack is completely ready and interface names are finalized.

1. Edit the Service Configuration

Instead of editing the file in /lib/systemd/system/ directly, use a systemd "override" to keep your changes safe during updates:

bash

sudo systemctl edit frr.service

2. Adjust Network Dependencies

In the editor that opens, insert the following lines:

ini

[Unit]

After=network-online.target

Wants=network-online.target

This tells systemd that FRR should only start after the system reports that the network is fully up and functional (network-online.target).

3. Reload and Apply

Save the file and exit the editor. Then, reload the systemd manager to pick up the changes:

bash

sudo systemctl daemon-reload

Verification

On your next reboot, you can verify that Zebra is correctly picking up the routes by checking your FRR logs or using the following debug commands in vtysh:

vtysh

debug zebra kernel

debug rip events

Summary

In modern Linux distributions where interface naming and IP addressing are dynamic, startup timing is everything. Forcing FRR to wait for the network to be "online" ensures that when the daemons ask the kernel for routes, the kernel actually has the final, stable answers ready to give.

Monday, December 4, 2023

How to create an IPv6 route to null/blackhole in Linux

Case:

How to create an IPv6 route to null/blackhole in Linux

Command:

ip -6 route add blackhole fd00:12:34::0/48

I hope it is useful

Thursday, December 1, 2022

An Interesting Change Is Coming to BGP

A route leak is defined as the propagation of routing announcement(s) beyond their intended scope (RFC 7908). But why do route leaks occur? The reasons are varied and include errors (typos when entering a number), ignorance, lack of filters, social engineering, and others.

Although there are several ways to prevent route leaks and, in fact, their number has decreased over the past three years (thanks to RPKI, IRR, and other mechanisms), I will try to explain what I believe BGP configurations will look like in the future. To do so, I will talk about RFC 9234, Route Leak Prevention and Detection Using Roles in UPDATE and OPEN Messages. And the part I would like to highlight is “role detection” as, after this RFC, in the future, we will assign roles in our BGP configurations.

To understand what we want to achieve, let’s recall some typical situations for an ISP:

a new customer comes along with whom we will speak BGP,

a connection to an IXP,

the ISP buys capacity from a new upstream provider,

a new private peering agreement.

In all these cases decisions need to be made. There are multiple ways to configure BGP, including route maps, AS filters, prefix lists, communities, ACLS, and others. We may even be using more than one of these options.

This is where RFC 9324 enters the picture: the document establishes the roles in the BGP OPEN message, i.e., it establishes an agreement of the relationship on each BGP session between autonomous systems. For example, let’s say that I am a router and I speak to another router and tell them that I am a “customer”; in turn, the other router’s BGP session can say “I am your provider.” Based on this exchange, all configurations (i.e., filters) will be automatic, which should help reduce route leaks.

These capabilities are then negotiated in the BGP OPEN message.

The RFC defines five roles:

Provider – sender is a transit provider to neighbor;

Customer – sender is a transit customer of neighbor;

RS – sender is a Route Server, usually at an Internet exchange point (IX);

RS-client – sender is client of an RS;

Peer – sender and neighbor are peers.

How are these roles configured?

If, for example, on a relationship in a BGP session between ASes, the local AS role is performed by the Provider, the remote AS role must be performed by the Customer and vice versa. Likewise, if the local AS role is performed by a Route Server (RS), the remote AS role must be performed by an RS-Client and vice versa. Local and remote AS roles can also be performed by two Peers (see table).

An example is included below.

BGP Capabilities

BGP capabilities are what the router advertises to its BGP peers to tell them which features it can support and, if possible, it will try to negotiate that capability with its neighbors. A BGP router determines the capabilities supported by its peer by examining the list of capabilities in the OPEN message. This is similar to a meeting between two multilingual individuals, one of whom speaks English, Spanish and Portuguese, while the other speaks French, Chinese and English. The common language between them is English, so they will communicate in that language. But they will not do so in French, as only one of them speaks this language. This is basically what has allowed BGP to grow so much with only a minor impact on our networks, as it incorporates these backward compatibility notions that work seamlessly.

This RFC has added a new capability.

Does this code work? Absolutely. Here’s an example in FRR:

Strict Mode

Capabilities are generally negotiated between the BGP speakers, and only the capabilities supported by both speakers are used. If the Strict Mode option is configured, the two routers must support this capability.

In conclusion, I believe the way described in RFC 9234 will be the future of BGP configuration worldwide, replacing and greatly improving route leaks and improper Internet advertisements. It will make BGP configuration easier and serve as a complement to RPKI and IRR for reducing route leaks and allowing for cleaner routing tables.

Click here to watch the full presentation offered during LACNIC 38 LACNOG 2022.

https://news.lacnic.net/en/events/an-interesting-change-is-coming-to-bgp

Thursday, September 3, 2020

Solved: Closing connection because of an I/O error in FRR - at least in Ubuntu

If you are getting this message in FRR:

Closing connection because of an I/O error in FR

The solution is straight forward. You have to compile FRR with this flag:

--enable-systemd

So, it would be something like:

./configure \

--prefix=/usr \

--includedir=\${prefix}/include \

--enable-exampledir=\${prefix}/share/doc/frr/examples \

--bindir=\${prefix}/bin \

--sbindir=\${prefix}/lib/frr \

--libdir=\${prefix}/lib/frr \

--libexecdir=\${prefix}/lib/frr \

--localstatedir=/var/run/frr \

--sysconfdir=/etc/frr \

--with-moduledir=\${prefix}/lib/frr/modules \

--with-libyang-pluginsdir=\${prefix}/lib/frr/libyang_plugins \

--enable-configfile-mask=0640 \

--enable-logfile-mask=0640 \

--enable-snmp=agentx \

--enable-multipath=64 \

--enable-user=frr \

--enable-group=frr \

--enable-vty-group=frrvty \

--with-pkg-git-version \

--enable-systemd

--with-pkg-extra-version=-MyOwnFRRVersion

you can follow those instructions and adding my previous solution:

http://docs.frrouting.org/projects/dev-guide/en/latest/building-frr-for-ubuntu2004.html

Monday, July 8, 2019

BGP: To filter or not to filter by prefix size. That is the question

Introduction

In order to write these post the R+D team and the WARP team joint together after analyzing some security incidents related with BGP, network accessibility, network hijacks and network visibilities.

As you probably know, in the BGP world, there are dozen of ways to filter prefixes. The goal of this post is to show some recommendations in order to have a more stable network, keep the visibility of your prefixes as much as possible and of course reduce the calls to the NOC

Scenario

Many ISPs around the world can not (or do not wish) to receive the full routing table (DFZ) which by the time of writing this text is about 750.000 prefixes (IPv4)

The above description could be due to some -not limited- of the following causes:
The routers does not have enough RAM to learn all the prefixes (please also note that there could be also several BGP session at the same time)
The network admin wants to save CPU cycles in their devices
The upstream providers is not announcing the full routing table
The network admin wishes to keep his network simple and easy

Anyhow, in the end of the story, the router is not learning the full routing table.

Problem

Not learning the full routing table can bring many partial inconveniences that in the end brings connectivity problems, users complaints, issues accessing some web sites and more.

Why?

Please try to imagine the following case

I have a router (property of EXAMPLE) in Internet that is ONLY learning a partial DFZ
The routers mentioned in “1” is only learning “big network”, over /20. I mean, the router learns /20, /19, /18, etc. (of course, we are talking about IPv4)
Based in the configuration indicated in number “2”, the router is not going to learn prefixes such as /21, /22, /23 nor /24
So far so good. However, somewhere in Internet, some people hijacked an /21 to the company ACME (hijack, bad configuration, whatever)
ACME decides to perform more specific prefix advertisements, so, he takes his /21 and announces 8 /24 prefixes to the DFZ.
Because of the filters configured by EXAMPLE, he will never learn the legitimate /24 prefixes advertised by ACME
EXAMPLE will keep learning the hijacked /21 which obviously will bring connectivity problems aim to the legitimate owner of the network

Topology

The following diagram represents the hypothesis presented in the previous point in a graphic manner to facilitate its understanding.

Recommendation

The following recommendations only for networks that CAN NOT LEARN the full DFZ. One more time, they were found after studying many cases of connectivity problem and network hijacks.:

Do not filter more specific network,. We mean, it’s better to learn more specific networks such as: /24, /23, /22 (IPv4 world)

Filter using AS_PATH (like, 2,3 or 4 deep ASs)

Please create your ROS and use RPKI

Some examples (Cisco like)1. Learning only /22, /23 or /24:

router bgp 65002
neighbor 10.0.0.1 remote-as 65001
neighbor 10.0.0.1 route-map FILTRO-IN in
!
ip prefix-list SMALLNETWORKS seq 5 permit 0.0.0.0/0 ge 22 le 24
!
route-map FILTRO-IN permit 10
match ip address prefix-list SMALLNETWORKS
!

2. Only learning two AS beyond us:

router bgp 65001
neighbor 10.0.0.2 remote-as 65002
neighbor 10.0.0.2 route-map ASFILTER-IN in

!
ip as-path access-list 5 permit ^[0-9]+_$
ip as-path access-list 5 permit ^[0-9]+ [0-9]+_$
!
route-map ASFILTER-IN permit 10
match as-path 5
!

More information

Prefix hijack demonstration 1 / 2 (in Spanish):
https://www.youtube.com/watch?v=X5RNSs8y8Ao&t=39s

Prefix hijack demonstration 2/2 (in Spanish):
https://www.youtube.com/watch?v=m51WtuEZOKI

BGP Prefix-Based Outbound Route Filtering
http://www.cisco.com/c/en/us/td/docs/ios/12_2s/feature/guide/fsbgporf.html

Ejemplo de configuración de BGP con dos prestadores de servicio diferentes (conexiones múltiples)
http://www.cisco.com/cisco/web/support/LA/7/75/75930_27.html

Certificación de Recursos (RPKI)
http://www.lacnic.net/web/lacnic/certificacion-de-recursos-rpki

Información General sobre Certificación de Recursos (RPKI)
http://www.lacnic.net/web/lacnic/informacion-general-rpki

Authors:

Dario Gomez (https://twitter.com/daro_ua)

Alejandro Acosta (https://twitter.com/ITandNetworking)

Monday, February 29, 2016

Read a BGP live stream from CAIDA

Objective
Read a BGP live stream from CAIDA and insert them into a BGP session

What do we need
bgpreader from the bgpstream core package provided by Caida
bgp_simple.pl obtained in github

Overview
We will read the BGP live stream feed using bgpreader, then the standard output of it will be redirected to a pipe file (mkfifo) where a perl script called bgpsimple will be reading this file. This very same script will established the BGP session against a BGP speaker and announce the prefixes received in the stream.

LAB Topology
The configuration was already tested in Cisco & Quagga
The BGP Speaker (Cisco/Quagga) has the IPv4 address 192.168.1.1
The BGP Simple Linux box has the IP 192.168.1.2

How does it works?
bgpreader has the ability to write his output in the -m format used by libbgpdump (by RIPENCC), this is the very same format bgpsimple uses as stdin. That's why myroutes is a PIPE file (created with mkfifo).

Steps:

INSTALL BGP READER - UBUNTU 15.04

First install general some packages:
apt-get install apt-file libsqlite3-dev libsqlite3 libmysqlclient-dev libmysqlclient
apt-get install libcurl-dev libcurl autoconf git libssl-dev
apt-get install build-essential zlib1g-dev libbz2-dev
apt-get install libtool git
apt-get install zlib1g-dev

Also intall wandio
wandio-1.0.3
git clone https://github.com/alistairking/wandio

./configure

cd wandio
./bootstrap.sh
./configure && ./make && ./make install
wandiocat http://www.apple.com/library/test/success.html

to test wandio:
wandiocat http://www.apple.com/library/test/success.html

Download bgp reader tarball from:
https://bgpstream.caida.org/download

#ldconfig (before testing)

#mkfifo myroutes

to test bgpreader:
./bgpreader -p caida-bmp -w 1453912260 -m
(wait some seconds and then you will see something)

# git clone https://github.com/xdel/bgpsimple

Finally run everything
In two separate terminals (or any other way you would like to do it):

./bgpreader -p caida-bmp -w 1453912260 -m > /usr/src/bgpsimple/myroutes
./bgp_simple.pl -myas 65000 -myip 192.168.1.2 -peerip 192.168.1.1 -peeras 65000 -p myroutes

One more time, what will happen behind this?
bgpreader will read an online feed from a project called caida-bmp with starting timestamp 1453912260 (Jan 27 2016, 16:31) in "-m" format, It means a libbgpdump format (see references). The stardard output of all this will be send to the file /usr/src/bgpsimple/myroutes which is a "pipe file". At the same time, bgp_simple.pl will create an iBGP session againts peer 192.168.1.1/AS65000 (a bgp speaker such as Quagga or Cisco). bgp_simple.pl will read myroutes files and send what it seems in this file thru the iBGP Session.

Important information
- The BGP Session won't be established until there is something in the file myroutes
- eBGP multi-hop session are allowed
- You have to wait short time (few seconds) until bgpreaders start to actually see something and bgp_simple.pl starts to announce to the BGP peer

References / More information:
-Part of the work was based on:
http://evilrouters.net/2009/08/21/getting-bgp-routes-into-dynamips-with-video/

- Caida BGP Stream:
https://bgpstream.caida.org/

- bgpreader info:
https://bgpstream.caida.org/docs/tools/bgpreader

- RIPE NCC libbgpdump:
http://www.ris.ripe.net/source/bgpdump/

- Introduction of "Named Pipes" (pipe files in Linux):
http://www.linuxjournal.com/article/2156

Tuesday, February 17, 2015

Solution to quagga vtysh "Exiting: failed to connect to any daemons."

Description:
When you run the command in the linux shell vtysh to connect to the quagga daemons (such as bgpd, ospfd, etc) returns the following error "Exiting: failed to connect to any daemons"

Just like this:

alejandro @ miserver: ~ $ vtysh -d bgpd
Exiting: failed to connect to any daemons.

alejandro @ miserver: ~ $ vtysh
Exiting: failed to connect to any daemons.

Solution:
The solution is to add the user that is executing vtysh to the quagga group. To do this edit the /etc/group file.
After editing /etc/group should be something like:

quagga:x:1003:alejandro

You can specify multiple users doing:

quagga:x:1003:alejandro, john

This is necessary because vtysh tries to connect to the daemons using UNIX domain sockets and not all users (for security reasons) have access to these sockets.

Another solution:
Another solution might be during the compilation phase where you can specify the linux/unix group for sockets mentioned above. Example:

./configure --enable-vty-group = group

Good luck, I hope this helped,

Thursday, February 21, 2013

Advertising IPv6 Routes Between IPv4 BGP Peers (Cisco)

Situation:
  I want to advertise IPv6 networks / prefixes over IPv4 eBGP session
History:
  Although not common, this case may occur in some situations.
For example, in this moment, I have a Cisco router with IPv6 support (routing) but do not support BGP IPv6 neighbors
Error (just in case):
(Probably you are receiving the message below)     :)

*Mar 1 02:05:00.663: BGP: 1.1.1.1 Advertised Nexthop ::FFFF:1.1.1.1: Non-local or Nexthop and peer Not on same interface
*Mar 1 02:05:00.663: BGP(1): 1.1.1.1 rcv UPDATE w/ attr: nexthop ::FFFF:1.1.1.1, origin i, metric 0, originator 0.0.0.0, path 1, community , extended community
*Mar 1 02:05:00.667: BGP(1): 1.1.1.1 rcv UPDATE about 2001:db8::/32 -- DENIED due to:
*Mar 1 02:05:00.667: BGP(0): Revise route installing 1 of 1 route for 10.0.0.0/24 -> 1.1.1.1 to main IP table
*Mar 1 02:05:00.771: BGP(0): 1.1.1.1 computing updates, afi 0, neighbor version 0, table version 25, starting at 0.0.0.0

Solution:
  Fortunately BGP support carrying routing information for different protocols (ie. IPv6). Therefore it is possible to exchange prefixes IPv6 over eBGP IPv4 sessions.
Configuration:
  In this basic scenario with R1 <--> R2 connected back-to-back the configuration is as follows (the prefix announced by R1 is learned by R2).

R1:
!
interface Ethernet1/0
ip address 1.1.1.2 255.255.255.252
full-duplex
ipv6 address 2001:db8::1/64
ipv6 enable
!
router bgp 1
no synchronization
bgp router-id 1.1.1.1
bgp log-neighbor-changes
neighbor 1.1.1.2 remote-as 2
neighbor 1.1.1.2 ebgp-multihop 2
no auto-summary
!
address-family ipv6
neighbor 1.1.1.2 activate
network 2001:db8::/32
no synchronization
redistribute static
exit-address-family
!
ipv6 route 2001:db8::/32 Null0

R2:
!
interface Ethernet1/0
ip address 1.1.1.2 255.255.255.252
full-duplex
ipv6 address 2001:db8::2/64
ipv6 enable
!
router bgp 2
no synchronization
bgp router-id 1.1.1.2
bgp log-neighbor-changes
neighbor 1.1.1.1 remote-as 1
neighbor 1.1.1.1 ebgp-multihop 2
no auto-summary
!
address-family ipv6
neighbor 1.1.1.1 activate
neighbor 1.1.1.1 route-map IPv6-NextHop in
exit-address-family
!
route-map IPv6-NextHop permit 10
set ipv6 next-hop 2001:db8::1
!
"The trick":
  * The session must be eBGP multihop, if not, R2 will not learn the prefix (the same error as seen above). I admit I do not get 100% why it happens however after readings some documents it looks like the router complains that the next-hop IP address and the way it was configured are in different subnet (make sense, one is IPv6 and IPv4 another!).
  * In R2 (who receive the prefix) there must be a route-map applied (in) forcing the next-hop IPv6 address of R1
After applying ebgp-multihop (everything works):
* Mar 1 02:01:42.539: BGP (1): 1.1.1.1 rcvd UPDATE w / attr: nexthop :: FFFF: 1.1.1.1, origin i, metric 0, path 1* Mar 1 02:01:42.539: BGP (1): 1.1.1.1 rcvd 2800:26 :: / 32* Mar 1 02:01:42.543: BGP (0): Check route installing 1 of 1 route for 10.0.0.0/24 -> 1.1.1.1 to main IP table* Mar 1 02:01:42.543: BGP (1): Check for installing route 2001: db8 :: / 32 -> 2001: db8 :: 1 (::) to main IPv6 tableMore information:- https://supportforums.cisco.com/docs/DOC-21110- http://ieoc.com/forums/p/15154/130174.aspx- http://ieoc.com/forums/p/15154/130174.aspx

I hope it's useful!