Friday, March 8, 2024

BGP Stream: An Analysis of One Year of BGP Incidents

04/03/2024


By Alejandro Acosta, R&D Coordinator at LACNIC

LACNIC presents the first webpage designed to show incidents and an analysis of Border Gateway Protocol (BGP) measurement data in Latin America and the Caribbean.

MAIN INCIDENTS. In addition to a summary of the information, the page shows three main types of events: possible network hijacks, BGP outages, and route leaks.

Possible hijacks refers to the illegitimate takeover of groups of IP addresses by corrupting Internet routing tables. This typically occurs when an Autonomous System announces a prefix that it does not originate.

Outages refers to the loss of visibility of network prefixes by a majority group of sensors.

Route leaks, as the name suggests, refers to the —potentially— unintentional announcement of a network prefix via BGP. For example, in a private peering traffic exchange, when one of the participants announces the peer’s prefix to the Internet. This case is the most difficult for algorithms to detect, so some of these incidents are not identified.

How is the data obtained?

This initiative uses Cisco BGP Stream, an automated process that selects the largest and most important incidents, providing information on the nature of the event and the ASNs involved.

The information is openly published, as LACNIC believes that it is important for engineers, network administrators, and organizations to gain insights into the most common incidents in the region and raise awareness about the situation.

This allows quickly investigating events, the rapid development of complex prototypes and tools, as well as large-scale monitoring applications (e.g., detecting connectivity outages, attacks, or BGP hijacks).

Using a system developed by LACNIC’s R&D department, raw data is collected, plotted, identified, cleaned, stored in a database, and later used to produce statistics and graphs. This occurs automatically every 24 hours.

RESULTS. During the study period —February 2023 to February 2024— we found the results shown in the charts below, which compare BGP events worldwide vs BGP events in our region.

A comparison between the global chart and the chart specific to the LAC region shows a similar pattern in the order of the most common incidents, with outages being the most frequent type of incident, followed by possible hijacks, and finally prefix leaks. It should also be noted that outages represent a higher percentage of the total number of incidents in our region than at the global level.

An analysis of the results table showing worldwide BGP events vs BGP events in our region reveals the following:

TOP 5 countries in our region with the highest number of BGP outages

Outages 
CCEvents
BR781
AR99
HT24
MX22
CL17

TOP 5 countries in our region with the highest number of possible Hijacks

Expected CCDetected CCEvents
BRBR67
BRnone35
PYBR24
BRUS22
BRCN9

TOP 3 countries in our region with the highest number of route leaks

Origin CCLeaker CCEvents
VEVE7
MXMX5
CLPA2

Impact

In this first year of operation, LACNIC has observed a reduction in BGP incidents. Several reasons for this have been identified, including a) the deployment and adoption of Resource Certification (RPKI), b) LACNIC’s Internet Routing Registry (IRR), and the adoption of RFC 9234 (Route Leak Prevention and Detection Using Roles in UPDATE and OPEN Messages).

The adoption of these tools is being driven by better operator practices and ISOC’s promotion of MANRS.

Conclusions

Possible hijacks, outages, and route leaks are the most common types of BGP incidents. During the initial year of data collection, a decrease in the number of cases was observed. However, it is expected that they will not disappear entirely in the near future. Implementing robust redundancy and resiliency measures in networks is crucial, as is the early detection and prevention of possible hijacks to ensure the integrity and reliability of Internet routes.

At LACNIC, our goal is to raise awareness and encourage ISPs and organizations to be prepared to handle these incidents efficiently when they occur.

References

https://stats.labs.lacnic.net/BGP/bgpstream-lac-region.html

https://stats.labs.lacnic.net/BGP/bgpstream.html

https://bgpstream.crosswork.cisco.com/