Discussion:
[nvo3] WGLC comments on draft-ietf-bfd-vxlan
Anoop Ghanwani
2018-11-08 09:58:12 UTC
Permalink
Here are my comments.

Thanks,
Anoop

==

Philosophical

Since VXLAN is not an IETF standard, should we be defining a standard for
running BFD on it? Should we define BFD over Geneve instead which is the
official WG selection? Is that going to be a separate document?

Technical

Section 1:

This part needs to be rewritten:
The individual racks may be part of a different Layer 3 network, or they
could be in a single Layer 2 network. The VXLAN segments/overlays are
overlaid on top of Layer 3 network. A VM can communicate with another VM
only if they are on the same VXLAN segment.
It's hard to parse and, given IRB, the last sentence above is wrong.

Section 3:
Most deployments will have VMs with only L2 capabilities that
may not support L3.
Are you suggesting most deployments have VMs with no IP
addresses/configuration?
Having a hierarchical OAM model helps localize faults though it requires
additional consideration.
What are the additional considerations?

Would be useful to add a reference to RFC 8293 in case the reader would
like to know more about service nodes.

Section 4
Separate BFD sessions can be established between the VTEPs (IP1 and IP2)
for monitoring each of the VXLAN tunnels (VNI 100 and 200).
IMO, the document should mention that this could lead to scaling issues
given that VTEPs can support well in excess of 4K VNIs. Additionally, we
should mention that with IRB, a given VNI may not even exist on the
destination VTEP. Finally, what is the benefit of doing this? There may
be certain corner cases where it's useful (vs a single BFD session between
the VTEPs for all VNIs) but it would be good to explain what those are.

Sections 5.1 and 6.1

In 5.1 we have
The inner MAC frame carrying the BFD payload has the
following format:
.... Source IP: IP address of the originating VTEP. Destination IP: IP
address of the terminating VTEP.
In 6.1 we have
Since multiple BFD sessions may be running between two
VTEPs, there needs to be a mechanism for demultiplexing received BF

packets to the proper session. The procedure for demultiplexing
packets with Your Discriminator equal to 0 is different from[RFC5880
<https://tools.ietf.org/html/rfc5880>].

*For such packets, the BFD session MUST be identified*

*using the inner headers, i.e., the source IP and the destination IP
present in the IP header carried by the payload of the VXLAN*

*encapsulated packet.*
How does this work if the source IP and dest IP are the same as specified
in 5.1?

Editorial

- Terminology section should be renamed to acronyms.
- Document would benefit from a thorough editorial scrub, but maybe that
will happen once it gets to the RFC editor.

Section 1
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348
<https://tools.ietf.org/html/rfc7348>]. provides an encapsulation scheme
that allows virtual machines (VMs) to communicate in a data center network.
This is not accurate. VXLAN allows you to implement an overlay to decouple
the address space of the attached hosts from that of the network.

Section 7

VTEP's -> VTEPs
Greg Mirsky
2018-11-09 01:50:27 UTC
Permalink
Hi Anoop,
thank you for your thorough review and the comments. I'm traveling over the
weekend and will respond in details later next week.

Regards,
Greg
Post by Anoop Ghanwani
Here are my comments.
Thanks,
Anoop
==
Philosophical
Since VXLAN is not an IETF standard, should we be defining a standard for
running BFD on it? Should we define BFD over Geneve instead which is the
official WG selection? Is that going to be a separate document?
Technical
The individual racks may be part of a different Layer 3 network, or they
could be in a single Layer 2 network. The VXLAN segments/overlays are
overlaid on top of Layer 3 network. A VM can communicate with another VM
only if they are on the same VXLAN segment.
It's hard to parse and, given IRB, the last sentence above is wrong.
Most deployments will have VMs with only L2 capabilities that
may not support L3.
Are you suggesting most deployments have VMs with no IP
addresses/configuration?
Having a hierarchical OAM model helps localize faults though it requires
additional consideration.
What are the additional considerations?
Would be useful to add a reference to RFC 8293 in case the reader would
like to know more about service nodes.
Section 4
Separate BFD sessions can be established between the VTEPs (IP1 and IP2)
for monitoring each of the VXLAN tunnels (VNI 100 and 200).
IMO, the document should mention that this could lead to scaling issues
given that VTEPs can support well in excess of 4K VNIs. Additionally, we
should mention that with IRB, a given VNI may not even exist on the
destination VTEP. Finally, what is the benefit of doing this? There may
be certain corner cases where it's useful (vs a single BFD session between
the VTEPs for all VNIs) but it would be good to explain what those are.
Sections 5.1 and 6.1
In 5.1 we have
The inner MAC frame carrying the BFD payload has the
... Source IP: IP address of the originating VTEP. Destination IP: IP
address of the terminating VTEP.
In 6.1 we have
Since multiple BFD sessions may be running between two
VTEPs, there needs to be a mechanism for demultiplexing received BF
packets to the proper session. The procedure for demultiplexing
packets with Your Discriminator equal to 0 is different from[RFC5880 <https://tools.ietf.org/html/rfc5880>].
*For such packets, the BFD session MUST be identified*
*using the inner headers, i.e., the source IP and the destination IP
present in the IP header carried by the payload of the VXLAN*
*encapsulated packet.*
How does this work if the source IP and dest IP are the same as specified
in 5.1?
Editorial
- Terminology section should be renamed to acronyms.
- Document would benefit from a thorough editorial scrub, but maybe that
will happen once it gets to the RFC editor.
Section 1
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348
<https://tools.ietf.org/html/rfc7348>]. provides an encapsulation scheme
that allows virtual machines (VMs) to communicate in a data center network.
This is not accurate. VXLAN allows you to implement an overlay to
decouple the address space of the attached hosts from that of the network.
Section 7
VTEP's -> VTEPs
Greg Mirsky
2018-11-09 01:50:59 UTC
Permalink
Hi Anoop,
thank you for your thorough review and the comments you've shared. Please
find my answers below tagged GIM>>.

Regards,
Greg
Post by Anoop Ghanwani
Here are my comments.
Thanks,
Anoop
==
Philosophical
Since VXLAN is not an IETF standard, should we be defining a standard for
running BFD on it? Should we define BFD over Geneve instead which is the
official WG selection? Is that going to be a separate document?
Technical
The individual racks may be part of a different Layer 3 network, or they
could be in a single Layer 2 network. The VXLAN segments/overlays are
overlaid on top of Layer 3 network. A VM can communicate with another VM
only if they are on the same VXLAN segment.
It's hard to parse and, given IRB, the last sentence above is wrong.
Most deployments will have VMs with only L2 capabilities that
may not support L3.
Are you suggesting most deployments have VMs with no IP
addresses/configuration?
Having a hierarchical OAM model helps localize faults though it requires
additional consideration.
What are the additional considerations?
Would be useful to add a reference to RFC 8293 in case the reader would
like to know more about service nodes.
Section 4
Separate BFD sessions can be established between the VTEPs (IP1 and IP2)
for monitoring each of the VXLAN tunnels (VNI 100 and 200).
IMO, the document should mention that this could lead to scaling issues
given that VTEPs can support well in excess of 4K VNIs. Additionally, we
should mention that with IRB, a given VNI may not even exist on the
destination VTEP. Finally, what is the benefit of doing this? There may
be certain corner cases where it's useful (vs a single BFD session between
the VTEPs for all VNIs) but it would be good to explain what those are.
Sections 5.1 and 6.1
In 5.1 we have
The inner MAC frame carrying the BFD payload has the
... Source IP: IP address of the originating VTEP. Destination IP: IP
address of the terminating VTEP.
In 6.1 we have
Since multiple BFD sessions may be running between two
VTEPs, there needs to be a mechanism for demultiplexing received BF
packets to the proper session. The procedure for demultiplexing
packets with Your Discriminator equal to 0 is different from[RFC5880 <https://tools.ietf.org/html/rfc5880>].
*For such packets, the BFD session MUST be identified*
*using the inner headers, i.e., the source IP and the destination IP
present in the IP header carried by the payload of the VXLAN*
*encapsulated packet.*
How does this work if the source IP and dest IP are the same as specified
in 5.1?
Editorial
- Terminology section should be renamed to acronyms.
- Document would benefit from a thorough editorial scrub, but maybe that
will happen once it gets to the RFC editor.
Section 1
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348
<https://tools.ietf.org/html/rfc7348>]. provides an encapsulation scheme
that allows virtual machines (VMs) to communicate in a data center network.
This is not accurate. VXLAN allows you to implement an overlay to
decouple the address space of the attached hosts from that of the network.
Section 7
VTEP's -> VTEPs
Greg Mirsky
2018-11-13 19:34:30 UTC
Permalink
Hi Anoop,
many thanks for the thorough review and detailed comments. Please find my
answers, this time for real, in-line tagged GIM>>.

Regards,
Greg
Post by Anoop Ghanwani
Here are my comments.
Thanks,
Anoop
==
Philosophical
Since VXLAN is not an IETF standard, should we be defining a standard for
running BFD on it? Should we define BFD over Geneve instead which is the
official WG selection? Is that going to be a separate document?
GIM>> IS-IS is not on the Standard track either but that had not prevented
IETF from developing tens of standard track RFCs using RFC 1142 as the
normative reference until RFC 7142 re-classified it as historical. A
similar path was followed with IS-IS-TE by publishing RFC 3784 until it was
obsoleted by RFC 5305 four years later. I understand that Down Reference,
i.e., using informational RFC as the normative reference, is not an unusual
situation.
Technical
The individual racks may be part of a different Layer 3 network, or they
could be in a single Layer 2 network. The VXLAN segments/overlays are
overlaid on top of Layer 3 network. A VM can communicate with another VM
only if they are on the same VXLAN segment.
It's hard to parse and, given IRB,
GIM>> Would the following text be acceptable:
OLD TEXT:
VXLAN is typically deployed in data centers interconnecting
virtualized hosts, which may be spread across multiple racks. The
individual racks may be part of a different Layer 3 network, or they
could be in a single Layer 2 network. The VXLAN segments/overlays
are overlaid on top of Layer 3 network.
NEW TEXT:
VXLAN is typically deployed in data centers interconnecting virtualized
hosts of a tenant. VXLAN addresses requirements of the Layer 2 and
Layer 3 data center network infrastructure in the presence of VMs in
a multi-tenant environment, discussed in section 3 [RFC7348], by
providing Layer 2 overlay scheme on a Layer 3 network.

A VM can communicate with another VM only if they are on the same
VXLAN segment.
Post by Anoop Ghanwani
the last sentence above is wrong.
GIM>> Section 4 in RFC 7348 states:
Only VMs within the same VXLAN segment can communicate with each other.
Post by Anoop Ghanwani
Most deployments will have VMs with only L2 capabilities that
may not support L3.
Are you suggesting most deployments have VMs with no IP
addresses/configuration?
GIM>> Would re-word as follows:
OLD TEXT:
Most deployments will have VMs with only L2 capabilities that
may not support L3.
NEW TEXT:
Deployments may have VMs with only L2 capabilities that do not support L3.
Post by Anoop Ghanwani
Having a hierarchical OAM model helps localize faults though it requires
additional consideration.
What are the additional considerations?
GIM>> For example, coordination of BFD intervals across the OAM layers.
Post by Anoop Ghanwani
Would be useful to add a reference to RFC 8293 in case the reader would
like to know more about service nodes.
GIM>> I have to admit that I don't find how RFC 8293 A Framework for
Multicast in Network Virtualization over Layer 3 is related to this
document. Please help with additional reference to the text of the
document.
Post by Anoop Ghanwani
Section 4
Separate BFD sessions can be established between the VTEPs (IP1 and IP2)
for monitoring each of the VXLAN tunnels (VNI 100 and 200).
IMO, the document should mention that this could lead to scaling issues
given that VTEPs can support well in excess of 4K VNIs. Additionally, we
should mention that with IRB, a given VNI may not even exist on the
destination VTEP. Finally, what is the benefit of doing this? There may
be certain corner cases where it's useful (vs a single BFD session between
the VTEPs for all VNIs) but it would be good to explain what those are.
GIM>> Will add text in the Security Considerations section that VTEPs
should have limit on number of BFD sessions.
Post by Anoop Ghanwani
Sections 5.1 and 6.1
In 5.1 we have
The inner MAC frame carrying the BFD payload has the
... Source IP: IP address of the originating VTEP. Destination IP: IP
address of the terminating VTEP.
In 6.1 we have
Since multiple BFD sessions may be running between two
VTEPs, there needs to be a mechanism for demultiplexing received BF
packets to the proper session. The procedure for demultiplexing
packets with Your Discriminator equal to 0 is different from[RFC5880 <https://tools.ietf.org/html/rfc5880>].
*For such packets, the BFD session MUST be identified*
*using the inner headers, i.e., the source IP and the destination IP
present in the IP header carried by the payload of the VXLAN*
*encapsulated packet.*
How does this work if the source IP and dest IP are the same as specified
in 5.1?
GIM>> You're right, Destination and source IP addresses likely are the same
in this case. Will add that the source UDP port number, along with the pair
of IP addresses, MUST be used to demux received BFD control packets. Would
you agree that will be sufficient?
Post by Anoop Ghanwani
Editorial
- Terminology section should be renamed to acronyms.
GIM>> Accepted
Post by Anoop Ghanwani
- Document would benefit from a thorough editorial scrub, but maybe that
will happen once it gets to the RFC editor.
GIM>> Will certainly have helpful comments from ADs and RFC editor.
Post by Anoop Ghanwani
Section 1
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348
<https://tools.ietf.org/html/rfc7348>]. provides an encapsulation scheme
that allows virtual machines (VMs) to communicate in a data center network.
This is not accurate. VXLAN allows you to implement an overlay to
decouple the address space of the attached hosts from that of the network.
GIM>> Thank you for the suggested text. Will change as follows:
OLD TEXT:
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348]. provides
an encapsulation scheme that allows virtual machines (VMs) to
communicate in a data center network.
NEW TEXT:
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348]. provides
an encapsulation scheme that allows building an overlay network by
decoupling the address space of the attached virtual hosts from that of
the network.
Post by Anoop Ghanwani
Section 7
VTEP's -> VTEPs
GIM>> Yes, thank you.
Anoop Ghanwani
2018-11-13 20:30:27 UTC
Permalink
Hi Greg,

Please see inline prefixed with [ag].

Thanks,
Anoop
Post by Greg Mirsky
Hi Anoop,
many thanks for the thorough review and detailed comments. Please find my
answers, this time for real, in-line tagged GIM>>.
Regards,
Greg
Post by Anoop Ghanwani
Here are my comments.
Thanks,
Anoop
==
Philosophical
Since VXLAN is not an IETF standard, should we be defining a standard for
running BFD on it? Should we define BFD over Geneve instead which is the
official WG selection? Is that going to be a separate document?
GIM>> IS-IS is not on the Standard track either but that had not
prevented IETF from developing tens of standard track RFCs using RFC 1142
as the normative reference until RFC 7142 re-classified it as historical. A
similar path was followed with IS-IS-TE by publishing RFC 3784 until it was
obsoleted by RFC 5305 four years later. I understand that Down Reference,
i.e., using informational RFC as the normative reference, is not an unusual
situation.
[ag] OK. I'm not an expert on this part so unless someone else that is an
expert (chairs, AD?) can comment on it, I'll just let it go.
Post by Greg Mirsky
Post by Anoop Ghanwani
Technical
The individual racks may be part of a different Layer 3 network, or they
could be in a single Layer 2 network. The VXLAN segments/overlays are
overlaid on top of Layer 3 network. A VM can communicate with another VM
only if they are on the same VXLAN segment.
It's hard to parse and, given IRB,
VXLAN is typically deployed in data centers interconnecting
virtualized hosts, which may be spread across multiple racks. The
individual racks may be part of a different Layer 3 network, or they
could be in a single Layer 2 network. The VXLAN segments/overlays
are overlaid on top of Layer 3 network.
VXLAN is typically deployed in data centers interconnecting virtualized
hosts of a tenant. VXLAN addresses requirements of the Layer 2 and
Layer 3 data center network infrastructure in the presence of VMs in
a multi-tenant environment, discussed in section 3 [RFC7348], by
providing Layer 2 overlay scheme on a Layer 3 network.
[ag] This is a lot better.
Post by Greg Mirsky
A VM can communicate with another VM only if they are on the same
VXLAN segment.
Post by Anoop Ghanwani
the last sentence above is wrong.
Only VMs within the same VXLAN segment can communicate with each other.
[ag] VMs on different segments can communicate using routing/IRB, so even
RFC 7348 is wrong. Perhaps the text should be modified so say -- "In the
absence of a router in the overlay, a VM can communicate...".
Post by Greg Mirsky
Post by Anoop Ghanwani
Most deployments will have VMs with only L2 capabilities that
may not support L3.
Are you suggesting most deployments have VMs with no IP
addresses/configuration?
Most deployments will have VMs with only L2 capabilities that
may not support L3.
Deployments may have VMs with only L2 capabilities that do not support L3.
[ag] I still don't understand this. What does it mean for a VM to not
support L3? No IP address, no default GW, something else?
Post by Greg Mirsky
Post by Anoop Ghanwani
Having a hierarchical OAM model helps localize faults though it requires
additional consideration.
What are the additional considerations?
GIM>> For example, coordination of BFD intervals across the OAM layers.
[ag] Can we mention them in the draft?
Post by Greg Mirsky
Post by Anoop Ghanwani
Would be useful to add a reference to RFC 8293 in case the reader would
like to know more about service nodes.
GIM>> I have to admit that I don't find how RFC 8293 A Framework for
Multicast in Network Virtualization over Layer 3 is related to this
document. Please help with additional reference to the text of the
document.
[ag] The RFC discusses the use of service nodes which is mentioned here.
Post by Greg Mirsky
Post by Anoop Ghanwani
Section 4
Separate BFD sessions can be established between the VTEPs (IP1 and IP2)
for monitoring each of the VXLAN tunnels (VNI 100 and 200).
IMO, the document should mention that this could lead to scaling issues
given that VTEPs can support well in excess of 4K VNIs. Additionally, we
should mention that with IRB, a given VNI may not even exist on the
destination VTEP. Finally, what is the benefit of doing this? There may
be certain corner cases where it's useful (vs a single BFD session between
the VTEPs for all VNIs) but it would be good to explain what those are.
GIM>> Will add text in the Security Considerations section that VTEPs
should have limit on number of BFD sessions.
[ag] I was hoping for two things:
- A mention about the scalability issue right where per-VNI BFD is
discussed. (Not sure why that is a security issue/consideration.)
- What is the benefit of running BFD per VNI between a pair of VTEPs?
Post by Greg Mirsky
Post by Anoop Ghanwani
Sections 5.1 and 6.1
In 5.1 we have
The inner MAC frame carrying the BFD payload has the
... Source IP: IP address of the originating VTEP. Destination IP: IP
address of the terminating VTEP.
In 6.1 we have
Since multiple BFD sessions may be running between two
VTEPs, there needs to be a mechanism for demultiplexing received BF
packets to the proper session. The procedure for demultiplexing
packets with Your Discriminator equal to 0 is different from[RFC5880 <https://tools.ietf.org/html/rfc5880>].
*For such packets, the BFD session MUST be identified*
*using the inner headers, i.e., the source IP and the destination IP
present in the IP header carried by the payload of the VXLAN*
*encapsulated packet.*
How does this work if the source IP and dest IP are the same as specified
in 5.1?
GIM>> You're right, Destination and source IP addresses likely are the
same in this case. Will add that the source UDP port number, along with the
pair of IP addresses, MUST be used to demux received BFD control packets.
Would you agree that will be sufficient?
[ag] Yes, I think that should work.
Post by Greg Mirsky
Post by Anoop Ghanwani
Editorial
[ag] Agree with all comments on this section.
Post by Greg Mirsky
Post by Anoop Ghanwani
- Terminology section should be renamed to acronyms.
GIM>> Accepted
Post by Anoop Ghanwani
- Document would benefit from a thorough editorial scrub, but maybe that
will happen once it gets to the RFC editor.
GIM>> Will certainly have helpful comments from ADs and RFC editor.
Post by Anoop Ghanwani
Section 1
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348
<https://tools.ietf.org/html/rfc7348>]. provides an encapsulation scheme
that allows virtual machines (VMs) to communicate in a data center network.
This is not accurate. VXLAN allows you to implement an overlay to
decouple the address space of the attached hosts from that of the network.
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348]. provides
an encapsulation scheme that allows virtual machines (VMs) to
communicate in a data center network.
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348]. provides
an encapsulation scheme that allows building an overlay network by
decoupling the address space of the attached virtual hosts from that of
the network.
Post by Anoop Ghanwani
Section 7
VTEP's -> VTEPs
GIM>> Yes, thank you.
Greg Mirsky
2018-11-14 17:45:19 UTC
Permalink
Hi Anoop,
thank you for the expedient response. I am glad that some of my responses
have addressed your concerns. Please find followup notes in-line tagged
GIM2>>. I've attached the diff to highlight the updates applied in the
working version. Let me know if these are acceptable changes.

Regards,
Greg
Post by Anoop Ghanwani
Hi Greg,
Please see inline prefixed with [ag].
Thanks,
Anoop
Post by Greg Mirsky
Hi Anoop,
many thanks for the thorough review and detailed comments. Please find my
answers, this time for real, in-line tagged GIM>>.
Regards,
Greg
Post by Anoop Ghanwani
Here are my comments.
Thanks,
Anoop
==
Philosophical
Since VXLAN is not an IETF standard, should we be defining a standard
for running BFD on it? Should we define BFD over Geneve instead which is
the official WG selection? Is that going to be a separate document?
GIM>> IS-IS is not on the Standard track either but that had not
prevented IETF from developing tens of standard track RFCs using RFC 1142
as the normative reference until RFC 7142 re-classified it as historical. A
similar path was followed with IS-IS-TE by publishing RFC 3784 until it was
obsoleted by RFC 5305 four years later. I understand that Down Reference,
i.e., using informational RFC as the normative reference, is not an unusual
situation.
[ag] OK. I'm not an expert on this part so unless someone else that is an
expert (chairs, AD?) can comment on it, I'll just let it go.
Post by Greg Mirsky
Post by Anoop Ghanwani
Technical
The individual racks may be part of a different Layer 3 network, or they
could be in a single Layer 2 network. The VXLAN segments/overlays are
overlaid on top of Layer 3 network. A VM can communicate with another VM
only if they are on the same VXLAN segment.
It's hard to parse and, given IRB,
VXLAN is typically deployed in data centers interconnecting
virtualized hosts, which may be spread across multiple racks. The
individual racks may be part of a different Layer 3 network, or they
could be in a single Layer 2 network. The VXLAN segments/overlays
are overlaid on top of Layer 3 network.
VXLAN is typically deployed in data centers interconnecting virtualized
hosts of a tenant. VXLAN addresses requirements of the Layer 2 and
Layer 3 data center network infrastructure in the presence of VMs in
a multi-tenant environment, discussed in section 3 [RFC7348], by
providing Layer 2 overlay scheme on a Layer 3 network.
[ag] This is a lot better.
Post by Greg Mirsky
A VM can communicate with another VM only if they are on the same
VXLAN segment.
Post by Anoop Ghanwani
the last sentence above is wrong.
Only VMs within the same VXLAN segment can communicate with each other.
[ag] VMs on different segments can communicate using routing/IRB, so even
RFC 7348 is wrong. Perhaps the text should be modified so say -- "In the
absence of a router in the overlay, a VM can communicate...".
Post by Greg Mirsky
Post by Anoop Ghanwani
Most deployments will have VMs with only L2 capabilities that
may not support L3.
Are you suggesting most deployments have VMs with no IP
addresses/configuration?
Most deployments will have VMs with only L2 capabilities that
may not support L3.
Deployments may have VMs with only L2 capabilities that do not support L3.
[ag] I still don't understand this. What does it mean for a VM to not
support L3? No IP address, no default GW, something else?
GIM2>> VM communicates with its VTEP which, in turn, originates VXLAN
tunnel. VM is not required to have IP address as it is VTEP's IP address
that VM's MAC is associated with. As for gateway, RFC 7348 discusses VXLAN
gateway as the device that forwards traffice between VXLAN and non-VXLAN
domains. Considering all that, would the following change be acceptable:
OLD TEXT:
Most deployments will have VMs with only L2 capabilities that
may not support L3.
NEW TEXT:
Most deployments will have VMs with only L2 capabilities and not have an
IP address assigned.
Post by Anoop Ghanwani
Post by Greg Mirsky
Post by Anoop Ghanwani
Having a hierarchical OAM model helps localize faults though it requires
additional consideration.
What are the additional considerations?
GIM>> For example, coordination of BFD intervals across the OAM layers.
[ag] Can we mention them in the draft?
Post by Greg Mirsky
Post by Anoop Ghanwani
Would be useful to add a reference to RFC 8293 in case the reader would
like to know more about service nodes.
GIM>> I have to admit that I don't find how RFC 8293 A Framework for
Multicast in Network Virtualization over Layer 3 is related to this
document. Please help with additional reference to the text of the
document.
[ag] The RFC discusses the use of service nodes which is mentioned here.
Post by Greg Mirsky
Post by Anoop Ghanwani
Section 4
Separate BFD sessions can be established between the VTEPs (IP1 and IP2)
for monitoring each of the VXLAN tunnels (VNI 100 and 200).
IMO, the document should mention that this could lead to scaling issues
given that VTEPs can support well in excess of 4K VNIs. Additionally, we
should mention that with IRB, a given VNI may not even exist on the
destination VTEP. Finally, what is the benefit of doing this? There may
be certain corner cases where it's useful (vs a single BFD session between
the VTEPs for all VNIs) but it would be good to explain what those are.
GIM>> Will add text in the Security Considerations section that VTEPs
should have limit on number of BFD sessions.
- A mention about the scalability issue right where per-VNI BFD is
discussed. (Not sure why that is a security issue/consideration.)
GIM2>> I've added the following sentense in both places:
The implementation SHOULD have a reasonable upper bound on the number of
BFD sessions that can be created between the same pair of VTEPs.
Post by Anoop Ghanwani
- What is the benefit of running BFD per VNI between a pair of VTEPs?
GIM2>> An alternative would be to run CFM between VMs, if there's the need
to monitor liveliness of the particular VM. Again, this is optional.
Post by Anoop Ghanwani
Post by Greg Mirsky
Post by Anoop Ghanwani
Sections 5.1 and 6.1
In 5.1 we have
The inner MAC frame carrying the BFD payload has the
... Source IP: IP address of the originating VTEP. Destination IP: IP
address of the terminating VTEP.
In 6.1 we have
Since multiple BFD sessions may be running between two
VTEPs, there needs to be a mechanism for demultiplexing received BF
packets to the proper session. The procedure for demultiplexing
packets with Your Discriminator equal to 0 is different from[RFC5880 <https://tools.ietf.org/html/rfc5880>].
*For such packets, the BFD session MUST be identified*
*using the inner headers, i.e., the source IP and the destination IP
present in the IP header carried by the payload of the VXLAN*
*encapsulated packet.*
How does this work if the source IP and dest IP are the same as
specified in 5.1?
GIM>> You're right, Destination and source IP addresses likely are the
same in this case. Will add that the source UDP port number, along with the
pair of IP addresses, MUST be used to demux received BFD control packets.
Would you agree that will be sufficient?
[ag] Yes, I think that should work.
Post by Greg Mirsky
Post by Anoop Ghanwani
Editorial
[ag] Agree with all comments on this section.
Post by Greg Mirsky
Post by Anoop Ghanwani
- Terminology section should be renamed to acronyms.
GIM>> Accepted
Post by Anoop Ghanwani
- Document would benefit from a thorough editorial scrub, but maybe that
will happen once it gets to the RFC editor.
GIM>> Will certainly have helpful comments from ADs and RFC editor.
Post by Anoop Ghanwani
Section 1
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348
<https://tools.ietf.org/html/rfc7348>]. provides an encapsulation
scheme that allows virtual machines (VMs) to communicate in a data center
network.
This is not accurate. VXLAN allows you to implement an overlay to
decouple the address space of the attached hosts from that of the network.
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348]. provides
an encapsulation scheme that allows virtual machines (VMs) to
communicate in a data center network.
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348]. provides
an encapsulation scheme that allows building an overlay network by
decoupling the address space of the attached virtual hosts from that of
the network.
Post by Anoop Ghanwani
Section 7
VTEP's -> VTEPs
GIM>> Yes, thank you.
Anoop Ghanwani
2018-11-15 07:00:37 UTC
Permalink
Hi Greg,

Please see inline prefixed with [ag2].

Thanks,
Anoop
Post by Greg Mirsky
Hi Anoop,
thank you for the expedient response. I am glad that some of my responses
have addressed your concerns. Please find followup notes in-line tagged
GIM2>>. I've attached the diff to highlight the updates applied in the
working version. Let me know if these are acceptable changes.
Regards,
Greg
Post by Anoop Ghanwani
Hi Greg,
Please see inline prefixed with [ag].
Thanks,
Anoop
Post by Greg Mirsky
Hi Anoop,
many thanks for the thorough review and detailed comments. Please find
my answers, this time for real, in-line tagged GIM>>.
Regards,
Greg
Post by Anoop Ghanwani
Here are my comments.
Thanks,
Anoop
==
Philosophical
Since VXLAN is not an IETF standard, should we be defining a standard
for running BFD on it? Should we define BFD over Geneve instead which is
the official WG selection? Is that going to be a separate document?
GIM>> IS-IS is not on the Standard track either but that had not
prevented IETF from developing tens of standard track RFCs using RFC 1142
as the normative reference until RFC 7142 re-classified it as historical. A
similar path was followed with IS-IS-TE by publishing RFC 3784 until it was
obsoleted by RFC 5305 four years later. I understand that Down Reference,
i.e., using informational RFC as the normative reference, is not an unusual
situation.
[ag] OK. I'm not an expert on this part so unless someone else that is
an expert (chairs, AD?) can comment on it, I'll just let it go.
Post by Greg Mirsky
Post by Anoop Ghanwani
Technical
The individual racks may be part of a different Layer 3 network, or
they could be in a single Layer 2 network. The VXLAN segments/overlays are
overlaid on top of Layer 3 network. A VM can communicate with another VM
only if they are on the same VXLAN segment.
It's hard to parse and, given IRB,
VXLAN is typically deployed in data centers interconnecting
virtualized hosts, which may be spread across multiple racks. The
individual racks may be part of a different Layer 3 network, or they
could be in a single Layer 2 network. The VXLAN segments/overlays
are overlaid on top of Layer 3 network.
VXLAN is typically deployed in data centers interconnecting virtualized
hosts of a tenant. VXLAN addresses requirements of the Layer 2 and
Layer 3 data center network infrastructure in the presence of VMs in
a multi-tenant environment, discussed in section 3 [RFC7348], by
providing Layer 2 overlay scheme on a Layer 3 network.
[ag] This is a lot better.
Post by Greg Mirsky
A VM can communicate with another VM only if they are on the same
VXLAN segment.
Post by Anoop Ghanwani
the last sentence above is wrong.
Only VMs within the same VXLAN segment can communicate with each other.
[ag] VMs on different segments can communicate using routing/IRB, so even
RFC 7348 is wrong. Perhaps the text should be modified so say -- "In the
absence of a router in the overlay, a VM can communicate...".
Post by Greg Mirsky
Post by Anoop Ghanwani
Most deployments will have VMs with only L2 capabilities that
may not support L3.
Are you suggesting most deployments have VMs with no IP
addresses/configuration?
Most deployments will have VMs with only L2 capabilities that
may not support L3.
Deployments may have VMs with only L2 capabilities that do not support L3.
[ag] I still don't understand this. What does it mean for a VM to not
support L3? No IP address, no default GW, something else?
GIM2>> VM communicates with its VTEP which, in turn, originates VXLAN
tunnel. VM is not required to have IP address as it is VTEP's IP address
that VM's MAC is associated with. As for gateway, RFC 7348 discusses VXLAN
gateway as the device that forwards traffice between VXLAN and non-VXLAN
Most deployments will have VMs with only L2 capabilities that
may not support L3.
Most deployments will have VMs with only L2 capabilities and not have an
IP address assigned.
[ag2] Do you have a reference for this (i.e. that most deployments have VMs
without an IP address)? Normally I would think VMs would have an IP
address. It's just that they are segregated into segments and, without an
intervening router, they are restricted to communicate only within their
subnet.
Post by Greg Mirsky
Post by Anoop Ghanwani
Post by Greg Mirsky
Post by Anoop Ghanwani
Having a hierarchical OAM model helps localize faults though it
requires additional consideration.
What are the additional considerations?
GIM>> For example, coordination of BFD intervals across the OAM layers.
[ag] Can we mention them in the draft?
Post by Greg Mirsky
Post by Anoop Ghanwani
Would be useful to add a reference to RFC 8293 in case the reader would
like to know more about service nodes.
GIM>> I have to admit that I don't find how RFC 8293 A Framework for
Multicast in Network Virtualization over Layer 3 is related to this
document. Please help with additional reference to the text of the
document.
[ag] The RFC discusses the use of service nodes which is mentioned here.
Post by Greg Mirsky
Post by Anoop Ghanwani
Section 4
Separate BFD sessions can be established between the VTEPs (IP1 and
IP2) for monitoring each of the VXLAN tunnels (VNI 100 and 200).
IMO, the document should mention that this could lead to scaling issues
given that VTEPs can support well in excess of 4K VNIs. Additionally, we
should mention that with IRB, a given VNI may not even exist on the
destination VTEP. Finally, what is the benefit of doing this? There may
be certain corner cases where it's useful (vs a single BFD session between
the VTEPs for all VNIs) but it would be good to explain what those are.
GIM>> Will add text in the Security Considerations section that VTEPs
should have limit on number of BFD sessions.
- A mention about the scalability issue right where per-VNI BFD is
discussed. (Not sure why that is a security issue/consideration.)
The implementation SHOULD have a reasonable upper bound on the number of
BFD sessions that can be created between the same pair of VTEPs.
[ag2] What is the criteria for determining what is reasonable?
Post by Greg Mirsky
- What is the benefit of running BFD per VNI between a pair of VTEPs?
GIM2>> An alternative would be to run CFM between VMs, if there's the need
to monitor liveliness of the particular VM. Again, this is optional.
[ag2] I'm not sure how running per-VNI BFD between the VTEPs allows one to
monitor the liveliness of VMs.
Post by Greg Mirsky
Post by Anoop Ghanwani
Post by Greg Mirsky
Post by Anoop Ghanwani
Sections 5.1 and 6.1
In 5.1 we have
The inner MAC frame carrying the BFD payload has the
... Source IP: IP address of the originating VTEP. Destination IP: IP
address of the terminating VTEP.
In 6.1 we have
Since multiple BFD sessions may be running between two
VTEPs, there needs to be a mechanism for demultiplexing received BF
packets to the proper session. The procedure for demultiplexing
packets with Your Discriminator equal to 0 is different from[RFC5880 <https://tools.ietf.org/html/rfc5880>].
*For such packets, the BFD session MUST be identified*
*using the inner headers, i.e., the source IP and the destination IP
present in the IP header carried by the payload of the VXLAN*
*encapsulated packet.*
How does this work if the source IP and dest IP are the same as
specified in 5.1?
GIM>> You're right, Destination and source IP addresses likely are the
same in this case. Will add that the source UDP port number, along with the
pair of IP addresses, MUST be used to demux received BFD control packets.
Would you agree that will be sufficient?
[ag] Yes, I think that should work.
Post by Greg Mirsky
Post by Anoop Ghanwani
Editorial
[ag] Agree with all comments on this section.
Post by Greg Mirsky
Post by Anoop Ghanwani
- Terminology section should be renamed to acronyms.
GIM>> Accepted
Post by Anoop Ghanwani
- Document would benefit from a thorough editorial scrub, but maybe
that will happen once it gets to the RFC editor.
GIM>> Will certainly have helpful comments from ADs and RFC editor.
Post by Anoop Ghanwani
Section 1
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348
<https://tools.ietf.org/html/rfc7348>]. provides an encapsulation
scheme that allows virtual machines (VMs) to communicate in a data center
network.
This is not accurate. VXLAN allows you to implement an overlay to
decouple the address space of the attached hosts from that of the network.
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348]. provides
an encapsulation scheme that allows virtual machines (VMs) to
communicate in a data center network.
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348]. provides
an encapsulation scheme that allows building an overlay network by
decoupling the address space of the attached virtual hosts from that
of the network.
Post by Anoop Ghanwani
Section 7
VTEP's -> VTEPs
GIM>> Yes, thank you.
Greg Mirsky
2018-11-17 01:28:55 UTC
Permalink
Hi Anoop,
thank you for the discussion. Please find my responses tagged GIM3>>. Also,
attached diff and the updated working version of the draft. Hope we're
converging.

Regards,
Greg
Post by Anoop Ghanwani
Hi Greg,
Please see inline prefixed with [ag2].
Thanks,
Anoop
Post by Greg Mirsky
Hi Anoop,
thank you for the expedient response. I am glad that some of my responses
have addressed your concerns. Please find followup notes in-line tagged
GIM2>>. I've attached the diff to highlight the updates applied in the
working version. Let me know if these are acceptable changes.
Regards,
Greg
Post by Anoop Ghanwani
Hi Greg,
Please see inline prefixed with [ag].
Thanks,
Anoop
Post by Greg Mirsky
Hi Anoop,
many thanks for the thorough review and detailed comments. Please find
my answers, this time for real, in-line tagged GIM>>.
Regards,
Greg
Post by Anoop Ghanwani
Here are my comments.
Thanks,
Anoop
==
Philosophical
Since VXLAN is not an IETF standard, should we be defining a standard
for running BFD on it? Should we define BFD over Geneve instead which is
the official WG selection? Is that going to be a separate document?
GIM>> IS-IS is not on the Standard track either but that had not
prevented IETF from developing tens of standard track RFCs using RFC 1142
as the normative reference until RFC 7142 re-classified it as historical. A
similar path was followed with IS-IS-TE by publishing RFC 3784 until it was
obsoleted by RFC 5305 four years later. I understand that Down Reference,
i.e., using informational RFC as the normative reference, is not an unusual
situation.
[ag] OK. I'm not an expert on this part so unless someone else that is
an expert (chairs, AD?) can comment on it, I'll just let it go.
Post by Greg Mirsky
Post by Anoop Ghanwani
Technical
The individual racks may be part of a different Layer 3 network, or
they could be in a single Layer 2 network. The VXLAN segments/overlays are
overlaid on top of Layer 3 network. A VM can communicate with another VM
only if they are on the same VXLAN segment.
It's hard to parse and, given IRB,
VXLAN is typically deployed in data centers interconnecting
virtualized hosts, which may be spread across multiple racks. The
individual racks may be part of a different Layer 3 network, or they
could be in a single Layer 2 network. The VXLAN segments/overlays
are overlaid on top of Layer 3 network.
VXLAN is typically deployed in data centers interconnecting virtualized
hosts of a tenant. VXLAN addresses requirements of the Layer 2 and
Layer 3 data center network infrastructure in the presence of VMs in
a multi-tenant environment, discussed in section 3 [RFC7348], by
providing Layer 2 overlay scheme on a Layer 3 network.
[ag] This is a lot better.
Post by Greg Mirsky
A VM can communicate with another VM only if they are on the same
VXLAN segment.
Post by Anoop Ghanwani
the last sentence above is wrong.
Only VMs within the same VXLAN segment can communicate with each other.
[ag] VMs on different segments can communicate using routing/IRB, so
even RFC 7348 is wrong. Perhaps the text should be modified so say -- "In
the absence of a router in the overlay, a VM can communicate...".
Post by Greg Mirsky
Post by Anoop Ghanwani
Most deployments will have VMs with only L2 capabilities that
may not support L3.
Are you suggesting most deployments have VMs with no IP
addresses/configuration?
Most deployments will have VMs with only L2 capabilities that
may not support L3.
Deployments may have VMs with only L2 capabilities that do not support L3.
[ag] I still don't understand this. What does it mean for a VM to not
support L3? No IP address, no default GW, something else?
GIM2>> VM communicates with its VTEP which, in turn, originates VXLAN
tunnel. VM is not required to have IP address as it is VTEP's IP address
that VM's MAC is associated with. As for gateway, RFC 7348 discusses VXLAN
gateway as the device that forwards traffice between VXLAN and non-VXLAN
Most deployments will have VMs with only L2 capabilities that
may not support L3.
Most deployments will have VMs with only L2 capabilities and not have an
IP address assigned.
[ag2] Do you have a reference for this (i.e. that most deployments have
VMs without an IP address)? Normally I would think VMs would have an IP
address. It's just that they are segregated into segments and, without an
intervening router, they are restricted to communicate only within their
subnet.
GIM3>> Would the following text be acceptable:

Deployments might have VMs with only L2 capabilities and not have an IP
address assigned or,
in other cases, VMs are assigned IP address but are restricted to
communicate only within their subnet.
Post by Anoop Ghanwani
Post by Greg Mirsky
Post by Anoop Ghanwani
Post by Greg Mirsky
Post by Anoop Ghanwani
Having a hierarchical OAM model helps localize faults though it
requires additional consideration.
What are the additional considerations?
GIM>> For example, coordination of BFD intervals across the OAM layers.
[ag] Can we mention them in the draft?
Post by Greg Mirsky
Post by Anoop Ghanwani
Would be useful to add a reference to RFC 8293 in case the reader
would like to know more about service nodes.
GIM>> I have to admit that I don't find how RFC 8293 A Framework for
Multicast in Network Virtualization over Layer 3 is related to this
document. Please help with additional reference to the text of the
document.
[ag] The RFC discusses the use of service nodes which is mentioned here.
Post by Greg Mirsky
Post by Anoop Ghanwani
Section 4
Separate BFD sessions can be established between the VTEPs (IP1 and
IP2) for monitoring each of the VXLAN tunnels (VNI 100 and 200).
IMO, the document should mention that this could lead to scaling
issues given that VTEPs can support well in excess of 4K VNIs.
Additionally, we should mention that with IRB, a given VNI may not even
exist on the destination VTEP. Finally, what is the benefit of doing
this? There may be certain corner cases where it's useful (vs a single BFD
session between the VTEPs for all VNIs) but it would be good to explain
what those are.
GIM>> Will add text in the Security Considerations section that VTEPs
should have limit on number of BFD sessions.
- A mention about the scalability issue right where per-VNI BFD is
discussed. (Not sure why that is a security issue/consideration.)
The implementation SHOULD have a reasonable upper bound on the number of
BFD sessions that can be created between the same pair of VTEPs.
[ag2] What is the criteria for determining what is reasonable?
GIM>> I usually understand that as requirement to make it controllable,
have configurable limit. Thus it will be up to an network operator to set
the limit.
Post by Anoop Ghanwani
Post by Greg Mirsky
- What is the benefit of running BFD per VNI between a pair of VTEPs?
GIM2>> An alternative would be to run CFM between VMs, if there's the
need to monitor liveliness of the particular VM. Again, this is optional.
[ag2] I'm not sure how running per-VNI BFD between the VTEPs allows one to
monitor the liveliness of VMs.
Post by Greg Mirsky
Post by Anoop Ghanwani
Post by Greg Mirsky
Post by Anoop Ghanwani
Sections 5.1 and 6.1
In 5.1 we have
The inner MAC frame carrying the BFD payload has the
... Source IP: IP address of the originating VTEP. Destination IP: IP
address of the terminating VTEP.
In 6.1 we have
Since multiple BFD sessions may be running between two
VTEPs, there needs to be a mechanism for demultiplexing received BF
packets to the proper session. The procedure for demultiplexing
packets with Your Discriminator equal to 0 is different from[RFC5880 <https://tools.ietf.org/html/rfc5880>].
*For such packets, the BFD session MUST be identified*
*using the inner headers, i.e., the source IP and the destination IP
present in the IP header carried by the payload of the VXLAN*
*encapsulated packet.*
How does this work if the source IP and dest IP are the same as
specified in 5.1?
GIM>> You're right, Destination and source IP addresses likely are the
same in this case. Will add that the source UDP port number, along with the
pair of IP addresses, MUST be used to demux received BFD control packets.
Would you agree that will be sufficient?
[ag] Yes, I think that should work.
Post by Greg Mirsky
Post by Anoop Ghanwani
Editorial
[ag] Agree with all comments on this section.
Post by Greg Mirsky
Post by Anoop Ghanwani
- Terminology section should be renamed to acronyms.
GIM>> Accepted
Post by Anoop Ghanwani
- Document would benefit from a thorough editorial scrub, but maybe
that will happen once it gets to the RFC editor.
GIM>> Will certainly have helpful comments from ADs and RFC editor.
Post by Anoop Ghanwani
Section 1
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348
<https://tools.ietf.org/html/rfc7348>]. provides an encapsulation
scheme that allows virtual machines (VMs) to communicate in a data center
network.
This is not accurate. VXLAN allows you to implement an overlay to
decouple the address space of the attached hosts from that of the network.
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348]. provides
an encapsulation scheme that allows virtual machines (VMs) to
communicate in a data center network.
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348]. provides
an encapsulation scheme that allows building an overlay network by
decoupling the address space of the attached virtual hosts from that
of the network.
Post by Anoop Ghanwani
Section 7
VTEP's -> VTEPs
GIM>> Yes, thank you.
Anoop Ghanwani
2018-11-20 20:14:42 UTC
Permalink
Hi Greg,

Please see inline prefixed by [ag3].

Thanks,
Anoop
Post by Greg Mirsky
Hi Anoop,
thank you for the discussion. Please find my responses tagged GIM3>>.
Also, attached diff and the updated working version of the draft. Hope
we're converging.
Regards,
Greg
Post by Anoop Ghanwani
Hi Greg,
Please see inline prefixed with [ag2].
Thanks,
Anoop
Post by Greg Mirsky
Hi Anoop,
thank you for the expedient response. I am glad that some of my
responses have addressed your concerns. Please find followup notes in-line
tagged GIM2>>. I've attached the diff to highlight the updates applied in
the working version. Let me know if these are acceptable changes.
Regards,
Greg
Post by Anoop Ghanwani
Hi Greg,
Please see inline prefixed with [ag].
Thanks,
Anoop
Post by Greg Mirsky
Hi Anoop,
many thanks for the thorough review and detailed comments. Please find
my answers, this time for real, in-line tagged GIM>>.
Regards,
Greg
Post by Anoop Ghanwani
Here are my comments.
Thanks,
Anoop
==
Philosophical
Since VXLAN is not an IETF standard, should we be defining a standard
for running BFD on it? Should we define BFD over Geneve instead which is
the official WG selection? Is that going to be a separate document?
GIM>> IS-IS is not on the Standard track either but that had not
prevented IETF from developing tens of standard track RFCs using RFC 1142
as the normative reference until RFC 7142 re-classified it as historical. A
similar path was followed with IS-IS-TE by publishing RFC 3784 until it was
obsoleted by RFC 5305 four years later. I understand that Down Reference,
i.e., using informational RFC as the normative reference, is not an unusual
situation.
[ag] OK. I'm not an expert on this part so unless someone else that is
an expert (chairs, AD?) can comment on it, I'll just let it go.
Post by Greg Mirsky
Post by Anoop Ghanwani
Technical
The individual racks may be part of a different Layer 3 network, or
they could be in a single Layer 2 network. The VXLAN segments/overlays are
overlaid on top of Layer 3 network. A VM can communicate with another VM
only if they are on the same VXLAN segment.
It's hard to parse and, given IRB,
VXLAN is typically deployed in data centers interconnecting
virtualized hosts, which may be spread across multiple racks. The
individual racks may be part of a different Layer 3 network, or they
could be in a single Layer 2 network. The VXLAN segments/overlays
are overlaid on top of Layer 3 network.
VXLAN is typically deployed in data centers interconnecting virtualized
hosts of a tenant. VXLAN addresses requirements of the Layer 2 and
Layer 3 data center network infrastructure in the presence of VMs in
a multi-tenant environment, discussed in section 3 [RFC7348], by
providing Layer 2 overlay scheme on a Layer 3 network.
[ag] This is a lot better.
Post by Greg Mirsky
A VM can communicate with another VM only if they are on the same
VXLAN segment.
Post by Anoop Ghanwani
the last sentence above is wrong.
Only VMs within the same VXLAN segment can communicate with each other.
[ag] VMs on different segments can communicate using routing/IRB, so
even RFC 7348 is wrong. Perhaps the text should be modified so say -- "In
the absence of a router in the overlay, a VM can communicate...".
Post by Greg Mirsky
Post by Anoop Ghanwani
Most deployments will have VMs with only L2 capabilities that
may not support L3.
Are you suggesting most deployments have VMs with no IP
addresses/configuration?
Most deployments will have VMs with only L2 capabilities that
may not support L3.
Deployments may have VMs with only L2 capabilities that do not support L3.
[ag] I still don't understand this. What does it mean for a VM to not
support L3? No IP address, no default GW, something else?
GIM2>> VM communicates with its VTEP which, in turn, originates VXLAN
tunnel. VM is not required to have IP address as it is VTEP's IP address
that VM's MAC is associated with. As for gateway, RFC 7348 discusses VXLAN
gateway as the device that forwards traffice between VXLAN and non-VXLAN
Most deployments will have VMs with only L2 capabilities that
may not support L3.
Most deployments will have VMs with only L2 capabilities and not have
an IP address assigned.
[ag2] Do you have a reference for this (i.e. that most deployments have
VMs without an IP address)? Normally I would think VMs would have an IP
address. It's just that they are segregated into segments and, without an
intervening router, they are restricted to communicate only within their
subnet.
Deployments might have VMs with only L2 capabilities and not have an IP
address assigned or,
in other cases, VMs are assigned IP address but are restricted to
communicate only within their subnet.
[ag3] Yes, this is better.
Post by Greg Mirsky
Post by Anoop Ghanwani
Post by Greg Mirsky
Post by Anoop Ghanwani
Post by Greg Mirsky
Post by Anoop Ghanwani
Having a hierarchical OAM model helps localize faults though it
requires additional consideration.
What are the additional considerations?
GIM>> For example, coordination of BFD intervals across the OAM layers.
[ag] Can we mention them in the draft?
Post by Greg Mirsky
Post by Anoop Ghanwani
Would be useful to add a reference to RFC 8293 in case the reader
would like to know more about service nodes.
GIM>> I have to admit that I don't find how RFC 8293 A Framework for
Multicast in Network Virtualization over Layer 3 is related to this
document. Please help with additional reference to the text of the
document.
[ag] The RFC discusses the use of service nodes which is mentioned here.
Post by Greg Mirsky
Post by Anoop Ghanwani
Section 4
Separate BFD sessions can be established between the VTEPs (IP1 and
IP2) for monitoring each of the VXLAN tunnels (VNI 100 and 200).
IMO, the document should mention that this could lead to scaling
issues given that VTEPs can support well in excess of 4K VNIs.
Additionally, we should mention that with IRB, a given VNI may not even
exist on the destination VTEP. Finally, what is the benefit of doing
this? There may be certain corner cases where it's useful (vs a single BFD
session between the VTEPs for all VNIs) but it would be good to explain
what those are.
GIM>> Will add text in the Security Considerations section that VTEPs
should have limit on number of BFD sessions.
- A mention about the scalability issue right where per-VNI BFD is
discussed. (Not sure why that is a security issue/consideration.)
The implementation SHOULD have a reasonable upper bound on the number of
BFD sessions that can be created between the same pair of VTEPs.
[ag2] What is the criteria for determining what is reasonable?
GIM>> I usually understand that as requirement to make it controllable,
have configurable limit. Thus it will be up to an network operator to set
the limit.
Post by Anoop Ghanwani
Post by Greg Mirsky
- What is the benefit of running BFD per VNI between a pair of VTEPs?
GIM2>> An alternative would be to run CFM between VMs, if there's the
need to monitor liveliness of the particular VM. Again, this is optional.
[ag2] I'm not sure how running per-VNI BFD between the VTEPs allows one
to monitor the liveliness of VMs.
[ag3] I think you missed responding to this. I'm not sure of the value of
running BFD per VNI between VTEPs. What am I getting that is not covered
by running a single BFD session with VNI 0 between the VTEPs?
Post by Greg Mirsky
Post by Anoop Ghanwani
Post by Greg Mirsky
Post by Anoop Ghanwani
Post by Greg Mirsky
Post by Anoop Ghanwani
Sections 5.1 and 6.1
In 5.1 we have
The inner MAC frame carrying the BFD payload has the
... Source IP: IP address of the originating VTEP. Destination IP: IP
address of the terminating VTEP.
In 6.1 we have
Since multiple BFD sessions may be running between two
VTEPs, there needs to be a mechanism for demultiplexing received BF
packets to the proper session. The procedure for demultiplexing
packets with Your Discriminator equal to 0 is different from[RFC5880 <https://tools.ietf.org/html/rfc5880>].
*For such packets, the BFD session MUST be identified*
*using the inner headers, i.e., the source IP and the destination IP
present in the IP header carried by the payload of the VXLAN*
*encapsulated packet.*
How does this work if the source IP and dest IP are the same as
specified in 5.1?
GIM>> You're right, Destination and source IP addresses likely are the
same in this case. Will add that the source UDP port number, along with the
pair of IP addresses, MUST be used to demux received BFD control packets.
Would you agree that will be sufficient?
[ag] Yes, I think that should work.
Post by Greg Mirsky
Post by Anoop Ghanwani
Editorial
[ag] Agree with all comments on this section.
Post by Greg Mirsky
Post by Anoop Ghanwani
- Terminology section should be renamed to acronyms.
GIM>> Accepted
Post by Anoop Ghanwani
- Document would benefit from a thorough editorial scrub, but maybe
that will happen once it gets to the RFC editor.
GIM>> Will certainly have helpful comments from ADs and RFC editor.
Post by Anoop Ghanwani
Section 1
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348
<https://tools.ietf.org/html/rfc7348>]. provides an encapsulation
scheme that allows virtual machines (VMs) to communicate in a data center
network.
This is not accurate. VXLAN allows you to implement an overlay to
decouple the address space of the attached hosts from that of the network.
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348]. provides
an encapsulation scheme that allows virtual machines (VMs) to
communicate in a data center network.
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348]. provides
an encapsulation scheme that allows building an overlay network by
decoupling the address space of the attached virtual hosts from that
of the network.
Post by Anoop Ghanwani
Section 7
VTEP's -> VTEPs
GIM>> Yes, thank you.
Greg Mirsky
2018-11-22 00:36:18 UTC
Permalink
Hi Anoop,
apologies for the miss. Is it the last outstanding? Let's bring it to the
front then.

- What is the benefit of running BFD per VNI between a pair of VTEPs?
Post by Anoop Ghanwani
Post by Anoop Ghanwani
Post by Greg Mirsky
GIM2>> An alternative would be to run CFM between VMs, if there's the
need to monitor liveliness of the particular VM. Again, this is optional.
[ag2] I'm not sure how running per-VNI BFD between the VTEPs allows one
to monitor the liveliness of VMs.
[ag3] I think you missed responding to this. I'm not sure of the value of
running BFD per VNI between VTEPs. What am I getting that is not covered
by running a single BFD session with VNI 0 between the VTEPs?

GIM3>> I've misspoken. Non-zero VNI is recommended to be used to
demultiplex BFD sessions between the same VTEPs. In section 6.1:
The procedure for demultiplexing
packets with Your Discriminator equal to 0 is different from
[RFC5880]. For such packets, the BFD session MUST be identified
using the inner headers, i.e., the source IP and the destination IP
present in the IP header carried by the payload of the VXLAN
encapsulated packet. The VNI of the packet SHOULD be used to derive
interface-related information for demultiplexing the packet.

Hope that clarifies the use of non-zero VNI in VXLAN encapsulation of a BFD
control packet.

Regards,
Greg
Post by Anoop Ghanwani
Hi Greg,
Please see inline prefixed by [ag3].
Thanks,
Anoop
Post by Anoop Ghanwani
Hi Anoop,
thank you for the discussion. Please find my responses tagged GIM3>>.
Also, attached diff and the updated working version of the draft. Hope
we're converging.
Regards,
Greg
Post by Greg Mirsky
Hi Greg,
Please see inline prefixed with [ag2].
Thanks,
Anoop
Post by Greg Mirsky
Hi Anoop,
thank you for the expedient response. I am glad that some of my
responses have addressed your concerns. Please find followup notes in-line
tagged GIM2>>. I've attached the diff to highlight the updates applied in
the working version. Let me know if these are acceptable changes.
Regards,
Greg
Post by Anoop Ghanwani
Hi Greg,
Please see inline prefixed with [ag].
Thanks,
Anoop
Post by Greg Mirsky
Hi Anoop,
many thanks for the thorough review and detailed comments. Please
find my answers, this time for real, in-line tagged GIM>>.
Regards,
Greg
Post by Anoop Ghanwani
Here are my comments.
Thanks,
Anoop
==
Philosophical
Since VXLAN is not an IETF standard, should we be defining a
standard for running BFD on it? Should we define BFD over Geneve instead
which is the official WG selection? Is that going to be a separate
document?
GIM>> IS-IS is not on the Standard track either but that had not
prevented IETF from developing tens of standard track RFCs using RFC 1142
as the normative reference until RFC 7142 re-classified it as historical. A
similar path was followed with IS-IS-TE by publishing RFC 3784 until it was
obsoleted by RFC 5305 four years later. I understand that Down Reference,
i.e., using informational RFC as the normative reference, is not an unusual
situation.
[ag] OK. I'm not an expert on this part so unless someone else that
is an expert (chairs, AD?) can comment on it, I'll just let it go.
Post by Greg Mirsky
Post by Anoop Ghanwani
Technical
The individual racks may be part of a different Layer 3 network, or
they could be in a single Layer 2 network. The VXLAN segments/overlays are
overlaid on top of Layer 3 network. A VM can communicate with another VM
only if they are on the same VXLAN segment.
It's hard to parse and, given IRB,
VXLAN is typically deployed in data centers interconnecting
virtualized hosts, which may be spread across multiple racks. The
individual racks may be part of a different Layer 3 network, or they
could be in a single Layer 2 network. The VXLAN segments/overlays
are overlaid on top of Layer 3 network.
VXLAN is typically deployed in data centers interconnecting virtualized
hosts of a tenant. VXLAN addresses requirements of the Layer 2 and
Layer 3 data center network infrastructure in the presence of VMs in
a multi-tenant environment, discussed in section 3 [RFC7348], by
providing Layer 2 overlay scheme on a Layer 3 network.
[ag] This is a lot better.
Post by Greg Mirsky
A VM can communicate with another VM only if they are on the same
VXLAN segment.
Post by Anoop Ghanwani
the last sentence above is wrong.
Only VMs within the same VXLAN segment can communicate with each other.
[ag] VMs on different segments can communicate using routing/IRB, so
even RFC 7348 is wrong. Perhaps the text should be modified so say -- "In
the absence of a router in the overlay, a VM can communicate...".
Post by Greg Mirsky
Post by Anoop Ghanwani
Most deployments will have VMs with only L2 capabilities that
may not support L3.
Are you suggesting most deployments have VMs with no IP
addresses/configuration?
Most deployments will have VMs with only L2 capabilities that
may not support L3.
Deployments may have VMs with only L2 capabilities that do not support L3.
[ag] I still don't understand this. What does it mean for a VM to not
support L3? No IP address, no default GW, something else?
GIM2>> VM communicates with its VTEP which, in turn, originates VXLAN
tunnel. VM is not required to have IP address as it is VTEP's IP address
that VM's MAC is associated with. As for gateway, RFC 7348 discusses VXLAN
gateway as the device that forwards traffice between VXLAN and non-VXLAN
Most deployments will have VMs with only L2 capabilities that
may not support L3.
Most deployments will have VMs with only L2 capabilities and not have
an IP address assigned.
[ag2] Do you have a reference for this (i.e. that most deployments have
VMs without an IP address)? Normally I would think VMs would have an IP
address. It's just that they are segregated into segments and, without an
intervening router, they are restricted to communicate only within their
subnet.
Deployments might have VMs with only L2 capabilities and not have an IP
address assigned or,
in other cases, VMs are assigned IP address but are restricted to
communicate only within their subnet.
[ag3] Yes, this is better.
Post by Anoop Ghanwani
Post by Greg Mirsky
Post by Greg Mirsky
Post by Anoop Ghanwani
Post by Greg Mirsky
Post by Anoop Ghanwani
Having a hierarchical OAM model helps localize faults though it
requires additional consideration.
What are the additional considerations?
GIM>> For example, coordination of BFD intervals across the OAM layers.
[ag] Can we mention them in the draft?
Post by Greg Mirsky
Post by Anoop Ghanwani
Would be useful to add a reference to RFC 8293 in case the reader
would like to know more about service nodes.
GIM>> I have to admit that I don't find how RFC 8293 A Framework for
Multicast in Network Virtualization over Layer 3 is related to this
document. Please help with additional reference to the text of the
document.
[ag] The RFC discusses the use of service nodes which is mentioned here.
Post by Greg Mirsky
Post by Anoop Ghanwani
Section 4
Separate BFD sessions can be established between the VTEPs (IP1 and
IP2) for monitoring each of the VXLAN tunnels (VNI 100 and 200).
IMO, the document should mention that this could lead to scaling
issues given that VTEPs can support well in excess of 4K VNIs.
Additionally, we should mention that with IRB, a given VNI may not even
exist on the destination VTEP. Finally, what is the benefit of doing
this? There may be certain corner cases where it's useful (vs a single BFD
session between the VTEPs for all VNIs) but it would be good to explain
what those are.
GIM>> Will add text in the Security Considerations section that VTEPs
should have limit on number of BFD sessions.
- A mention about the scalability issue right where per-VNI BFD is
discussed. (Not sure why that is a security issue/consideration.)
The implementation SHOULD have a reasonable upper bound on the number
of BFD sessions that can be created between the same pair of VTEPs.
[ag2] What is the criteria for determining what is reasonable?
GIM>> I usually understand that as requirement to make it controllable,
have configurable limit. Thus it will be up to an network operator to set
the limit.
Post by Greg Mirsky
Post by Greg Mirsky
- What is the benefit of running BFD per VNI between a pair of VTEPs?
GIM2>> An alternative would be to run CFM between VMs, if there's the
need to monitor liveliness of the particular VM. Again, this is optional.
[ag2] I'm not sure how running per-VNI BFD between the VTEPs allows one
to monitor the liveliness of VMs.
[ag3] I think you missed responding to this. I'm not sure of the value of
running BFD per VNI between VTEPs. What am I getting that is not covered
by running a single BFD session with VNI 0 between the VTEPs?
Post by Anoop Ghanwani
Post by Greg Mirsky
Post by Greg Mirsky
Post by Anoop Ghanwani
Post by Greg Mirsky
Post by Anoop Ghanwani
Sections 5.1 and 6.1
In 5.1 we have
The inner MAC frame carrying the BFD payload has the
IP address of the terminating VTEP.
In 6.1 we have
Since multiple BFD sessions may be running between two
VTEPs, there needs to be a mechanism for demultiplexing received BF
packets to the proper session. The procedure for demultiplexing
packets with Your Discriminator equal to 0 is different from[RFC5880 <https://tools.ietf.org/html/rfc5880>].
*For such packets, the BFD session MUST be identified*
*using the inner headers, i.e., the source IP and the destination IP
present in the IP header carried by the payload of the VXLAN*
*encapsulated packet.*
How does this work if the source IP and dest IP are the same as
specified in 5.1?
GIM>> You're right, Destination and source IP addresses likely are
the same in this case. Will add that the source UDP port number, along with
the pair of IP addresses, MUST be used to demux received BFD control
packets. Would you agree that will be sufficient?
[ag] Yes, I think that should work.
Post by Greg Mirsky
Post by Anoop Ghanwani
Editorial
[ag] Agree with all comments on this section.
Post by Greg Mirsky
Post by Anoop Ghanwani
- Terminology section should be renamed to acronyms.
GIM>> Accepted
Post by Anoop Ghanwani
- Document would benefit from a thorough editorial scrub, but maybe
that will happen once it gets to the RFC editor.
GIM>> Will certainly have helpful comments from ADs and RFC editor.
Post by Anoop Ghanwani
Section 1
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348
<https://tools.ietf.org/html/rfc7348>]. provides an encapsulation
scheme that allows virtual machines (VMs) to communicate in a data center
network.
This is not accurate. VXLAN allows you to implement an overlay to
decouple the address space of the attached hosts from that of the network.
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348].
provides
an encapsulation scheme that allows virtual machines (VMs) to
communicate in a data center network.
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348]. provides
an encapsulation scheme that allows building an overlay network by
decoupling the address space of the attached virtual hosts from
that of the network.
Post by Anoop Ghanwani
Section 7
VTEP's -> VTEPs
GIM>> Yes, thank you.
Anoop Ghanwani
2018-11-22 06:59:56 UTC
Permalink
Hi Greg,

See below prefixed with [ag4].

Thanks,
Anoop
Post by Greg Mirsky
Hi Anoop,
apologies for the miss. Is it the last outstanding? Let's bring it to the
front then.
- What is the benefit of running BFD per VNI between a pair of VTEPs?
Post by Anoop Ghanwani
Post by Anoop Ghanwani
Post by Greg Mirsky
GIM2>> An alternative would be to run CFM between VMs, if there's the
need to monitor liveliness of the particular VM. Again, this is optional.
[ag2] I'm not sure how running per-VNI BFD between the VTEPs allows one
to monitor the liveliness of VMs.
[ag3] I think you missed responding to this. I'm not sure of the value of
running BFD per VNI between VTEPs. What am I getting that is not covered
by running a single BFD session with VNI 0 between the VTEPs?
GIM3>> I've misspoken. Non-zero VNI is recommended to be used to
The procedure for demultiplexing
packets with Your Discriminator equal to 0 is different from
[RFC5880]. For such packets, the BFD session MUST be identified
using the inner headers, i.e., the source IP and the destination IP
present in the IP header carried by the payload of the VXLAN
encapsulated packet. The VNI of the packet SHOULD be used to derive
interface-related information for demultiplexing the packet.
Hope that clarifies the use of non-zero VNI in VXLAN encapsulation of a
BFD control packet.
[ag4] This tells me how the VNI is used for BFD packets being
sent/received. What is the use case/benefit of doing that? I am creating
a special interface with VNI 0 just for BFD. Why do I now need to run BFD
on any/all of the other VNIs? As a developer, if I read this spec, should
I be building this capability or not? Basically what I'm getting at is I
think the draft should recommend using VNI 0. If there is a convincing use
case for running BFD over other VNIs serviced by that VTEP, then that needs
to be explained. But as I mentioned before, this leads to scaling issues.
So given the scaling issues, it would be good if an implementation only
needed to worry about sending BFD messages on VNI 0.
Post by Greg Mirsky
Regards,
Greg
Post by Anoop Ghanwani
Hi Greg,
Please see inline prefixed by [ag3].
Thanks,
Anoop
Post by Anoop Ghanwani
Hi Anoop,
thank you for the discussion. Please find my responses tagged GIM3>>.
Also, attached diff and the updated working version of the draft. Hope
we're converging.
Regards,
Greg
Post by Greg Mirsky
Hi Greg,
Please see inline prefixed with [ag2].
Thanks,
Anoop
Post by Greg Mirsky
Hi Anoop,
thank you for the expedient response. I am glad that some of my
responses have addressed your concerns. Please find followup notes in-line
tagged GIM2>>. I've attached the diff to highlight the updates applied in
the working version. Let me know if these are acceptable changes.
Regards,
Greg
Post by Anoop Ghanwani
Hi Greg,
Please see inline prefixed with [ag].
Thanks,
Anoop
Post by Greg Mirsky
Hi Anoop,
many thanks for the thorough review and detailed comments. Please
find my answers, this time for real, in-line tagged GIM>>.
Regards,
Greg
Post by Anoop Ghanwani
Here are my comments.
Thanks,
Anoop
==
Philosophical
Since VXLAN is not an IETF standard, should we be defining a
standard for running BFD on it? Should we define BFD over Geneve instead
which is the official WG selection? Is that going to be a separate
document?
GIM>> IS-IS is not on the Standard track either but that had not
prevented IETF from developing tens of standard track RFCs using RFC 1142
as the normative reference until RFC 7142 re-classified it as historical. A
similar path was followed with IS-IS-TE by publishing RFC 3784 until it was
obsoleted by RFC 5305 four years later. I understand that Down Reference,
i.e., using informational RFC as the normative reference, is not an unusual
situation.
[ag] OK. I'm not an expert on this part so unless someone else that
is an expert (chairs, AD?) can comment on it, I'll just let it go.
Post by Greg Mirsky
Post by Anoop Ghanwani
Technical
The individual racks may be part of a different Layer 3 network, or
they could be in a single Layer 2 network. The VXLAN segments/overlays are
overlaid on top of Layer 3 network. A VM can communicate with another VM
only if they are on the same VXLAN segment.
It's hard to parse and, given IRB,
VXLAN is typically deployed in data centers interconnecting
virtualized hosts, which may be spread across multiple racks. The
individual racks may be part of a different Layer 3 network, or they
could be in a single Layer 2 network. The VXLAN segments/overlays
are overlaid on top of Layer 3 network.
VXLAN is typically deployed in data centers interconnecting virtualized
hosts of a tenant. VXLAN addresses requirements of the Layer 2 and
Layer 3 data center network infrastructure in the presence of VMs in
a multi-tenant environment, discussed in section 3 [RFC7348], by
providing Layer 2 overlay scheme on a Layer 3 network.
[ag] This is a lot better.
Post by Greg Mirsky
A VM can communicate with another VM only if they are on the same
VXLAN segment.
Post by Anoop Ghanwani
the last sentence above is wrong.
Only VMs within the same VXLAN segment can communicate with each other.
[ag] VMs on different segments can communicate using routing/IRB, so
even RFC 7348 is wrong. Perhaps the text should be modified so say -- "In
the absence of a router in the overlay, a VM can communicate...".
Post by Greg Mirsky
Post by Anoop Ghanwani
Most deployments will have VMs with only L2 capabilities that
may not support L3.
Are you suggesting most deployments have VMs with no IP
addresses/configuration?
Most deployments will have VMs with only L2 capabilities that
may not support L3.
Deployments may have VMs with only L2 capabilities that do not support L3.
[ag] I still don't understand this. What does it mean for a VM to
not support L3? No IP address, no default GW, something else?
GIM2>> VM communicates with its VTEP which, in turn, originates VXLAN
tunnel. VM is not required to have IP address as it is VTEP's IP address
that VM's MAC is associated with. As for gateway, RFC 7348 discusses VXLAN
gateway as the device that forwards traffice between VXLAN and non-VXLAN
Most deployments will have VMs with only L2 capabilities that
may not support L3.
Most deployments will have VMs with only L2 capabilities and not have
an IP address assigned.
[ag2] Do you have a reference for this (i.e. that most deployments have
VMs without an IP address)? Normally I would think VMs would have an IP
address. It's just that they are segregated into segments and, without an
intervening router, they are restricted to communicate only within their
subnet.
Deployments might have VMs with only L2 capabilities and not have an IP
address assigned or,
in other cases, VMs are assigned IP address but are restricted to
communicate only within their subnet.
[ag3] Yes, this is better.
Post by Anoop Ghanwani
Post by Greg Mirsky
Post by Greg Mirsky
Post by Anoop Ghanwani
Post by Greg Mirsky
Post by Anoop Ghanwani
Having a hierarchical OAM model helps localize faults though it
requires additional consideration.
What are the additional considerations?
GIM>> For example, coordination of BFD intervals across the OAM layers.
[ag] Can we mention them in the draft?
Post by Greg Mirsky
Post by Anoop Ghanwani
Would be useful to add a reference to RFC 8293 in case the reader
would like to know more about service nodes.
GIM>> I have to admit that I don't find how RFC 8293 A Framework
for Multicast in Network Virtualization over Layer 3 is related to this
document. Please help with additional reference to the text of the
document.
[ag] The RFC discusses the use of service nodes which is mentioned here.
Post by Greg Mirsky
Post by Anoop Ghanwani
Section 4
Separate BFD sessions can be established between the VTEPs (IP1 and
IP2) for monitoring each of the VXLAN tunnels (VNI 100 and 200).
IMO, the document should mention that this could lead to scaling
issues given that VTEPs can support well in excess of 4K VNIs.
Additionally, we should mention that with IRB, a given VNI may not even
exist on the destination VTEP. Finally, what is the benefit of doing
this? There may be certain corner cases where it's useful (vs a single BFD
session between the VTEPs for all VNIs) but it would be good to explain
what those are.
GIM>> Will add text in the Security Considerations section that
VTEPs should have limit on number of BFD sessions.
- A mention about the scalability issue right where per-VNI BFD is
discussed. (Not sure why that is a security issue/consideration.)
The implementation SHOULD have a reasonable upper bound on the number
of BFD sessions that can be created between the same pair of VTEPs.
[ag2] What is the criteria for determining what is reasonable?
GIM>> I usually understand that as requirement to make it controllable,
have configurable limit. Thus it will be up to an network operator to set
the limit.
Post by Greg Mirsky
Post by Greg Mirsky
- What is the benefit of running BFD per VNI between a pair of VTEPs?
GIM2>> An alternative would be to run CFM between VMs, if there's the
need to monitor liveliness of the particular VM. Again, this is optional.
[ag2] I'm not sure how running per-VNI BFD between the VTEPs allows one
to monitor the liveliness of VMs.
[ag3] I think you missed responding to this. I'm not sure of the value
of running BFD per VNI between VTEPs. What am I getting that is not
covered by running a single BFD session with VNI 0 between the VTEPs?
Post by Anoop Ghanwani
Post by Greg Mirsky
Post by Greg Mirsky
Post by Anoop Ghanwani
Post by Greg Mirsky
Post by Anoop Ghanwani
Sections 5.1 and 6.1
In 5.1 we have
The inner MAC frame carrying the BFD payload has the
IP address of the terminating VTEP.
In 6.1 we have
Since multiple BFD sessions may be running between two
VTEPs, there needs to be a mechanism for demultiplexing received BF
packets to the proper session. The procedure for demultiplexing
packets with Your Discriminator equal to 0 is different from[RFC5880 <https://tools.ietf.org/html/rfc5880>].
*For such packets, the BFD session MUST be identified*
*using the inner headers, i.e., the source IP and the destination IP
present in the IP header carried by the payload of the VXLAN*
*encapsulated packet.*
How does this work if the source IP and dest IP are the same as
specified in 5.1?
GIM>> You're right, Destination and source IP addresses likely are
the same in this case. Will add that the source UDP port number, along with
the pair of IP addresses, MUST be used to demux received BFD control
packets. Would you agree that will be sufficient?
[ag] Yes, I think that should work.
Post by Greg Mirsky
Post by Anoop Ghanwani
Editorial
[ag] Agree with all comments on this section.
Post by Greg Mirsky
Post by Anoop Ghanwani
- Terminology section should be renamed to acronyms.
GIM>> Accepted
Post by Anoop Ghanwani
- Document would benefit from a thorough editorial scrub, but maybe
that will happen once it gets to the RFC editor.
GIM>> Will certainly have helpful comments from ADs and RFC editor.
Post by Anoop Ghanwani
Section 1
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348
<https://tools.ietf.org/html/rfc7348>]. provides an encapsulation
scheme that allows virtual machines (VMs) to communicate in a data center
network.
This is not accurate. VXLAN allows you to implement an overlay to
decouple the address space of the attached hosts from that of the network.
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348].
provides
an encapsulation scheme that allows virtual machines (VMs) to
communicate in a data center network.
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348]. provides
an encapsulation scheme that allows building an overlay network by
decoupling the address space of the attached virtual hosts from
that of the network.
Post by Anoop Ghanwani
Section 7
VTEP's -> VTEPs
GIM>> Yes, thank you.
Greg Mirsky
2018-11-22 20:27:56 UTC
Permalink
Hi Anoop,
apologies if my explanation was not clear. Non-zero VNIs are recommended to
be used by a VTEP that received BFD control packet with zero Your
Discriminator value. BFD control packets with non-zero Your Discriminator
value will be demultiplexed using only that value. As for the special role
of VNI 0 the section 7 of the draft states the following:
BFD session MAY be established for the reserved VNI 0. One way to
aggregate BFD sessions between VTEP's is to establish a BFD session
with VNI 0. A VTEP MAY also use VNI 0 to establish a BFD session
with a service node.
Would you suggest changing the normative language in this text?

Regards,
Greg

PS. Happy Thanksgiving to All!
Post by Anoop Ghanwani
Hi Greg,
See below prefixed with [ag4].
Thanks,
Anoop
Post by Greg Mirsky
Hi Anoop,
apologies for the miss. Is it the last outstanding? Let's bring it to the
front then.
- What is the benefit of running BFD per VNI between a pair of VTEPs?
Post by Anoop Ghanwani
Post by Anoop Ghanwani
Post by Greg Mirsky
GIM2>> An alternative would be to run CFM between VMs, if there's the
need to monitor liveliness of the particular VM. Again, this is optional.
[ag2] I'm not sure how running per-VNI BFD between the VTEPs allows one
to monitor the liveliness of VMs.
[ag3] I think you missed responding to this. I'm not sure of the value
of running BFD per VNI between VTEPs. What am I getting that is not
covered by running a single BFD session with VNI 0 between the VTEPs?
GIM3>> I've misspoken. Non-zero VNI is recommended to be used to
The procedure for demultiplexing
packets with Your Discriminator equal to 0 is different from
[RFC5880]. For such packets, the BFD session MUST be identified
using the inner headers, i.e., the source IP and the destination IP
present in the IP header carried by the payload of the VXLAN
encapsulated packet. The VNI of the packet SHOULD be used to derive
interface-related information for demultiplexing the packet.
Hope that clarifies the use of non-zero VNI in VXLAN encapsulation of a
BFD control packet.
[ag4] This tells me how the VNI is used for BFD packets being
sent/received. What is the use case/benefit of doing that? I am creating
a special interface with VNI 0 just for BFD. Why do I now need to run BFD
on any/all of the other VNIs? As a developer, if I read this spec, should
I be building this capability or not? Basically what I'm getting at is I
think the draft should recommend using VNI 0. If there is a convincing use
case for running BFD over other VNIs serviced by that VTEP, then that needs
to be explained. But as I mentioned before, this leads to scaling issues.
So given the scaling issues, it would be good if an implementation only
needed to worry about sending BFD messages on VNI 0.
Post by Greg Mirsky
Regards,
Greg
Post by Anoop Ghanwani
Hi Greg,
Please see inline prefixed by [ag3].
Thanks,
Anoop
Post by Anoop Ghanwani
Hi Anoop,
thank you for the discussion. Please find my responses tagged GIM3>>.
Also, attached diff and the updated working version of the draft. Hope
we're converging.
Regards,
Greg
Post by Greg Mirsky
Hi Greg,
Please see inline prefixed with [ag2].
Thanks,
Anoop
Post by Greg Mirsky
Hi Anoop,
thank you for the expedient response. I am glad that some of my
responses have addressed your concerns. Please find followup notes in-line
tagged GIM2>>. I've attached the diff to highlight the updates applied in
the working version. Let me know if these are acceptable changes.
Regards,
Greg
On Tue, Nov 13, 2018 at 12:30 PM Anoop Ghanwani <
Post by Anoop Ghanwani
Hi Greg,
Please see inline prefixed with [ag].
Thanks,
Anoop
Post by Greg Mirsky
Hi Anoop,
many thanks for the thorough review and detailed comments. Please
find my answers, this time for real, in-line tagged GIM>>.
Regards,
Greg
On Thu, Nov 8, 2018 at 1:58 AM Anoop Ghanwani <
Post by Anoop Ghanwani
Here are my comments.
Thanks,
Anoop
==
Philosophical
Since VXLAN is not an IETF standard, should we be defining a
standard for running BFD on it? Should we define BFD over Geneve instead
which is the official WG selection? Is that going to be a separate
document?
GIM>> IS-IS is not on the Standard track either but that had not
prevented IETF from developing tens of standard track RFCs using RFC 1142
as the normative reference until RFC 7142 re-classified it as historical. A
similar path was followed with IS-IS-TE by publishing RFC 3784 until it was
obsoleted by RFC 5305 four years later. I understand that Down Reference,
i.e., using informational RFC as the normative reference, is not an unusual
situation.
[ag] OK. I'm not an expert on this part so unless someone else that
is an expert (chairs, AD?) can comment on it, I'll just let it go.
Post by Greg Mirsky
Post by Anoop Ghanwani
Technical
The individual racks may be part of a different Layer 3 network,
or they could be in a single Layer 2 network. The VXLAN segments/overlays
are overlaid on top of Layer 3 network. A VM can communicate with another
VM only if they are on the same VXLAN segment.
It's hard to parse and, given IRB,
VXLAN is typically deployed in data centers interconnecting
virtualized hosts, which may be spread across multiple racks.
The
individual racks may be part of a different Layer 3 network, or they
could be in a single Layer 2 network. The VXLAN
segments/overlays
are overlaid on top of Layer 3 network.
VXLAN is typically deployed in data centers interconnecting virtualized
hosts of a tenant. VXLAN addresses requirements of the Layer 2 and
Layer 3 data center network infrastructure in the presence of VMs in
a multi-tenant environment, discussed in section 3 [RFC7348], by
providing Layer 2 overlay scheme on a Layer 3 network.
[ag] This is a lot better.
Post by Greg Mirsky
A VM can communicate with another VM only if they are on the same
VXLAN segment.
Post by Anoop Ghanwani
the last sentence above is wrong.
Only VMs within the same VXLAN segment can communicate with each other.
[ag] VMs on different segments can communicate using routing/IRB, so
even RFC 7348 is wrong. Perhaps the text should be modified so say -- "In
the absence of a router in the overlay, a VM can communicate...".
Post by Greg Mirsky
Post by Anoop Ghanwani
Most deployments will have VMs with only L2 capabilities that
may not support L3.
Are you suggesting most deployments have VMs with no IP
addresses/configuration?
Most deployments will have VMs with only L2 capabilities that
may not support L3.
Deployments may have VMs with only L2 capabilities that do not support L3.
[ag] I still don't understand this. What does it mean for a VM to
not support L3? No IP address, no default GW, something else?
GIM2>> VM communicates with its VTEP which, in turn, originates VXLAN
tunnel. VM is not required to have IP address as it is VTEP's IP address
that VM's MAC is associated with. As for gateway, RFC 7348 discusses VXLAN
gateway as the device that forwards traffice between VXLAN and non-VXLAN
Most deployments will have VMs with only L2 capabilities that
may not support L3.
Most deployments will have VMs with only L2 capabilities and not
have an IP address assigned.
[ag2] Do you have a reference for this (i.e. that most deployments
have VMs without an IP address)? Normally I would think VMs would have an
IP address. It's just that they are segregated into segments and, without
an intervening router, they are restricted to communicate only within their
subnet.
Deployments might have VMs with only L2 capabilities and not have an IP
address assigned or,
in other cases, VMs are assigned IP address but are restricted to
communicate only within their subnet.
[ag3] Yes, this is better.
Post by Anoop Ghanwani
Post by Greg Mirsky
Post by Greg Mirsky
Post by Anoop Ghanwani
Post by Greg Mirsky
Post by Anoop Ghanwani
Having a hierarchical OAM model helps localize faults though it
requires additional consideration.
What are the additional considerations?
GIM>> For example, coordination of BFD intervals across the OAM layers.
[ag] Can we mention them in the draft?
Post by Greg Mirsky
Post by Anoop Ghanwani
Would be useful to add a reference to RFC 8293 in case the reader
would like to know more about service nodes.
GIM>> I have to admit that I don't find how RFC 8293 A Framework
for Multicast in Network Virtualization over Layer 3 is related to this
document. Please help with additional reference to the text of the
document.
[ag] The RFC discusses the use of service nodes which is mentioned here.
Post by Greg Mirsky
Post by Anoop Ghanwani
Section 4
Separate BFD sessions can be established between the VTEPs (IP1
and IP2) for monitoring each of the VXLAN tunnels (VNI 100 and 200).
IMO, the document should mention that this could lead to scaling
issues given that VTEPs can support well in excess of 4K VNIs.
Additionally, we should mention that with IRB, a given VNI may not even
exist on the destination VTEP. Finally, what is the benefit of doing
this? There may be certain corner cases where it's useful (vs a single BFD
session between the VTEPs for all VNIs) but it would be good to explain
what those are.
GIM>> Will add text in the Security Considerations section that
VTEPs should have limit on number of BFD sessions.
- A mention about the scalability issue right where per-VNI BFD is
discussed. (Not sure why that is a security issue/consideration.)
The implementation SHOULD have a reasonable upper bound on the number
of BFD sessions that can be created between the same pair of VTEPs.
[ag2] What is the criteria for determining what is reasonable?
GIM>> I usually understand that as requirement to make it controllable,
have configurable limit. Thus it will be up to an network operator to set
the limit.
Post by Greg Mirsky
Post by Greg Mirsky
- What is the benefit of running BFD per VNI between a pair of VTEPs?
GIM2>> An alternative would be to run CFM between VMs, if there's the
need to monitor liveliness of the particular VM. Again, this is optional.
[ag2] I'm not sure how running per-VNI BFD between the VTEPs allows
one to monitor the liveliness of VMs.
[ag3] I think you missed responding to this. I'm not sure of the value
of running BFD per VNI between VTEPs. What am I getting that is not
covered by running a single BFD session with VNI 0 between the VTEPs?
Post by Anoop Ghanwani
Post by Greg Mirsky
Post by Greg Mirsky
Post by Anoop Ghanwani
Post by Greg Mirsky
Post by Anoop Ghanwani
Sections 5.1 and 6.1
In 5.1 we have
The inner MAC frame carrying the BFD payload has the
IP address of the terminating VTEP.
In 6.1 we have
Since multiple BFD sessions may be running between two
VTEPs, there needs to be a mechanism for demultiplexing received BF
packets to the proper session. The procedure for demultiplexing
packets with Your Discriminator equal to 0 is different from[RFC5880 <https://tools.ietf.org/html/rfc5880>].
*For such packets, the BFD session MUST be identified*
*using the inner headers, i.e., the source IP and the destination IP
present in the IP header carried by the payload of the VXLAN*
*encapsulated packet.*
How does this work if the source IP and dest IP are the same as
specified in 5.1?
GIM>> You're right, Destination and source IP addresses likely are
the same in this case. Will add that the source UDP port number, along with
the pair of IP addresses, MUST be used to demux received BFD control
packets. Would you agree that will be sufficient?
[ag] Yes, I think that should work.
Post by Greg Mirsky
Post by Anoop Ghanwani
Editorial
[ag] Agree with all comments on this section.
Post by Greg Mirsky
Post by Anoop Ghanwani
- Terminology section should be renamed to acronyms.
GIM>> Accepted
Post by Anoop Ghanwani
- Document would benefit from a thorough editorial scrub, but
maybe that will happen once it gets to the RFC editor.
GIM>> Will certainly have helpful comments from ADs and RFC editor.
Post by Anoop Ghanwani
Section 1
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348
<https://tools.ietf.org/html/rfc7348>]. provides an encapsulation
scheme that allows virtual machines (VMs) to communicate in a data center
network.
This is not accurate. VXLAN allows you to implement an overlay to
decouple the address space of the attached hosts from that of the network.
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348].
provides
an encapsulation scheme that allows virtual machines (VMs) to
communicate in a data center network.
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348].
provides
an encapsulation scheme that allows building an overlay network by
decoupling the address space of the attached virtual hosts from
that of the network.
Post by Anoop Ghanwani
Section 7
VTEP's -> VTEPs
GIM>> Yes, thank you.
Anoop Ghanwani
2018-11-23 18:47:23 UTC
Permalink
Hi Greg,

I would recommend the following change.

OLD

7. Use of reserved VNI

BFD session MAY be established for the reserved VNI 0. One way to
aggregate BFD sessions between VTEP's is to establish a BFD session
with VNI 0. A VTEP MAY also use VNI 0 to establish a BFD session
with a service node.

NEW

7. Use of reserved VNI

In most cases, only a single BFD session is necessary for a given VTEP
to monitor the reachability to a remote VTEP, regardless of the number of
VNIs in common. When a single session is used to monitor reachability
remote VTEP, an implementation SHOULD use a VNI of 0.

Thanks,
Anoop
Post by Greg Mirsky
Hi Anoop,
apologies if my explanation was not clear. Non-zero VNIs are recommended
to be used by a VTEP that received BFD control packet with zero Your
Discriminator value. BFD control packets with non-zero Your Discriminator
value will be demultiplexed using only that value. As for the special role
BFD session MAY be established for the reserved VNI 0. One way to
aggregate BFD sessions between VTEP's is to establish a BFD session
with VNI 0. A VTEP MAY also use VNI 0 to establish a BFD session
with a service node.
Would you suggest changing the normative language in this text?
Regards,
Greg
PS. Happy Thanksgiving to All!
Post by Anoop Ghanwani
Hi Greg,
See below prefixed with [ag4].
Thanks,
Anoop
Post by Greg Mirsky
Hi Anoop,
apologies for the miss. Is it the last outstanding? Let's bring it to
the front then.
- What is the benefit of running BFD per VNI between a pair of VTEPs?
Post by Anoop Ghanwani
Post by Greg Mirsky
GIM2>> An alternative would be to run CFM between VMs, if there's the
need to monitor liveliness of the particular VM. Again, this is optional.
[ag2] I'm not sure how running per-VNI BFD between the VTEPs allows
one to monitor the liveliness of VMs.
[ag3] I think you missed responding to this. I'm not sure of the value
of running BFD per VNI between VTEPs. What am I getting that is not
covered by running a single BFD session with VNI 0 between the VTEPs?
GIM3>> I've misspoken. Non-zero VNI is recommended to be used to
The procedure for demultiplexing
packets with Your Discriminator equal to 0 is different from
[RFC5880]. For such packets, the BFD session MUST be identified
using the inner headers, i.e., the source IP and the destination IP
present in the IP header carried by the payload of the VXLAN
encapsulated packet. The VNI of the packet SHOULD be used to derive
interface-related information for demultiplexing the packet.
Hope that clarifies the use of non-zero VNI in VXLAN encapsulation of a
BFD control packet.
[ag4] This tells me how the VNI is used for BFD packets being
sent/received. What is the use case/benefit of doing that? I am creating
a special interface with VNI 0 just for BFD. Why do I now need to run BFD
on any/all of the other VNIs? As a developer, if I read this spec, should
I be building this capability or not? Basically what I'm getting at is I
think the draft should recommend using VNI 0. If there is a convincing use
case for running BFD over other VNIs serviced by that VTEP, then that needs
to be explained. But as I mentioned before, this leads to scaling issues.
So given the scaling issues, it would be good if an implementation only
needed to worry about sending BFD messages on VNI 0.
Post by Greg Mirsky
Regards,
Greg
Greg Mirsky
2018-11-23 21:10:16 UTC
Permalink
Hi Anoop,
thank you for the consise text. I think I've got the idea. Would the minor
tweak be acceptable?

In most cases, a single BFD session is sufficient for the given VTEP to
monitor
the reachability of a remote VTEP, regardless of the number of VNIs in
common.
When the single BFD session is used to monitor reachability of the remote
VTEP,
an implementation SHOULD use a VNI of 0.

Regards,
Greg
Post by Anoop Ghanwani
Hi Greg,
I would recommend the following change.
OLD
7. Use of reserved VNI
BFD session MAY be established for the reserved VNI 0. One way to
aggregate BFD sessions between VTEP's is to establish a BFD session
with VNI 0. A VTEP MAY also use VNI 0 to establish a BFD session
with a service node.
NEW
7. Use of reserved VNI
In most cases, only a single BFD session is necessary for a given VTEP
to monitor the reachability to a remote VTEP, regardless of the number of
VNIs in common. When a single session is used to monitor reachability
remote VTEP, an implementation SHOULD use a VNI of 0.
Thanks,
Anoop
Post by Greg Mirsky
Hi Anoop,
apologies if my explanation was not clear. Non-zero VNIs are recommended
to be used by a VTEP that received BFD control packet with zero Your
Discriminator value. BFD control packets with non-zero Your Discriminator
value will be demultiplexed using only that value. As for the special role
BFD session MAY be established for the reserved VNI 0. One way to
aggregate BFD sessions between VTEP's is to establish a BFD session
with VNI 0. A VTEP MAY also use VNI 0 to establish a BFD session
with a service node.
Would you suggest changing the normative language in this text?
Regards,
Greg
PS. Happy Thanksgiving to All!
Post by Anoop Ghanwani
Hi Greg,
See below prefixed with [ag4].
Thanks,
Anoop
Post by Greg Mirsky
Hi Anoop,
apologies for the miss. Is it the last outstanding? Let's bring it to
the front then.
- What is the benefit of running BFD per VNI between a pair of VTEPs?
Post by Anoop Ghanwani
Post by Greg Mirsky
GIM2>> An alternative would be to run CFM between VMs, if there's
the need to monitor liveliness of the particular VM. Again, this is
optional.
[ag2] I'm not sure how running per-VNI BFD between the VTEPs allows
one to monitor the liveliness of VMs.
[ag3] I think you missed responding to this. I'm not sure of the value
of running BFD per VNI between VTEPs. What am I getting that is not
covered by running a single BFD session with VNI 0 between the VTEPs?
GIM3>> I've misspoken. Non-zero VNI is recommended to be used to
The procedure for demultiplexing
packets with Your Discriminator equal to 0 is different from
[RFC5880]. For such packets, the BFD session MUST be identified
using the inner headers, i.e., the source IP and the destination IP
present in the IP header carried by the payload of the VXLAN
encapsulated packet. The VNI of the packet SHOULD be used to derive
interface-related information for demultiplexing the packet.
Hope that clarifies the use of non-zero VNI in VXLAN encapsulation of a
BFD control packet.
[ag4] This tells me how the VNI is used for BFD packets being
sent/received. What is the use case/benefit of doing that? I am creating
a special interface with VNI 0 just for BFD. Why do I now need to run BFD
on any/all of the other VNIs? As a developer, if I read this spec, should
I be building this capability or not? Basically what I'm getting at is I
think the draft should recommend using VNI 0. If there is a convincing use
case for running BFD over other VNIs serviced by that VTEP, then that needs
to be explained. But as I mentioned before, this leads to scaling issues.
So given the scaling issues, it would be good if an implementation only
needed to worry about sending BFD messages on VNI 0.
Post by Greg Mirsky
Regards,
Greg
Anoop Ghanwani
2018-11-23 22:46:38 UTC
Permalink
Hi Greg,

That is fine.

Thanks,
Anoop
Post by Greg Mirsky
Hi Anoop,
thank you for the consise text. I think I've got the idea. Would the minor
tweak be acceptable?
In most cases, a single BFD session is sufficient for the given VTEP to
monitor
the reachability of a remote VTEP, regardless of the number of VNIs in
common.
When the single BFD session is used to monitor reachability of the remote
VTEP,
an implementation SHOULD use a VNI of 0.
Regards,
Greg
Post by Anoop Ghanwani
Hi Greg,
I would recommend the following change.
OLD
7. Use of reserved VNI
BFD session MAY be established for the reserved VNI 0. One way to
aggregate BFD sessions between VTEP's is to establish a BFD session
with VNI 0. A VTEP MAY also use VNI 0 to establish a BFD session
with a service node.
NEW
7. Use of reserved VNI
In most cases, only a single BFD session is necessary for a given VTEP
to monitor the reachability to a remote VTEP, regardless of the number of
VNIs in common. When a single session is used to monitor reachability
remote VTEP, an implementation SHOULD use a VNI of 0.
Thanks,
Anoop
Post by Greg Mirsky
Hi Anoop,
apologies if my explanation was not clear. Non-zero VNIs are recommended
to be used by a VTEP that received BFD control packet with zero Your
Discriminator value. BFD control packets with non-zero Your Discriminator
value will be demultiplexed using only that value. As for the special role
BFD session MAY be established for the reserved VNI 0. One way to
aggregate BFD sessions between VTEP's is to establish a BFD session
with VNI 0. A VTEP MAY also use VNI 0 to establish a BFD session
with a service node.
Would you suggest changing the normative language in this text?
Regards,
Greg
PS. Happy Thanksgiving to All!
Post by Anoop Ghanwani
Hi Greg,
See below prefixed with [ag4].
Thanks,
Anoop
Post by Greg Mirsky
Hi Anoop,
apologies for the miss. Is it the last outstanding? Let's bring it to
the front then.
- What is the benefit of running BFD per VNI between a pair of VTEPs?
Post by Anoop Ghanwani
Post by Greg Mirsky
GIM2>> An alternative would be to run CFM between VMs, if there's
the need to monitor liveliness of the particular VM. Again, this is
optional.
[ag2] I'm not sure how running per-VNI BFD between the VTEPs allows
one to monitor the liveliness of VMs.
[ag3] I think you missed responding to this. I'm not sure of the
value of running BFD per VNI between VTEPs. What am I getting that is not
covered by running a single BFD session with VNI 0 between the VTEPs?
GIM3>> I've misspoken. Non-zero VNI is recommended to be used to
The procedure for demultiplexing
packets with Your Discriminator equal to 0 is different from
[RFC5880]. For such packets, the BFD session MUST be identified
using the inner headers, i.e., the source IP and the destination IP
present in the IP header carried by the payload of the VXLAN
encapsulated packet. The VNI of the packet SHOULD be used to derive
interface-related information for demultiplexing the packet.
Hope that clarifies the use of non-zero VNI in VXLAN encapsulation of
a BFD control packet.
[ag4] This tells me how the VNI is used for BFD packets being
sent/received. What is the use case/benefit of doing that? I am creating
a special interface with VNI 0 just for BFD. Why do I now need to run BFD
on any/all of the other VNIs? As a developer, if I read this spec, should
I be building this capability or not? Basically what I'm getting at is I
think the draft should recommend using VNI 0. If there is a convincing use
case for running BFD over other VNIs serviced by that VTEP, then that needs
to be explained. But as I mentioned before, this leads to scaling issues.
So given the scaling issues, it would be good if an implementation only
needed to worry about sending BFD messages on VNI 0.
Post by Greg Mirsky
Regards,
Greg
Greg Mirsky
2018-11-24 00:30:53 UTC
Permalink
Hi Anoop,
thank you for your comments and the discussion, much appreciated. All that
helped to improve the specification. I've uploaded the -04 version with
updates resulting from your comments and our discussion. Hope I've got them
all right, please let me know. Attached is the diff and the new version of
the draft.

Regards,
Greg
Post by Anoop Ghanwani
Hi Greg,
That is fine.
Thanks,
Anoop
Post by Greg Mirsky
Hi Anoop,
thank you for the consise text. I think I've got the idea. Would the
minor tweak be acceptable?
In most cases, a single BFD session is sufficient for the given VTEP to
monitor
the reachability of a remote VTEP, regardless of the number of VNIs in
common.
When the single BFD session is used to monitor reachability of the remote
VTEP,
an implementation SHOULD use a VNI of 0.
Regards,
Greg
Post by Anoop Ghanwani
Hi Greg,
I would recommend the following change.
OLD
7. Use of reserved VNI
BFD session MAY be established for the reserved VNI 0. One way to
aggregate BFD sessions between VTEP's is to establish a BFD session
with VNI 0. A VTEP MAY also use VNI 0 to establish a BFD session
with a service node.
NEW
7. Use of reserved VNI
In most cases, only a single BFD session is necessary for a given
VTEP to monitor the reachability to a remote VTEP, regardless of the number
of VNIs in common. When a single session is used to monitor reachability
remote VTEP, an implementation SHOULD use a VNI of 0.
Thanks,
Anoop
Post by Greg Mirsky
Hi Anoop,
apologies if my explanation was not clear. Non-zero VNIs are
recommended to be used by a VTEP that received BFD control packet with zero
Your Discriminator value. BFD control packets with non-zero Your
Discriminator value will be demultiplexed using only that value. As for the
BFD session MAY be established for the reserved VNI 0. One way to
aggregate BFD sessions between VTEP's is to establish a BFD session
with VNI 0. A VTEP MAY also use VNI 0 to establish a BFD session
with a service node.
Would you suggest changing the normative language in this text?
Regards,
Greg
PS. Happy Thanksgiving to All!
Post by Anoop Ghanwani
Hi Greg,
See below prefixed with [ag4].
Thanks,
Anoop
Post by Greg Mirsky
Hi Anoop,
apologies for the miss. Is it the last outstanding? Let's bring it to
the front then.
- What is the benefit of running BFD per VNI between a pair of VTEPs?
Post by Anoop Ghanwani
Post by Greg Mirsky
GIM2>> An alternative would be to run CFM between VMs, if there's
the need to monitor liveliness of the particular VM. Again, this is
optional.
[ag2] I'm not sure how running per-VNI BFD between the VTEPs allows
one to monitor the liveliness of VMs.
[ag3] I think you missed responding to this. I'm not sure of the
value of running BFD per VNI between VTEPs. What am I getting that is not
covered by running a single BFD session with VNI 0 between the VTEPs?
GIM3>> I've misspoken. Non-zero VNI is recommended to be used to
The procedure for demultiplexing
packets with Your Discriminator equal to 0 is different from
[RFC5880]. For such packets, the BFD session MUST be identified
using the inner headers, i.e., the source IP and the destination IP
present in the IP header carried by the payload of the VXLAN
encapsulated packet. The VNI of the packet SHOULD be used to derive
interface-related information for demultiplexing the packet.
Hope that clarifies the use of non-zero VNI in VXLAN encapsulation of
a BFD control packet.
[ag4] This tells me how the VNI is used for BFD packets being
sent/received. What is the use case/benefit of doing that? I am creating
a special interface with VNI 0 just for BFD. Why do I now need to run BFD
on any/all of the other VNIs? As a developer, if I read this spec, should
I be building this capability or not? Basically what I'm getting at is I
think the draft should recommend using VNI 0. If there is a convincing use
case for running BFD over other VNIs serviced by that VTEP, then that needs
to be explained. But as I mentioned before, this leads to scaling issues.
So given the scaling issues, it would be good if an implementation only
needed to worry about sending BFD messages on VNI 0.
Post by Greg Mirsky
Regards,
Greg
Loading...