[nvo3] WGLC comments on draft-ietf-bfd-vxlan

Discussion:

Anoop Ghanwani

2018-11-08 09:58:12 UTC

Here are my comments.

Thanks,
Anoop

==

Philosophical

Since VXLAN is not an IETF standard, should we be defining a standard for
running BFD on it? Should we define BFD over Geneve instead which is the
official WG selection? Is that going to be a separate document?

Technical

Section 1:

This part needs to be rewritten:
The individual racks may be part of a different Layer 3 network, or they
could be in a single Layer 2 network. The VXLAN segments/overlays are
overlaid on top of Layer 3 network. A VM can communicate with another VM
only if they are on the same VXLAN segment.
It's hard to parse and, given IRB, the last sentence above is wrong.

Section 3:
Most deployments will have VMs with only L2 capabilities that
may not support L3.
Are you suggesting most deployments have VMs with no IP
addresses/configuration?
Having a hierarchical OAM model helps localize faults though it requires
additional consideration.
What are the additional considerations?

Would be useful to add a reference to RFC 8293 in case the reader would
like to know more about service nodes.

Section 4
Separate BFD sessions can be established between the VTEPs (IP1 and IP2)
for monitoring each of the VXLAN tunnels (VNI 100 and 200).
IMO, the document should mention that this could lead to scaling issues
given that VTEPs can support well in excess of 4K VNIs. Additionally, we
should mention that with IRB, a given VNI may not even exist on the
destination VTEP. Finally, what is the benefit of doing this? There may
be certain corner cases where it's useful (vs a single BFD session between
the VTEPs for all VNIs) but it would be good to explain what those are.

Sections 5.1 and 6.1

In 5.1 we have
The inner MAC frame carrying the BFD payload has the
following format:
.... Source IP: IP address of the originating VTEP. Destination IP: IP
address of the terminating VTEP.
In 6.1 we have
Since multiple BFD sessions may be running between two
VTEPs, there needs to be a mechanism for demultiplexing received BF

packets to the proper session. The procedure for demultiplexing
packets with Your Discriminator equal to 0 is different from[RFC5880
<https://tools.ietf.org/html/rfc5880>].

*For such packets, the BFD session MUST be identified*

*using the inner headers, i.e., the source IP and the destination IP
present in the IP header carried by the payload of the VXLAN*

*encapsulated packet.*
How does this work if the source IP and dest IP are the same as specified
in 5.1?

Editorial

- Terminology section should be renamed to acronyms.
- Document would benefit from a thorough editorial scrub, but maybe that
will happen once it gets to the RFC editor.

Section 1
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348
<https://tools.ietf.org/html/rfc7348>]. provides an encapsulation scheme
that allows virtual machines (VMs) to communicate in a data center network.
This is not accurate. VXLAN allows you to implement an overlay to decouple
the address space of the attached hosts from that of the network.

Section 7

VTEP's -> VTEPs

Greg Mirsky

2018-11-09 01:50:27 UTC

Permalink

Hi Anoop,
thank you for your thorough review and the comments. I'm traveling over the
weekend and will respond in details later next week.

Regards,
Greg

Post by Anoop Ghanwani
Here are my comments.
Thanks,
Anoop
==
Philosophical
Since VXLAN is not an IETF standard, should we be defining a standard for
running BFD on it? Should we define BFD over Geneve instead which is the
official WG selection? Is that going to be a separate document?
Technical
The individual racks may be part of a different Layer 3 network, or they
could be in a single Layer 2 network. The VXLAN segments/overlays are
overlaid on top of Layer 3 network. A VM can communicate with another VM
only if they are on the same VXLAN segment.
It's hard to parse and, given IRB, the last sentence above is wrong.
Most deployments will have VMs with only L2 capabilities that
may not support L3.
Are you suggesting most deployments have VMs with no IP
addresses/configuration?
Having a hierarchical OAM model helps localize faults though it requires
additional consideration.
What are the additional considerations?
Would be useful to add a reference to RFC 8293 in case the reader would
like to know more about service nodes.
Section 4
Separate BFD sessions can be established between the VTEPs (IP1 and IP2)
for monitoring each of the VXLAN tunnels (VNI 100 and 200).
IMO, the document should mention that this could lead to scaling issues
given that VTEPs can support well in excess of 4K VNIs. Additionally, we
should mention that with IRB, a given VNI may not even exist on the
destination VTEP. Finally, what is the benefit of doing this? There may
be certain corner cases where it's useful (vs a single BFD session between
the VTEPs for all VNIs) but it would be good to explain what those are.
Sections 5.1 and 6.1
In 5.1 we have
The inner MAC frame carrying the BFD payload has the
... Source IP: IP address of the originating VTEP. Destination IP: IP
address of the terminating VTEP.
In 6.1 we have
Since multiple BFD sessions may be running between two
VTEPs, there needs to be a mechanism for demultiplexing received BF
packets to the proper session. The procedure for demultiplexing
packets with Your Discriminator equal to 0 is different from[RFC5880 <https://tools.ietf.org/html/rfc5880>].
*For such packets, the BFD session MUST be identified*
*using the inner headers, i.e., the source IP and the destination IP
present in the IP header carried by the payload of the VXLAN*
*encapsulated packet.*
How does this work if the source IP and dest IP are the same as specified
in 5.1?
Editorial
- Terminology section should be renamed to acronyms.
- Document would benefit from a thorough editorial scrub, but maybe that
will happen once it gets to the RFC editor.
Section 1
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348
<https://tools.ietf.org/html/rfc7348>]. provides an encapsulation scheme
that allows virtual machines (VMs) to communicate in a data center network.
This is not accurate. VXLAN allows you to implement an overlay to
decouple the address space of the attached hosts from that of the network.
Section 7
VTEP's -> VTEPs

Greg Mirsky

2018-11-09 01:50:59 UTC

Permalink

Hi Anoop,
thank you for your thorough review and the comments you've shared. Please
find my answers below tagged GIM>>.

Regards,
Greg

Greg Mirsky

2018-11-13 19:34:30 UTC

Permalink

Hi Anoop,
many thanks for the thorough review and detailed comments. Please find my
answers, this time for real, in-line tagged GIM>>.

Regards,
Greg

GIM>> Would the following text be acceptable:
OLD TEXT:
VXLAN is typically deployed in data centers interconnecting
virtualized hosts, which may be spread across multiple racks. The
individual racks may be part of a different Layer 3 network, or they
could be in a single Layer 2 network. The VXLAN segments/overlays
are overlaid on top of Layer 3 network.
NEW TEXT:
VXLAN is typically deployed in data centers interconnecting virtualized
hosts of a tenant. VXLAN addresses requirements of the Layer 2 and
Layer 3 data center network infrastructure in the presence of VMs in
a multi-tenant environment, discussed in section 3 [RFC7348], by
providing Layer 2 overlay scheme on a Layer 3 network.

A VM can communicate with another VM only if they are on the same
VXLAN segment.

Post by Anoop Ghanwani
the last sentence above is wrong.

GIM>> Section 4 in RFC 7348 states:
Only VMs within the same VXLAN segment can communicate with each other.

Post by Anoop Ghanwani
Most deployments will have VMs with only L2 capabilities that
may not support L3.
Are you suggesting most deployments have VMs with no IP
addresses/configuration?

GIM>> Would re-word as follows:
OLD TEXT:
Most deployments will have VMs with only L2 capabilities that
may not support L3.
NEW TEXT:
Deployments may have VMs with only L2 capabilities that do not support L3.

Post by Anoop Ghanwani
Having a hierarchical OAM model helps localize faults though it requires
additional consideration.
What are the additional considerations?

GIM>> For example, coordination of BFD intervals across the OAM layers.

Post by Anoop Ghanwani
Would be useful to add a reference to RFC 8293 in case the reader would
like to know more about service nodes.

GIM>> I have to admit that I don't find how RFC 8293 A Framework for
Multicast in Network Virtualization over Layer 3 is related to this
document. Please help with additional reference to the text of the
document.

Post by Anoop Ghanwani
Section 4
Separate BFD sessions can be established between the VTEPs (IP1 and IP2)
for monitoring each of the VXLAN tunnels (VNI 100 and 200).
IMO, the document should mention that this could lead to scaling issues
given that VTEPs can support well in excess of 4K VNIs. Additionally, we
should mention that with IRB, a given VNI may not even exist on the
destination VTEP. Finally, what is the benefit of doing this? There may
be certain corner cases where it's useful (vs a single BFD session between
the VTEPs for all VNIs) but it would be good to explain what those are.

GIM>> Will add text in the Security Considerations section that VTEPs
should have limit on number of BFD sessions.

Post by Anoop Ghanwani
Sections 5.1 and 6.1
In 5.1 we have
The inner MAC frame carrying the BFD payload has the
... Source IP: IP address of the originating VTEP. Destination IP: IP
address of the terminating VTEP.
In 6.1 we have
Since multiple BFD sessions may be running between two
VTEPs, there needs to be a mechanism for demultiplexing received BF
packets to the proper session. The procedure for demultiplexing
packets with Your Discriminator equal to 0 is different from[RFC5880 <https://tools.ietf.org/html/rfc5880>].
*For such packets, the BFD session MUST be identified*
*using the inner headers, i.e., the source IP and the destination IP
present in the IP header carried by the payload of the VXLAN*
*encapsulated packet.*
How does this work if the source IP and dest IP are the same as specified
in 5.1?

GIM>> You're right, Destination and source IP addresses likely are the same
in this case. Will add that the source UDP port number, along with the pair
of IP addresses, MUST be used to demux received BFD control packets. Would
you agree that will be sufficient?

Post by Anoop Ghanwani
Editorial
- Terminology section should be renamed to acronyms.

GIM>> Accepted

Post by Anoop Ghanwani
- Document would benefit from a thorough editorial scrub, but maybe that
will happen once it gets to the RFC editor.

GIM>> Will certainly have helpful comments from ADs and RFC editor.

Post by Anoop Ghanwani
Section 1
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348
<https://tools.ietf.org/html/rfc7348>]. provides an encapsulation scheme
that allows virtual machines (VMs) to communicate in a data center network.
This is not accurate. VXLAN allows you to implement an overlay to
decouple the address space of the attached hosts from that of the network.

GIM>> Thank you for the suggested text. Will change as follows:
OLD TEXT:
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348]. provides
an encapsulation scheme that allows virtual machines (VMs) to
communicate in a data center network.
NEW TEXT:
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348]. provides
an encapsulation scheme that allows building an overlay network by
decoupling the address space of the attached virtual hosts from that of
the network.

Post by Anoop Ghanwani
Section 7
VTEP's -> VTEPs

GIM>> Yes, thank you.

Anoop Ghanwani

2018-11-13 20:30:27 UTC

Permalink

Hi Greg,

Please see inline prefixed with [ag].

Thanks,
Anoop

Post by Greg Mirsky
Hi Anoop,
many thanks for the thorough review and detailed comments. Please find my
answers, this time for real, in-line tagged GIM>>.
Regards,
Greg

[ag] OK. I'm not an expert on this part so unless someone else that is an
expert (chairs, AD?) can comment on it, I'll just let it go.