Discussion:
[nvo3] Tsvart last call review of draft-ietf-nvo3-vmm-04
Bob Briscoe
2018-09-04 02:44:52 UTC
Permalink
Reviewer: Bob Briscoe
Review result: Not Ready

I have been selected as the Transport Directorate reviewer for this draft. The
Transport Directorate seeks to review all transport or transport-related drafts
as they pass through IETF last call and IESG review, and sometimes on special
request. The purpose of the review is to provide assistance to the Transport
ADs. For more information about the Transport Directorate Reviews and the
Transport Area Review Team, please see
​https://trac.ietf.org/trac/tsv/wiki/TSV-Directorate-Reviews

In this case, very very few of the review comments relate to transport issues,
although the greatest issue concerns a desire that the network could pause or
stop connections during L3 VM Mobility, which is certainly a transport issue.

==Summary==

The technical aspects of the draft concerning L2 VM mobility (within a subnet)
seem sound. However, this is only part of the draft, which has the following
issues:

#. The introduction does not say what the purpose of publishing this draft is.
It seems that, rather than describing a specific protocol or protocols, it
intends to describe the overall system procedure that would typically be used
in DCs for VM mobility. It is tagged as a BCP, but it does not say who needs
this BCP, why it is useful for the IETF to publish this BCP, how wide the
authors' knowledge is of current practice (given DCs are private), or why this
is a BCP rather than a protocol spec.

The draft starts out (S.3) as if it intends to say what a good VM Mobility
protocol should or shouldn't do, but the rest of the document doesn't give any
reasoning for these recommendations, it just asserts what appears to be one
view of how a whole VM Mobility system works, sometimes referring to one
example protocol RFC for a component part, but more often with no references or
details.

#. It does not seem as if the NVO WG has discussed the purpose of using
normative text in this draft. See detailed comments.

#. The draft silently slips back and forth between VM mobility and VM
redundancy, without recognizing the differences. See detailed comments.

#. Please adopt different terminology than "source NVE" and "destination NVE",
which are really poor choices of terms for an intermediate node. See detailed
comments. Why not use "old NVE" and "new NVE", which is what you mean?

#. Applicability is fairly clearly outlined, but it is not clear whether hosts
corresponding with the mobile VMs are part of the same controlled environment
or on the uncontrolled public Internet. See detailed comments.

#. Section 4.2.1 on L3 VM mobility reads like some potential
half-thought-through ideas on how to solve L3 mobility, rather than current
practice, let alone best current practice. Either current practice should be
described instead, or the scope of the draft should be narrowed solely to L2 VM
mobility. See detailed comments.

# The VM's file system is described as state that moves with the VM (S.6), but
VM mobility solutions often move the VM but stitch it back to its (unmoved)
storage. Conversely, the storage can also move independent of the VM.

#. The draft omits some of the security, transport and management aspects of VM
mobility. See detailed comments.

#. The draft reads as if different sections have been written by different
authors and no-one has edited the whole to give it a coherent structure, or to
ensure consistency (both technical and editorial) between the parts. See
detailed comments.

#. The quality of the English grammar does not allow a reviewer to concentrate
on the technical aspects rather than the English. It would have been useful if
one of the English-speaking co-authors had improved the English before
submission for review. See detailed comments.

==Detailed Comments==

===#. Normative statements===

In the body of the document, there is just one occurrence of normative text
(actually two "MUST"s, but both state a common requirement - just written
separately for IPv4 and IPv6). This merely serves to imply that everything else
the document says is less important or optional, which was probably not the
intention.

At the start there is a requirements section, which states what a VM Mobility
protocol "SHOULD" or "SHOULD NOT" do. I think this is intended as a set of
goals for the rest of the document. If so, these "SHOULDs" are not intended to
apply to implementations, so they ought not to be capitalized.

The first requirement, "Data center network SHOULD support virtual machine
mobility in IPv6", is written as a requirement on all DC networks, not on
implementations. I assume this was intended to read as "Data center network
virtual machine mobility protocols SHOULD support IPv6". Even then, it doesn't
really add anything to say VM mobility should support v6 and it should support
v4. A L2 solution won't. While undoubtedly, a L3 solution will at least support
one of them.

I'm not sure that 'protocol' is the right word anyway; I think 'VM Mobility
procedure' would be a better phrase, because it includes steps such as
suspending the VM, which is more than a protocol.

The requirement "Virtual machine mobility protocol MAY support host routes to
accomplish virtualization", is not followed up at all in the rest of the draft.
Even if this requirement stays, the last 3 words should be deleted.

By the end of the draft, the solution falls far short of the most relevant
"Requirements" anyway, so one assumes the title of the section ought to have
been "Goals". Specifically, even in the simpler case of L2 VM mobility, S.4.1
says that triangular routing and tunnelling persist "until a neighbour cache
entry times out". A cache timeout is about 10 orders of magnitude longer than
the requirement to only persist "while handling packets in flight", which would
be a few milliseconds at most (the time for packets to clear the network that
were already launched into flight when the old VM stopped).

Whatever, it would be preferable for the draft to give rationale for these
requirements, rather than just assert them. This would help to shed light on
the merits of the different trade offs that solutions choose.

===#. Mobility vs. Redundancy===

Redundancy and mobility have a lot of similarities, but they have different
goals. With mobility, it is necessary to know the exact instant when one set of
state is identical to the other so it can hand over. With redundancy, the aim
is to keep two (or more) sets of state evolving through the same sequence of
changes, but there is no need to know the point at which one is the same as the
other was at a certain point.

The draft slips from mobility to resilience in the following places:
* S.2. Terminology: Warm VM Mobility is defined without any ending, as if it is
permanent replication. * S.7. "Handling of Hot, Warm and Cold Virtual Machine
Mobility" is actually all about redundancy, and doesn't address mobility
explicitly.

===#. Terminology===

Packets run from the source at A to the destination at B via NVE1, then via
NVE2. Please don't call NVE1 and NVE2 the source NVE and the destination NVE.
In future, no-one will thank you for the apparent contradictions when they
continually stumble over phrases like this one in S.4.1: "...send their packets
to the source NVE".

The term "packets in flight" is used incorrectly to refer to all the packets
sent to the old NVE after the VM has moved, even if they were launched into
flight long after the old VM stopped receiving packets.

BTW, I think s/before/after/ in: "that have old ARP or neighbor cache entry
before VM or task migration".

I think: s/IP-based VM mobility/L3 VM mobility/ throughout, because "based"
sounds (to me) like the mobility control protocol is over (i.e. based on) IP.

===#. Applicability===

In section 4.2 it says that the protocol mostly used as the IP based task
migration protocol is ILA. This implies that all hosts corresponding with the
mobile VMs are either part of the same controlled environment, or they are
proxied via nodes that are part of the same controlled environment (I only have
passing knowledge of ILA, but I understand that it depends on ILA routers on
the path). If I am correct, this aspect of scope needs to be made clear from
the start.

Also under the heading of applicabiliy, the sentence "Since migrations should
be relatively rare events" appears very late in the document (S.4.2.1). The
assumed level of churn ought to be stated nearer the start.

===#. L3 Mobility===
L2 VM mobility is independent of the application, because resolution of L2
mappings is delegated to the stack. In contrast, L3 VM mobility is only
feasible under certain conditions, because an application needs an IP address
to open a socket (resolution of DNS names is not delegated to the stack, and
apps can use IP addresses directly anyway).

Examples of the 'certain conditions':
a) /All/ applications used in the whole DC load balancing scheme contain IP
address migration logic for /all/ their connections; b) VMs running solely
applications that support IP address migration register this fact with the NVA,
and it only select such VMs for mobility. c) An abstraction is layered over
/all/ the IP addresses exposed to applications (at both ends) so that the IP
addresses that applications use are solely identifiers (e.g. ILA, LISP, HIP),
not also locators.

The introduction says the draft is about VM mobility in a multi-tenant DC, so
the DC admin will not know the range of applications being used. This excludes
condition (a) above. When the draft says "...if all applications running are
known to handle this gracefully...", it doesn't quantify just how restrictive
this condition is, and it gives no explanation of how this knowledge might be
'known' or which function within the system 'knows' it.

S.4.2.1 contains what seems like plenty of arm-waving.
* "TCP connections could be automatically closed in the network stack during a
migration event."
o There is no TCP connection state in the network stack.
o Even if the network starts to drop every packet, the TCP connection
state persists in the end-points for a duration of the order of 30-90
minutes (OS-dependent) before TCP deems the connection is broken. o
Other transport protocols have similar designs (including the app-layer
of protocols over UDP).
* "More involved approach to connection migration":
o pausing the connection [does this refer to an actual feature of any
L4 protocol?] o packaging connection state and sending to target [does
this assume logic written into the application, or is this assuming the
stack handles this and the app is restricted to using some form of
separate identifier/locator addresses?] o instantiating connection
state in the peer stack [ditto?].

There's some arm-waving in S.7 too:
"Cold Virtual Machine mobility is facilitated by the VM initially
sending an ARP or Neighbor Discovery message at the destination NVE
but the source NVE not receiving any packets inflight."
[How is it arranged for the source NVE not to receive any packets in flight?]

And in S.7:
"In hot
standby option, regarding TCP connections, one option is to start
with and maintain TCP connections to two different VMs at the same
time."
[This sounds like resilience logic has been written into the application,
which would be a special case but not something VM mobility infrastructure
could depend on.]

===#. Gaps===
#. Security Considerations: repeats issues in other drafts that are not
specific to mobility, but it does not mention any security issues specifically
due to VM mobility. It says that address spoofing may arise in a DC (sort-of
implying it is worse than in non-DC environments, but not saying why). The
handshake at the start of a connection (e.g. TCP, SCTP, QUIC) checks for source
address spoofing. So L3 VM mobility would be more vulnerable to source address
spoofing in cases where the mobile VM was the connection initiator and there
was not a new handshake after the move. However, this draft does not contain
any detailed mobility protocols, so it is not possible to identify any specific
security flaws.

#. Transport Issues: Effect of delay on the transport: Cold mobility introduces
significant delay, and other forms less, but still some delay. It should be
pointed out that some applications (e.g. real-time) will therefore not be
useful if subjected to VM mobility. Similarly, even a short period of delay
will drive most congestion controls to severely reduce throughput. These points
might be self-evident, but perhaps they should be stated explicitly.

BTW, in the L3 VM mobility case, the draft often refers to TCP connections, but
the address bindings of any transport protocols would have to be migrated due
to VM mobility (e.g. SCTP; sequences of datagrams over UDP; streams over UDP
such as with RTP, QUIC).

#. Management Issues: perhaps the draft ought to recommend statistics gathering
(e.g. time taken, amount of duplicate data) to aid a DC's future decisions on
the cost-benefit of moving a VM. The OPSDIR review says a BCP does not /have/
to describe management issues, but this document seems to describe a whole
system procedure, not just a protocol, which then surely includes the
management plane.

===#. Incoherent Structure===

S.4.1. happens to talk about VMs moving, while S.4.2. happens to talk about
tasks moving, but this is not the distinguishing aspect of these two sections
(anyway, S.2. says "the draft uses task and VM interchangeably"): * "4.1 VM
Migration" is about "L2 VM Mobility" so this ought to be the section heading, *
"4.2 Task Migration" is about "L3 VM Mobility" so this ought to be the section
heading. It would also help not to switch from VM to task across these sections
- it's just a distraction.

S.4.1 needs better signposting of where each sub-case ends (Subsections might
be useful to solve this): * IPv4 * end-user client * 2 paras starting "All NVEs
communicating with this virtual machine..." [Not clear that the end-user case
has ended and we have returned to the general IPv4 case?] * IPv6 [Strictly, it
still hasn't said whether the end-user client case has ended.] [Also, it
doesn't explain why there is no need for an end-user client case under IPv6?]
Sections 5 & 6 seem to be about either L2 or L3 mobility, whereas Sections 7 &
8 seem to be restricted to L2.

The draft vacillates over what to do with packets arriving at the old NVE in
the L3 case (see also L3 mobility above): * S4.2 first says packets are
dropped, possibly with an ICMP error message;
o then later it says they are silently dropped;
o then in the very next sentence it says either silently drop them or forward
them to the new location
* S.5 says they should not be lost, but instead delivered to the destination
hypervisor
o then it describes how they are tunnelled (which is not the same as
"forwarding").

The order in which all the stages of mobilty are given is jumbled up across
sections that also appear in arbitrary order: * S.5 prepares, establishes uses
then stops a tunnel, but it doesn't say where the other stages fit between
these steps
o When tunneling packets, it talks about the *migrating* VM not the
*migrated* VM, which implies tunnelling has started before the new VM
is running. Does this imply there is a huge buffer? o It says "Stop
Tunneling Packets - When source NVE stops receiving packets destined
to..." but it is never clear when a source has stopped sending packets
to a destination, unless it explicitly closes the connection (e.g. with
a FIN in the case of TCP). Often there are long gaps between packets,
because many flows are 'thin' (meaning the application frequently has
nothing to send). These gaps can last for milliseconds, hours or even
days without any implication that the connection has ended.
* Then S.6. describes moving state, but doesn't say that this is not after the
previous tunnelling steps (or where it fits within those steps). * Then S.7
describes hot, warm and cold mobility, but doesn't lay out the tunnelling or
steps to move state in each case. * Then S.8 says it's about VM life-cycle, but
just gives the very first 3 steps for allocation of resources to a VM, then
abruptly ends, without even starting the VM, let alone getting to move it.

S.5 exhibits another inconsistency by talking about the hypervisor, not the NVE.

==#. Nits==

Nits with the English are too numerous to mention them all. Below are pointers
to general problems as well as some individual instances.

S.4
"Layer 2 and Layer 3 protocols are described next. In the following
sections, we examine more advanced features."
s/following/subsequent/

S.4.1
Expand WSC, MSC and NVA on first use.

s/the VM moves in the same link/the VM moves in the same subnet/

"i.e. end-user clients ask for the same MAC address upon migration. [...] to
ensure that the same IPv4 address is assigned to the VM." I think s/IPv4/MAC/
was intended?

" All NVEs communicating with this virtual machine uses the old ARP
entry. If any VM in those NVEs need to talk to the new VM in the
destination NVE, it uses the old ARP entry."
Repetition: these 2 sentences say the same. (The mistake is also repeated when
these 2 sentences are repeated for IPv6).

S.4.2.1
s/Push the new mapping to hosts./Push the new mapping to communicating hosts./

S.5.
The IPv4/IPv6 pairs of paras for "tunnel estabilshment" and "tunneling packets"
only differ in the words "IPv4"/"IPv6". So in each case a single para could be
given for IP (irrespective of whether v4 or v6).
Linda Dunbar
2018-09-10 18:16:58 UTC
Permalink
Bob,

Thank you very much for reviewing the draft and provided in-depth comments. I am very sorry for the delayed response due to traveling.

Replies to your comments are inserted below marked by [Linda]:


-----Original Message-----
From: Bob Briscoe [mailto:***@bobbriscoe.net]
Sent: Monday, September 03, 2018 9:45 PM
To: tsv-***@ietf.org
Cc: ***@ietf.org; ***@ietf.org; draft-ietf-nvo3-***@ietf.org
Subject: Tsvart last call review of draft-ietf-nvo3-vmm-04

Reviewer: Bob Briscoe
Review result: Not Ready

I have been selected as the Transport Directorate reviewer for this draft. The Transport Directorate seeks to review all transport or transport-related drafts as they pass through IETF last call and IESG review, and sometimes on special request. The purpose of the review is to provide assistance to the Transport ADs. For more information about the Transport Directorate Reviews and the Transport Area Review Team, please see ​https://trac.ietf.org/trac/tsv/wiki/TSV-Directorate-Reviews

In this case, very very few of the review comments relate to transport issues, although the greatest issue concerns a desire that the network could pause or stop connections during L3 VM Mobility, which is certainly a transport issue.

[Linda] There is “Hot Migration” with transport service continuing, and there is a “Cold Migration”, which is a common practice in many data centers, which stop the task running on the old place and move to the new place before restart as described in the Task Migration.
Is it helpful to add this description to the draft?


==Summary==

The technical aspects of the draft concerning L2 VM mobility (within a subnet) seem sound. However, this is only part of the draft, which has the following
issues:

#. The introduction does not say what the purpose of publishing this draft is.
It seems that, rather than describing a specific protocol or protocols, it intends to describe the overall system procedure that would typically be used in DCs for VM mobility. It is tagged as a BCP, but it does not say who needs this BCP, why it is useful for the IETF to publish this BCP, how wide the authors' knowledge is of current practice (given DCs are private), or why this is a BCP rather than a protocol spec.

[Linda] The first paragraph on Page 3 has the description why VM Mobility is needed. Is it helpful to move this paragraph to the beginning of the Introduction Section?
“Virtualization which is being used in almost all of today’s data
centers enables many virtual machines to run on a single physical
computer or compute server. Virtual machines (VM) need hypervisor
running on the physical compute server to provide them shared
processor/memory/storage. Network connectivity is provided by the
network virtualization edge (NVE) [RFC8014]. Being able to move VMs
dynamically, or live migration, from one server to another allows for
dynamic load balancing or work distribution and thus it is a highly
desirable feature [RFC7364].”


The draft starts out (S.3) as if it intends to say what a good VM Mobility protocol should or shouldn't do, but the rest of the document doesn't give any reasoning for these recommendations, it just asserts what appears to be one view of how a whole VM Mobility system works, sometimes referring to one example protocol RFC for a component part, but more often with no references or details.

[Linda] Is it helpful to move the paragraph above to the beginning of the Introduction Section? So that audience is aware of why VM Mobility is needed. And then follow up with what a good VM Mobility protocol should or shouldn't do?

#. It does not seem as if the NVO WG has discussed the purpose of using normative text in this draft. See detailed comments.

[Linda] The “Intended status” of the draft is “Best Current Practice”. So all the text are not “normative”. Is it Okay?

#. The draft silently slips back and forth between VM mobility and VM redundancy, without recognizing the differences. See detailed comments.

[Linda] There is only one usage of “redundancy” in the entire document, used under the context of “Hot standby option”, indicating the “redundancy” of “the VMs in both primary and secondary domains have identical information and can provide services simultaneously as in load-share mode of operation” being expensive.

#. Please adopt different terminology than "source NVE" and "destination NVE", which are really poor choices of terms for an intermediate node. See detailed comments. Why not use "old NVE" and "new NVE", which is what you mean?
[Linda] Thanks for the suggestion. We will change to “Old NVE”, and “new NVE”.

#. Applicability is fairly clearly outlined, but it is not clear whether hosts corresponding with the mobile VMs are part of the same controlled environment or on the uncontrolled public Internet. See detailed comments.
[Linda] “Hosts” are the App running on the VM. It is the under the same controlled environment. Not on uncontrolled public internet.


#. Section 4.2.1 on L3 VM mobility reads like some potential half-thought-through ideas on how to solve L3 mobility, rather than current practice, let alone best current practice. Either current practice should be described instead, or the scope of the draft should be narrowed solely to L2 VM mobility. See detailed comments.
[Linda] This is refereeing to “Cold Migration”, which is a common practice in many data centers.

# The VM's file system is described as state that moves with the VM (S.6), but VM mobility solutions often move the VM but stitch it back to its (unmoved) storage. Conversely, the storage can also move independent of the VM.
[Linda] It depends. When a VM move to a different zone, the storage/file can becomes inaccessible.

#. The draft omits some of the security, transport and management aspects of VM mobility. See detailed comments.
[Linda] Can you provide some text?

#. The draft reads as if different sections have been written by different authors and no-one has edited the whole to give it a coherent structure, or to ensure consistency (both technical and editorial) between the parts. See detailed comments.

[Linda] we can improve.


#. The quality of the English grammar does not allow a reviewer to concentrate on the technical aspects rather than the English. It would have been useful if one of the English-speaking co-authors had improved the English before submission for review. See detailed comments.
[Linda] can you help? Becoming a co-author to improve?

==Detailed Comments==

===#. Normative statements===

In the body of the document, there is just one occurrence of normative text (actually two "MUST"s, but both state a common requirement - just written separately for IPv4 and IPv6). This merely serves to imply that everything else the document says is less important or optional, which was probably not the intention.
[Linda] The goal is to indicate any solution in moving the VM “MUST” follow this rule. They make sense, aren’t they?

At the start there is a requirements section, which states what a VM Mobility protocol "SHOULD" or "SHOULD NOT" do. I think this is intended as a set of goals for the rest of the document. If so, these "SHOULDs" are not intended to apply to implementations, so they ought not to be capitalized.

[Linda] okay, will change.


The first requirement, "Data center network SHOULD support virtual machine mobility in IPv6", is written as a requirement on all DC networks, not on implementations. I assume this was intended to read as "Data center network virtual machine mobility protocols SHOULD support IPv6". Even then, it doesn't really add anything to say VM mobility should support v6 and it should support v4. A L2 solution won't. While undoubtedly, a L3 solution will at least support one of them.
[Linda]Agree. Will change it to “Data center that support IPv6 address should 
”

I'm not sure that 'protocol' is the right word anyway; I think 'VM Mobility procedure' would be a better phrase, because it includes steps such as suspending the VM, which is more than a protocol.
[Linda] yes. Will change to “Procedure”.

The requirement "Virtual machine mobility protocol MAY support host routes to accomplish virtualization", is not followed up at all in the rest of the draft.
Even if this requirement stays, the last 3 words should be deleted.

[Linda] will change to “Host Route can be used to support the Virtual Machine Mobility Procedure.”

By the end of the draft, the solution falls far short of the most relevant "Requirements" anyway, so one assumes the title of the section ought to have been "Goals". Specifically, even in the simpler case of L2 VM mobility, S.4.1 says that triangular routing and tunnelling persist "until a neighbour cache entry times out". A cache timeout is about 10 orders of magnitude longer than the requirement to only persist "while handling packets in flight", which would be a few milliseconds at most (the time for packets to clear the network that were already launched into flight when the old VM stopped).

Whatever, it would be preferable for the draft to give rationale for these requirements, rather than just assert them. This would help to shed light on the merits of the different trade offs that solutions choose.

[Linda] Agree, will add.

===#. Mobility vs. Redundancy===

Redundancy and mobility have a lot of similarities, but they have different goals. With mobility, it is necessary to know the exact instant when one set of state is identical to the other so it can hand over. With redundancy, the aim is to keep two (or more) sets of state evolving through the same sequence of changes, but there is no need to know the point at which one is the same as the other was at a certain point.
[Linda] Agree with what you said. There is only one usage of “redundancy” in the entire document, used under the context of “Hot standby option”, indicating the “redundancy” of “the VMs in both primary and secondary domains have identical information and can provide services simultaneously as in load-share mode of operation” being expensive.

The draft slips from mobility to resilience in the following places:
* S.2. Terminology: Warm VM Mobility is defined without any ending, as if it is permanent replication. * S.7. "Handling of Hot, Warm and Cold Virtual Machine Mobility" is actually all about redundancy, and doesn't address mobility explicitly.

[Linda] Will add the definition “Hot Migration”, “cold migration”, and “warm migration”.

===#. Terminology===

Packets run from the source at A to the destination at B via NVE1, then via NVE2. Please don't call NVE1 and NVE2 the source NVE and the destination NVE.
In future, no-one will thank you for the apparent contradictions when they continually stumble over phrases like this one in S.4.1: "...send their packets to the source NVE".

The term "packets in flight" is used incorrectly to refer to all the packets sent to the old NVE after the VM has moved, even if they were launched into flight long after the old VM stopped receiving packets.

[Linda] thank for the comments. Will change.

BTW, I think s/before/after/ in: "that have old ARP or neighbor cache entry before VM or task migration".

I think: s/IP-based VM mobility/L3 VM mobility/ throughout, because "based"
sounds (to me) like the mobility control protocol is over (i.e. based on) IP.

===#. Applicability===

In section 4.2 it says that the protocol mostly used as the IP based task migration protocol is ILA. This implies that all hosts corresponding with the mobile VMs are either part of the same controlled environment, or they are proxied via nodes that are part of the same controlled environment (I only have passing knowledge of ILA, but I understand that it depends on ILA routers on the path). If I am correct, this aspect of scope needs to be made clear from the start.

Also under the heading of applicabiliy, the sentence "Since migrations should be relatively rare events" appears very late in the document (S.4.2.1). The assumed level of churn ought to be stated nearer the start.

[Linda] yes, under the same controlled environment.

===#. L3 Mobility===
L2 VM mobility is independent of the application, because resolution of L2 mappings is delegated to the stack. In contrast, L3 VM mobility is only feasible under certain conditions, because an application needs an IP address to open a socket (resolution of DNS names is not delegated to the stack, and apps can use IP addresses directly anyway).

Examples of the 'certain conditions':
a) /All/ applications used in the whole DC load balancing scheme contain IP address migration logic for /all/ their connections; b) VMs running solely applications that support IP address migration register this fact with the NVA, and it only select such VMs for mobility. c) An abstraction is layered over /all/ the IP addresses exposed to applications (at both ends) so that the IP addresses that applications use are solely identifiers (e.g. ILA, LISP, HIP), not also locators.

The introduction says the draft is about VM mobility in a multi-tenant DC, so the DC admin will not know the range of applications being used. This excludes condition (a) above. When the draft says "...if all applications running are known to handle this gracefully...", it doesn't quantify just how restrictive this condition is, and it gives no explanation of how this knowledge might be 'known' or which function within the system 'knows' it.

S.4.2.1 contains what seems like plenty of arm-waving.
* "TCP connections could be automatically closed in the network stack during a migration event."
o There is no TCP connection state in the network stack.
o Even if the network starts to drop every packet, the TCP connection
state persists in the end-points for a duration of the order of 30-90
minutes (OS-dependent) before TCP deems the connection is broken. o
Other transport protocols have similar designs (including the app-layer
of protocols over UDP).
* "More involved approach to connection migration":
o pausing the connection [does this refer to an actual feature of any
L4 protocol?] o packaging connection state and sending to target [does
this assume logic written into the application, or is this assuming the
stack handles this and the app is restricted to using some form of
separate identifier/locator addresses?] o instantiating connection
state in the peer stack [ditto?].

There's some arm-waving in S.7 too:
"Cold Virtual Machine mobility is facilitated by the VM initially
sending an ARP or Neighbor Discovery message at the destination NVE
but the source NVE not receiving any packets inflight."
[How is it arranged for the source NVE not to receive any packets in flight?]

And in S.7:
"In hot
standby option, regarding TCP connections, one option is to start
with and maintain TCP connections to two different VMs at the same
time."
[This sounds like resilience logic has been written into the application,
which would be a special case but not something VM mobility infrastructure
could depend on.]

[Linda] will add.

===#. Gaps===
#. Security Considerations: repeats issues in other drafts that are not specific to mobility, but it does not mention any security issues specifically due to VM mobility. It says that address spoofing may arise in a DC (sort-of implying it is worse than in non-DC environments, but not saying why). The handshake at the start of a connection (e.g. TCP, SCTP, QUIC) checks for source address spoofing. So L3 VM mobility would be more vulnerable to source address spoofing in cases where the mobile VM was the connection initiator and there was not a new handshake after the move. However, this draft does not contain any detailed mobility protocols, so it is not possible to identify any specific security flaws.

#. Transport Issues: Effect of delay on the transport: Cold mobility introduces significant delay, and other forms less, but still some delay. It should be pointed out that some applications (e.g. real-time) will therefore not be useful if subjected to VM mobility. Similarly, even a short period of delay will drive most congestion controls to severely reduce throughput. These points might be self-evident, but perhaps they should be stated explicitly.

BTW, in the L3 VM mobility case, the draft often refers to TCP connections, but the address bindings of any transport protocols would have to be migrated due to VM mobility (e.g. SCTP; sequences of datagrams over UDP; streams over UDP such as with RTP, QUIC).

#. Management Issues: perhaps the draft ought to recommend statistics gathering (e.g. time taken, amount of duplicate data) to aid a DC's future decisions on the cost-benefit of moving a VM. The OPSDIR review says a BCP does not /have/ to describe management issues, but this document seems to describe a whole system procedure, not just a protocol, which then surely includes the management plane.

[Linda] can you become a co-author and add those in?

===#. Incoherent Structure===

S.4.1. happens to talk about VMs moving, while S.4.2. happens to talk about tasks moving, but this is not the distinguishing aspect of these two sections (anyway, S.2. says "the draft uses task and VM interchangeably"): * "4.1 VM Migration" is about "L2 VM Mobility" so this ought to be the section heading, *
"4.2 Task Migration" is about "L3 VM Mobility" so this ought to be the section heading. It would also help not to switch from VM to task across these sections
- it's just a distraction.

S.4.1 needs better signposting of where each sub-case ends (Subsections might be useful to solve this): * IPv4 * end-user client * 2 paras starting "All NVEs communicating with this virtual machine..." [Not clear that the end-user case has ended and we have returned to the general IPv4 case?] * IPv6 [Strictly, it still hasn't said whether the end-user client case has ended.] [Also, it doesn't explain why there is no need for an end-user client case under IPv6?] Sections 5 & 6 seem to be about either L2 or L3 mobility, whereas Sections 7 &
8 seem to be restricted to L2.

The draft vacillates over what to do with packets arriving at the old NVE in the L3 case (see also L3 mobility above): * S4.2 first says packets are dropped, possibly with an ICMP error message;
o then later it says they are silently dropped;
o then in the very next sentence it says either silently drop them or forward
them to the new location
* S.5 says they should not be lost, but instead delivered to the destination hypervisor
o then it describes how they are tunnelled (which is not the same as
"forwarding").

The order in which all the stages of mobilty are given is jumbled up across sections that also appear in arbitrary order: * S.5 prepares, establishes uses then stops a tunnel, but it doesn't say where the other stages fit between these steps
o When tunneling packets, it talks about the *migrating* VM not the
*migrated* VM, which implies tunnelling has started before the new VM
is running. Does this imply there is a huge buffer? o It says "Stop
Tunneling Packets - When source NVE stops receiving packets destined
to..." but it is never clear when a source has stopped sending packets
to a destination, unless it explicitly closes the connection (e.g. with
a FIN in the case of TCP). Often there are long gaps between packets,
because many flows are 'thin' (meaning the application frequently has
nothing to send). These gaps can last for milliseconds, hours or even
days without any implication that the connection has ended.
* Then S.6. describes moving state, but doesn't say that this is not after the previous tunnelling steps (or where it fits within those steps). * Then S.7 describes hot, warm and cold mobility, but doesn't lay out the tunnelling or steps to move state in each case. * Then S.8 says it's about VM life-cycle, but just gives the very first 3 steps for allocation of resources to a VM, then abruptly ends, without even starting the VM, let alone getting to move it.

S.5 exhibits another inconsistency by talking about the hypervisor, not the NVE.

==#. Nits==

Nits with the English are too numerous to mention them all. Below are pointers to general problems as well as some individual instances.

S.4
"Layer 2 and Layer 3 protocols are described next. In the following
sections, we examine more advanced features."
s/following/subsequent/

S.4.1
Expand WSC, MSC and NVA on first use.

s/the VM moves in the same link/the VM moves in the same subnet/

"i.e. end-user clients ask for the same MAC address upon migration. [...] to ensure that the same IPv4 address is assigned to the VM." I think s/IPv4/MAC/ was intended?

" All NVEs communicating with this virtual machine uses the old ARP
entry. If any VM in those NVEs need to talk to the new VM in the
destination NVE, it uses the old ARP entry."
Repetition: these 2 sentences say the same. (The mistake is also repeated when these 2 sentences are repeated for IPv6).

S.4.2.1
s/Push the new mapping to hosts./Push the new mapping to communicating hosts./

S.5.
The IPv4/IPv6 pairs of paras for "tunnel estabilshment" and "tunneling packets"
only differ in the words "IPv4"/"IPv6". So in each case a single para could be given for IP (irrespective of whether v4 or v6).

Thank you very much.

Linda Dunbar
Black, David
2018-09-14 01:02:54 UTC
Permalink
Post by Linda Dunbar
#. It does not seem as if the NVO WG has discussed the purpose of using normative text in this draft. See detailed comments.
[Linda] The “Intended status” of the draft is “Best Current Practice”. So all the text are not “normative”. Is it Okay?
Not really – this draft might be better targeted as “Informational” as it is not a comprehensive review of current practice (best or otherwise) nor an overall set of recommendations, e.g., as Bob wrote “it just asserts what appears to be one view of how a whole VM Mobility system works.”

Thanks, --David

From: Tsv-art [mailto:tsv-art-***@ietf.org] On Behalf Of Linda Dunbar
Sent: Monday, September 10, 2018 2:17 PM
To: Bob Briscoe; tsv-***@ietf.org
Cc: ***@ietf.org; ***@ietf.org; draft-ietf-nvo3-***@ietf.org
Subject: Re: [Tsv-art] Tsvart last call review of draft-ietf-nvo3-vmm-04

Bob,

Thank you very much for reviewing the draft and provided in-depth comments. I am very sorry for the delayed response due to traveling.

Replies to your comments are inserted below marked by [Linda]:


-----Original Message-----
From: Bob Briscoe [mailto:***@bobbriscoe.net]
Sent: Monday, September 03, 2018 9:45 PM
To: tsv-***@ietf.org<mailto:tsv-***@ietf.org>
Cc: ***@ietf.org<mailto:***@ietf.org>; ***@ietf.org<mailto:***@ietf.org>; draft-ietf-nvo3-***@ietf.org<mailto:draft-ietf-nvo3-***@ietf.org>
Subject: Tsvart last call review of draft-ietf-nvo3-vmm-04

Reviewer: Bob Briscoe
Review result: Not Ready

I have been selected as the Transport Directorate reviewer for this draft. The Transport Directorate seeks to review all transport or transport-related drafts as they pass through IETF last call and IESG review, and sometimes on special request. The purpose of the review is to provide assistance to the Transport ADs. For more information about the Transport Directorate Reviews and the Transport Area Review Team, please see ​https://trac.ietf.org/trac/tsv/wiki/TSV-Directorate-Reviews

In this case, very very few of the review comments relate to transport issues, although the greatest issue concerns a desire that the network could pause or stop connections during L3 VM Mobility, which is certainly a transport issue.

[Linda] There is “Hot Migration” with transport service continuing, and there is a “Cold Migration”, which is a common practice in many data centers, which stop the task running on the old place and move to the new place before restart as described in the Task Migration.
Is it helpful to add this description to the draft?


==Summary==

The technical aspects of the draft concerning L2 VM mobility (within a subnet) seem sound. However, this is only part of the draft, which has the following
issues:

#. The introduction does not say what the purpose of publishing this draft is.
It seems that, rather than describing a specific protocol or protocols, it intends to describe the overall system procedure that would typically be used in DCs for VM mobility. It is tagged as a BCP, but it does not say who needs this BCP, why it is useful for the IETF to publish this BCP, how wide the authors' knowledge is of current practice (given DCs are private), or why this is a BCP rather than a protocol spec.

[Linda] The first paragraph on Page 3 has the description why VM Mobility is needed. Is it helpful to move this paragraph to the beginning of the Introduction Section?
“Virtualization which is being used in almost all of today’s data
centers enables many virtual machines to run on a single physical
computer or compute server. Virtual machines (VM) need hypervisor
running on the physical compute server to provide them shared
processor/memory/storage. Network connectivity is provided by the
network virtualization edge (NVE) [RFC8014]. Being able to move VMs
dynamically, or live migration, from one server to another allows for
dynamic load balancing or work distribution and thus it is a highly
desirable feature [RFC7364].”


The draft starts out (S.3) as if it intends to say what a good VM Mobility protocol should or shouldn't do, but the rest of the document doesn't give any reasoning for these recommendations, it just asserts what appears to be one view of how a whole VM Mobility system works, sometimes referring to one example protocol RFC for a component part, but more often with no references or details.

[Linda] Is it helpful to move the paragraph above to the beginning of the Introduction Section? So that audience is aware of why VM Mobility is needed. And then follow up with what a good VM Mobility protocol should or shouldn't do?

#. It does not seem as if the NVO WG has discussed the purpose of using normative text in this draft. See detailed comments.

[Linda] The “Intended status” of the draft is “Best Current Practice”. So all the text are not “normative”. Is it Okay?

#. The draft silently slips back and forth between VM mobility and VM redundancy, without recognizing the differences. See detailed comments.

[Linda] There is only one usage of “redundancy” in the entire document, used under the context of “Hot standby option”, indicating the “redundancy” of “the VMs in both primary and secondary domains have identical information and can provide services simultaneously as in load-share mode of operation” being expensive.

#. Please adopt different terminology than "source NVE" and "destination NVE", which are really poor choices of terms for an intermediate node. See detailed comments. Why not use "old NVE" and "new NVE", which is what you mean?
[Linda] Thanks for the suggestion. We will change to “Old NVE”, and “new NVE”.

#. Applicability is fairly clearly outlined, but it is not clear whether hosts corresponding with the mobile VMs are part of the same controlled environment or on the uncontrolled public Internet. See detailed comments.
[Linda] “Hosts” are the App running on the VM. It is the under the same controlled environment. Not on uncontrolled public internet.


#. Section 4.2.1 on L3 VM mobility reads like some potential half-thought-through ideas on how to solve L3 mobility, rather than current practice, let alone best current practice. Either current practice should be described instead, or the scope of the draft should be narrowed solely to L2 VM mobility. See detailed comments.
[Linda] This is refereeing to “Cold Migration”, which is a common practice in many data centers.

# The VM's file system is described as state that moves with the VM (S.6), but VM mobility solutions often move the VM but stitch it back to its (unmoved) storage. Conversely, the storage can also move independent of the VM.
[Linda] It depends. When a VM move to a different zone, the storage/file can becomes inaccessible.

#. The draft omits some of the security, transport and management aspects of VM mobility. See detailed comments.
[Linda] Can you provide some text?

#. The draft reads as if different sections have been written by different authors and no-one has edited the whole to give it a coherent structure, or to ensure consistency (both technical and editorial) between the parts. See detailed comments.

[Linda] we can improve.


#. The quality of the English grammar does not allow a reviewer to concentrate on the technical aspects rather than the English. It would have been useful if one of the English-speaking co-authors had improved the English before submission for review. See detailed comments.
[Linda] can you help? Becoming a co-author to improve?

==Detailed Comments==

===#. Normative statements===

In the body of the document, there is just one occurrence of normative text (actually two "MUST"s, but both state a common requirement - just written separately for IPv4 and IPv6). This merely serves to imply that everything else the document says is less important or optional, which was probably not the intention.
[Linda] The goal is to indicate any solution in moving the VM “MUST” follow this rule. They make sense, aren’t they?

At the start there is a requirements section, which states what a VM Mobility protocol "SHOULD" or "SHOULD NOT" do. I think this is intended as a set of goals for the rest of the document. If so, these "SHOULDs" are not intended to apply to implementations, so they ought not to be capitalized.

[Linda] okay, will change.


The first requirement, "Data center network SHOULD support virtual machine mobility in IPv6", is written as a requirement on all DC networks, not on implementations. I assume this was intended to read as "Data center network virtual machine mobility protocols SHOULD support IPv6". Even then, it doesn't really add anything to say VM mobility should support v6 and it should support v4. A L2 solution won't. While undoubtedly, a L3 solution will at least support one of them.
[Linda]Agree. Will change it to “Data center that support IPv6 address should 
”

I'm not sure that 'protocol' is the right word anyway; I think 'VM Mobility procedure' would be a better phrase, because it includes steps such as suspending the VM, which is more than a protocol.
[Linda] yes. Will change to “Procedure”.

The requirement "Virtual machine mobility protocol MAY support host routes to accomplish virtualization", is not followed up at all in the rest of the draft.
Even if this requirement stays, the last 3 words should be deleted.

[Linda] will change to “Host Route can be used to support the Virtual Machine Mobility Procedure.”

By the end of the draft, the solution falls far short of the most relevant "Requirements" anyway, so one assumes the title of the section ought to have been "Goals". Specifically, even in the simpler case of L2 VM mobility, S.4.1 says that triangular routing and tunnelling persist "until a neighbour cache entry times out". A cache timeout is about 10 orders of magnitude longer than the requirement to only persist "while handling packets in flight", which would be a few milliseconds at most (the time for packets to clear the network that were already launched into flight when the old VM stopped).

Whatever, it would be preferable for the draft to give rationale for these requirements, rather than just assert them. This would help to shed light on the merits of the different trade offs that solutions choose.

[Linda] Agree, will add.

===#. Mobility vs. Redundancy===

Redundancy and mobility have a lot of similarities, but they have different goals. With mobility, it is necessary to know the exact instant when one set of state is identical to the other so it can hand over. With redundancy, the aim is to keep two (or more) sets of state evolving through the same sequence of changes, but there is no need to know the point at which one is the same as the other was at a certain point.
[Linda] Agree with what you said. There is only one usage of “redundancy” in the entire document, used under the context of “Hot standby option”, indicating the “redundancy” of “the VMs in both primary and secondary domains have identical information and can provide services simultaneously as in load-share mode of operation” being expensive.

The draft slips from mobility to resilience in the following places:
* S.2. Terminology: Warm VM Mobility is defined without any ending, as if it is permanent replication. * S.7. "Handling of Hot, Warm and Cold Virtual Machine Mobility" is actually all about redundancy, and doesn't address mobility explicitly.

[Linda] Will add the definition “Hot Migration”, “cold migration”, and “warm migration”.

===#. Terminology===

Packets run from the source at A to the destination at B via NVE1, then via NVE2. Please don't call NVE1 and NVE2 the source NVE and the destination NVE.
In future, no-one will thank you for the apparent contradictions when they continually stumble over phrases like this one in S.4.1: "...send their packets to the source NVE".

The term "packets in flight" is used incorrectly to refer to all the packets sent to the old NVE after the VM has moved, even if they were launched into flight long after the old VM stopped receiving packets.

[Linda] thank for the comments. Will change.

BTW, I think s/before/after/ in: "that have old ARP or neighbor cache entry before VM or task migration".

I think: s/IP-based VM mobility/L3 VM mobility/ throughout, because "based"
sounds (to me) like the mobility control protocol is over (i.e. based on) IP.

===#. Applicability===

In section 4.2 it says that the protocol mostly used as the IP based task migration protocol is ILA. This implies that all hosts corresponding with the mobile VMs are either part of the same controlled environment, or they are proxied via nodes that are part of the same controlled environment (I only have passing knowledge of ILA, but I understand that it depends on ILA routers on the path). If I am correct, this aspect of scope needs to be made clear from the start.

Also under the heading of applicabiliy, the sentence "Since migrations should be relatively rare events" appears very late in the document (S.4.2.1). The assumed level of churn ought to be stated nearer the start.

[Linda] yes, under the same controlled environment.

===#. L3 Mobility===
L2 VM mobility is independent of the application, because resolution of L2 mappings is delegated to the stack. In contrast, L3 VM mobility is only feasible under certain conditions, because an application needs an IP address to open a socket (resolution of DNS names is not delegated to the stack, and apps can use IP addresses directly anyway).

Examples of the 'certain conditions':
a) /All/ applications used in the whole DC load balancing scheme contain IP address migration logic for /all/ their connections; b) VMs running solely applications that support IP address migration register this fact with the NVA, and it only select such VMs for mobility. c) An abstraction is layered over /all/ the IP addresses exposed to applications (at both ends) so that the IP addresses that applications use are solely identifiers (e.g. ILA, LISP, HIP), not also locators.

The introduction says the draft is about VM mobility in a multi-tenant DC, so the DC admin will not know the range of applications being used. This excludes condition (a) above. When the draft says "...if all applications running are known to handle this gracefully...", it doesn't quantify just how restrictive this condition is, and it gives no explanation of how this knowledge might be 'known' or which function within the system 'knows' it.

S.4.2.1 contains what seems like plenty of arm-waving.
* "TCP connections could be automatically closed in the network stack during a migration event."
o There is no TCP connection state in the network stack.
o Even if the network starts to drop every packet, the TCP connection
state persists in the end-points for a duration of the order of 30-90
minutes (OS-dependent) before TCP deems the connection is broken. o
Other transport protocols have similar designs (including the app-layer
of protocols over UDP).
* "More involved approach to connection migration":
o pausing the connection [does this refer to an actual feature of any
L4 protocol?] o packaging connection state and sending to target [does
this assume logic written into the application, or is this assuming the
stack handles this and the app is restricted to using some form of
separate identifier/locator addresses?] o instantiating connection
state in the peer stack [ditto?].

There's some arm-waving in S.7 too:
"Cold Virtual Machine mobility is facilitated by the VM initially
sending an ARP or Neighbor Discovery message at the destination NVE
but the source NVE not receiving any packets inflight."
[How is it arranged for the source NVE not to receive any packets in flight?]

And in S.7:
"In hot
standby option, regarding TCP connections, one option is to start
with and maintain TCP connections to two different VMs at the same
time."
[This sounds like resilience logic has been written into the application,
which would be a special case but not something VM mobility infrastructure
could depend on.]

[Linda] will add.

===#. Gaps===
#. Security Considerations: repeats issues in other drafts that are not specific to mobility, but it does not mention any security issues specifically due to VM mobility. It says that address spoofing may arise in a DC (sort-of implying it is worse than in non-DC environments, but not saying why). The handshake at the start of a connection (e.g. TCP, SCTP, QUIC) checks for source address spoofing. So L3 VM mobility would be more vulnerable to source address spoofing in cases where the mobile VM was the connection initiator and there was not a new handshake after the move. However, this draft does not contain any detailed mobility protocols, so it is not possible to identify any specific security flaws.

#. Transport Issues: Effect of delay on the transport: Cold mobility introduces significant delay, and other forms less, but still some delay. It should be pointed out that some applications (e.g. real-time) will therefore not be useful if subjected to VM mobility. Similarly, even a short period of delay will drive most congestion controls to severely reduce throughput. These points might be self-evident, but perhaps they should be stated explicitly.

BTW, in the L3 VM mobility case, the draft often refers to TCP connections, but the address bindings of any transport protocols would have to be migrated due to VM mobility (e.g. SCTP; sequences of datagrams over UDP; streams over UDP such as with RTP, QUIC).

#. Management Issues: perhaps the draft ought to recommend statistics gathering (e.g. time taken, amount of duplicate data) to aid a DC's future decisions on the cost-benefit of moving a VM. The OPSDIR review says a BCP does not /have/ to describe management issues, but this document seems to describe a whole system procedure, not just a protocol, which then surely includes the management plane.

[Linda] can you become a co-author and add those in?

===#. Incoherent Structure===

S.4.1. happens to talk about VMs moving, while S.4.2. happens to talk about tasks moving, but this is not the distinguishing aspect of these two sections (anyway, S.2. says "the draft uses task and VM interchangeably"): * "4.1 VM Migration" is about "L2 VM Mobility" so this ought to be the section heading, *
"4.2 Task Migration" is about "L3 VM Mobility" so this ought to be the section heading. It would also help not to switch from VM to task across these sections
- it's just a distraction.

S.4.1 needs better signposting of where each sub-case ends (Subsections might be useful to solve this): * IPv4 * end-user client * 2 paras starting "All NVEs communicating with this virtual machine..." [Not clear that the end-user case has ended and we have returned to the general IPv4 case?] * IPv6 [Strictly, it still hasn't said whether the end-user client case has ended.] [Also, it doesn't explain why there is no need for an end-user client case under IPv6?] Sections 5 & 6 seem to be about either L2 or L3 mobility, whereas Sections 7 &
8 seem to be restricted to L2.

The draft vacillates over what to do with packets arriving at the old NVE in the L3 case (see also L3 mobility above): * S4.2 first says packets are dropped, possibly with an ICMP error message;
o then later it says they are silently dropped;
o then in the very next sentence it says either silently drop them or forward
them to the new location
* S.5 says they should not be lost, but instead delivered to the destination hypervisor
o then it describes how they are tunnelled (which is not the same as
"forwarding").

The order in which all the stages of mobilty are given is jumbled up across sections that also appear in arbitrary order: * S.5 prepares, establishes uses then stops a tunnel, but it doesn't say where the other stages fit between these steps
o When tunneling packets, it talks about the *migrating* VM not the
*migrated* VM, which implies tunnelling has started before the new VM
is running. Does this imply there is a huge buffer? o It says "Stop
Tunneling Packets - When source NVE stops receiving packets destined
to..." but it is never clear when a source has stopped sending packets
to a destination, unless it explicitly closes the connection (e.g. with
a FIN in the case of TCP). Often there are long gaps between packets,
because many flows are 'thin' (meaning the application frequently has
nothing to send). These gaps can last for milliseconds, hours or even
days without any implication that the connection has ended.
* Then S.6. describes moving state, but doesn't say that this is not after the previous tunnelling steps (or where it fits within those steps). * Then S.7 describes hot, warm and cold mobility, but doesn't lay out the tunnelling or steps to move state in each case. * Then S.8 says it's about VM life-cycle, but just gives the very first 3 steps for allocation of resources to a VM, then abruptly ends, without even starting the VM, let alone getting to move it.

S.5 exhibits another inconsistency by talking about the hypervisor, not the NVE.

==#. Nits==

Nits with the English are too numerous to mention them all. Below are pointers to general problems as well as some individual instances.

S.4
"Layer 2 and Layer 3 protocols are described next. In the following
sections, we examine more advanced features."
s/following/subsequent/

S.4.1
Expand WSC, MSC and NVA on first use.

s/the VM moves in the same link/the VM moves in the same subnet/

"i.e. end-user clients ask for the same MAC address upon migration. [...] to ensure that the same IPv4 address is assigned to the VM." I think s/IPv4/MAC/ was intended?

" All NVEs communicating with this virtual machine uses the old ARP
entry. If any VM in those NVEs need to talk to the new VM in the
destination NVE, it uses the old ARP entry."
Repetition: these 2 sentences say the same. (The mistake is also repeated when these 2 sentences are repeated for IPv6).

S.4.2.1
s/Push the new mapping to hosts./Push the new mapping to communicating hosts./

S.5.
The IPv4/IPv6 pairs of paras for "tunnel estabilshment" and "tunneling packets"
only differ in the words "IPv4"/"IPv6". So in each case a single para could be given for IP (irrespective of whether v4 or v6).

Thank you very much.

Linda Dunbar
Linda Dunbar
2018-09-14 02:31:45 UTC
Permalink
David,

Thank you for the suggestion, we will change the status to “Informational”.

Linda

From: Black, David [mailto:***@dell.com]
Sent: Thursday, September 13, 2018 8:03 PM
To: Linda Dunbar <***@huawei.com>; Bob Briscoe <***@bobbriscoe.net>; tsv-***@ietf.org
Cc: ***@ietf.org; ***@ietf.org; draft-ietf-nvo3-***@ietf.org; Black, David <***@dell.com>
Subject: RE: Tsvart last call review of draft-ietf-nvo3-vmm-04
Post by Linda Dunbar
#. It does not seem as if the NVO WG has discussed the purpose of using normative text in this draft. See detailed comments.
[Linda] The “Intended status” of the draft is “Best Current Practice”. So all the text are not “normative”. Is it Okay?
Not really – this draft might be better targeted as “Informational” as it is not a comprehensive review of current practice (best or otherwise) nor an overall set of recommendations, e.g., as Bob wrote “it just asserts what appears to be one view of how a whole VM Mobility system works.”

Thanks, --David

From: Tsv-art [mailto:tsv-art-***@ietf.org] On Behalf Of Linda Dunbar
Sent: Monday, September 10, 2018 2:17 PM
To: Bob Briscoe; tsv-***@ietf.org<mailto:tsv-***@ietf.org>
Cc: ***@ietf.org<mailto:***@ietf.org>; ***@ietf.org<mailto:***@ietf.org>; draft-ietf-nvo3-***@ietf.org<mailto:draft-ietf-nvo3-***@ietf.org>
Subject: Re: [Tsv-art] Tsvart last call review of draft-ietf-nvo3-vmm-04

Bob,

Thank you very much for reviewing the draft and provided in-depth comments. I am very sorry for the delayed response due to traveling.

Replies to your comments are inserted below marked by [Linda]:


-----Original Message-----
From: Bob Briscoe [mailto:***@bobbriscoe.net]
Sent: Monday, September 03, 2018 9:45 PM
To: tsv-***@ietf.org<mailto:tsv-***@ietf.org>
Cc: ***@ietf.org<mailto:***@ietf.org>; ***@ietf.org<mailto:***@ietf.org>; draft-ietf-nvo3-***@ietf.org<mailto:draft-ietf-nvo3-***@ietf.org>
Subject: Tsvart last call review of draft-ietf-nvo3-vmm-04

Reviewer: Bob Briscoe
Review result: Not Ready

I have been selected as the Transport Directorate reviewer for this draft. The Transport Directorate seeks to review all transport or transport-related drafts as they pass through IETF last call and IESG review, and sometimes on special request. The purpose of the review is to provide assistance to the Transport ADs. For more information about the Transport Directorate Reviews and the Transport Area Review Team, please see ​https://trac.ietf.org/trac/tsv/wiki/TSV-Directorate-Reviews

In this case, very very few of the review comments relate to transport issues, although the greatest issue concerns a desire that the network could pause or stop connections during L3 VM Mobility, which is certainly a transport issue.

[Linda] There is “Hot Migration” with transport service continuing, and there is a “Cold Migration”, which is a common practice in many data centers, which stop the task running on the old place and move to the new place before restart as described in the Task Migration.
Is it helpful to add this description to the draft?


==Summary==

The technical aspects of the draft concerning L2 VM mobility (within a subnet) seem sound. However, this is only part of the draft, which has the following
issues:

#. The introduction does not say what the purpose of publishing this draft is.
It seems that, rather than describing a specific protocol or protocols, it intends to describe the overall system procedure that would typically be used in DCs for VM mobility. It is tagged as a BCP, but it does not say who needs this BCP, why it is useful for the IETF to publish this BCP, how wide the authors' knowledge is of current practice (given DCs are private), or why this is a BCP rather than a protocol spec.

[Linda] The first paragraph on Page 3 has the description why VM Mobility is needed. Is it helpful to move this paragraph to the beginning of the Introduction Section?
“Virtualization which is being used in almost all of today’s data
centers enables many virtual machines to run on a single physical
computer or compute server. Virtual machines (VM) need hypervisor
running on the physical compute server to provide them shared
processor/memory/storage. Network connectivity is provided by the
network virtualization edge (NVE) [RFC8014]. Being able to move VMs
dynamically, or live migration, from one server to another allows for
dynamic load balancing or work distribution and thus it is a highly
desirable feature [RFC7364].”


The draft starts out (S.3) as if it intends to say what a good VM Mobility protocol should or shouldn't do, but the rest of the document doesn't give any reasoning for these recommendations, it just asserts what appears to be one view of how a whole VM Mobility system works, sometimes referring to one example protocol RFC for a component part, but more often with no references or details.

[Linda] Is it helpful to move the paragraph above to the beginning of the Introduction Section? So that audience is aware of why VM Mobility is needed. And then follow up with what a good VM Mobility protocol should or shouldn't do?

#. It does not seem as if the NVO WG has discussed the purpose of using normative text in this draft. See detailed comments.

[Linda] The “Intended status” of the draft is “Best Current Practice”. So all the text are not “normative”. Is it Okay?

#. The draft silently slips back and forth between VM mobility and VM redundancy, without recognizing the differences. See detailed comments.

[Linda] There is only one usage of “redundancy” in the entire document, used under the context of “Hot standby option”, indicating the “redundancy” of “the VMs in both primary and secondary domains have identical information and can provide services simultaneously as in load-share mode of operation” being expensive.

#. Please adopt different terminology than "source NVE" and "destination NVE", which are really poor choices of terms for an intermediate node. See detailed comments. Why not use "old NVE" and "new NVE", which is what you mean?
[Linda] Thanks for the suggestion. We will change to “Old NVE”, and “new NVE”.

#. Applicability is fairly clearly outlined, but it is not clear whether hosts corresponding with the mobile VMs are part of the same controlled environment or on the uncontrolled public Internet. See detailed comments.
[Linda] “Hosts” are the App running on the VM. It is the under the same controlled environment. Not on uncontrolled public internet.


#. Section 4.2.1 on L3 VM mobility reads like some potential half-thought-through ideas on how to solve L3 mobility, rather than current practice, let alone best current practice. Either current practice should be described instead, or the scope of the draft should be narrowed solely to L2 VM mobility. See detailed comments.
[Linda] This is refereeing to “Cold Migration”, which is a common practice in many data centers.

# The VM's file system is described as state that moves with the VM (S.6), but VM mobility solutions often move the VM but stitch it back to its (unmoved) storage. Conversely, the storage can also move independent of the VM.
[Linda] It depends. When a VM move to a different zone, the storage/file can becomes inaccessible.

#. The draft omits some of the security, transport and management aspects of VM mobility. See detailed comments.
[Linda] Can you provide some text?

#. The draft reads as if different sections have been written by different authors and no-one has edited the whole to give it a coherent structure, or to ensure consistency (both technical and editorial) between the parts. See detailed comments.

[Linda] we can improve.


#. The quality of the English grammar does not allow a reviewer to concentrate on the technical aspects rather than the English. It would have been useful if one of the English-speaking co-authors had improved the English before submission for review. See detailed comments.
[Linda] can you help? Becoming a co-author to improve?

==Detailed Comments==

===#. Normative statements===

In the body of the document, there is just one occurrence of normative text (actually two "MUST"s, but both state a common requirement - just written separately for IPv4 and IPv6). This merely serves to imply that everything else the document says is less important or optional, which was probably not the intention.
[Linda] The goal is to indicate any solution in moving the VM “MUST” follow this rule. They make sense, aren’t they?

At the start there is a requirements section, which states what a VM Mobility protocol "SHOULD" or "SHOULD NOT" do. I think this is intended as a set of goals for the rest of the document. If so, these "SHOULDs" are not intended to apply to implementations, so they ought not to be capitalized.

[Linda] okay, will change.


The first requirement, "Data center network SHOULD support virtual machine mobility in IPv6", is written as a requirement on all DC networks, not on implementations. I assume this was intended to read as "Data center network virtual machine mobility protocols SHOULD support IPv6". Even then, it doesn't really add anything to say VM mobility should support v6 and it should support v4. A L2 solution won't. While undoubtedly, a L3 solution will at least support one of them.
[Linda]Agree. Will change it to “Data center that support IPv6 address should 
”

I'm not sure that 'protocol' is the right word anyway; I think 'VM Mobility procedure' would be a better phrase, because it includes steps such as suspending the VM, which is more than a protocol.
[Linda] yes. Will change to “Procedure”.

The requirement "Virtual machine mobility protocol MAY support host routes to accomplish virtualization", is not followed up at all in the rest of the draft.
Even if this requirement stays, the last 3 words should be deleted.

[Linda] will change to “Host Route can be used to support the Virtual Machine Mobility Procedure.”

By the end of the draft, the solution falls far short of the most relevant "Requirements" anyway, so one assumes the title of the section ought to have been "Goals". Specifically, even in the simpler case of L2 VM mobility, S.4.1 says that triangular routing and tunnelling persist "until a neighbour cache entry times out". A cache timeout is about 10 orders of magnitude longer than the requirement to only persist "while handling packets in flight", which would be a few milliseconds at most (the time for packets to clear the network that were already launched into flight when the old VM stopped).

Whatever, it would be preferable for the draft to give rationale for these requirements, rather than just assert them. This would help to shed light on the merits of the different trade offs that solutions choose.

[Linda] Agree, will add.

===#. Mobility vs. Redundancy===

Redundancy and mobility have a lot of similarities, but they have different goals. With mobility, it is necessary to know the exact instant when one set of state is identical to the other so it can hand over. With redundancy, the aim is to keep two (or more) sets of state evolving through the same sequence of changes, but there is no need to know the point at which one is the same as the other was at a certain point.
[Linda] Agree with what you said. There is only one usage of “redundancy” in the entire document, used under the context of “Hot standby option”, indicating the “redundancy” of “the VMs in both primary and secondary domains have identical information and can provide services simultaneously as in load-share mode of operation” being expensive.

The draft slips from mobility to resilience in the following places:
* S.2. Terminology: Warm VM Mobility is defined without any ending, as if it is permanent replication. * S.7. "Handling of Hot, Warm and Cold Virtual Machine Mobility" is actually all about redundancy, and doesn't address mobility explicitly.

[Linda] Will add the definition “Hot Migration”, “cold migration”, and “warm migration”.

===#. Terminology===

Packets run from the source at A to the destination at B via NVE1, then via NVE2. Please don't call NVE1 and NVE2 the source NVE and the destination NVE.
In future, no-one will thank you for the apparent contradictions when they continually stumble over phrases like this one in S.4.1: "...send their packets to the source NVE".

The term "packets in flight" is used incorrectly to refer to all the packets sent to the old NVE after the VM has moved, even if they were launched into flight long after the old VM stopped receiving packets.

[Linda] thank for the comments. Will change.

BTW, I think s/before/after/ in: "that have old ARP or neighbor cache entry before VM or task migration".

I think: s/IP-based VM mobility/L3 VM mobility/ throughout, because "based"
sounds (to me) like the mobility control protocol is over (i.e. based on) IP.

===#. Applicability===

In section 4.2 it says that the protocol mostly used as the IP based task migration protocol is ILA. This implies that all hosts corresponding with the mobile VMs are either part of the same controlled environment, or they are proxied via nodes that are part of the same controlled environment (I only have passing knowledge of ILA, but I understand that it depends on ILA routers on the path). If I am correct, this aspect of scope needs to be made clear from the start.

Also under the heading of applicabiliy, the sentence "Since migrations should be relatively rare events" appears very late in the document (S.4.2.1). The assumed level of churn ought to be stated nearer the start.

[Linda] yes, under the same controlled environment.

===#. L3 Mobility===
L2 VM mobility is independent of the application, because resolution of L2 mappings is delegated to the stack. In contrast, L3 VM mobility is only feasible under certain conditions, because an application needs an IP address to open a socket (resolution of DNS names is not delegated to the stack, and apps can use IP addresses directly anyway).

Examples of the 'certain conditions':
a) /All/ applications used in the whole DC load balancing scheme contain IP address migration logic for /all/ their connections; b) VMs running solely applications that support IP address migration register this fact with the NVA, and it only select such VMs for mobility. c) An abstraction is layered over /all/ the IP addresses exposed to applications (at both ends) so that the IP addresses that applications use are solely identifiers (e.g. ILA, LISP, HIP), not also locators.

The introduction says the draft is about VM mobility in a multi-tenant DC, so the DC admin will not know the range of applications being used. This excludes condition (a) above. When the draft says "...if all applications running are known to handle this gracefully...", it doesn't quantify just how restrictive this condition is, and it gives no explanation of how this knowledge might be 'known' or which function within the system 'knows' it.

S.4.2.1 contains what seems like plenty of arm-waving.
* "TCP connections could be automatically closed in the network stack during a migration event."
o There is no TCP connection state in the network stack.
o Even if the network starts to drop every packet, the TCP connection
state persists in the end-points for a duration of the order of 30-90
minutes (OS-dependent) before TCP deems the connection is broken. o
Other transport protocols have similar designs (including the app-layer
of protocols over UDP).
* "More involved approach to connection migration":
o pausing the connection [does this refer to an actual feature of any
L4 protocol?] o packaging connection state and sending to target [does
this assume logic written into the application, or is this assuming the
stack handles this and the app is restricted to using some form of
separate identifier/locator addresses?] o instantiating connection
state in the peer stack [ditto?].

There's some arm-waving in S.7 too:
"Cold Virtual Machine mobility is facilitated by the VM initially
sending an ARP or Neighbor Discovery message at the destination NVE
but the source NVE not receiving any packets inflight."
[How is it arranged for the source NVE not to receive any packets in flight?]

And in S.7:
"In hot
standby option, regarding TCP connections, one option is to start
with and maintain TCP connections to two different VMs at the same
time."
[This sounds like resilience logic has been written into the application,
which would be a special case but not something VM mobility infrastructure
could depend on.]

[Linda] will add.

===#. Gaps===
#. Security Considerations: repeats issues in other drafts that are not specific to mobility, but it does not mention any security issues specifically due to VM mobility. It says that address spoofing may arise in a DC (sort-of implying it is worse than in non-DC environments, but not saying why). The handshake at the start of a connection (e.g. TCP, SCTP, QUIC) checks for source address spoofing. So L3 VM mobility would be more vulnerable to source address spoofing in cases where the mobile VM was the connection initiator and there was not a new handshake after the move. However, this draft does not contain any detailed mobility protocols, so it is not possible to identify any specific security flaws.

#. Transport Issues: Effect of delay on the transport: Cold mobility introduces significant delay, and other forms less, but still some delay. It should be pointed out that some applications (e.g. real-time) will therefore not be useful if subjected to VM mobility. Similarly, even a short period of delay will drive most congestion controls to severely reduce throughput. These points might be self-evident, but perhaps they should be stated explicitly.

BTW, in the L3 VM mobility case, the draft often refers to TCP connections, but the address bindings of any transport protocols would have to be migrated due to VM mobility (e.g. SCTP; sequences of datagrams over UDP; streams over UDP such as with RTP, QUIC).

#. Management Issues: perhaps the draft ought to recommend statistics gathering (e.g. time taken, amount of duplicate data) to aid a DC's future decisions on the cost-benefit of moving a VM. The OPSDIR review says a BCP does not /have/ to describe management issues, but this document seems to describe a whole system procedure, not just a protocol, which then surely includes the management plane.

[Linda] can you become a co-author and add those in?

===#. Incoherent Structure===

S.4.1. happens to talk about VMs moving, while S.4.2. happens to talk about tasks moving, but this is not the distinguishing aspect of these two sections (anyway, S.2. says "the draft uses task and VM interchangeably"): * "4.1 VM Migration" is about "L2 VM Mobility" so this ought to be the section heading, *
"4.2 Task Migration" is about "L3 VM Mobility" so this ought to be the section heading. It would also help not to switch from VM to task across these sections
- it's just a distraction.

S.4.1 needs better signposting of where each sub-case ends (Subsections might be useful to solve this): * IPv4 * end-user client * 2 paras starting "All NVEs communicating with this virtual machine..." [Not clear that the end-user case has ended and we have returned to the general IPv4 case?] * IPv6 [Strictly, it still hasn't said whether the end-user client case has ended.] [Also, it doesn't explain why there is no need for an end-user client case under IPv6?] Sections 5 & 6 seem to be about either L2 or L3 mobility, whereas Sections 7 &
8 seem to be restricted to L2.

The draft vacillates over what to do with packets arriving at the old NVE in the L3 case (see also L3 mobility above): * S4.2 first says packets are dropped, possibly with an ICMP error message;
o then later it says they are silently dropped;
o then in the very next sentence it says either silently drop them or forward
them to the new location
* S.5 says they should not be lost, but instead delivered to the destination hypervisor
o then it describes how they are tunnelled (which is not the same as
"forwarding").

The order in which all the stages of mobilty are given is jumbled up across sections that also appear in arbitrary order: * S.5 prepares, establishes uses then stops a tunnel, but it doesn't say where the other stages fit between these steps
o When tunneling packets, it talks about the *migrating* VM not the
*migrated* VM, which implies tunnelling has started before the new VM
is running. Does this imply there is a huge buffer? o It says "Stop
Tunneling Packets - When source NVE stops receiving packets destined
to..." but it is never clear when a source has stopped sending packets
to a destination, unless it explicitly closes the connection (e.g. with
a FIN in the case of TCP). Often there are long gaps between packets,
because many flows are 'thin' (meaning the application frequently has
nothing to send). These gaps can last for milliseconds, hours or even
days without any implication that the connection has ended.
* Then S.6. describes moving state, but doesn't say that this is not after the previous tunnelling steps (or where it fits within those steps). * Then S.7 describes hot, warm and cold mobility, but doesn't lay out the tunnelling or steps to move state in each case. * Then S.8 says it's about VM life-cycle, but just gives the very first 3 steps for allocation of resources to a VM, then abruptly ends, without even starting the VM, let alone getting to move it.

S.5 exhibits another inconsistency by talking about the hypervisor, not the NVE.

==#. Nits==

Nits with the English are too numerous to mention them all. Below are pointers to general problems as well as some individual instances.

S.4
"Layer 2 and Layer 3 protocols are described next. In the following
sections, we examine more advanced features."
s/following/subsequent/

S.4.1
Expand WSC, MSC and NVA on first use.

s/the VM moves in the same link/the VM moves in the same subnet/

"i.e. end-user clients ask for the same MAC address upon migration. [...] to ensure that the same IPv4 address is assigned to the VM." I think s/IPv4/MAC/ was intended?

" All NVEs communicating with this virtual machine uses the old ARP
entry. If any VM in those NVEs need to talk to the new VM in the
destination NVE, it uses the old ARP entry."
Repetition: these 2 sentences say the same. (The mistake is also repeated when these 2 sentences are repeated for IPv6).

S.4.2.1
s/Push the new mapping to hosts./Push the new mapping to communicating hosts./

S.5.
The IPv4/IPv6 pairs of paras for "tunnel estabilshment" and "tunneling packets"
only differ in the words "IPv4"/"IPv6". So in each case a single para could be given for IP (irrespective of whether v4 or v6).

Thank you very much.

Linda Dunbar
Behcet Sarikaya
2018-09-14 14:25:21 UTC
Permalink
Post by Bob Briscoe
Post by Bob Briscoe
#. It does not seem as if the NVO WG has discussed the purpose of using
normative text in this draft. See detailed comments.
Post by Bob Briscoe
[Linda] The “Intended status” of the draft is “Best Current Practice”.
So all the text are not “normative”. Is it Okay?
Not really – this draft might be better targeted as “Informational” as it
is not a comprehensive review of current practice (best or otherwise) nor
an overall set of recommendations, e.g., as Bob wrote “it just asserts what
appears to be one view of how a whole VM Mobility system works.”
At present we have no intention of changing the intended status because
that decision should be deferred until IESG Review where we expect to
receive an authoritative view.

Behcet
Post by Bob Briscoe
Thanks, --David
Dunbar
*Sent:* Monday, September 10, 2018 2:17 PM
*Subject:* Re: [Tsv-art] Tsvart last call review of draft-ietf-nvo3-vmm-04
Bob,
Thank you very much for reviewing the draft and provided in-depth
comments. I am very sorry for the delayed response due to traveling.
-----Original Message-----
Sent: Monday, September 03, 2018 9:45 PM
Subject: Tsvart last call review of draft-ietf-nvo3-vmm-04
Reviewer: Bob Briscoe
Review result: Not Ready
I have been selected as the Transport Directorate reviewer for this draft..
The Transport Directorate seeks to review all transport or
transport-related drafts as they pass through IETF last call and IESG
review, and sometimes on special request. The purpose of the review is to
provide assistance to the Transport ADs. For more information about the
Transport Directorate Reviews and the Transport Area Review Team, please
see https://trac.ietf.org/trac/tsv/wiki/TSV-Directorate-Reviews
In this case, very very few of the review comments relate to transport
issues, although the greatest issue concerns a desire that the network
could pause or stop connections during L3 VM Mobility, which is certainly a
transport issue.
[Linda] There is “Hot Migration” with transport service continuing, and
there is a “Cold Migration”, which is a common practice in many data
centers, which stop the task running on the old place and move to the new
place before restart as described in the Task Migration.
Is it helpful to add this description to the draft?
==Summary==
The technical aspects of the draft concerning L2 VM mobility (within a
subnet) seem sound. However, this is only part of the draft, which has the
following
#. The introduction does not say what the purpose of publishing this draft is.
It seems that, rather than describing a specific protocol or protocols, it
intends to describe the overall system procedure that would typically be
used in DCs for VM mobility. It is tagged as a BCP, but it does not say who
needs this BCP, why it is useful for the IETF to publish this BCP, how wide
the authors' knowledge is of current practice (given DCs are private), or
why this is a BCP rather than a protocol spec.
[Linda] The first paragraph on Page 3 has the description why VM Mobility
is needed. Is it helpful to move this paragraph to the beginning of the
Introduction Section?
*“Virtualization which is being used in almost all of today’s data*
*centers enables many virtual machines to run on a single physical*
*computer or compute server. Virtual machines (VM) need hypervisor*
*running on the physical compute server to provide them shared*
*processor/memory/storage. Network connectivity is provided by the*
*network virtualization edge (NVE) [RFC8014]. Being able to move VMs*
*dynamically, or live migration, from one server to another allows for*
*dynamic load balancing or work distribution and thus it is a highly*
*desirable feature [RFC7364].”*
The draft starts out (S.3) as if it intends to say what a good VM Mobility
protocol should or shouldn't do, but the rest of the document doesn't give
any reasoning for these recommendations, it just asserts what appears to be
one view of how a whole VM Mobility system works, sometimes referring to
one example protocol RFC for a component part, but more often with no
references or details.
[Linda] Is it helpful to move the paragraph above to the beginning of the
Introduction Section? So that audience is aware of why VM Mobility is
needed. And then follow up with what a good VM Mobility protocol should or
shouldn't do?
#. It does not seem as if the NVO WG has discussed the purpose of using
normative text in this draft. See detailed comments.
[Linda] The “Intended status” of the draft is “Best Current Practice”. So
all the text are not “normative”. Is it Okay?
#. The draft silently slips back and forth between VM mobility and VM
redundancy, without recognizing the differences. See detailed comments.
[Linda] There is only one usage of “redundancy” in the entire document,
used under the context of “Hot standby option”, indicating the
“redundancy” of “the VMs in both primary and secondary domains have
identical information and can provide services simultaneously as in
load-share mode of operation” being expensive.
#. Please adopt different terminology than "source NVE" and "destination
NVE", which are really poor choices of terms for an intermediate node. See
detailed comments. Why not use "old NVE" and "new NVE", which is what you
mean?
[Linda] Thanks for the suggestion. We will change to “Old NVE”, and “new
NVE”.
#. Applicability is fairly clearly outlined, but it is not clear whether
hosts corresponding with the mobile VMs are part of the same controlled
environment or on the uncontrolled public Internet. See detailed comments..
[Linda] “Hosts” are the App running on the VM. It is the under the same
controlled environment. Not on uncontrolled public internet.
#. Section 4.2.1 on L3 VM mobility reads like some potential
half-thought-through ideas on how to solve L3 mobility, rather than current
practice, let alone best current practice. Either current practice should
be described instead, or the scope of the draft should be narrowed solely
to L2 VM mobility. See detailed comments.
[Linda] This is refereeing to “Cold Migration”, which is a common practice
in many data centers.
# The VM's file system is described as state that moves with the VM (S.6),
but VM mobility solutions often move the VM but stitch it back to its
(unmoved) storage. Conversely, the storage can also move independent of the
VM.
[Linda] It depends. When a VM move to a different zone, the storage/file
can becomes inaccessible.
#. The draft omits some of the security, transport and management aspects
of VM mobility. See detailed comments.
[Linda] Can you provide some text?
#. The draft reads as if different sections have been written by different
authors and no-one has edited the whole to give it a coherent structure, or
to ensure consistency (both technical and editorial) between the parts. See
detailed comments.
[Linda] we can improve.
#. The quality of the English grammar does not allow a reviewer to
concentrate on the technical aspects rather than the English. It would have
been useful if one of the English-speaking co-authors had improved the
English before submission for review. See detailed comments.
[Linda] can you help? Becoming a co-author to improve?
==Detailed Comments==
===#. Normative statements===
In the body of the document, there is just one occurrence of normative
text (actually two "MUST"s, but both state a common requirement - just
written separately for IPv4 and IPv6). This merely serves to imply that
everything else the document says is less important or optional, which was
probably not the intention.
[Linda] The goal is to indicate any solution in moving the VM “MUST”
follow this rule. They make sense, aren’t they?
At the start there is a requirements section, which states what a VM
Mobility protocol "SHOULD" or "SHOULD NOT" do. I think this is intended as
a set of goals for the rest of the document. If so, these "SHOULDs" are not
intended to apply to implementations, so they ought not to be capitalized..
[Linda] okay, will change.
The first requirement, "Data center network SHOULD support virtual machine
mobility in IPv6", is written as a requirement on all DC networks, not on
implementations. I assume this was intended to read as "Data center network
virtual machine mobility protocols SHOULD support IPv6". Even then, it
doesn't really add anything to say VM mobility should support v6 and it
should support v4. A L2 solution won't. While undoubtedly, a L3 solution
will at least support one of them.
[Linda]Agree. Will change it to “Data center that support IPv6 address
should 
”
I'm not sure that 'protocol' is the right word anyway; I think 'VM
Mobility procedure' would be a better phrase, because it includes steps
such as suspending the VM, which is more than a protocol.
[Linda] yes. Will change to “Procedure”.
The requirement "Virtual machine mobility protocol MAY support host routes
to accomplish virtualization", is not followed up at all in the rest of the
draft.
Even if this requirement stays, the last 3 words should be deleted.
[Linda] will change to “Host Route can be used to support the Virtual
Machine Mobility Procedure.”
By the end of the draft, the solution falls far short of the most relevant
"Requirements" anyway, so one assumes the title of the section ought to
have been "Goals". Specifically, even in the simpler case of L2 VM
mobility, S.4.1 says that triangular routing and tunnelling persist "until
a neighbour cache entry times out". A cache timeout is about 10 orders of
magnitude longer than the requirement to only persist "while handling
packets in flight", which would be a few milliseconds at most (the time for
packets to clear the network that were already launched into flight when
the old VM stopped).
Whatever, it would be preferable for the draft to give rationale for these
requirements, rather than just assert them. This would help to shed light
on the merits of the different trade offs that solutions choose.
[Linda] Agree, will add.
===#. Mobility vs. Redundancy===
Redundancy and mobility have a lot of similarities, but they have
different goals. With mobility, it is necessary to know the exact instant
when one set of state is identical to the other so it can hand over. With
redundancy, the aim is to keep two (or more) sets of state evolving through
the same sequence of changes, but there is no need to know the point at
which one is the same as the other was at a certain point.
[Linda] Agree with what you said. There is only one usage of “redundancy”
in the entire document, used under the context of “Hot standby option”,
indicating the “redundancy” of “the VMs in both primary and secondary
domains have identical information and can provide services simultaneously
as in load-share mode of operation” being expensive.
* S.2. Terminology: Warm VM Mobility is defined without any ending, as if
it is permanent replication. * S.7. "Handling of Hot, Warm and Cold Virtual
Machine Mobility" is actually all about redundancy, and doesn't address
mobility explicitly.
[Linda] Will add the definition “Hot Migration”, “cold migration”, and
“warm migration”.
===#. Terminology===
Packets run from the source at A to the destination at B via NVE1, then
via NVE2. Please don't call NVE1 and NVE2 the source NVE and the
destination NVE.
In future, no-one will thank you for the apparent contradictions when they
continually stumble over phrases like this one in S.4.1: "...send their
packets to the source NVE".
The term "packets in flight" is used incorrectly to refer to all the
packets sent to the old NVE after the VM has moved, even if they were
launched into flight long after the old VM stopped receiving packets.
[Linda] thank for the comments. Will change.
BTW, I think s/before/after/ in: "that have old ARP or neighbor cache
entry before VM or task migration".
I think: s/IP-based VM mobility/L3 VM mobility/ throughout, because "based"
sounds (to me) like the mobility control protocol is over (i.e. based on) IP.
===#. Applicability===
In section 4.2 it says that the protocol mostly used as the IP based task
migration protocol is ILA. This implies that all hosts corresponding with
the mobile VMs are either part of the same controlled environment, or they
are proxied via nodes that are part of the same controlled environment (I
only have passing knowledge of ILA, but I understand that it depends on ILA
routers on the path). If I am correct, this aspect of scope needs to be
made clear from the start.
Also under the heading of applicabiliy, the sentence "Since migrations
should be relatively rare events" appears very late in the document
(S.4.2.1). The assumed level of churn ought to be stated nearer the start..
[Linda] yes, under the same controlled environment.
===#. L3 Mobility===
L2 VM mobility is independent of the application, because resolution of L2
mappings is delegated to the stack. In contrast, L3 VM mobility is only
feasible under certain conditions, because an application needs an IP
address to open a socket (resolution of DNS names is not delegated to the
stack, and apps can use IP addresses directly anyway).
a) /All/ applications used in the whole DC load balancing scheme contain
IP address migration logic for /all/ their connections; b) VMs running
solely applications that support IP address migration register this fact
with the NVA, and it only select such VMs for mobility. c) An abstraction
is layered over /all/ the IP addresses exposed to applications (at both
ends) so that the IP addresses that applications use are solely
identifiers (e.g. ILA, LISP, HIP), not also locators.
The introduction says the draft is about VM mobility in a multi-tenant DC,
so the DC admin will not know the range of applications being used. This
excludes condition (a) above. When the draft says "...if all applications
running are known to handle this gracefully...", it doesn't quantify just
how restrictive this condition is, and it gives no explanation of how this
knowledge might be 'known' or which function within the system 'knows' it..
S.4.2.1 contains what seems like plenty of arm-waving.
* "TCP connections could be automatically closed in the network stack
during a migration event."
o There is no TCP connection state in the network stack.
o Even if the network starts to drop every packet, the TCP connection
state persists in the end-points for a duration of the order of 30-90
minutes (OS-dependent) before TCP deems the connection is broken. o
Other transport protocols have similar designs (including the app-layer
of protocols over UDP).
o pausing the connection [does this refer to an actual feature of any
L4 protocol?] o packaging connection state and sending to target [does
this assume logic written into the application, or is this assuming the
stack handles this and the app is restricted to using some form of
separate identifier/locator addresses?] o instantiating connection
state in the peer stack [ditto?].
"Cold Virtual Machine mobility is facilitated by the VM initially
sending an ARP or Neighbor Discovery message at the destination NVE
but the source NVE not receiving any packets inflight."
[How is it arranged for the source NVE not to receive any packets in flight?]
"In hot
standby option, regarding TCP connections, one option is to start
with and maintain TCP connections to two different VMs at the same
time."
[This sounds like resilience logic has been written into the application,
which would be a special case but not something VM mobility
infrastructure
could depend on.]
[Linda] will add.
===#. Gaps===
#. Security Considerations: repeats issues in other drafts that are not
specific to mobility, but it does not mention any security issues
specifically due to VM mobility. It says that address spoofing may arise in
a DC (sort-of implying it is worse than in non-DC environments, but not
saying why). The handshake at the start of a connection (e.g. TCP, SCTP,
QUIC) checks for source address spoofing. So L3 VM mobility would be more
vulnerable to source address spoofing in cases where the mobile VM was the
connection initiator and there was not a new handshake after the move.
However, this draft does not contain any detailed mobility protocols, so it
is not possible to identify any specific security flaws.
#. Transport Issues: Effect of delay on the transport: Cold mobility
introduces significant delay, and other forms less, but still some delay.
It should be pointed out that some applications (e.g. real-time) will
therefore not be useful if subjected to VM mobility. Similarly, even a
short period of delay will drive most congestion controls to severely
reduce throughput. These points might be self-evident, but perhaps they
should be stated explicitly.
BTW, in the L3 VM mobility case, the draft often refers to TCP
connections, but the address bindings of any transport protocols would have
to be migrated due to VM mobility (e.g. SCTP; sequences of datagrams over
UDP; streams over UDP such as with RTP, QUIC).
#. Management Issues: perhaps the draft ought to recommend statistics
gathering (e.g. time taken, amount of duplicate data) to aid a DC's future
decisions on the cost-benefit of moving a VM. The OPSDIR review says a BCP
does not /have/ to describe management issues, but this document seems to
describe a whole system procedure, not just a protocol, which then surely
includes the management plane.
[Linda] can you become a co-author and add those in?
===#. Incoherent Structure===
S.4.1. happens to talk about VMs moving, while S.4.2. happens to talk
about tasks moving, but this is not the distinguishing aspect of these two
* "4.1 VM Migration" is about "L2 VM Mobility" so this ought to be the
section heading, *
"4.2 Task Migration" is about "L3 VM Mobility" so this ought to be the
section heading. It would also help not to switch from VM to task across
these sections
- it's just a distraction.
S.4.1 needs better signposting of where each sub-case ends (Subsections
might be useful to solve this): * IPv4 * end-user client * 2 paras starting
"All NVEs communicating with this virtual machine..." [Not clear that the
end-user case has ended and we have returned to the general IPv4 case?] *
IPv6 [Strictly, it still hasn't said whether the end-user client case has
ended.] [Also, it doesn't explain why there is no need for an end-user
client case under IPv6?] Sections 5 & 6 seem to be about either L2 or L3
mobility, whereas Sections 7 &
8 seem to be restricted to L2.
The draft vacillates over what to do with packets arriving at the old NVE
in the L3 case (see also L3 mobility above): * S4.2 first says packets are
dropped, possibly with an ICMP error message;
o then later it says they are silently dropped;
o then in the very next sentence it says either silently drop them or forward
them to the new location
* S.5 says they should not be lost, but instead delivered to the destination hypervisor
o then it describes how they are tunnelled (which is not the same as
"forwarding").
The order in which all the stages of mobilty are given is jumbled up
across sections that also appear in arbitrary order: * S.5 prepares,
establishes uses then stops a tunnel, but it doesn't say where the other
stages fit between these steps
o When tunneling packets, it talks about the *migrating* VM not the
*migrated* VM, which implies tunnelling has started before the new VM
is running. Does this imply there is a huge buffer? o It says "Stop
Tunneling Packets - When source NVE stops receiving packets destined
to..." but it is never clear when a source has stopped sending packets
to a destination, unless it explicitly closes the connection (e.g..
with
a FIN in the case of TCP). Often there are long gaps between packets,
because many flows are 'thin' (meaning the application frequently has
nothing to send). These gaps can last for milliseconds, hours or even
days without any implication that the connection has ended.
* Then S.6. describes moving state, but doesn't say that this is not after
the previous tunnelling steps (or where it fits within those steps). * Then
S.7 describes hot, warm and cold mobility, but doesn't lay out the
tunnelling or steps to move state in each case. * Then S.8 says it's about
VM life-cycle, but just gives the very first 3 steps for allocation of
resources to a VM, then abruptly ends, without even starting the VM, let
alone getting to move it.
S.5 exhibits another inconsistency by talking about the hypervisor, not the NVE.
==#. Nits==
Nits with the English are too numerous to mention them all. Below are
pointers to general problems as well as some individual instances.
S.4
"Layer 2 and Layer 3 protocols are described next. In the following
sections, we examine more advanced features."
s/following/subsequent/
S.4.1
Expand WSC, MSC and NVA on first use.
s/the VM moves in the same link/the VM moves in the same subnet/
"i.e. end-user clients ask for the same MAC address upon migration. [...]
to ensure that the same IPv4 address is assigned to the VM." I think
s/IPv4/MAC/ was intended?
" All NVEs communicating with this virtual machine uses the old ARP
entry. If any VM in those NVEs need to talk to the new VM in the
destination NVE, it uses the old ARP entry."
Repetition: these 2 sentences say the same. (The mistake is also repeated
when these 2 sentences are repeated for IPv6).
S.4.2.1
s/Push the new mapping to hosts./Push the new mapping to communicating hosts./
S.5.
The IPv4/IPv6 pairs of paras for "tunnel estabilshment" and "tunneling packets"
only differ in the words "IPv4"/"IPv6". So in each case a single para
could be given for IP (irrespective of whether v4 or v6).
Thank you very much.
Linda Dunbar
Spencer Dawkins at IETF
2018-09-14 14:51:19 UTC
Permalink
If I might offer an opinion here ...
Post by Behcet Sarikaya
Post by Linda Dunbar
Post by Bob Briscoe
#. It does not seem as if the NVO WG has discussed the purpose of
using normative text in this draft. See detailed comments.
Post by Bob Briscoe
[Linda] The “Intended status” of the draft is “Best Current Practice”.
So all the text are not “normative”. Is it Okay?
Not really – this draft might be better targeted as “Informational” as it
is not a comprehensive review of current practice (best or otherwise) nor
an overall set of recommendations, e.g., as Bob wrote “it just asserts what
appears to be one view of how a whole VM Mobility system works.”
At present we have no intention of changing the intended status because
that decision should be deferred until IESG Review where we expect to
receive an authoritative view.
Right - this is pretty clear in
https://tools.ietf.org/html/rfc2026#section-6.1.2. The IESG is supposed to
make sure the status coming out makes sense, whether the intended status
going in made sense or not.

Having said that, I would encourage people to take their best shot at
recommending the intended status going into IESG Review, because having 15
people who haven't thought about the intended status as much as other
people should have, trying to figure that out during a telechat week makes
more sense if the document comes in with an obviously inappropriate
intended status - if you can send a document with an appropriate intended
status, the document is more likely to come out with the right status, in
my experience.

Do the right thing, of course!

Spencer
Black, David
2018-09-18 15:15:05 UTC
Permalink
In that case 


IETF Last Call Comment: The status of this document should be “Informational” not “Best Current Practice” as the “Practice” described in this document appears to be neither “Best” nor “Current” (in the sense of widely deployed). If this change of status is made, then all RFC 2119 keywords ought to be removed from this document.

Thanks, --David

From: Spencer Dawkins at IETF [mailto:***@gmail.com]
Sent: Friday, September 14, 2018 10:51 AM
To: ***@ieee.org
Cc: Black, David; IETF list; ***@ietf.org; Bob Briscoe; Linda Dunbar; tsv-***@ietf.org; draft-ietf-nvo3-***@ietf.org
Subject: Re: [Tsv-art] Tsvart last call review of draft-ietf-nvo3-vmm-04

If I might offer an opinion here ...
Post by Linda Dunbar
#. It does not seem as if the NVO WG has discussed the purpose of using normative text in this draft. See detailed comments.
[Linda] The “Intended status” of the draft is “Best Current Practice”. So all the text are not “normative”. Is it Okay?
Not really – this draft might be better targeted as “Informational” as it is not a comprehensive review of current practice (best or otherwise) nor an overall set of recommendations, e.g., as Bob wrote “it just asserts what appears to be one view of how a whole VM Mobility system works.”



At present we have no intention of changing the intended status because that decision should be deferred until IESG Review where we expect to receive an authoritative view.

Right - this is pretty clear in https://tools.ietf.org/html/rfc2026#section-6.1.2. The IESG is supposed to make sure the status coming out makes sense, whether the intended status going in made sense or not.

Having said that, I would encourage people to take their best shot at recommending the intended status going into IESG Review, because having 15 people who haven't thought about the intended status as much as other people should have, trying to figure that out during a telechat week makes more sense if the document comes in with an obviously inappropriate intended status - if you can send a document with an appropriate intended status, the document is more likely to come out with the right status, in my experience.

Do the right thing, of course!

Spencer
Bob Briscoe
2018-09-18 23:12:18 UTC
Permalink
Behcet,

In my experience, if you have to change the intended status of a draft,
you need to make significant changes to the text and often to the whole
structure of the draft. So it's very painful if the intended status
changes right near the end of the process, and therefore preferable by
far to get consensus on what the intended status should be when the
draft is first adopted by the WG.



Bob
Post by Spencer Dawkins at IETF
If I might offer an opinion here ...
On Fri, Sep 14, 2018 at 9:26 AM Behcet Sarikaya
Post by Bob Briscoe
#. It does not seem as if the NVO WG has discussed the
purpose of using normative text in this draft. See detailed
comments.
Post by Bob Briscoe
[Linda] The “Intended status” of the draft is “Best Current
Practice”. So all the text are not “normative”. Is it Okay?
Not really – this draft might be better targeted as
“Informational” as it is not a comprehensive review of current
practice (best or otherwise) nor an overall set of
recommendations, e.g., as Bob wrote “it just asserts what
appears to be one view of how a whole VM Mobility system works.”
At present we have no intention of changing the intended status
because that decision should be deferred until IESG Review where
we expect to receive an authoritative view.
Right -  this is pretty clear in
https://tools.ietf.org/html/rfc2026#section-6.1.2. The IESG is
supposed to make sure the status coming out makes sense, whether the
intended status going in made sense or not.
Having said that, I would encourage people to take their best shot at
recommending the intended status going into IESG Review, because
having 15 people who haven't thought about the intended status as much
as other people should have, trying to figure that out during a
telechat week makes more sense if the document comes in with an
obviously inappropriate intended status - if you can send a document
with an appropriate intended status, the document is more likely to
come out with the right status, in my experience.
Do the right thing, of course!
Spencer
_______________________________________________
Tsv-art mailing list
https://www.ietf.org/mailman/listinfo/tsv-art
--
________________________________________________________________
Bob Briscoe http://bobbriscoe.net/
Bob Briscoe
2018-09-18 23:02:54 UTC
Permalink
Linda,

Until we can all understand the answers to the following two questions,
I don't think we can discuss what track this draft ought to be on, let
alone move on to your responses to all my other points.

1/ Applicability

You say this draft solely applies to connections with both ends within
the controlled DC environment. But the draft says it's about
multi-tenant DCs. Are there any multi-tenant DCs that restrict all VMs
to only communicate with other VMs within the same controlled DC
environment?

2/ Purpose of publishing as an RFC
Post by Bob Briscoe
#. The introduction does not say what the purpose of publishing this draft is.
[Linda] The first paragraph on Page 3 has the description why VM
Mobility is needed.
Whether VM Mobility is needed was not my question. My question was what
is the purpose of the IETF publishing an RFC about VM Mobility? And
particularly, what is /this/ RFC intended to achieve?

Are the authors trying to argue for a particular approach vs. others?
Are you trying to write a tutorial? Are you trying to give the pros and
cons of different approaches? Are you trying to give advice on good
practice (with the implication that alternative practices are less
good)? Are you trying to clarify ideas by writing them down? Are you
trying to outline the implications of VM Mobility for other protocols
being developed within the NVO WG?




Bob
Post by Bob Briscoe
Bob,
Thank you very much for reviewing the draft and provided in-depth
comments. I am very sorry for the delayed response due to traveling.
-----Original Message-----
Sent: Monday, September 03, 2018 9:45 PM
Subject: Tsvart last call review of draft-ietf-nvo3-vmm-04
Reviewer: Bob Briscoe
Review result: Not Ready
I have been selected as the Transport Directorate reviewer for this
draft. The Transport Directorate seeks to review all transport or
transport-related drafts as they pass through IETF last call and IESG
review, and sometimes on special request. The purpose of the review is
to provide assistance to the Transport ADs. For more information about
the Transport Directorate Reviews and the Transport Area Review Team,
please see ​https://trac.ietf.org/trac/tsv/wiki/TSV-Directorate-Reviews
In this case, very very few of the review comments relate to transport
issues, although the greatest issue concerns a desire that the network
could pause or stop connections during L3 VM Mobility, which is
certainly a transport issue.
[Linda] There is “Hot Migration” with transport service continuing,
and there is a “Cold Migration”, which is a common practice in many
data centers, which stop the task running on the old place and move to
the new place before restart as described in the Task Migration.
Is it helpful to add this description to the draft?
==Summary==
The technical aspects of the draft concerning L2 VM mobility (within a
subnet) seem sound. However, this is only part of the draft, which has
the following
#. The introduction does not say what the purpose of publishing this draft is.
It seems that, rather than describing a specific protocol or
protocols, it intends to describe the overall system procedure that
would typically be used in DCs for VM mobility. It is tagged as a BCP,
but it does not say who needs this BCP, why it is useful for the IETF
to publish this BCP, how wide the authors' knowledge is of current
practice (given DCs are private), or why this is a BCP rather than a
protocol spec.
[Linda] The first paragraph on Page 3 has the description why VM
Mobility is needed. Is it helpful to move this paragraph to the
beginning of the Introduction Section?
/“//Virtualization which is being used in almost all of today’s data/
/centers enables many virtual machines to run on a single physical/
/computer or compute server. Virtual machines (VM) need hypervisor/
/running on the physical compute server to provide them shared/
/processor/memory/storage. Network connectivity is provided by the/
/network virtualization edge (NVE) [RFC8014]. Being able to move VMs/
/dynamically, or live migration, from one server to another allows for/
/dynamic load balancing or work distribution and thus it is a highly/
/desirable feature [RFC7364].//”/
The draft starts out (S.3) as if it intends to say what a good VM
Mobility protocol should or shouldn't do, but the rest of the document
doesn't give any reasoning for these recommendations, it just asserts
what appears to be one view of how a whole VM Mobility system works,
sometimes referring to one example protocol RFC for a component part,
but more often with no references or details.
[Linda] Is it helpful to move the paragraph above to the beginning of
the Introduction Section? So that audience is aware of why VM Mobility
is needed. And then follow up with what a good VM Mobility protocol
should or shouldn't do?
#. It does not seem as if the NVO WG has discussed the purpose of
using normative text in this draft. See detailed comments.
[Linda] The “Intended status” of the draft is “Best Current Practice”.
So all the text are not “normative”. Is it Okay?
#. The draft silently slips back and forth between VM mobility and VM
redundancy, without recognizing the differences. See detailed comments.
[Linda] There is only one usage of “redundancy” in the entire
document, used under the context of “Hot standby option”, indicating 
the “redundancy” of “the VMs in both primary and secondary domains
have identical information and can provide services simultaneously as
in load-share mode of operation” being expensive.
#. Please adopt different terminology than "source NVE" and
"destination NVE", which are really poor choices of terms for an
intermediate node. See detailed comments. Why not use "old NVE" and
"new NVE", which is what you mean?
[Linda] Thanks for the suggestion. We will change to “Old NVE”, and
“new NVE”.
#. Applicability is fairly clearly outlined, but it is not clear
whether hosts corresponding with the mobile VMs are part of the same
controlled environment or on the uncontrolled public Internet. See
detailed comments.
[Linda] “Hosts” are the App running on the VM. It is the under the
same controlled environment. Not on uncontrolled public internet.
#. Section 4.2.1 on L3 VM mobility reads like some potential
half-thought-through ideas on how to solve L3 mobility, rather than
current practice, let alone best current practice. Either current
practice should be described instead, or the scope of the draft should
be narrowed solely to L2 VM mobility. See detailed comments.
[Linda] This is refereeing to “Cold Migration”, which is a common
practice in many data centers.
# The VM's file system is described as state that moves with the VM
(S.6), but VM mobility solutions often move the VM but stitch it back
to its (unmoved) storage. Conversely, the storage can also move
independent of the VM.
[Linda] It depends. When a VM move to a different zone, the
storage/file can becomes inaccessible.
#. The draft omits some of the security, transport and management
aspects of VM mobility. See detailed comments.
[Linda] Can you provide some text?
#. The draft reads as if different sections have been written by
different authors and no-one has edited the whole to give it a
coherent structure, or to ensure consistency (both technical and
editorial) between the parts. See detailed comments.
[Linda] we can improve.
#. The quality of the English grammar does not allow a reviewer to
concentrate on the technical aspects rather than the English. It would
have been useful if one of the English-speaking co-authors had
improved the English before submission for review. See detailed comments.
[Linda] can you help?  Becoming a co-author to improve?
==Detailed Comments==
===#. Normative statements===
In the body of the document, there is just one occurrence of normative
text (actually two "MUST"s, but both state a common requirement - just
written separately for IPv4 and IPv6). This merely serves to imply
that everything else the document says is less important or optional,
which was probably not the intention.
[Linda] The goal is to indicate any solution in moving the VM “MUST”
follow this rule. They make sense, aren’t they?
At the start there is a requirements section, which states what a VM
Mobility protocol "SHOULD" or "SHOULD NOT" do. I think this is
intended as a set of goals for the rest of the document. If so, these
"SHOULDs" are not intended to apply to implementations, so they ought
not to be capitalized.
[Linda] okay, will change.
The first requirement, "Data center network SHOULD support virtual
machine mobility in IPv6", is written as a requirement on all DC
networks, not on implementations. I assume this was intended to read
as "Data center network virtual machine mobility protocols SHOULD
support IPv6". Even then, it doesn't really add anything to say VM
mobility should support v6 and it should support v4. A L2 solution
won't. While undoubtedly, a L3 solution will at least support one of them.
[Linda]Agree. Will change it to “Data center that support IPv6 address
should 
”
I'm not sure that 'protocol' is the right word anyway; I think 'VM
Mobility procedure' would be a better phrase, because it includes
steps such as suspending the VM, which is more than a protocol.
[Linda] yes. Will change to “Procedure”.
The requirement "Virtual machine mobility protocol MAY support host
routes to accomplish virtualization", is not followed up at all in the
rest of the draft.
Even if this requirement stays, the last 3 words should be deleted.
[Linda] will change to “Host Route can be used to support the Virtual
Machine Mobility Procedure.”
By the end of the draft, the solution falls far short of the most
relevant "Requirements" anyway, so one assumes the title of the
section ought to have been "Goals". Specifically, even in the simpler
case of L2 VM mobility, S.4.1 says that triangular routing and
tunnelling persist "until a neighbour cache entry times out". A cache
timeout is about 10 orders of magnitude longer than the requirement to
only persist "while handling packets in flight", which would be a few
milliseconds at most (the time for packets to clear the network that
were already launched into flight when the old VM stopped).
Whatever, it would be preferable for the draft to give rationale for
these requirements, rather than just assert them. This would help to
shed light on the merits of the different trade offs that solutions
choose.
[Linda] Agree, will add.
===#. Mobility vs. Redundancy===
Redundancy and mobility have a lot of similarities, but they have
different goals. With mobility, it is necessary to know the exact
instant when one set of state is identical to the other so it can hand
over. With redundancy, the aim is to keep two (or more) sets of state
evolving through the same sequence of changes, but there is no need to
know the point at which one is the same as the other was at a certain
point.
[Linda] Agree with what you said. There is only one usage of
“redundancy” in the entire document, used under the context of “Hot
standby option”, indicating  the “redundancy” of  “the VMs in both
primary and secondary domains have identical information and can
provide services simultaneously as in load-share mode of operation”
being expensive.
* S.2. Terminology: Warm VM Mobility is defined without any ending, as
if it is permanent replication. * S.7. "Handling of Hot, Warm and Cold
Virtual Machine Mobility" is actually all about redundancy, and
doesn't address mobility explicitly.
[Linda] Will add the definition “Hot Migration”, “cold migration”, and
“warm migration”.
===#. Terminology===
Packets run from the source at A to the destination at B via NVE1,
then via NVE2. Please don't call NVE1 and NVE2 the source NVE and the
destination NVE.
In future, no-one will thank you for the apparent contradictions when
they continually stumble over phrases like this one in S.4.1: "...send
their packets to the source NVE".
The term "packets in flight" is used incorrectly to refer to all the
packets sent to the old NVE after the VM has moved, even if they were
launched into flight long after the old VM stopped receiving packets.
[Linda] thank for the comments. Will change.
BTW, I think s/before/after/ in: "that have old ARP or neighbor cache
entry before VM or task migration".
I think: s/IP-based VM mobility/L3 VM mobility/ throughout, because "based"
sounds (to me) like the mobility control protocol is over (i.e. based on) IP.
===#. Applicability===
In section 4.2 it says that the protocol mostly used as the IP based
task migration protocol is ILA. This implies that all hosts
corresponding with the mobile VMs are either part of the same
controlled environment, or they are proxied via nodes that are part of
the same controlled environment (I only have passing knowledge of ILA,
but I understand that it depends on ILA routers on the path). If I am
correct, this aspect of scope needs to be made clear from the start.
Also under the heading of applicabiliy, the sentence "Since migrations
should be relatively rare events" appears very late in the document
(S.4.2.1). The assumed level of churn ought to be stated nearer the start.
[Linda] yes, under the same controlled environment.
===#. L3 Mobility===
L2 VM mobility is independent of the application, because resolution
of L2 mappings is delegated to the stack. In contrast, L3 VM mobility
is only feasible under certain conditions, because an application
needs an IP address to open a socket (resolution of DNS names is not
delegated to the stack, and apps can use IP addresses directly anyway).
a) /All/ applications used in the whole DC load balancing scheme
contain IP address migration logic for /all/ their connections; b) VMs
running solely applications that support IP address migration register
this fact with the NVA, and it only select such VMs for mobility. c)
An abstraction is layered over /all/ the IP addresses exposed to
applications (at both ends) so that the IP addresses that applications
use are solely identifiers  (e.g. ILA, LISP, HIP), not also locators.
The introduction says the draft is about VM mobility in a multi-tenant
DC, so the DC admin will not know the range of applications being
used. This excludes condition (a) above. When the draft says "...if
all applications running are known to handle this gracefully...", it
doesn't quantify just how restrictive this condition is, and it gives
no explanation of how this knowledge might be 'known' or which
function within the system 'knows' it.
S.4.2.1 contains what seems like plenty of arm-waving.
* "TCP connections could be automatically closed in the network stack
during a migration event."
        o There is no TCP connection state in the network stack.
        o Even if the network starts to drop every packet, the TCP
connection
        state persists in the end-points for a duration of the order
of 30-90
        minutes (OS-dependent) before TCP deems the connection is
broken. o
        Other transport protocols have similar designs (including the
app-layer
        of protocols over UDP).
        o pausing the connection [does this refer to an actual feature
of any
        L4 protocol?] o packaging connection state and sending to
target [does
        this assume logic written into the application, or is this
assuming the
        stack handles this and the app is restricted to using some form of
        separate identifier/locator addresses?] o instantiating connection
        state in the peer stack [ditto?].
  "Cold Virtual Machine mobility is facilitated by the VM initially
   sending an ARP or Neighbor Discovery message at the destination NVE
   but the source NVE not receiving any packets inflight."
   [How is it arranged for the source NVE not to receive any packets
in flight?]
  "In hot
   standby option, regarding TCP connections, one option is to start
   with and maintain TCP connections to two different VMs at the same
   time."
   [This sounds like resilience logic has been written into the
application,
   which would be a special case but not something VM mobility
infrastructure
   could depend on.]
[Linda] will add.
===#. Gaps===
#. Security Considerations: repeats issues in other drafts that are
not specific to mobility, but it does not mention any security issues
specifically due to VM mobility. It says that address spoofing may
arise in a DC (sort-of implying it is worse than in non-DC
environments, but not saying why). The handshake at the start of a
connection (e.g. TCP, SCTP, QUIC) checks for source address spoofing.
So L3 VM mobility would be more vulnerable to source address spoofing
in cases where the mobile VM was the connection initiator and there
was not a new handshake after the move. However, this draft does not
contain any detailed mobility protocols, so it is not possible to
identify any specific security flaws.
#. Transport Issues: Effect of delay on the transport: Cold mobility
introduces significant delay, and other forms less, but still some
delay. It should be pointed out that some applications (e.g.
real-time) will therefore not be useful if subjected to VM mobility.
Similarly, even a short period of delay will drive most congestion
controls to severely reduce throughput. These points might be
self-evident, but perhaps they should be stated explicitly.
BTW, in the L3 VM mobility case, the draft often refers to TCP
connections, but the address bindings of any transport protocols would
have to be migrated due to VM mobility (e.g. SCTP; sequences of
datagrams over UDP; streams over UDP such as with RTP, QUIC).
#. Management Issues: perhaps the draft ought to recommend statistics
gathering (e.g. time taken, amount of duplicate data) to aid a DC's
future decisions on the cost-benefit of moving a VM. The OPSDIR review
says a BCP does not /have/ to describe management issues, but this
document seems to describe a whole system procedure, not just a
protocol, which then surely includes the management plane.
[Linda] can you become a co-author and add those in?
===#. Incoherent Structure===
S.4.1. happens to talk about VMs moving, while S.4.2. happens to talk
about tasks moving, but this is not the distinguishing aspect of these
two sections (anyway, S.2. says "the draft uses task and VM
interchangeably"): * "4.1 VM Migration" is about "L2 VM Mobility" so
this ought to be the section heading, *
"4.2 Task Migration" is about "L3 VM Mobility" so this ought to be the
section heading. It would also help not to switch from VM to task
across these sections
- it's just a distraction.
S.4.1 needs better signposting of where each sub-case ends
(Subsections might be useful to solve this): * IPv4 * end-user client
* 2 paras starting "All NVEs communicating with this virtual
machine..." [Not clear that the end-user case has ended and we have
returned to the general IPv4 case?] * IPv6 [Strictly, it still hasn't
said whether the end-user client case has ended.] [Also, it doesn't
explain why there is no need for an end-user client case under IPv6?]
Sections 5 & 6 seem to be about either L2 or L3 mobility, whereas Sections 7 &
8 seem to be restricted to L2.
The draft vacillates over what to do with packets arriving at the old
NVE in the L3 case (see also L3 mobility above): * S4.2 first says
packets are dropped, possibly with an ICMP error message;
  o then later it says they are silently dropped;
  o then in the very next sentence it says either silently drop them
or forward
  them to the new location
* S.5 says they should not be lost, but instead delivered to the destination hypervisor
  o then it describes how they are tunnelled (which is not the same as
  "forwarding").
The order in which all the stages of mobilty are given is jumbled up
across sections that also appear in arbitrary order: * S.5 prepares,
establishes uses then stops a tunnel, but it doesn't say where the
other stages fit between these steps
        o When tunneling packets, it talks about the *migrating* VM
not the
        *migrated* VM, which implies tunnelling has started before the
new VM
        is running. Does this imply there is a huge buffer? o It says
"Stop
        Tunneling Packets - When source NVE stops receiving packets
destined
        to..." but it is never clear when a source has stopped sending
packets
        to a destination, unless it explicitly closes the connection
(e.g. with
        a FIN in the case of TCP). Often there are long gaps between
packets,
        because many flows are 'thin' (meaning the application
frequently has
        nothing to send). These gaps can last for milliseconds, hours
or even
        days without any implication that the connection has ended.
* Then S.6. describes moving state, but doesn't say that this is not
after the previous tunnelling steps (or where it fits within those
steps). * Then S.7 describes hot, warm and cold mobility, but doesn't
lay out the tunnelling or steps to move state in each case. * Then S.8
says it's about VM life-cycle, but just gives the very first 3 steps
for allocation of resources to a VM, then abruptly ends, without even
starting the VM, let alone getting to move it.
S.5 exhibits another inconsistency by talking about the hypervisor, not the NVE.
==#. Nits==
Nits with the English are too numerous to mention them all. Below are
pointers to general problems as well as some individual instances.
S.4
  "Layer 2 and Layer 3 protocols are described next.  In the following
   sections, we examine more advanced features."
        s/following/subsequent/
S.4.1
Expand WSC, MSC and NVA on first use.
s/the VM moves in the same link/the VM moves in the same subnet/
"i.e. end-user clients ask for the same MAC address upon migration.
[...] to ensure that the same IPv4 address is assigned to the VM." I
think s/IPv4/MAC/ was intended?
"  All NVEs communicating with this virtual machine uses the old ARP
   entry.  If any VM in those NVEs need to talk to the new VM in the
   destination NVE, it uses the old ARP entry."
Repetition: these 2 sentences say the same. (The mistake is also
repeated when these 2 sentences are repeated for IPv6).
S.4.2.1
s/Push the new mapping to hosts./Push the new mapping to communicating hosts./
S.5.
The IPv4/IPv6 pairs of paras for "tunnel estabilshment" and "tunneling packets"
only differ in the words "IPv4"/"IPv6". So in each case a single para
could be given for IP (irrespective of whether v4 or v6).
Thank you very much.
Linda Dunbar
_______________________________________________
Tsv-art mailing list
https://www.ietf.org/mailman/listinfo/tsv-art
--
________________________________________________________________
Bob Briscoe http://bobbriscoe.net/
Behcet Sarikaya
2018-09-19 14:46:11 UTC
Permalink
Hi Bob,

Thank you for your comments.
The authors are currently discussing your points and we will come up with a
revision soon after the discussions are over.

Regards,
Behcet
Post by Bob Briscoe
Linda,
Until we can all understand the answers to the following two questions, I
don't think we can discuss what track this draft ought to be on, let alone
move on to your responses to all my other points.
1/ Applicability
You say this draft solely applies to connections with both ends within the
controlled DC environment. But the draft says it's about multi-tenant DCs..
Are there any multi-tenant DCs that restrict all VMs to only communicate
with other VMs within the same controlled DC environment?
2/ Purpose of publishing as an RFC
#. The introduction does not say what the purpose of publishing this draft is.
[Linda] The first paragraph on Page 3 has the description why VM Mobility
is needed.
Whether VM Mobility is needed was not my question. My question was what is
the purpose of the IETF publishing an RFC about VM Mobility? And
particularly, what is /this/ RFC intended to achieve?
Are the authors trying to argue for a particular approach vs. others? Are
you trying to write a tutorial? Are you trying to give the pros and cons of
different approaches? Are you trying to give advice on good practice (with
the implication that alternative practices are less good)? Are you trying
to clarify ideas by writing them down? Are you trying to outline the
implications of VM Mobility for other protocols being developed within the
NVO WG?
Bob
Bob,
Thank you very much for reviewing the draft and provided in-depth
comments. I am very sorry for the delayed response due to traveling.
-----Original Message-----
Sent: Monday, September 03, 2018 9:45 PM
Subject: Tsvart last call review of draft-ietf-nvo3-vmm-04
Reviewer: Bob Briscoe
Review result: Not Ready
I have been selected as the Transport Directorate reviewer for this draft..
The Transport Directorate seeks to review all transport or
transport-related drafts as they pass through IETF last call and IESG
review, and sometimes on special request. The purpose of the review is to
provide assistance to the Transport ADs. For more information about the
Transport Directorate Reviews and the Transport Area Review Team, please
see https://trac.ietf.org/trac/tsv/wiki/TSV-Directorate-Reviews
In this case, very very few of the review comments relate to transport
issues, although the greatest issue concerns a desire that the network
could pause or stop connections during L3 VM Mobility, which is certainly a
transport issue.
[Linda] There is “Hot Migration” with transport service continuing, and
there is a “Cold Migration”, which is a common practice in many data
centers, which stop the task running on the old place and move to the new
place before restart as described in the Task Migration.
Is it helpful to add this description to the draft?
==Summary==
The technical aspects of the draft concerning L2 VM mobility (within a
subnet) seem sound. However, this is only part of the draft, which has the
following
#. The introduction does not say what the purpose of publishing this draft is.
It seems that, rather than describing a specific protocol or protocols, it
intends to describe the overall system procedure that would typically be
used in DCs for VM mobility. It is tagged as a BCP, but it does not say who
needs this BCP, why it is useful for the IETF to publish this BCP, how wide
the authors' knowledge is of current practice (given DCs are private), or
why this is a BCP rather than a protocol spec.
[Linda] The first paragraph on Page 3 has the description why VM Mobility
is needed. Is it helpful to move this paragraph to the beginning of the
Introduction Section?
*“**Virtualization which is being used in almost all of today’s data*
*centers enables many virtual machines to run on a single physical*
*computer or compute server. Virtual machines (VM) need hypervisor*
*running on the physical compute server to provide them shared*
*processor/memory/storage. Network connectivity is provided by the*
*network virtualization edge (NVE) [RFC8014]. Being able to move VMs*
*dynamically, or live migration, from one server to another allows for*
*dynamic load balancing or work distribution and thus it is a highly*
*desirable feature [RFC7364].**”*
The draft starts out (S.3) as if it intends to say what a good VM Mobility
protocol should or shouldn't do, but the rest of the document doesn't give
any reasoning for these recommendations, it just asserts what appears to be
one view of how a whole VM Mobility system works, sometimes referring to
one example protocol RFC for a component part, but more often with no
references or details.
[Linda] Is it helpful to move the paragraph above to the beginning of the
Introduction Section? So that audience is aware of why VM Mobility is
needed. And then follow up with what a good VM Mobility protocol should or
shouldn't do?
#. It does not seem as if the NVO WG has discussed the purpose of using
normative text in this draft. See detailed comments.
[Linda] The “Intended status” of the draft is “Best Current Practice”. So
all the text are not “normative”. Is it Okay?
#. The draft silently slips back and forth between VM mobility and VM
redundancy, without recognizing the differences. See detailed comments.
[Linda] There is only one usage of “redundancy” in the entire document,
used under the context of “Hot standby option”, indicating the
“redundancy” of “the VMs in both primary and secondary domains have
identical information and can provide services simultaneously as in
load-share mode of operation” being expensive.
#. Please adopt different terminology than "source NVE" and "destination
NVE", which are really poor choices of terms for an intermediate node. See
detailed comments. Why not use "old NVE" and "new NVE", which is what you
mean?
[Linda] Thanks for the suggestion. We will change to “Old NVE”, and “new
NVE”.
#. Applicability is fairly clearly outlined, but it is not clear whether
hosts corresponding with the mobile VMs are part of the same controlled
environment or on the uncontrolled public Internet. See detailed comments..
[Linda] “Hosts” are the App running on the VM. It is the under the same
controlled environment. Not on uncontrolled public internet.
#. Section 4.2.1 on L3 VM mobility reads like some potential
half-thought-through ideas on how to solve L3 mobility, rather than current
practice, let alone best current practice. Either current practice should
be described instead, or the scope of the draft should be narrowed solely
to L2 VM mobility. See detailed comments.
[Linda] This is refereeing to “Cold Migration”, which is a common practice
in many data centers.
# The VM's file system is described as state that moves with the VM (S.6),
but VM mobility solutions often move the VM but stitch it back to its
(unmoved) storage. Conversely, the storage can also move independent of the
VM.
[Linda] It depends. When a VM move to a different zone, the storage/file
can becomes inaccessible.
#. The draft omits some of the security, transport and management aspects
of VM mobility. See detailed comments.
[Linda] Can you provide some text?
#. The draft reads as if different sections have been written by different
authors and no-one has edited the whole to give it a coherent structure, or
to ensure consistency (both technical and editorial) between the parts. See
detailed comments.
[Linda] we can improve.
#. The quality of the English grammar does not allow a reviewer to
concentrate on the technical aspects rather than the English. It would have
been useful if one of the English-speaking co-authors had improved the
English before submission for review. See detailed comments.
[Linda] can you help? Becoming a co-author to improve?
==Detailed Comments==
===#. Normative statements===
In the body of the document, there is just one occurrence of normative
text (actually two "MUST"s, but both state a common requirement - just
written separately for IPv4 and IPv6). This merely serves to imply that
everything else the document says is less important or optional, which was
probably not the intention.
[Linda] The goal is to indicate any solution in moving the VM “MUST”
follow this rule. They make sense, aren’t they?
At the start there is a requirements section, which states what a VM
Mobility protocol "SHOULD" or "SHOULD NOT" do. I think this is intended as
a set of goals for the rest of the document. If so, these "SHOULDs" are not
intended to apply to implementations, so they ought not to be capitalized..
[Linda] okay, will change.
The first requirement, "Data center network SHOULD support virtual machine
mobility in IPv6", is written as a requirement on all DC networks, not on
implementations. I assume this was intended to read as "Data center network
virtual machine mobility protocols SHOULD support IPv6". Even then, it
doesn't really add anything to say VM mobility should support v6 and it
should support v4. A L2 solution won't. While undoubtedly, a L3 solution
will at least support one of them.
[Linda]Agree. Will change it to “Data center that support IPv6 address
should 
”
I'm not sure that 'protocol' is the right word anyway; I think 'VM
Mobility procedure' would be a better phrase, because it includes steps
such as suspending the VM, which is more than a protocol.
[Linda] yes. Will change to “Procedure”.
The requirement "Virtual machine mobility protocol MAY support host routes
to accomplish virtualization", is not followed up at all in the rest of the
draft.
Even if this requirement stays, the last 3 words should be deleted.
[Linda] will change to “Host Route can be used to support the Virtual
Machine Mobility Procedure.”
By the end of the draft, the solution falls far short of the most relevant
"Requirements" anyway, so one assumes the title of the section ought to
have been "Goals". Specifically, even in the simpler case of L2 VM
mobility, S.4.1 says that triangular routing and tunnelling persist "until
a neighbour cache entry times out". A cache timeout is about 10 orders of
magnitude longer than the requirement to only persist "while handling
packets in flight", which would be a few milliseconds at most (the time for
packets to clear the network that were already launched into flight when
the old VM stopped).
Whatever, it would be preferable for the draft to give rationale for these
requirements, rather than just assert them. This would help to shed light
on the merits of the different trade offs that solutions choose.
[Linda] Agree, will add.
===#. Mobility vs. Redundancy===
Redundancy and mobility have a lot of similarities, but they have
different goals. With mobility, it is necessary to know the exact instant
when one set of state is identical to the other so it can hand over. With
redundancy, the aim is to keep two (or more) sets of state evolving through
the same sequence of changes, but there is no need to know the point at
which one is the same as the other was at a certain point.
[Linda] Agree with what you said. There is only one usage of “redundancy”
in the entire document, used under the context of “Hot standby option”,
indicating the “redundancy” of “the VMs in both primary and secondary
domains have identical information and can provide services simultaneously
as in load-share mode of operation” being expensive.
* S.2. Terminology: Warm VM Mobility is defined without any ending, as if
it is permanent replication. * S.7. "Handling of Hot, Warm and Cold Virtual
Machine Mobility" is actually all about redundancy, and doesn't address
mobility explicitly.
[Linda] Will add the definition “Hot Migration”, “cold migration”, and
“warm migration”.
===#. Terminology===
Packets run from the source at A to the destination at B via NVE1, then
via NVE2. Please don't call NVE1 and NVE2 the source NVE and the
destination NVE.
In future, no-one will thank you for the apparent contradictions when they
continually stumble over phrases like this one in S.4.1: "...send their
packets to the source NVE".
The term "packets in flight" is used incorrectly to refer to all the
packets sent to the old NVE after the VM has moved, even if they were
launched into flight long after the old VM stopped receiving packets.
[Linda] thank for the comments. Will change.
BTW, I think s/before/after/ in: "that have old ARP or neighbor cache
entry before VM or task migration".
I think: s/IP-based VM mobility/L3 VM mobility/ throughout, because "based"
sounds (to me) like the mobility control protocol is over (i.e. based on) IP.
===#. Applicability===
In section 4.2 it says that the protocol mostly used as the IP based task
migration protocol is ILA. This implies that all hosts corresponding with
the mobile VMs are either part of the same controlled environment, or they
are proxied via nodes that are part of the same controlled environment (I
only have passing knowledge of ILA, but I understand that it depends on ILA
routers on the path). If I am correct, this aspect of scope needs to be
made clear from the start.
Also under the heading of applicabiliy, the sentence "Since migrations
should be relatively rare events" appears very late in the document
(S.4.2.1). The assumed level of churn ought to be stated nearer the start..
[Linda] yes, under the same controlled environment.
===#. L3 Mobility===
L2 VM mobility is independent of the application, because resolution of L2
mappings is delegated to the stack. In contrast, L3 VM mobility is only
feasible under certain conditions, because an application needs an IP
address to open a socket (resolution of DNS names is not delegated to the
stack, and apps can use IP addresses directly anyway).
a) /All/ applications used in the whole DC load balancing scheme contain
IP address migration logic for /all/ their connections; b) VMs running
solely applications that support IP address migration register this fact
with the NVA, and it only select such VMs for mobility. c) An abstraction
is layered over /all/ the IP addresses exposed to applications (at both
ends) so that the IP addresses that applications use are solely
identifiers (e.g. ILA, LISP, HIP), not also locators.
The introduction says the draft is about VM mobility in a multi-tenant DC,
so the DC admin will not know the range of applications being used. This
excludes condition (a) above. When the draft says "...if all applications
running are known to handle this gracefully...", it doesn't quantify just
how restrictive this condition is, and it gives no explanation of how this
knowledge might be 'known' or which function within the system 'knows' it..
S.4.2.1 contains what seems like plenty of arm-waving.
* "TCP connections could be automatically closed in the network stack
during a migration event."
o There is no TCP connection state in the network stack.
o Even if the network starts to drop every packet, the TCP connection
state persists in the end-points for a duration of the order of 30-90
minutes (OS-dependent) before TCP deems the connection is broken. o
Other transport protocols have similar designs (including the app-layer
of protocols over UDP).
o pausing the connection [does this refer to an actual feature of any
L4 protocol?] o packaging connection state and sending to target [does
this assume logic written into the application, or is this assuming the
stack handles this and the app is restricted to using some form of
separate identifier/locator addresses?] o instantiating connection
state in the peer stack [ditto?].
"Cold Virtual Machine mobility is facilitated by the VM initially
sending an ARP or Neighbor Discovery message at the destination NVE
but the source NVE not receiving any packets inflight."
[How is it arranged for the source NVE not to receive any packets in flight?]
"In hot
standby option, regarding TCP connections, one option is to start
with and maintain TCP connections to two different VMs at the same
time."
[This sounds like resilience logic has been written into the application,
which would be a special case but not something VM mobility
infrastructure
could depend on.]
[Linda] will add.
===#. Gaps===
#. Security Considerations: repeats issues in other drafts that are not
specific to mobility, but it does not mention any security issues
specifically due to VM mobility. It says that address spoofing may arise in
a DC (sort-of implying it is worse than in non-DC environments, but not
saying why). The handshake at the start of a connection (e.g. TCP, SCTP,
QUIC) checks for source address spoofing. So L3 VM mobility would be more
vulnerable to source address spoofing in cases where the mobile VM was the
connection initiator and there was not a new handshake after the move.
However, this draft does not contain any detailed mobility protocols, so it
is not possible to identify any specific security flaws.
#. Transport Issues: Effect of delay on the transport: Cold mobility
introduces significant delay, and other forms less, but still some delay.
It should be pointed out that some applications (e.g. real-time) will
therefore not be useful if subjected to VM mobility. Similarly, even a
short period of delay will drive most congestion controls to severely
reduce throughput. These points might be self-evident, but perhaps they
should be stated explicitly.
BTW, in the L3 VM mobility case, the draft often refers to TCP
connections, but the address bindings of any transport protocols would have
to be migrated due to VM mobility (e.g. SCTP; sequences of datagrams over
UDP; streams over UDP such as with RTP, QUIC).
#. Management Issues: perhaps the draft ought to recommend statistics
gathering (e.g. time taken, amount of duplicate data) to aid a DC's future
decisions on the cost-benefit of moving a VM. The OPSDIR review says a BCP
does not /have/ to describe management issues, but this document seems to
describe a whole system procedure, not just a protocol, which then surely
includes the management plane.
[Linda] can you become a co-author and add those in?
===#. Incoherent Structure===
S.4.1. happens to talk about VMs moving, while S.4.2. happens to talk
about tasks moving, but this is not the distinguishing aspect of these two
* "4.1 VM Migration" is about "L2 VM Mobility" so this ought to be the
section heading, *
"4.2 Task Migration" is about "L3 VM Mobility" so this ought to be the
section heading. It would also help not to switch from VM to task across
these sections
- it's just a distraction.
S.4.1 needs better signposting of where each sub-case ends (Subsections
might be useful to solve this): * IPv4 * end-user client * 2 paras starting
"All NVEs communicating with this virtual machine..." [Not clear that the
end-user case has ended and we have returned to the general IPv4 case?] *
IPv6 [Strictly, it still hasn't said whether the end-user client case has
ended.] [Also, it doesn't explain why there is no need for an end-user
client case under IPv6?] Sections 5 & 6 seem to be about either L2 or L3
mobility, whereas Sections 7 &
8 seem to be restricted to L2.
The draft vacillates over what to do with packets arriving at the old NVE
in the L3 case (see also L3 mobility above): * S4.2 first says packets are
dropped, possibly with an ICMP error message;
o then later it says they are silently dropped;
o then in the very next sentence it says either silently drop them or forward
them to the new location
* S.5 says they should not be lost, but instead delivered to the destination hypervisor
o then it describes how they are tunnelled (which is not the same as
"forwarding").
The order in which all the stages of mobilty are given is jumbled up
across sections that also appear in arbitrary order: * S.5 prepares,
establishes uses then stops a tunnel, but it doesn't say where the other
stages fit between these steps
o When tunneling packets, it talks about the *migrating* VM not the
*migrated* VM, which implies tunnelling has started before the new VM
is running. Does this imply there is a huge buffer? o It says "Stop
Tunneling Packets - When source NVE stops receiving packets destined
to..." but it is never clear when a source has stopped sending packets
to a destination, unless it explicitly closes the connection (e.g..
with
a FIN in the case of TCP). Often there are long gaps between packets,
because many flows are 'thin' (meaning the application frequently has
nothing to send). These gaps can last for milliseconds, hours or even
days without any implication that the connection has ended.
* Then S.6. describes moving state, but doesn't say that this is not after
the previous tunnelling steps (or where it fits within those steps). * Then
S.7 describes hot, warm and cold mobility, but doesn't lay out the
tunnelling or steps to move state in each case. * Then S.8 says it's about
VM life-cycle, but just gives the very first 3 steps for allocation of
resources to a VM, then abruptly ends, without even starting the VM, let
alone getting to move it.
S.5 exhibits another inconsistency by talking about the hypervisor, not the NVE.
==#. Nits==
Nits with the English are too numerous to mention them all. Below are
pointers to general problems as well as some individual instances.
S.4
"Layer 2 and Layer 3 protocols are described next. In the following
sections, we examine more advanced features."
s/following/subsequent/
S.4.1
Expand WSC, MSC and NVA on first use.
s/the VM moves in the same link/the VM moves in the same subnet/
"i.e. end-user clients ask for the same MAC address upon migration. [...]
to ensure that the same IPv4 address is assigned to the VM." I think
s/IPv4/MAC/ was intended?
" All NVEs communicating with this virtual machine uses the old ARP
entry. If any VM in those NVEs need to talk to the new VM in the
destination NVE, it uses the old ARP entry."
Repetition: these 2 sentences say the same. (The mistake is also repeated
when these 2 sentences are repeated for IPv6).
S.4.2.1
s/Push the new mapping to hosts./Push the new mapping to communicating hosts./
S.5.
The IPv4/IPv6 pairs of paras for "tunnel estabilshment" and "tunneling packets"
only differ in the words "IPv4"/"IPv6". So in each case a single para
could be given for IP (irrespective of whether v4 or v6).
Thank you very much.
Linda Dunbar
_______________________________________________
--
________________________________________________________________
Bob Briscoe http://bobbriscoe.net/
Bob Briscoe
2018-09-19 14:53:40 UTC
Permalink
Behcet,

I would like to make significant responses to many of Linda's responses,
but until we get answers to the two pre-requisite questions I've given,
I can't be sure how to respond.

So rather than promising a new version with no prior discussion, I
believe it would be much more fruitful to engage in this conversation.
I'm trying to help.

Cheers


Bob
Post by Behcet Sarikaya
Hi Bob,
Thank you for your comments.
The authors are currently discussing your points and we will come up
with a revision soon after the discussions are over.
Regards,
Behcet
Linda,
Until we can all understand the answers to the following two
questions, I don't think we can discuss what track this draft
ought to be on, let alone move on to your responses to all my
other points.
1/ Applicability
You say this draft solely applies to connections with both ends
within the controlled DC environment. But the draft says it's
about multi-tenant DCs. Are there any multi-tenant DCs that
restrict all VMs to only communicate with other VMs within the
same controlled DC environment?
2/ Purpose of publishing as an RFC
Post by Bob Briscoe
#. The introduction does not say what the purpose of publishing this draft is.
[Linda] The first paragraph on Page 3 has the description why VM
Mobility is needed.
Whether VM Mobility is needed was not my question. My question was
what is the purpose of the IETF publishing an RFC about VM
Mobility? And particularly, what is /this/ RFC intended to achieve?
Are the authors trying to argue for a particular approach vs.
others? Are you trying to write a tutorial? Are you trying to give
the pros and cons of different approaches? Are you trying to give
advice on good practice (with the implication that alternative
practices are less good)? Are you trying to clarify ideas by
writing them down? Are you trying to outline the implications of
VM Mobility for other protocols being developed within the NVO WG?
Bob
Post by Bob Briscoe
Bob,
Thank you very much for reviewing the draft and provided in-depth
comments. I am very sorry for the delayed response due to traveling.
-----Original Message-----
Sent: Monday, September 03, 2018 9:45 PM
Subject: Tsvart last call review of draft-ietf-nvo3-vmm-04
Reviewer: Bob Briscoe
Review result: Not Ready
I have been selected as the Transport Directorate reviewer for
this draft. The Transport Directorate seeks to review all
transport or transport-related drafts as they pass through IETF
last call and IESG review, and sometimes on special request. The
purpose of the review is to provide assistance to the Transport
ADs. For more information about the Transport Directorate Reviews
and the Transport Area Review Team, please see
https://trac.ietf.org/trac/tsv/wiki/TSV-Directorate-Reviews
In this case, very very few of the review comments relate to
transport issues, although the greatest issue concerns a desire
that the network could pause or stop connections during L3 VM
Mobility, which is certainly a transport issue.
[Linda] There is “Hot Migration” with transport service
continuing, and there is a “Cold Migration”, which is a common
practice in many data centers, which stop the task running on the
old place and move to the new place before restart as described
in the Task Migration.
Is it helpful to add this description to the draft?
==Summary==
The technical aspects of the draft concerning L2 VM mobility
(within a subnet) seem sound. However, this is only part of the
draft, which has the following
#. The introduction does not say what the purpose of publishing this draft is.
It seems that, rather than describing a specific protocol or
protocols, it intends to describe the overall system procedure
that would typically be used in DCs for VM mobility. It is tagged
as a BCP, but it does not say who needs this BCP, why it is
useful for the IETF to publish this BCP, how wide the authors'
knowledge is of current practice (given DCs are private), or why
this is a BCP rather than a protocol spec.
[Linda] The first paragraph on Page 3 has the description why VM
Mobility is needed. Is it helpful to move this paragraph to the
beginning of the Introduction Section?
/“//Virtualization which is being used in almost all of today’s data/
/centers enables many virtual machines to run on a single physical/
/computer or compute server. Virtual machines (VM) need hypervisor/
/running on the physical compute server to provide them shared/
/processor/memory/storage. Network connectivity is provided by the/
/network virtualization edge (NVE) [RFC8014]. Being able to move VMs/
/dynamically, or live migration, from one server to another allows for/
/dynamic load balancing or work distribution and thus it is a highly/
/desirable feature [RFC7364].//”/
The draft starts out (S.3) as if it intends to say what a good VM
Mobility protocol should or shouldn't do, but the rest of the
document doesn't give any reasoning for these recommendations, it
just asserts what appears to be one view of how a whole VM
Mobility system works, sometimes referring to one example
protocol RFC for a component part, but more often with no
references or details.
[Linda] Is it helpful to move the paragraph above to the
beginning of the Introduction Section? So that audience is aware
of why VM Mobility is needed. And then follow up with what a good
VM Mobility protocol should or shouldn't do?
#. It does not seem as if the NVO WG has discussed the purpose of
using normative text in this draft. See detailed comments.
[Linda] The “Intended status” of the draft is “Best Current
Practice”. So all the text are not “normative”. Is it Okay?
#. The draft silently slips back and forth between VM mobility
and VM redundancy, without recognizing the differences. See
detailed comments.
[Linda] There is only one usage of “redundancy” in the entire
document, used under the context of “Hot standby option”,
indicating the “redundancy” of “the VMs in both primary and
secondary domains have identical information and can provide
services simultaneously as in load-share mode of operation” being
expensive.
#. Please adopt different terminology than "source NVE" and
"destination NVE", which are really poor choices of terms for an
intermediate node. See detailed comments. Why not use "old NVE"
and "new NVE", which is what you mean?
[Linda] Thanks for the suggestion. We will change to “Old NVE”,
and “new NVE”.
#. Applicability is fairly clearly outlined, but it is not clear
whether hosts corresponding with the mobile VMs are part of the
same controlled environment or on the uncontrolled public
Internet. See detailed comments.
[Linda] “Hosts” are the App running on the VM. It is the under
the same controlled environment. Not on uncontrolled public
internet.
#. Section 4.2.1 on L3 VM mobility reads like some potential
half-thought-through ideas on how to solve L3 mobility, rather
than current practice, let alone best current practice. Either
current practice should be described instead, or the scope of the
draft should be narrowed solely to L2 VM mobility. See detailed
comments.
[Linda] This is refereeing to “Cold Migration”, which is a common
practice in many data centers.
# The VM's file system is described as state that moves with the
VM (S.6), but VM mobility solutions often move the VM but stitch
it back to its (unmoved) storage. Conversely, the storage can
also move independent of the VM.
[Linda] It depends. When a VM move to a different zone, the
storage/file can becomes inaccessible.
#. The draft omits some of the security, transport and management
aspects of VM mobility. See detailed comments.
[Linda] Can you provide some text?
#. The draft reads as if different sections have been written by
different authors and no-one has edited the whole to give it a
coherent structure, or to ensure consistency (both technical and
editorial) between the parts. See detailed comments.
[Linda] we can improve.
#. The quality of the English grammar does not allow a reviewer
to concentrate on the technical aspects rather than the English.
It would have been useful if one of the English-speaking
co-authors had improved the English before submission for review.
See detailed comments.
[Linda] can you help?  Becoming a co-author to improve?
==Detailed Comments==
===#. Normative statements===
In the body of the document, there is just one occurrence of
normative text (actually two "MUST"s, but both state a common
requirement - just written separately for IPv4 and IPv6). This
merely serves to imply that everything else the document says is
less important or optional, which was probably not the intention.
[Linda] The goal is to indicate any solution in moving the VM
“MUST” follow this rule. They make sense, aren’t they?
At the start there is a requirements section, which states what a
VM Mobility protocol "SHOULD" or "SHOULD NOT" do. I think this is
intended as a set of goals for the rest of the document. If so,
these "SHOULDs" are not intended to apply to implementations, so
they ought not to be capitalized.
[Linda] okay, will change.
The first requirement, "Data center network SHOULD support
virtual machine mobility in IPv6", is written as a requirement on
all DC networks, not on implementations. I assume this was
intended to read as "Data center network virtual machine mobility
protocols SHOULD support IPv6". Even then, it doesn't really add
anything to say VM mobility should support v6 and it should
support v4. A L2 solution won't. While undoubtedly, a L3 solution
will at least support one of them.
[Linda]Agree. Will change it to “Data center that support IPv6
address should 
”
I'm not sure that 'protocol' is the right word anyway; I think
'VM Mobility procedure' would be a better phrase, because it
includes steps such as suspending the VM, which is more than a
protocol.
[Linda] yes. Will change to “Procedure”.
The requirement "Virtual machine mobility protocol MAY support
host routes to accomplish virtualization", is not followed up at
all in the rest of the draft.
Even if this requirement stays, the last 3 words should be deleted.
[Linda] will change to “Host Route can be used to support the
Virtual Machine Mobility Procedure.”
By the end of the draft, the solution falls far short of the most
relevant "Requirements" anyway, so one assumes the title of the
section ought to have been "Goals". Specifically, even in the
simpler case of L2 VM mobility, S.4.1 says that triangular
routing and tunnelling persist "until a neighbour cache entry
times out". A cache timeout is about 10 orders of magnitude
longer than the requirement to only persist "while handling
packets in flight", which would be a few milliseconds at most
(the time for packets to clear the network that were already
launched into flight when the old VM stopped).
Whatever, it would be preferable for the draft to give rationale
for these requirements, rather than just assert them. This would
help to shed light on the merits of the different trade offs that
solutions choose.
[Linda] Agree, will add.
===#. Mobility vs. Redundancy===
Redundancy and mobility have a lot of similarities, but they have
different goals. With mobility, it is necessary to know the exact
instant when one set of state is identical to the other so it can
hand over. With redundancy, the aim is to keep two (or more) sets
of state evolving through the same sequence of changes, but there
is no need to know the point at which one is the same as the
other was at a certain point.
[Linda] Agree with what you said. There is only one usage of
“redundancy” in the entire document, used under the context of
“Hot standby option”, indicating  the “redundancy” of  “the VMs
in both primary and secondary domains have identical information
and can provide services simultaneously as in load-share mode of
operation” being expensive.
* S.2. Terminology: Warm VM Mobility is defined without any
ending, as if it is permanent replication. * S.7. "Handling of
Hot, Warm and Cold Virtual Machine Mobility" is actually all
about redundancy, and doesn't address mobility explicitly.
[Linda] Will add the definition “Hot Migration”, “cold
migration”, and “warm migration”.
===#. Terminology===
Packets run from the source at A to the destination at B via
NVE1, then via NVE2. Please don't call NVE1 and NVE2 the source
NVE and the destination NVE.
In future, no-one will thank you for the apparent contradictions
when they continually stumble over phrases like this one in
S.4.1: "...send their packets to the source NVE".
The term "packets in flight" is used incorrectly to refer to all
the packets sent to the old NVE after the VM has moved, even if
they were launched into flight long after the old VM stopped
receiving packets.
[Linda] thank for the comments. Will change.
BTW, I think s/before/after/ in: "that have old ARP or neighbor
cache entry before VM or task migration".
I think: s/IP-based VM mobility/L3 VM mobility/ throughout, because "based"
sounds (to me) like the mobility control protocol is over (i.e. based on) IP.
===#. Applicability===
In section 4.2 it says that the protocol mostly used as the IP
based task migration protocol is ILA. This implies that all hosts
corresponding with the mobile VMs are either part of the same
controlled environment, or they are proxied via nodes that are
part of the same controlled environment (I only have passing
knowledge of ILA, but I understand that it depends on ILA routers
on the path). If I am correct, this aspect of scope needs to be
made clear from the start.
Also under the heading of applicabiliy, the sentence "Since
migrations should be relatively rare events" appears very late in
the document (S.4.2.1). The assumed level of churn ought to be
stated nearer the start.
[Linda] yes, under the same controlled environment.
===#. L3 Mobility===
L2 VM mobility is independent of the application, because
resolution of L2 mappings is delegated to the stack. In contrast,
L3 VM mobility is only feasible under certain conditions, because
an application needs an IP address to open a socket (resolution
of DNS names is not delegated to the stack, and apps can use IP
addresses directly anyway).
a) /All/ applications used in the whole DC load balancing scheme
contain IP address migration logic for /all/ their connections;
b) VMs running solely applications that support IP address
migration register this fact with the NVA, and it only select
such VMs for mobility. c) An abstraction is layered over /all/
the IP addresses exposed to applications (at both ends) so that
the IP addresses that applications use are solely identifiers 
(e.g. ILA, LISP, HIP), not also locators.
The introduction says the draft is about VM mobility in a
multi-tenant DC, so the DC admin will not know the range of
applications being used. This excludes condition (a) above. When
the draft says "...if all applications running are known to
handle this gracefully...", it doesn't quantify just how
restrictive this condition is, and it gives no explanation of how
this knowledge might be 'known' or which function within the
system 'knows' it.
S.4.2.1 contains what seems like plenty of arm-waving.
* "TCP connections could be automatically closed in the network
stack during a migration event."
        o There is no TCP connection state in the network stack.
        o Even if the network starts to drop every packet, the
TCP connection
        state persists in the end-points for a duration of the
order of 30-90
        minutes (OS-dependent) before TCP deems the connection is
broken. o
        Other transport protocols have similar designs (including
the app-layer
        of protocols over UDP).
        o pausing the connection [does this refer to an actual
feature of any
        L4 protocol?] o packaging connection state and sending to
target [does
        this assume logic written into the application, or is
this assuming the
        stack handles this and the app is restricted to using
some form of
        separate identifier/locator addresses?] o instantiating
connection
        state in the peer stack [ditto?].
  "Cold Virtual Machine mobility is facilitated by the VM initially
   sending an ARP or Neighbor Discovery message at the destination NVE
   but the source NVE not receiving any packets inflight."
   [How is it arranged for the source NVE not to receive any
packets in flight?]
  "In hot
   standby option, regarding TCP connections, one option is to start
   with and maintain TCP connections to two different VMs at the same
   time."
   [This sounds like resilience logic has been written into the
application,
   which would be a special case but not something VM mobility
infrastructure
   could depend on.]
[Linda] will add.
===#. Gaps===
#. Security Considerations: repeats issues in other drafts that
are not specific to mobility, but it does not mention any
security issues specifically due to VM mobility. It says that
address spoofing may arise in a DC (sort-of implying it is worse
than in non-DC environments, but not saying why). The handshake
at the start of a connection (e.g. TCP, SCTP, QUIC) checks for
source address spoofing. So L3 VM mobility would be more
vulnerable to source address spoofing in cases where the mobile
VM was the connection initiator and there was not a new handshake
after the move. However, this draft does not contain any detailed
mobility protocols, so it is not possible to identify any
specific security flaws.
#. Transport Issues: Effect of delay on the transport: Cold
mobility introduces significant delay, and other forms less, but
still some delay. It should be pointed out that some applications
(e.g. real-time) will therefore not be useful if subjected to VM
mobility. Similarly, even a short period of delay will drive most
congestion controls to severely reduce throughput. These points
might be self-evident, but perhaps they should be stated explicitly.
BTW, in the L3 VM mobility case, the draft often refers to TCP
connections, but the address bindings of any transport protocols
would have to be migrated due to VM mobility (e.g. SCTP;
sequences of datagrams over UDP; streams over UDP such as with
RTP, QUIC).
#. Management Issues: perhaps the draft ought to recommend
statistics gathering (e.g. time taken, amount of duplicate data)
to aid a DC's future decisions on the cost-benefit of moving a
VM. The OPSDIR review says a BCP does not /have/ to describe
management issues, but this document seems to describe a whole
system procedure, not just a protocol, which then surely includes
the management plane.
[Linda] can you become a co-author and add those in?
===#. Incoherent Structure===
S.4.1. happens to talk about VMs moving, while S.4.2. happens to
talk about tasks moving, but this is not the distinguishing
aspect of these two sections (anyway, S.2. says "the draft uses
task and VM interchangeably"): * "4.1 VM Migration" is about "L2
VM Mobility" so this ought to be the section heading, *
"4.2 Task Migration" is about "L3 VM Mobility" so this ought to
be the section heading. It would also help not to switch from VM
to task across these sections
- it's just a distraction.
S.4.1 needs better signposting of where each sub-case ends
(Subsections might be useful to solve this): * IPv4 * end-user
client * 2 paras starting "All NVEs communicating with this
virtual machine..." [Not clear that the end-user case has ended
and we have returned to the general IPv4 case?] * IPv6 [Strictly,
it still hasn't said whether the end-user client case has ended.]
[Also, it doesn't explain why there is no need for an end-user
client case under IPv6?] Sections 5 & 6 seem to be about either
L2 or L3 mobility, whereas Sections 7 &
8 seem to be restricted to L2.
The draft vacillates over what to do with packets arriving at the
old NVE in the L3 case (see also L3 mobility above): * S4.2 first
says packets are dropped, possibly with an ICMP error message;
  o then later it says they are silently dropped;
  o then in the very next sentence it says either silently drop
them or forward
  them to the new location
* S.5 says they should not be lost, but instead delivered to the
destination hypervisor
  o then it describes how they are tunnelled (which is not the same as
  "forwarding").
The order in which all the stages of mobilty are given is jumbled
up across sections that also appear in arbitrary order: * S.5
prepares, establishes uses then stops a tunnel, but it doesn't
say where the other stages fit between these steps
        o When tunneling packets, it talks about the *migrating*
VM not the
        *migrated* VM, which implies tunnelling has started
before the new VM
        is running. Does this imply there is a huge buffer? o It
says "Stop
        Tunneling Packets - When source NVE stops receiving
packets destined
        to..." but it is never clear when a source has stopped
sending packets
        to a destination, unless it explicitly closes the
connection (e.g. with
        a FIN in the case of TCP). Often there are long gaps
between packets,
        because many flows are 'thin' (meaning the application
frequently has
        nothing to send). These gaps can last for milliseconds,
hours or even
        days without any implication that the connection has ended.
* Then S.6. describes moving state, but doesn't say that this is
not after the previous tunnelling steps (or where it fits within
those steps). * Then S.7 describes hot, warm and cold mobility,
but doesn't lay out the tunnelling or steps to move state in each
case. * Then S.8 says it's about VM life-cycle, but just gives
the very first 3 steps for allocation of resources to a VM, then
abruptly ends, without even starting the VM, let alone getting to move it.
S.5 exhibits another inconsistency by talking about the
hypervisor, not the NVE.
==#. Nits==
Nits with the English are too numerous to mention them all. Below
are pointers to general problems as well as some individual
instances.
S.4
  "Layer 2 and Layer 3 protocols are described next.  In the following
   sections, we examine more advanced features."
        s/following/subsequent/
S.4.1
Expand WSC, MSC and NVA on first use.
s/the VM moves in the same link/the VM moves in the same subnet/
"i.e. end-user clients ask for the same MAC address upon
migration. [...] to ensure that the same IPv4 address is assigned
to the VM." I think s/IPv4/MAC/ was intended?
"  All NVEs communicating with this virtual machine uses the old ARP
   entry.  If any VM in those NVEs need to talk to the new VM in the
   destination NVE, it uses the old ARP entry."
Repetition: these 2 sentences say the same. (The mistake is also
repeated when these 2 sentences are repeated for IPv6).
S.4.2.1
s/Push the new mapping to hosts./Push the new mapping to
communicating hosts./
S.5.
The IPv4/IPv6 pairs of paras for "tunnel estabilshment" and "tunneling packets"
only differ in the words "IPv4"/"IPv6". So in each case a single
para could be given for IP (irrespective of whether v4 or v6).
Thank you very much.
Linda Dunbar
_______________________________________________
Tsv-art mailing list
https://www.ietf.org/mailman/listinfo/tsv-art
<https://www..ietf.org/mailman/listinfo/tsv-art>
--
________________________________________________________________
Bob Briscoehttp://bobbriscoe.net/
_______________________________________________
Tsv-art mailing list
https://www.ietf.org/mailman/listinfo/tsv-art
--
________________________________________________________________
Bob Briscoe http://bobbriscoe.net/
Behcet Sarikaya
2018-09-20 14:30:35 UTC
Permalink
Dear Bob,
Post by Bob Briscoe
Behcet,
I would like to make significant responses to many of Linda's responses,
but until we get answers to the two pre-requisite questions I've given, I
can't be sure how to respond.
So rather than promising a new version with no prior discussion, I believe
it would be much more fruitful to engage in this conversation. I'm trying
to help.
You already made a detailed review.
Your two points are clarifications from your detailed review.
When I said we will revise I meant we will revise based on your detailed
review.
After we post our revision you can do what ever you wish.

Sincerely,
Behcet
Post by Bob Briscoe
Cheers
Bob
Hi Bob,
Thank you for your comments.
The authors are currently discussing your points and we will come up with
a revision soon after the discussions are over.
Regards,
Behcet
Post by Bob Briscoe
Linda,
Until we can all understand the answers to the following two questions, I
don't think we can discuss what track this draft ought to be on, let alone
move on to your responses to all my other points.
1/ Applicability
You say this draft solely applies to connections with both ends within
the controlled DC environment. But the draft says it's about multi-tenant
DCs. Are there any multi-tenant DCs that restrict all VMs to only
communicate with other VMs within the same controlled DC environment?
2/ Purpose of publishing as an RFC
#. The introduction does not say what the purpose of publishing this draft is.
[Linda] The first paragraph on Page 3 has the description why VM Mobility
is needed.
Whether VM Mobility is needed was not my question. My question was what
is the purpose of the IETF publishing an RFC about VM Mobility? And
particularly, what is /this/ RFC intended to achieve?
Are the authors trying to argue for a particular approach vs. others? Are
you trying to write a tutorial? Are you trying to give the pros and cons of
different approaches? Are you trying to give advice on good practice (with
the implication that alternative practices are less good)? Are you trying
to clarify ideas by writing them down? Are you trying to outline the
implications of VM Mobility for other protocols being developed within the
NVO WG?
Bob
Bob,
Thank you very much for reviewing the draft and provided in-depth
comments. I am very sorry for the delayed response due to traveling.
-----Original Message-----
Sent: Monday, September 03, 2018 9:45 PM
Subject: Tsvart last call review of draft-ietf-nvo3-vmm-04
Reviewer: Bob Briscoe
Review result: Not Ready
I have been selected as the Transport Directorate reviewer for this
draft. The Transport Directorate seeks to review all transport or
transport-related drafts as they pass through IETF last call and IESG
review, and sometimes on special request. The purpose of the review is to
provide assistance to the Transport ADs. For more information about the
Transport Directorate Reviews and the Transport Area Review Team, please
see https://trac.ietf.org/trac/tsv/wiki/TSV-Directorate-Reviews
In this case, very very few of the review comments relate to transport
issues, although the greatest issue concerns a desire that the network
could pause or stop connections during L3 VM Mobility, which is certainly a
transport issue.
[Linda] There is “Hot Migration” with transport service continuing, and
there is a “Cold Migration”, which is a common practice in many data
centers, which stop the task running on the old place and move to the new
place before restart as described in the Task Migration.
Is it helpful to add this description to the draft?
==Summary==
The technical aspects of the draft concerning L2 VM mobility (within a
subnet) seem sound. However, this is only part of the draft, which has the
following
#. The introduction does not say what the purpose of publishing this draft is.
It seems that, rather than describing a specific protocol or protocols,
it intends to describe the overall system procedure that would typically be
used in DCs for VM mobility. It is tagged as a BCP, but it does not say who
needs this BCP, why it is useful for the IETF to publish this BCP, how wide
the authors' knowledge is of current practice (given DCs are private), or
why this is a BCP rather than a protocol spec.
[Linda] The first paragraph on Page 3 has the description why VM Mobility
is needed. Is it helpful to move this paragraph to the beginning of the
Introduction Section?
*“**Virtualization which is being used in almost all of today’s data*
*centers enables many virtual machines to run on a single physical*
*computer or compute server. Virtual machines (VM) need hypervisor*
*running on the physical compute server to provide them shared*
*processor/memory/storage. Network connectivity is provided by the*
*network virtualization edge (NVE) [RFC8014]. Being able to move VMs*
*dynamically, or live migration, from one server to another allows for*
*dynamic load balancing or work distribution and thus it is a highly*
*desirable feature [RFC7364].**”*
The draft starts out (S.3) as if it intends to say what a good VM
Mobility protocol should or shouldn't do, but the rest of the document
doesn't give any reasoning for these recommendations, it just asserts what
appears to be one view of how a whole VM Mobility system works, sometimes
referring to one example protocol RFC for a component part, but more often
with no references or details.
[Linda] Is it helpful to move the paragraph above to the beginning of the
Introduction Section? So that audience is aware of why VM Mobility is
needed. And then follow up with what a good VM Mobility protocol should or
shouldn't do?
#. It does not seem as if the NVO WG has discussed the purpose of using
normative text in this draft. See detailed comments.
[Linda] The “Intended status” of the draft is “Best Current Practice”. So
all the text are not “normative”. Is it Okay?
#. The draft silently slips back and forth between VM mobility and VM
redundancy, without recognizing the differences. See detailed comments.
[Linda] There is only one usage of “redundancy” in the entire document,
used under the context of “Hot standby option”, indicating the
“redundancy” of “the VMs in both primary and secondary domains have
identical information and can provide services simultaneously as in
load-share mode of operation” being expensive.
#. Please adopt different terminology than "source NVE" and "destination
NVE", which are really poor choices of terms for an intermediate node. See
detailed comments. Why not use "old NVE" and "new NVE", which is what you
mean?
[Linda] Thanks for the suggestion. We will change to “Old NVE”, and “new
NVE”.
#. Applicability is fairly clearly outlined, but it is not clear whether
hosts corresponding with the mobile VMs are part of the same controlled
environment or on the uncontrolled public Internet. See detailed comments.
[Linda] “Hosts” are the App running on the VM. It is the under the same
controlled environment. Not on uncontrolled public internet.
#. Section 4.2.1 on L3 VM mobility reads like some potential
half-thought-through ideas on how to solve L3 mobility, rather than current
practice, let alone best current practice. Either current practice should
be described instead, or the scope of the draft should be narrowed solely
to L2 VM mobility. See detailed comments.
[Linda] This is refereeing to “Cold Migration”, which is a common
practice in many data centers.
# The VM's file system is described as state that moves with the VM
(S.6), but VM mobility solutions often move the VM but stitch it back to
its (unmoved) storage. Conversely, the storage can also move independent of
the VM.
[Linda] It depends. When a VM move to a different zone, the storage/file
can becomes inaccessible.
#. The draft omits some of the security, transport and management aspects
of VM mobility. See detailed comments.
[Linda] Can you provide some text?
#. The draft reads as if different sections have been written by
different authors and no-one has edited the whole to give it a coherent
structure, or to ensure consistency (both technical and editorial) between
the parts. See detailed comments.
[Linda] we can improve.
#. The quality of the English grammar does not allow a reviewer to
concentrate on the technical aspects rather than the English. It would have
been useful if one of the English-speaking co-authors had improved the
English before submission for review. See detailed comments.
[Linda] can you help? Becoming a co-author to improve?
==Detailed Comments==
===#. Normative statements===
In the body of the document, there is just one occurrence of normative
text (actually two "MUST"s, but both state a common requirement - just
written separately for IPv4 and IPv6). This merely serves to imply that
everything else the document says is less important or optional, which was
probably not the intention.
[Linda] The goal is to indicate any solution in moving the VM “MUST”
follow this rule. They make sense, aren’t they?
At the start there is a requirements section, which states what a VM
Mobility protocol "SHOULD" or "SHOULD NOT" do. I think this is intended as
a set of goals for the rest of the document. If so, these "SHOULDs" are not
intended to apply to implementations, so they ought not to be capitalized.
[Linda] okay, will change.
The first requirement, "Data center network SHOULD support virtual
machine mobility in IPv6", is written as a requirement on all DC networks,
not on implementations. I assume this was intended to read as "Data center
network virtual machine mobility protocols SHOULD support IPv6". Even then,
it doesn't really add anything to say VM mobility should support v6 and it
should support v4. A L2 solution won't. While undoubtedly, a L3 solution
will at least support one of them.
[Linda]Agree. Will change it to “Data center that support IPv6 address
should 
”
I'm not sure that 'protocol' is the right word anyway; I think 'VM
Mobility procedure' would be a better phrase, because it includes steps
such as suspending the VM, which is more than a protocol.
[Linda] yes. Will change to “Procedure”.
The requirement "Virtual machine mobility protocol MAY support host
routes to accomplish virtualization", is not followed up at all in the rest
of the draft.
Even if this requirement stays, the last 3 words should be deleted.
[Linda] will change to “Host Route can be used to support the Virtual
Machine Mobility Procedure.”
By the end of the draft, the solution falls far short of the most
relevant "Requirements" anyway, so one assumes the title of the section
ought to have been "Goals". Specifically, even in the simpler case of L2 VM
mobility, S.4.1 says that triangular routing and tunnelling persist "until
a neighbour cache entry times out". A cache timeout is about 10 orders of
magnitude longer than the requirement to only persist "while handling
packets in flight", which would be a few milliseconds at most (the time for
packets to clear the network that were already launched into flight when
the old VM stopped).
Whatever, it would be preferable for the draft to give rationale for
these requirements, rather than just assert them. This would help to shed
light on the merits of the different trade offs that solutions choose.
[Linda] Agree, will add.
===#. Mobility vs. Redundancy===
Redundancy and mobility have a lot of similarities, but they have
different goals. With mobility, it is necessary to know the exact instant
when one set of state is identical to the other so it can hand over. With
redundancy, the aim is to keep two (or more) sets of state evolving through
the same sequence of changes, but there is no need to know the point at
which one is the same as the other was at a certain point.
[Linda] Agree with what you said. There is only one usage of “redundancy”
in the entire document, used under the context of “Hot standby option”,
indicating the “redundancy” of “the VMs in both primary and secondary
domains have identical information and can provide services simultaneously
as in load-share mode of operation” being expensive.
* S.2. Terminology: Warm VM Mobility is defined without any ending, as if
it is permanent replication. * S.7. "Handling of Hot, Warm and Cold Virtual
Machine Mobility" is actually all about redundancy, and doesn't address
mobility explicitly.
[Linda] Will add the definition “Hot Migration”, “cold migration”, and
“warm migration”.
===#. Terminology===
Packets run from the source at A to the destination at B via NVE1, then
via NVE2. Please don't call NVE1 and NVE2 the source NVE and the
destination NVE.
In future, no-one will thank you for the apparent contradictions when
they continually stumble over phrases like this one in S.4.1: "...send
their packets to the source NVE".
The term "packets in flight" is used incorrectly to refer to all the
packets sent to the old NVE after the VM has moved, even if they were
launched into flight long after the old VM stopped receiving packets.
[Linda] thank for the comments. Will change.
BTW, I think s/before/after/ in: "that have old ARP or neighbor cache
entry before VM or task migration".
I think: s/IP-based VM mobility/L3 VM mobility/ throughout, because "based"
sounds (to me) like the mobility control protocol is over (i.e. based on) IP.
===#. Applicability===
In section 4.2 it says that the protocol mostly used as the IP based task
migration protocol is ILA. This implies that all hosts corresponding with
the mobile VMs are either part of the same controlled environment, or they
are proxied via nodes that are part of the same controlled environment (I
only have passing knowledge of ILA, but I understand that it depends on ILA
routers on the path). If I am correct, this aspect of scope needs to be
made clear from the start.
Also under the heading of applicabiliy, the sentence "Since migrations
should be relatively rare events" appears very late in the document
(S.4.2.1). The assumed level of churn ought to be stated nearer the start.
[Linda] yes, under the same controlled environment.
===#. L3 Mobility===
L2 VM mobility is independent of the application, because resolution of
L2 mappings is delegated to the stack. In contrast, L3 VM mobility is only
feasible under certain conditions, because an application needs an IP
address to open a socket (resolution of DNS names is not delegated to the
stack, and apps can use IP addresses directly anyway).
a) /All/ applications used in the whole DC load balancing scheme contain
IP address migration logic for /all/ their connections; b) VMs running
solely applications that support IP address migration register this fact
with the NVA, and it only select such VMs for mobility. c) An abstraction
is layered over /all/ the IP addresses exposed to applications (at both
ends) so that the IP addresses that applications use are solely
identifiers (e.g. ILA, LISP, HIP), not also locators.
The introduction says the draft is about VM mobility in a multi-tenant
DC, so the DC admin will not know the range of applications being used.
This excludes condition (a) above. When the draft says "...if all
applications running are known to handle this gracefully...", it doesn't
quantify just how restrictive this condition is, and it gives no
explanation of how this knowledge might be 'known' or which function within
the system 'knows' it.
S.4.2.1 contains what seems like plenty of arm-waving.
* "TCP connections could be automatically closed in the network stack
during a migration event."
o There is no TCP connection state in the network stack.
o Even if the network starts to drop every packet, the TCP connection
state persists in the end-points for a duration of the order of 30-90
minutes (OS-dependent) before TCP deems the connection is broken..
o
Other transport protocols have similar designs (including the app-layer
of protocols over UDP).
o pausing the connection [does this refer to an actual feature of any
L4 protocol?] o packaging connection state and sending to target [does
this assume logic written into the application, or is this assuming the
stack handles this and the app is restricted to using some form of
separate identifier/locator addresses?] o instantiating connection
state in the peer stack [ditto?].
"Cold Virtual Machine mobility is facilitated by the VM initially
sending an ARP or Neighbor Discovery message at the destination NVE
but the source NVE not receiving any packets inflight."
[How is it arranged for the source NVE not to receive any packets in flight?]
"In hot
standby option, regarding TCP connections, one option is to start
with and maintain TCP connections to two different VMs at the same
time."
[This sounds like resilience logic has been written into the application,
which would be a special case but not something VM mobility infrastructure
could depend on.]
[Linda] will add.
===#. Gaps===
#. Security Considerations: repeats issues in other drafts that are not
specific to mobility, but it does not mention any security issues
specifically due to VM mobility. It says that address spoofing may arise in
a DC (sort-of implying it is worse than in non-DC environments, but not
saying why). The handshake at the start of a connection (e.g. TCP, SCTP,
QUIC) checks for source address spoofing. So L3 VM mobility would be more
vulnerable to source address spoofing in cases where the mobile VM was the
connection initiator and there was not a new handshake after the move.
However, this draft does not contain any detailed mobility protocols, so it
is not possible to identify any specific security flaws.
#. Transport Issues: Effect of delay on the transport: Cold mobility
introduces significant delay, and other forms less, but still some delay..
It should be pointed out that some applications (e.g. real-time) will
therefore not be useful if subjected to VM mobility. Similarly, even a
short period of delay will drive most congestion controls to severely
reduce throughput. These points might be self-evident, but perhaps they
should be stated explicitly.
BTW, in the L3 VM mobility case, the draft often refers to TCP
connections, but the address bindings of any transport protocols would have
to be migrated due to VM mobility (e.g. SCTP; sequences of datagrams over
UDP; streams over UDP such as with RTP, QUIC).
#. Management Issues: perhaps the draft ought to recommend statistics
gathering (e.g. time taken, amount of duplicate data) to aid a DC's future
decisions on the cost-benefit of moving a VM. The OPSDIR review says a BCP
does not /have/ to describe management issues, but this document seems to
describe a whole system procedure, not just a protocol, which then surely
includes the management plane.
[Linda] can you become a co-author and add those in?
===#. Incoherent Structure===
S.4.1. happens to talk about VMs moving, while S.4.2. happens to talk
about tasks moving, but this is not the distinguishing aspect of these two
* "4.1 VM Migration" is about "L2 VM Mobility" so this ought to be the
section heading, *
"4.2 Task Migration" is about "L3 VM Mobility" so this ought to be the
section heading. It would also help not to switch from VM to task across
these sections
- it's just a distraction.
S.4.1 needs better signposting of where each sub-case ends (Subsections
might be useful to solve this): * IPv4 * end-user client * 2 paras starting
"All NVEs communicating with this virtual machine..." [Not clear that the
end-user case has ended and we have returned to the general IPv4 case?] *
IPv6 [Strictly, it still hasn't said whether the end-user client case has
ended.] [Also, it doesn't explain why there is no need for an end-user
client case under IPv6?] Sections 5 & 6 seem to be about either L2 or L3
mobility, whereas Sections 7 &
8 seem to be restricted to L2.
The draft vacillates over what to do with packets arriving at the old NVE
in the L3 case (see also L3 mobility above): * S4.2 first says packets are
dropped, possibly with an ICMP error message;
o then later it says they are silently dropped;
o then in the very next sentence it says either silently drop them or forward
them to the new location
* S.5 says they should not be lost, but instead delivered to the destination hypervisor
o then it describes how they are tunnelled (which is not the same as
"forwarding").
The order in which all the stages of mobilty are given is jumbled up
across sections that also appear in arbitrary order: * S.5 prepares,
establishes uses then stops a tunnel, but it doesn't say where the other
stages fit between these steps
o When tunneling packets, it talks about the *migrating* VM not the
*migrated* VM, which implies tunnelling has started before the new VM
is running. Does this imply there is a huge buffer? o It says "Stop
Tunneling Packets - When source NVE stops receiving packets destined
to..." but it is never clear when a source has stopped sending packets
to a destination, unless it explicitly closes the connection (e.g. with
a FIN in the case of TCP). Often there are long gaps between packets,
because many flows are 'thin' (meaning the application frequently has
nothing to send). These gaps can last for milliseconds, hours or even
days without any implication that the connection has ended.
* Then S.6. describes moving state, but doesn't say that this is not
after the previous tunnelling steps (or where it fits within those steps).
* Then S.7 describes hot, warm and cold mobility, but doesn't lay out the
tunnelling or steps to move state in each case. * Then S.8 says it's about
VM life-cycle, but just gives the very first 3 steps for allocation of
resources to a VM, then abruptly ends, without even starting the VM, let
alone getting to move it.
S.5 exhibits another inconsistency by talking about the hypervisor, not the NVE.
==#. Nits==
Nits with the English are too numerous to mention them all. Below are
pointers to general problems as well as some individual instances.
S.4
"Layer 2 and Layer 3 protocols are described next. In the following
sections, we examine more advanced features."
s/following/subsequent/
S.4.1
Expand WSC, MSC and NVA on first use.
s/the VM moves in the same link/the VM moves in the same subnet/
"i.e. end-user clients ask for the same MAC address upon migration. [...]
to ensure that the same IPv4 address is assigned to the VM." I think
s/IPv4/MAC/ was intended?
" All NVEs communicating with this virtual machine uses the old ARP
entry. If any VM in those NVEs need to talk to the new VM in the
destination NVE, it uses the old ARP entry."
Repetition: these 2 sentences say the same. (The mistake is also repeated
when these 2 sentences are repeated for IPv6).
S.4.2.1
s/Push the new mapping to hosts./Push the new mapping to communicating hosts./
S.5.
The IPv4/IPv6 pairs of paras for "tunnel estabilshment" and "tunneling packets"
only differ in the words "IPv4"/"IPv6". So in each case a single para
could be given for IP (irrespective of whether v4 or v6).
Thank you very much.
Linda Dunbar
_______________________________________________
--
________________________________________________________________
Bob Briscoe http://bobbriscoe.net/
_______________________________________________
--
________________________________________________________________
Bob Briscoe http://bobbriscoe.net/
Bob Briscoe
2018-09-21 09:33:49 UTC
Permalink
Behcet,

Linda made load of responses to my review, some of which I disagree with
so I would like to respond to them. I need responses to those two
questions first though, 'cos everything else depends on those.


Bob
Post by Behcet Sarikaya
Dear Bob,
Behcet,
I would like to make significant responses to many of Linda's
responses, but until we get answers to the two pre-requisite
questions I've given, I can't be sure how to respond.
So rather than promising a new version with no prior discussion, I
believe it would be much more fruitful to engage in this
conversation. I'm trying to help.
You already made a detailed review.
Your two points are clarifications from your detailed review.
When I said we will revise I meant we  will revise based on your
detailed review.
After we post our revision you can do what ever you wish.
Sincerely,
Behcet
Cheers
Bob
Post by Behcet Sarikaya
Hi Bob,
Thank you for your comments.
The authors are currently discussing your points and we will come
up with a revision soon after the discussions are over.
Regards,
Behcet
Linda,
Until we can all understand the answers to the following two
questions, I don't think we can discuss what track this draft
ought to be on, let alone move on to your responses to all my
other points.
1/ Applicability
You say this draft solely applies to connections with both
ends within the controlled DC environment. But the draft says
it's about multi-tenant DCs. Are there any multi-tenant DCs
that restrict all VMs to only communicate with other VMs
within the same controlled DC environment?
2/ Purpose of publishing as an RFC
Post by Bob Briscoe
#. The introduction does not say what the purpose of
publishing this draft is.
[Linda] The first paragraph on Page 3 has the description
why VM Mobility is needed.
Whether VM Mobility is needed was not my question. My
question was what is the purpose of the IETF publishing an
RFC about VM Mobility? And particularly, what is /this/ RFC
intended to achieve?
Are the authors trying to argue for a particular approach vs.
others? Are you trying to write a tutorial? Are you trying to
give the pros and cons of different approaches? Are you
trying to give advice on good practice (with the implication
that alternative practices are less good)? Are you trying to
clarify ideas by writing them down? Are you trying to outline
the implications of VM Mobility for other protocols being
developed within the NVO WG?
Bob
Post by Bob Briscoe
Bob,
Thank you very much for reviewing the draft and provided
in-depth comments. I am very sorry for the delayed response
due to traveling.
-----Original Message-----
Sent: Monday, September 03, 2018 9:45 PM
Subject: Tsvart last call review of draft-ietf-nvo3-vmm-04
Reviewer: Bob Briscoe
Review result: Not Ready
I have been selected as the Transport Directorate reviewer
for this draft. The Transport Directorate seeks to review
all transport or transport-related drafts as they pass
through IETF last call and IESG review, and sometimes on
special request. The purpose of the review is to provide
assistance to the Transport ADs. For more information about
the Transport Directorate Reviews and the Transport Area
Review Team, please see
https://trac..ietf.org/trac/tsv/wiki/TSV-Directorate-Reviews
<https://trac.ietf.org/trac/tsv/wiki/TSV-Directorate-Reviews>
In this case, very very few of the review comments relate to
transport issues, although the greatest issue concerns a
desire that the network could pause or stop connections
during L3 VM Mobility, which is certainly a transport issue.
[Linda] There is “Hot Migration” with transport service
continuing, and there is a “Cold Migration”, which is a
common practice in many data centers, which stop the task
running on the old place and move to the new place before
restart as described in the Task Migration.
Is it helpful to add this description to the draft?
==Summary==
The technical aspects of the draft concerning L2 VM mobility
(within a subnet) seem sound. However, this is only part of
the draft, which has the following
#. The introduction does not say what the purpose of
publishing this draft is.
It seems that, rather than describing a specific protocol or
protocols, it intends to describe the overall system
procedure that would typically be used in DCs for VM
mobility. It is tagged as a BCP, but it does not say who
needs this BCP, why it is useful for the IETF to publish
this BCP, how wide the authors' knowledge is of current
practice (given DCs are private), or why this is a BCP
rather than a protocol spec.
[Linda] The first paragraph on Page 3 has the description
why VM Mobility is needed. Is it helpful to move this
paragraph to the beginning of the Introduction Section?
/“//Virtualization which is being used in almost all of
today’s data/
/centers enables many virtual machines to run on a single physical/
/computer or compute server. Virtual machines (VM) need hypervisor/
/running on the physical compute server to provide them shared/
/processor/memory/storage. Network connectivity is provided by the/
/network virtualization edge (NVE) [RFC8014]. Being able to move VMs/
/dynamically, or live migration, from one server to another allows for/
/dynamic load balancing or work distribution and thus it is a highly/
/desirable feature [RFC7364].//”/
The draft starts out (S.3) as if it intends to say what a
good VM Mobility protocol should or shouldn't do, but the
rest of the document doesn't give any reasoning for these
recommendations, it just asserts what appears to be one view
of how a whole VM Mobility system works, sometimes referring
to one example protocol RFC for a component part, but more
often with no references or details.
[Linda] Is it helpful to move the paragraph above to the
beginning of the Introduction Section? So that audience is
aware of why VM Mobility is needed. And then follow up with
what a good VM Mobility protocol should or shouldn't do?
#. It does not seem as if the NVO WG has discussed the
purpose of using normative text in this draft. See detailed
comments.
[Linda] The “Intended status” of the draft is “Best Current
Practice”. So all the text are not “normative”. Is it Okay?
#. The draft silently slips back and forth between VM
mobility and VM redundancy, without recognizing the
differences. See detailed comments.
[Linda] There is only one usage of “redundancy” in the
entire document, used under the context of “Hot standby
option”, indicating  the “redundancy” of “the VMs in both
primary and secondary domains have identical information and
can provide services simultaneously as in load-share mode of
operation” being expensive.
#. Please adopt different terminology than "source NVE" and
"destination NVE", which are really poor choices of terms
for an intermediate node. See detailed comments. Why not use
"old NVE" and "new NVE", which is what you mean?
[Linda] Thanks for the suggestion. We will change to “Old
NVE”, and “new NVE”.
#. Applicability is fairly clearly outlined, but it is not
clear whether hosts corresponding with the mobile VMs are
part of the same controlled environment or on the
uncontrolled public Internet. See detailed comments.
[Linda] “Hosts” are the App running on the VM. It is the
under the same controlled environment. Not on uncontrolled
public internet.
#. Section 4.2.1 on L3 VM mobility reads like some potential
half-thought-through ideas on how to solve L3 mobility,
rather than current practice, let alone best current
practice. Either current practice should be described
instead, or the scope of the draft should be narrowed solely
to L2 VM mobility. See detailed comments.
[Linda] This is refereeing to “Cold Migration”, which is a
common practice in many data centers.
# The VM's file system is described as state that moves with
the VM (S.6), but VM mobility solutions often move the VM
but stitch it back to its (unmoved) storage. Conversely, the
storage can also move independent of the VM.
[Linda] It depends. When a VM move to a different zone, the
storage/file can becomes inaccessible.
#. The draft omits some of the security, transport and
management aspects of VM mobility. See detailed comments.
[Linda] Can you provide some text?
#. The draft reads as if different sections have been
written by different authors and no-one has edited the whole
to give it a coherent structure, or to ensure consistency
(both technical and editorial) between the parts. See
detailed comments.
[Linda] we can improve.
#. The quality of the English grammar does not allow a
reviewer to concentrate on the technical aspects rather than
the English. It would have been useful if one of the
English-speaking co-authors had improved the English before
submission for review. See detailed comments.
[Linda] can you help?  Becoming a co-author to improve?
==Detailed Comments==
===#. Normative statements===
In the body of the document, there is just one occurrence of
normative text (actually two "MUST"s, but both state a
common requirement - just written separately for IPv4 and
IPv6). This merely serves to imply that everything else the
document says is less important or optional, which was
probably not the intention.
[Linda] The goal is to indicate any solution in moving the
VM “MUST” follow this rule. They make sense, aren’t they?
At the start there is a requirements section, which states
what a VM Mobility protocol "SHOULD" or "SHOULD NOT" do. I
think this is intended as a set of goals for the rest of the
document. If so, these "SHOULDs" are not intended to apply
to implementations, so they ought not to be capitalized.
[Linda] okay, will change.
The first requirement, "Data center network SHOULD support
virtual machine mobility in IPv6", is written as a
requirement on all DC networks, not on implementations. I
assume this was intended to read as "Data center network
virtual machine mobility protocols SHOULD support IPv6".
Even then, it doesn't really add anything to say VM mobility
should support v6 and it should support v4. A L2 solution
won't. While undoubtedly, a L3 solution will at least
support one of them.
[Linda]Agree. Will change it to “Data center that support
IPv6 address should 
”
I'm not sure that 'protocol' is the right word anyway; I
think 'VM Mobility procedure' would be a better phrase,
because it includes steps such as suspending the VM, which
is more than a protocol.
[Linda] yes. Will change to “Procedure”.
The requirement "Virtual machine mobility protocol MAY
support host routes to accomplish virtualization", is not
followed up at all in the rest of the draft.
Even if this requirement stays, the last 3 words should be deleted.
[Linda] will change to “Host Route can be used to support
the Virtual Machine Mobility Procedure.”
By the end of the draft, the solution falls far short of the
most relevant "Requirements" anyway, so one assumes the
title of the section ought to have been "Goals".
Specifically, even in the simpler case of L2 VM mobility,
S.4.1 says that triangular routing and tunnelling persist
"until a neighbour cache entry times out". A cache timeout
is about 10 orders of magnitude longer than the requirement
to only persist "while handling packets in flight", which
would be a few milliseconds at most (the time for packets to
clear the network that were already launched into flight
when the old VM stopped).
Whatever, it would be preferable for the draft to give
rationale for these requirements, rather than just assert
them. This would help to shed light on the merits of the
different trade offs that solutions choose.
[Linda] Agree, will add.
===#. Mobility vs. Redundancy===
Redundancy and mobility have a lot of similarities, but they
have different goals. With mobility, it is necessary to know
the exact instant when one set of state is identical to the
other so it can hand over. With redundancy, the aim is to
keep two (or more) sets of state evolving through the same
sequence of changes, but there is no need to know the point
at which one is the same as the other was at a certain point.
[Linda] Agree with what you said. There is only one usage of
“redundancy” in the entire document, used under the context
of “Hot standby option”, indicating  the “redundancy” of 
“the VMs in both primary and secondary domains have
identical information and can provide services
simultaneously as in load-share mode of operation” being
expensive.
* S.2. Terminology: Warm VM Mobility is defined without any
ending, as if it is permanent replication. * S.7. "Handling
of Hot, Warm and Cold Virtual Machine Mobility" is actually
all about redundancy, and doesn't address mobility explicitly.
[Linda] Will add the definition “Hot Migration”, “cold
migration”, and “warm migration”.
===#. Terminology===
Packets run from the source at A to the destination at B via
NVE1, then via NVE2. Please don't call NVE1 and NVE2 the
source NVE and the destination NVE.
In future, no-one will thank you for the apparent
contradictions when they continually stumble over phrases
like this one in S.4.1: "...send their packets to the source
NVE"..
The term "packets in flight" is used incorrectly to refer to
all the packets sent to the old NVE after the VM has moved,
even if they were launched into flight long after the old VM
stopped receiving packets.
[Linda] thank for the comments. Will change.
BTW, I think s/before/after/ in: "that have old ARP or
neighbor cache entry before VM or task migration".
I think: s/IP-based VM mobility/L3 VM mobility/ throughout,
because "based"
sounds (to me) like the mobility control protocol is over
(i.e. based on) IP.
===#. Applicability===
In section 4.2 it says that the protocol mostly used as the
IP based task migration protocol is ILA. This implies that
all hosts corresponding with the mobile VMs are either part
of the same controlled environment, or they are proxied via
nodes that are part of the same controlled environment (I
only have passing knowledge of ILA, but I understand that it
depends on ILA routers on the path). If I am correct, this
aspect of scope needs to be made clear from the start.
Also under the heading of applicabiliy, the sentence "Since
migrations should be relatively rare events" appears very
late in the document (S.4.2.1). The assumed level of churn
ought to be stated nearer the start.
[Linda] yes, under the same controlled environment.
===#. L3 Mobility===
L2 VM mobility is independent of the application, because
resolution of L2 mappings is delegated to the stack. In
contrast, L3 VM mobility is only feasible under certain
conditions, because an application needs an IP address to
open a socket (resolution of DNS names is not delegated to
the stack, and apps can use IP addresses directly anyway).
a) /All/ applications used in the whole DC load balancing
scheme contain IP address migration logic for /all/ their
connections; b) VMs running solely applications that support
IP address migration register this fact with the NVA, and it
only select such VMs for mobility. c) An abstraction is
layered over /all/ the IP addresses exposed to applications
(at both ends) so that the IP addresses that applications
use are solely identifiers  (e.g. ILA, LISP, HIP), not also
locators.
The introduction says the draft is about VM mobility in a
multi-tenant DC, so the DC admin will not know the range of
applications being used. This excludes condition (a) above.
When the draft says "...if all applications running are
known to handle this gracefully...", it doesn't quantify
just how restrictive this condition is, and it gives no
explanation of how this knowledge might be 'known' or which
function within the system 'knows' it.
S.4.2.1 contains what seems like plenty of arm-waving.
* "TCP connections could be automatically closed in the
network stack during a migration event."
        o There is no TCP connection state in the network stack.
        o Even if the network starts to drop every packet,
the TCP connection
        state persists in the end-points for a duration of
the order of 30-90
        minutes (OS-dependent) before TCP deems the
connection is broken. o
        Other transport protocols have similar designs
(including the app-layer
        of protocols over UDP).
        o pausing the connection [does this refer to an
actual feature of any
        L4 protocol?] o packaging connection state and
sending to target [does
        this assume logic written into the application, or
is this assuming the
        stack handles this and the app is restricted to
using some form of
        separate identifier/locator addresses?] o
instantiating connection
        state in the peer stack [ditto?].
  "Cold Virtual Machine mobility is facilitated by the VM initially
   sending an ARP or Neighbor Discovery message at the
destination NVE
   but the source NVE not receiving any packets inflight."
   [How is it arranged for the source NVE not to receive any
packets in flight?]
  "In hot
   standby option, regarding TCP connections, one option is
to start
   with and maintain TCP connections to two different VMs at
the same
   time."
   [This sounds like resilience logic has been written into
the application,
   which would be a special case but not something VM
mobility infrastructure
   could depend on.]
[Linda] will add.
===#. Gaps===
#. Security Considerations: repeats issues in other drafts
that are not specific to mobility, but it does not mention
any security issues specifically due to VM mobility. It says
that address spoofing may arise in a DC (sort-of implying it
is worse than in non-DC environments, but not saying why).
The handshake at the start of a connection (e.g. TCP, SCTP,
QUIC) checks for source address spoofing. So L3 VM mobility
would be more vulnerable to source address spoofing in cases
where the mobile VM was the connection initiator and there
was not a new handshake after the move. However, this draft
does not contain any detailed mobility protocols, so it is
not possible to identify any specific security flaws.
#. Transport Issues: Effect of delay on the transport: Cold
mobility introduces significant delay, and other forms less,
but still some delay. It should be pointed out that some
applications (e.g. real-time) will therefore not be useful
if subjected to VM mobility. Similarly, even a short period
of delay will drive most congestion controls to severely
reduce throughput. These points might be self-evident, but
perhaps they should be stated explicitly.
BTW, in the L3 VM mobility case, the draft often refers to
TCP connections, but the address bindings of any transport
protocols would have to be migrated due to VM mobility (e.g.
SCTP; sequences of datagrams over UDP; streams over UDP such
as with RTP, QUIC).
#. Management Issues: perhaps the draft ought to recommend
statistics gathering (e.g. time taken, amount of duplicate
data) to aid a DC's future decisions on the cost-benefit of
moving a VM. The OPSDIR review says a BCP does not /have/ to
describe management issues, but this document seems to
describe a whole system procedure, not just a protocol,
which then surely includes the management plane.
[Linda] can you become a co-author and add those in?
===#. Incoherent Structure===
S.4.1. happens to talk about VMs moving, while S.4.2.
happens to talk about tasks moving, but this is not the
distinguishing aspect of these two sections (anyway, S.2.
says "the draft uses task and VM interchangeably"): * "4.1
VM Migration" is about "L2 VM Mobility" so this ought to be
the section heading, *
"4.2 Task Migration" is about "L3 VM Mobility" so this ought
to be the section heading. It would also help not to switch
from VM to task across these sections
- it's just a distraction.
S.4.1 needs better signposting of where each sub-case ends
(Subsections might be useful to solve this): * IPv4 *
end-user client * 2 paras starting "All NVEs communicating
with this virtual machine..." [Not clear that the end-user
case has ended and we have returned to the general IPv4
case?] * IPv6 [Strictly, it still hasn't said whether the
end-user client case has ended.] [Also, it doesn't explain
why there is no need for an end-user client case under
IPv6?] Sections 5 & 6 seem to be about either L2 or L3
mobility, whereas Sections 7 &
8 seem to be restricted to L2.
The draft vacillates over what to do with packets arriving
* S4.2 first says packets are dropped, possibly with an ICMP
error message;
  o then later it says they are silently dropped;
  o then in the very next sentence it says either silently
drop them or forward
  them to the new location
* S.5 says they should not be lost, but instead delivered to
the destination hypervisor
  o then it describes how they are tunnelled (which is not
the same as
  "forwarding").
The order in which all the stages of mobilty are given is
jumbled up across sections that also appear in arbitrary
order: * S.5 prepares, establishes uses then stops a tunnel,
but it doesn't say where the other stages fit between these
steps
        o When tunneling packets, it talks about the
*migrating* VM not the
        *migrated* VM, which implies tunnelling has started
before the new VM
        is running. Does this imply there is a huge buffer?
o It says "Stop
        Tunneling Packets - When source NVE stops receiving
packets destined
        to..." but it is never clear when a source has
stopped sending packets
        to a destination, unless it explicitly closes the
connection (e.g. with
        a FIN in the case of TCP). Often there are long gaps
between packets,
        because many flows are 'thin' (meaning the
application frequently has
        nothing to send). These gaps can last for
milliseconds, hours or even
        days without any implication that the connection has
ended.
* Then S.6. describes moving state, but doesn't say that
this is not after the previous tunnelling steps (or where it
fits within those steps). * Then S.7 describes hot, warm and
cold mobility, but doesn't lay out the tunnelling or steps
to move state in each case. * Then S.8 says it's about VM
life-cycle, but just gives the very first 3 steps for
allocation of resources to a VM, then abruptly ends, without
even starting the VM, let alone getting to move it.
S.5 exhibits another inconsistency by talking about the
hypervisor, not the NVE.
==#. Nits==
Nits with the English are too numerous to mention them all.
Below are pointers to general problems as well as some
individual instances.
S.4
  "Layer 2 and Layer 3 protocols are described next.  In the
following
   sections, we examine more advanced features."
        s/following/subsequent/
S.4.1
Expand WSC, MSC and NVA on first use.
s/the VM moves in the same link/the VM moves in the same subnet/
"i.e. end-user clients ask for the same MAC address upon
migration. [...] to ensure that the same IPv4 address is
assigned to the VM." I think s/IPv4/MAC/ was intended?
"  All NVEs communicating with this virtual machine uses the old ARP
   entry.  If any VM in those NVEs need to talk to the new
VM in the
   destination NVE, it uses the old ARP entry."
Repetition: these 2 sentences say the same. (The mistake is
also repeated when these 2 sentences are repeated for IPv6).
S.4.2.1
s/Push the new mapping to hosts./Push the new mapping to
communicating hosts./
S.5.
The IPv4/IPv6 pairs of paras for "tunnel estabilshment" and
"tunneling packets"
only differ in the words "IPv4"/"IPv6". So in each case a
single para could be given for IP (irrespective of whether
v4 or v6).
Thank you very much.
Linda Dunbar
_______________________________________________
Tsv-art mailing list
https://www.ietf.org/mailman/listinfo/tsv-art
<https://www..ietf.org/mailman/listinfo/tsv-art>
--
________________________________________________________________
Bob Briscoehttp://bobbriscoe.net/
_______________________________________________
Tsv-art mailing list
https://www.ietf.org/mailman/listinfo/tsv-art
--
________________________________________________________________
Bob Briscoehttp://bobbriscoe.net/
_______________________________________________
nvo3 mailing list
https://www.ietf.org/mailman/listinfo/nvo3
--
________________________________________________________________
Bob Briscoe http://bobbriscoe.net/
Loading...