rfc9766.original   rfc9766.txt 
Network File System Version 4 T. Haynes Internet Engineering Task Force (IETF) T. Haynes
Internet-Draft T. Myklebust Request for Comments: 9766 T. Myklebust
Intended status: Standards Track Hammerspace Category: Standards Track Hammerspace
Expires: 11 August 2025 7 February 2025 ISSN: 2070-1721 April 2025
Add LAYOUT_WCC to NFSv4.2's Flex File Layout Type Extensions for Weak Cache Consistency in NFSv4.2's Flexible File Layout
draft-ietf-nfsv4-layoutwcc-07
Abstract Abstract
This document specifies extensions to the parallel Network File This document specifies extensions to NFSv4.2 for improving Weak
System (NFS) version 4 (pNFS) for improving write cache consistency. Cache Consistency (WCC). These extensions introduce mechanisms that
These extensions introduce mechanisms that ensure partial writes ensure partial writes performed under a Parallel NFS (pNFS) layout
performed under a pNFS layout remain coherent and correctly tracked. remain coherent and correctly tracked. The solution addresses
The solution addresses concurrency and data integrity concerns that concurrency and data integrity concerns that may arise when multiple
may arise when multiple clients write to the same file through clients write to the same file through separate data servers. By
separate data servers. By defining additional interactions among defining additional interactions among clients, metadata servers, and
clients, metadata servers, and data servers, this specification data servers, this specification enhances the reliability of NFSv4 in
enhances the reliability of NFSv4 in parallel-access environments and parallel-access environments and ensures consistency across diverse
ensures consistency across diverse deployment scenarios. deployment scenarios.
Note
This note is to be removed before publishing as an RFC.
Discussion of this draft takes place on the NFSv4 working group
mailing list (nfsv4@ietf.org), which is archived at
https://mailarchive.ietf.org/arch/browse/nfsv4/. Working Group
information can be found at https://datatracker.ietf.org/wg/nfsv4/
about/.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This is an Internet Standards Track document.
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months This document is a product of the Internet Engineering Task Force
and may be updated, replaced, or obsoleted by other documents at any (IETF). It represents the consensus of the IETF community. It has
time. It is inappropriate to use Internet-Drafts as reference received public review and has been approved for publication by the
material or to cite them other than as "work in progress." Internet Engineering Steering Group (IESG). Further information on
Internet Standards is available in Section 2 of RFC 7841.
This Internet-Draft will expire on 11 August 2025. Information about the current status of this document, any errata,
and how to provide feedback on it may be obtained at
https://www.rfc-editor.org/info/rfc9766.
Copyright Notice Copyright Notice
Copyright (c) 2025 IETF Trust and the persons identified as the Copyright (c) 2025 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/ Provisions Relating to IETF Documents
license-info) in effect on the date of publication of this document. (https://trustee.ietf.org/license-info) in effect on the date of
Please review these documents carefully, as they describe your rights publication of this document. Please review these documents
and restrictions with respect to this document. Code Components carefully, as they describe your rights and restrictions with respect
extracted from this document must include Revised BSD License text as to this document. Code Components extracted from this document must
described in Section 4.e of the Trust Legal Provisions and are include Revised BSD License text as described in Section 4.e of the
provided without warranty as described in the Revised BSD License. Trust Legal Provisions and are provided without warranty as described
in the Revised BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 1. Introduction
1.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Definitions
1.2. Requirements Language . . . . . . . . . . . . . . . . . . 3 1.2. Requirements Language
2. Weak Cache Consistency (WCC) . . . . . . . . . . . . . . . . 4 2. Weak Cache Consistency (WCC)
3. Operation 77: LAYOUT_WCC - Layout Weak Cache Consistency . . 5 3. Operation 77: LAYOUT_WCC - Layout Weak Cache Consistency
3.4. Implementation . . . . . . . . . . . . . . . . . . . . . 6 3.1. ARGUMENT
3.4.1. Examples of when to use LAYOUT_WCC . . . . . . . . . 6 3.2. RESULT
3.4.2. Examples of what to send in the LAYOUT_WCC . . . . . 7 3.3. DESCRIPTION
3.5. Allowed Errors . . . . . . . . . . . . . . . . . . . . . 8 3.4. Implementation
3.6. Extension of Existing Implementations . . . . . . . . . . 9 3.4.1. Examples of When to Use LAYOUT_WCC
3.7. Flex Files Layout Type . . . . . . . . . . . . . . . . . 9 3.4.2. Examples of What to Send in LAYOUT_WCC
4. Extraction of XDR . . . . . . . . . . . . . . . . . . . . . . 10 3.5. Allowed Errors
4.1. Code Components Licensing Notice . . . . . . . . . . . . 11 3.6. Extension of Existing Implementations
5. Security Considerations . . . . . . . . . . . . . . . . . . . 11 3.7. Flexible File Layout Type
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 4. Extraction of XDR
7. References . . . . . . . . . . . . . . . . . . . . . . . . . 11 5. Security Considerations
7.1. Normative References . . . . . . . . . . . . . . . . . . 11 6. IANA Considerations
7.2. Informative References . . . . . . . . . . . . . . . . . 12 7. References
Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 13 7.1. Normative References
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 13 7.2. Informative References
Acknowledgments
Authors' Addresses
1. Introduction 1. Introduction
In the Network File System version 4 (NFSv4) with a Parallel NFS In the Parallel NFS (pNFS) flexible file layout (see [RFC8435]),
(pNFS) Flexible File Layout (see Section 12 of [RFC8435]) server,
there is no mechanism for the data servers to update the metadata there is no mechanism for the data servers to update the metadata
servers for when the data portion of the file is modified. The servers when the data portion of the file is modified. The metadata
metadata server needs this knowledge to correspondingly update the server needs this knowledge to correspondingly update the metadata
metadata portion of the file. If the client is using NFSv3 as the portion of the file. If the client is using NFSv3 as the protocol
protocol with the data server, it can leverage weak cache consistency with the data server, it can leverage Weak Cache Consistency (WCC) to
(WCC) to update the metadata server of the attribute changes. In update the metadata server of the attribute changes. In this
this document, we introduce a new operation called LAYOUT_WCC to document, we introduce a new operation called LAYOUT_WCC to NFSv4.2,
NFSv4.2 which allows the client to periodically report the attributes which allows the client to periodically report the attributes of the
of the data files to the metadata server. data files to the metadata server.
Using the process detailed in [RFC8178], the revisions in this Using the process detailed in [RFC8178], the revisions in this
document become an extension of NFSv4.2 [RFC7862]. They are built on document become an extension of NFSv4.2 [RFC7862]. They are built on
top of the external data representation (XDR) [RFC4506] generated top of the External Data Representation (XDR) [RFC4506] generated
from [RFC7863]. from [RFC7863].
1.1. Definitions 1.1. Definitions
For a more comprehensive set of definitions, see Section 1.1 of For a more comprehensive set of definitions, see Section 1.1 of
[RFC8435]. [RFC8435].
(file) data: that part of the file system object that contains the (file) data: that part of the file system object that contains the
data to be read or written. It is the contents of the object data to be read or written. It is the contents of the object
rather than the attributes of the object. rather than the attributes of the object.
skipping to change at page 3, line 38 skipping to change at line 120
metadata server (MDS): the pNFS server that provides metadata metadata server (MDS): the pNFS server that provides metadata
information for a file system object. information for a file system object.
storage device: the target to which clients may direct I/O requests storage device: the target to which clients may direct I/O requests
when they hold an appropriate layout. Note that each data server when they hold an appropriate layout. Note that each data server
is a storage device but that some storage device are not data is a storage device but that some storage device are not data
servers. (See Section 2.1 of [RFC8434] for a discussion on the servers. (See Section 2.1 of [RFC8434] for a discussion on the
difference between a data server and a storage device.) difference between a data server and a storage device.)
weak cache consistency (WCC): In NFSv3, WCC allows the client to weak cache consistency (WCC): the mechanism in NFSv3 that allows the
check for file attribute changes before and after an operation client to check for file attribute changes before and after an
(See Section 2.6 of [RFC1813]). operation (see Section 2.6 of [RFC1813]).
1.2. Requirements Language 1.2. Requirements Language
The key words 'MUST', 'MUST NOT', 'REQUIRED', 'SHALL', 'SHALL NOT', The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
'SHOULD', 'SHOULD NOT', 'RECOMMENDED', 'NOT RECOMMENDED', 'MAY', and "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
'OPTIONAL' in this document are to be interpreted as described in BCP "OPTIONAL" in this document are to be interpreted as described in
14 [RFC2119] [RFC8174] when, and only when, they appear in all BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here. capitals, as shown here.
2. Weak Cache Consistency (WCC) 2. Weak Cache Consistency (WCC)
A pNFS layout type enables the metadata server to inform the client A pNFS layout type enables the metadata server to inform the client
of both the storage protocol and the locations of the data that the of both the storage protocol and the locations of the data that the
client should use when communicating with the storage devices. The client should use when communicating with the storage devices. The
Flex Files Layout Type, as specified in [RFC8435], describes how data flexible file layout type, as specified in [RFC8435], describes how
servers using NFSv3 can be accessed. The client is restricted to data servers using NFSv3 can be accessed. The client is restricted
performing NFSv3 READ (Section 3.3.6 of [RFC1813]), WRITE to performing the following NFSv3 operations on the filehandles
(Section 3.3.6 of [RFC1813]), and COMMIT (Section 3.3.21 of provided in the layout: READ, WRITE, and COMMIT (see Sections 3.3.6,
[RFC1813]) operations on the file handles provided in the layout. In 3.3.7, and 3.3.21 of [RFC1813], respectively). In other words, the
other words, the client may only use NFSv3 operations that act client may only use NFSv3 operations that act directly on the data
directly on the data portion of the file. portion of the file.
Because there is no contol protocol (see [RFC8434]) possible with all Because there is no control protocol (see [RFC8434]) possible with
data servers, NFSv3 is used as the control protocol. As such, the all data servers, NFSv3 is used as the control protocol. As such,
NFSv3 CREATE (see Section 3.3.8 of [RFC1813]), GETATTR (see the following NFSv3 operations are commonly used by the metadata
Section 3.3.1 of [RFC1813]), and SETATTR (see Section 3.3.2 of server: CREATE, GETATTR, and SETATTR (see Sections 3.3.8, 3.3.1, and
[RFC1813]) are operations commonly used by the metadata server. 3.3.2 of [RFC1813], respectively). That is, the metadata server is
I.e., the metadata server is only allowed to use NFSv3 operations only allowed to use NFSv3 operations that directly act on the
which directly act on the metadata portion of the data file. GETATTR metadata portion of the data file. GETATTR allows the metadata
allows the metadata server to mainly retrieve the mtime (modify server to mainly retrieve the mtime (modify time), ctime (change
time), ctime (change time), and atime (access time). The metadata time), and atime (access time). The metadata server can use this
server can use this information to determine if the client modified information to determine if the client modified the file whilst it
the file whilst it held an iomode of LAYOUTIOMODE4_RW (see held an iomode of LAYOUTIOMODE4_RW (see Section 3.3.20 of [RFC8881]).
Section 3.3.20 of [RFC8881]). Then it can determine the time_modify Then it can determine the following for the metadata file:
(see Section 5.8.2.43 of [RFC8881]), time_metadata (see time_modify, time_metadata, and time_access (see Sections 5.8.2.43,
Section 5.8.2.42 of [RFC8881]), and time_access (see Section 5.8.2.37 5.8.2.42, and 5.8.2.37 of [RFC8881], respectively). That is, it can
of [RFC8881]) for the metadata file. I.e., the information to return determine the information to return to clients in an NFSv4.2 GETATTR
to clients in a NFSv4.2 GETATTR response. response.
For example, the metadata server might issue an NFSv3 GETATTR For example, the metadata server might issue an NFSv3 GETATTR
operation to the data server, which is typically triggered by a operation to the data server, which is typically triggered by a
client's NFSv4 GETATTR request to the metadata server. In addition client's NFSv4 GETATTR request to the metadata server. In addition
to the cost of each individual GETATTR operation, the data server can to the cost of each individual GETATTR operation, the data server can
be overwhelmed by a large volume of such requests. NFSv3 addressed a be overwhelmed by a large volume of such requests. NFSv3 addressed a
similar challenge by including a post-operation attribute in the READ similar challenge by including a post-operation attribute in the READ
and WRITE operations to report weak cache consistency (WCC) data (see and WRITE operations to report WCC data (see Section 2.6 of
Section 2.6 of [RFC1813]). [RFC1813]).
Each NFSv3 operation entails a single round trip between the client Each NFSv3 operation entails a single round trip between the client
and server. Consequently, issuing a WRITE followed by a GETATTR and server. Consequently, issuing a WRITE followed by a GETATTR
would require two round trips. In that situation, the retrieved would require two round trips. In that situation, the retrieved
attribute information is regarded as strict server-client attribute information is regarded as having strict server-client
consistency. By contrast, NFSv4 enables a WRITE and GETATTR to be consistency. By contrast, NFSv4 enables a WRITE and GETATTR to be
combined within a compound operation, which requires only one round combined within a compound operation, which requires only one round
trip. This combined approach is likewise considered strict server- trip. This combined approach is likewise considered to have strict
client consistency. Essentially, NFSv4 READ and WRITE operations server-client consistency. Essentially, NFSv4 READ and WRITE
omit post-operation attributes, allowing the client to determine operations omit post-operation attributes, allowing the client to
whether it requires that information. determine whether it requires that information.
Whilst NFSv4 got rid of the requirement for WCC information to be Whilst NFSv4 got rid of the requirement for WCC information to be
supplied by the WRITE or READ operations, the introduction of pNFS supplied by the WRITE or READ operations, the introduction of pNFS
re-introduces the same problem. The metadata server has to reintroduces the same problem. The metadata server has to
communicate with the data server in order to get at the data which communicate with the data server in order to get the data that could
could be provided by a WCC model. be provided by a WCC model.
With the flexible file layout type, the client can leverage the NFSv3 With the flexible file layout type, the client can leverage the NFSv3
WCC to service the proxying of times (See Section 4 of WCC to service the proxying of times (see Section 5 of [RFC9754]),
[I-D.ietf-nfsv4-delstid]). But the granularity of this data is but the granularity of this data is limited. With client-side
limited. With client side mirroring (See Section 8 of [RFC8435]), mirroring (see Section 8 of [RFC8435]), the client has to aggregate
the client has to aggregate the N mirrored files in order to send one the N mirrored files in order to send one piece of information
piece of information instead of N pieces of information. Also, the instead of N pieces of information. Also, the client is limited to
client is limited to sending that information only when it returns sending that information only when it returns the delegation.
the delegation.
This document introduces a new NFSv4.2 operation, LAYOUT_WCC, which This document introduces a new NFSv4.2 operation, LAYOUT_WCC, which
enables the client to provide the metadata server with information enables the client to provide the metadata server with information
obtained from the data server. The client is responsible for obtained from the data server. The client is responsible for
gathering the NFSv3 WCC data, returned by the three permissible NFSv3 gathering the NFSv3 WCC data, returned by the three permissible NFSv3
operations, and conveying it back to the metadata server as part of operations, and conveying it back to the metadata server as part of
NFSv4.2 attributes. The metadata server MAY therefore avoid issuing NFSv4.2 attributes. The metadata server MAY therefore avoid issuing
costly NFSv3 GETATTR calls to the data servers. Because this costly NFSv3 GETATTR calls to the data servers. Because this
approach relies on a weak model, the metadata server MAY still approach relies on a weak model, the metadata server MAY still
perform these calls if it chooses to strengthen the model. perform these calls if it chooses to strengthen the model.
skipping to change at page 6, line 4 skipping to change at line 217
3.1. ARGUMENT 3.1. ARGUMENT
<CODE BEGINS> <CODE BEGINS>
/// struct LAYOUT_WCC4args { /// struct LAYOUT_WCC4args {
/// stateid4 lowa_stateid; /// stateid4 lowa_stateid;
/// layouttype4 lowa_type; /// layouttype4 lowa_type;
/// opaque lowa_body<>; /// opaque lowa_body<>;
/// }; /// };
<CODE ENDS> <CODE ENDS>
stateid4 is defined in Section 3.3.12 of [RFC8881]. layouttype4 is stateid4 is defined in Section 3.3.12 of [RFC8881]. layouttype4 is
defined in Section 3.3.13 of [RFC8881]. defined in Section 3.3.13 of [RFC8881].
3.2. RESULT 3.2. RESULT
<CODE BEGINS> <CODE BEGINS>
/// struct LAYOUT_WCC4res { /// struct LAYOUT_WCC4res {
/// nfsstat4 lowr_status; /// nfsstat4 lowr_status;
/// }; /// };
<CODE ENDS> <CODE ENDS>
nfsstat4 is defined in Section 3.2 of [RFC8881]. nfsstat4 is defined in Section 3.2 of [RFC8881].
3.3. DESCRIPTION 3.3. DESCRIPTION
The current filehandle and the lowa_stateid identify the specific The current filehandle and the lowa_stateid identify the specific
layout for the LAYOUT_WCC operation. The lowa_type indicates how to layout for the LAYOUT_WCC operation. The lowa_type indicates how to
interpret the layout-type-specific payload contained in the lowa_body interpret the layout-type-specific payload contained in the lowa_body
field. The lowa_type is the corresponding value from the IANA field. The lowa_type is the corresponding value from the "pNFS
registry for 'pNFS Layout Types' for the layout type being used. Layout Types" IANA registry for the layout type being used.
The lowa_body contains the data file attributes. The client is The lowa_body contains the data file attributes. The client is
responsible for mapping NFSv3 post-operation attributes to the fattr4 responsible for mapping NFSv3 post-operation attributes to the fattr4
representation. Similar to the behavior of post-operation representation. Similar to the behavior of post-operation
attributes, the client may ignore these attributes, and the server attributes, the client may ignore these attributes, and the server
may also choose to ignore any attributes included in LAYOUT_WCC. may also choose to ignore any attributes included in LAYOUT_WCC.
However, the server can use these attributes to avoid querying the However, the server can use these attributes to avoid querying the
data server for data file attributes. Because these attributes are data server for data file attributes. Because these attributes are
optional and the client has no recourse if the server opts to optional and the client has no recourse if the server opts to
disregard them, there is no requirement to return a bitmap4 disregard them, there is no requirement to return a bitmap4
indicating which attributes have been accepted in the LAYOUT_WCC indicating which attributes have been accepted in the LAYOUT_WCC
result. result.
3.4. Implementation 3.4. Implementation
3.4.1. Examples of when to use LAYOUT_WCC 3.4.1. Examples of When to Use LAYOUT_WCC
The only way for the metadata server to detect modifications to the The only way for the metadata server to detect modifications to the
data file is to probe the data servers via a GETATTR. It can compare data file is to probe the data servers via a GETATTR. It can compare
the mtime results across multiple calls to detect a NFSv3 WRITE the mtime results across multiple calls to detect an NFSv3 WRITE
operation by the client. Likewise, the atime results indicate the operation by the client. Likewise, the atime results indicate the
client having issued a NFSv3 READ operation. As such, the client can client having issued an NFSv3 READ operation. As such, the client
leverage the LAYOUT_WCC operation whenever it has the belief that the can leverage the LAYOUT_WCC operation whenever it has the belief that
metadata server would need to refresh the attributes of the data the metadata server would need to refresh the attributes of the data
files. While the client can send a LAYOUT_WCC at any time, there are files. While the client can send a LAYOUT_WCC at any time, there are
times it will want to do this operation in order to avoid having the times it will want to do this operation in order to avoid having the
metadata server issue NFSv3 GETATTR requests to the data servers: metadata server issue NFSv3 GETATTR requests to the data servers:
* Whenever it sends a GETATTR for any of the following attributes: * Whenever it sends a GETATTR for any of the following attributes:
size (see Section 5.8.1.5 of [RFC8881]), space_used (see
Section 5.8.2.25 of [RFC8881]), change (see Section 5.8.1.4 of - size (see Section 5.8.1.5 of [RFC8881])
[RFC8881]), time_access (see Section 5.8.2.37 of [RFC8881]),
time_metadata (see Section 5.8.2.42 of [RFC8881]), and time_modify - space_used (see Section 5.8.2.35 of [RFC8881])
(see Section 5.8.2.43 of [RFC8881]).
- change (see Section 5.8.1.4 of [RFC8881])
- time_access (see Section 5.8.2.37 of [RFC8881])
- time_metadata (see Section 5.8.2.42 of [RFC8881])
- time_modify (see Section 5.8.2.43 of [RFC8881])
* Whenever it sends an NFS4ERR_ACCESS error via LAYOUTRETURN or * Whenever it sends an NFS4ERR_ACCESS error via LAYOUTRETURN or
LAYOUTERROR - it could have already gotten the NFSv3 uid and gid LAYOUTERROR. It could have already gotten the NFSv3 uid and gid
values back in the WCC of the WRITE, READ, or COMMIT operation values back in the WCC of the WRITE, READ, or COMMIT operation
which got the error. Thus it could report that information back that got the error. Thus, it could report that information back
to the metadata server, saving it from querying that information to the metadata server, saving it from querying that information
via a NFSv3 GETATTR. via an NFSv3 GETATTR.
* Whenever it sends a SETATTR to refresh the proxied times (See * Whenever it sends a SETATTR to refresh the proxied times (see
Section 4 of [I-D.ietf-nfsv4-delstid]) - the metadata server is Section 5 of [RFC9754]). The metadata server will correlate these
going to want to correlate these times in order to detect later times in order to detect later modification to the data file.
modification to the data file.
3.4.2. Examples of what to send in the LAYOUT_WCC 3.4.2. Examples of What to Send in LAYOUT_WCC
The NFSv3 attributes returned in the WCC of WRITE, READ, and COMMIT The NFSv3 attributes returned in the WCC of WRITE, READ, and COMMIT
are a smaller subset of what can be transmitted as a NFSv4 attribute. operations are a smaller subset of what can be transmitted as an
The mapping of NFSv3 to NFSv4 attributes is shown in Table 1. The NFSv4 attribute. The mapping of NFSv3 to NFSv4 attributes is shown
LAYOUT_WCC MUST provide all of these attributes to the metadata in Table 1. The LAYOUT_WCC MUST provide all of these attributes to
server. Both the uid and gid are stringified into their respective the metadata server. Both the uid and gid are stringified into their
attributes of owner and owner_group. The reason to provide these two respective attributes of owner and owner_group. In the case of
attributes is in case of NFS4ERR_ACCESS, the metadata server can NFS4ERR_ACCESS, the reason to provide these two attributes is that
compare what it expects the values of the uid and gid of the data the metadata server can compare what it expects the values of the uid
file to be versus the actual values. It can then repair the and gid of the data file to be versus the actual values. It can then
permissions as needed or modify the expected values it has cached. repair the permissions as needed or modify the expected values it has
cached.
+=================+===================+ +=================+===================+
| NFSv3 Attribute | NFSv4.2 Attribute | | NFSv3 Attribute | NFSv4.2 Attribute |
+=================+===================+ +=================+===================+
| size | size | | size | size |
+-----------------+-------------------+ +-----------------+-------------------+
| used | space_used | | used | space_used |
+-----------------+-------------------+ +-----------------+-------------------+
| mode | mode | | mode | mode |
+-----------------+-------------------+ +-----------------+-------------------+
skipping to change at page 8, line 30 skipping to change at line 330
| mtime | time_modify | | mtime | time_modify |
+-----------------+-------------------+ +-----------------+-------------------+
| ctime | time_metadata | | ctime | time_metadata |
+-----------------+-------------------+ +-----------------+-------------------+
Table 1: NFSv3 to NFSv4.2 Attribute Table 1: NFSv3 to NFSv4.2 Attribute
Mappings Mappings
3.5. Allowed Errors 3.5. Allowed Errors
The LAYOUT_WCC operation can raise the errors in Table 2. When an The LAYOUT_WCC operation can raise the errors listed in Table 2.
error is encountered, the metadata server can decide to ignore the When an error is encountered, the metadata server can decide to
entire operation or depending on the layout type specific payload, it ignore the entire operation, or depending on the layout-type-specific
could decide to apply a portion of the payload. Note that there are payload, it could decide to apply a portion of the payload. Note
no new errors introduced for the LAYOUT_WCC operation and the errors that there are no new errors introduced for the LAYOUT_WCC operation
in Table 2 are each defined in Section 15.1 of [RFC8881]. Table 2 and the errors in Table 2 are each defined in Section 15.1 of
can be considered as an extension of Section 15.2 of [RFC8881]. [RFC8881]. Table 2 can be considered as an extension of Section 15.2
of [RFC8881].
+============+====================================================+ +============+====================================================+
| Operation | Errors | | Operation | Errors |
+============+====================================================+ +============+====================================================+
| LAYOUT_WCC | NFS4ERR_ADMIN_REVOKED, NFS4ERR_BADXDR, | | LAYOUT_WCC | NFS4ERR_ADMIN_REVOKED, NFS4ERR_BADXDR, |
| | NFS4ERR_BAD_STATEID, NFS4ERR_DEADSESSION, | | | NFS4ERR_BAD_STATEID, NFS4ERR_DEADSESSION, |
| | NFS4ERR_DELAY, NFS4ERR_DELEG_REVOKED, | | | NFS4ERR_DELAY, NFS4ERR_DELEG_REVOKED, |
| | NFS4ERR_EXPIRED, NFS4ERR_FHEXPIRED, NFS4ERR_GRACE, | | | NFS4ERR_EXPIRED, NFS4ERR_FHEXPIRED, NFS4ERR_GRACE, |
| | NFS4ERR_INVAL, NFS4ERR_ISDIR, NFS4ERR_MOVED, | | | NFS4ERR_INVAL, NFS4ERR_ISDIR, NFS4ERR_MOVED, |
| | NFS4ERR_NOFILEHANDLE, NFS4ERR_NOTSUPP, | | | NFS4ERR_NOFILEHANDLE, NFS4ERR_NOTSUPP, |
skipping to change at page 9, line 27 skipping to change at line 361
| | NFS4ERR_RETRY_UNCACHED_REP, NFS4ERR_SERVERFAULT, | | | NFS4ERR_RETRY_UNCACHED_REP, NFS4ERR_SERVERFAULT, |
| | NFS4ERR_STALE, NFS4ERR_TOO_MANY_OPS, | | | NFS4ERR_STALE, NFS4ERR_TOO_MANY_OPS, |
| | NFS4ERR_UNKNOWN_LAYOUTTYPE, NFS4ERR_WRONG_CRED, | | | NFS4ERR_UNKNOWN_LAYOUTTYPE, NFS4ERR_WRONG_CRED, |
| | NFS4ERR_WRONG_TYPE | | | NFS4ERR_WRONG_TYPE |
+------------+----------------------------------------------------+ +------------+----------------------------------------------------+
Table 2: Operations and Their Valid Errors Table 2: Operations and Their Valid Errors
3.6. Extension of Existing Implementations 3.6. Extension of Existing Implementations
The new LAYOUT_WCC operation is OPTIONAL for both NFSv4.2 ([RFC7863]) The new LAYOUT_WCC operation is OPTIONAL for both NFSv4.2 [RFC7863]
and the flexible file layout type ([RFC8435]). and the flexible file layout type [RFC8435].
3.7. Flex Files Layout Type 3.7. Flexible File Layout Type
<CODE BEGINS> <CODE BEGINS>
/// struct ff_data_server_wcc4 { /// struct ff_data_server_wcc4 {
/// deviceid4 ffdsw_deviceid; /// deviceid4 ffdsw_deviceid;
/// stateid4 ffdsw_stateid; /// stateid4 ffdsw_stateid;
/// nfs_fh4 ffdsw_fh_vers<>; /// nfs_fh4 ffdsw_fh_vers<>;
/// fattr4 ffdsw_attributes; /// fattr4 ffdsw_attributes;
/// }; /// };
/// ///
/// struct ff_mirror_wcc4 { /// struct ff_mirror_wcc4 {
/// ff_data_server_wcc4 ffmw_data_servers<>; /// ff_data_server_wcc4 ffmw_data_servers<>;
/// }; /// };
/// ///
/// struct ff_layout_wcc4 { /// struct ff_layout_wcc4 {
/// ff_mirror_wcc4 fflw_mirrors<>; /// ff_mirror_wcc4 fflw_mirrors<>;
/// }; /// };
<CODE ENDS> <CODE ENDS>
The flex file layout type specific results MUST correspond to the The results specific to the flexible file layout type MUST correspond
ff_layout4 data structure as defined in Section 5.1 of [RFC8435]. to the ff_layout4 data structure as defined in Section 5.1 of
There MUST be a one-to-one correspondence between: [RFC8435]. There MUST be a one-to-one correspondence between the
following:
* ff_data_server4 -> ff_data_server_wcc4 * ff_data_server4 -> ff_data_server_wcc4
* ff_mirror4 -> ff_mirror_wcc4 * ff_mirror4 -> ff_mirror_wcc4
* ff_layout4 -> ff_layout_wcc4 * ff_layout4 -> ff_layout_wcc4
Each ff_layout4 has an array of ff_mirror4, which have an array of Each ff_layout4 has an array of ff_mirror4, which has an array of
ff_data_server4. Based on the current filehandle and the ff_data_server4. Based on the current filehandle and the
lowa_stateid, the server can match the reported attributes. lowa_stateid, the server can match the reported attributes.
But the positional correspondence between the elements is not But the positional correspondence between the elements is not
sufficient to determine the attributes to update. Consider the case sufficient to determine the attributes to update. Consider the case
where a layout had three mirrors and two of them had updated where a layout has three mirrors and two of them have updated
attributes, but the third did not. A client could decide to present attributes but the third does not. A client could decide to present
all three mirrors, with one mirror having an attribute mask with no all three mirrors, with one mirror having an attribute mask with no
attributes present. Or it could decide to present only the two attributes present. Or it could decide to present only the two
mirrors which had been changed. mirrors that had been changed.
In either case, the combination of ffdsw_deviceid, ffdsw_stateid, and In either case, the combination of ffdsw_deviceid, ffdsw_stateid, and
ffdsw_fh_vers will uniquely identify the attributes to be updated. ffdsw_fh_vers will uniquely identify the attributes to be updated.
All three arguments are required. A layout might have multiple data All three arguments are required. A layout might have multiple data
files on the same storage device, in which case the ffdsw_deviceid files on the same storage device, in which case the ffdsw_deviceid
and ffdsw_stateid would match, but the ffdsw_fh_vers would not. and ffdsw_stateid would match, but the ffdsw_fh_vers would not.
The ffdsw_attributes are processed similar to the obj_attributes in The ffdsw_attributes are processed similar to the obj_attributes in
the SETATTR arguments (See Section 18.34 of [RFC8881]). the SETATTR arguments (see Section 18.30 of [RFC8881]).
4. Extraction of XDR 4. Extraction of XDR
This document contains the external data representation (XDR) This document contains the XDR [RFC4506] description of the new
[RFC4506] description of the new open flags for delegating the file NFSv4.2 operation LAYOUT_WCC. The XDR description is embedded in
to the client. The XDR description is embedded in this document in a this document in a way that makes it simple for the reader to extract
way that makes it simple for the reader to extract into a ready-to- into a ready-to-compile form. The reader can feed this document into
compile form. The reader can feed this document into the following the following shell script to produce the machine-readable XDR
shell script to produce the machine-readable XDR description of the description of the new NFSv4.2 operation LAYOUT_WCC.
new flags:
<CODE BEGINS> <CODE BEGINS>
#!/bin/sh #!/bin/sh
grep '^ *///' $* | sed 's?^ */// ??' | sed 's?^ *///$??' grep '^ *///' $* | sed 's?^ */// ??' | sed 's?^ *///$??'
<CODE ENDS> <CODE ENDS>
That is, if the above script is stored in a file called 'extract.sh', That is, if the above script is stored in a file called 'extract.sh',
and this document is in a file called 'spec.txt', then the reader can and this document is in a file called 'spec.txt', then the reader can
do: do:
<CODE BEGINS> <CODE BEGINS>
sh extract.sh < spec.txt > layout_wcc.x sh extract.sh < spec.txt > layout_wcc.x
<CODE ENDS> <CODE ENDS>
The effect of the script is to remove leading white space from each The effect of the script is to remove leading blank space from each
line, plus a sentinel sequence of '///'. XDR descriptions with the line, plus a sentinel sequence of '///'. XDR descriptions with the
sentinel sequence are embedded throughout the document. sentinel sequence are embedded throughout the document.
Note that the XDR code contained in this document depends on types Note that the XDR code contained in this document depends on types
from the NFSv4.2 nfs4_prot.x file (generated from [RFC7863]). This from the NFSv4.2 nfs4_prot.x file (generated from [RFC7863]). This
includes both nfs types that end with a 4, such as offset4, length4, includes both nfs types that end with a 4 (such as offset4 and
etc., as well as more generic types such as uint32_t and uint64_t. length4) as well as more generic types (such as uint32_t and
uint64_t).
While the XDR can be appended to that from [RFC7863], the various While the XDR can be appended to that from [RFC7863], the various
code snippets belong in their respective areas of that XDR. code snippets belong in their respective areas of that XDR.
4.1. Code Components Licensing Notice
Both the XDR description and the scripts used for extracting the XDR
description are Code Components as described in Section 4 of 'Legal
Provisions Relating to IETF Documents' [LEGAL]. These Code
Components are licensed according to the terms of that document.
5. Security Considerations 5. Security Considerations
There are no new security considerations beyond those in [RFC8435]. There are no new security considerations beyond those in [RFC8435].
6. IANA Considerations 6. IANA Considerations
This section is to be removed before publishing as an RFC. This document has no IANA actions.
There are no IANA considerations for this document.
7. References 7. References
7.1. Normative References 7.1. Normative References
[I-D.ietf-nfsv4-delstid]
Haynes, T. and T. Myklebust, "Extending the Opening of
Files in NFSv4.2", Work in Progress, Internet-Draft,
draft-ietf-nfsv4-delstid-08, 2 October 2024,
<https://datatracker.ietf.org/doc/html/draft-ietf-nfsv4-
delstid-08>.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997, DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>. <https://www.rfc-editor.org/info/rfc2119>.
[RFC4506] Eisler, M., Ed., "XDR: External Data Representation [RFC4506] Eisler, M., Ed., "XDR: External Data Representation
Standard", STD 67, RFC 4506, DOI 10.17487/RFC4506, May Standard", STD 67, RFC 4506, DOI 10.17487/RFC4506, May
2006, <https://www.rfc-editor.org/info/rfc4506>. 2006, <https://www.rfc-editor.org/info/rfc4506>.
[RFC7862] Haynes, T., "Network File System (NFS) Version 4 Minor [RFC7862] Haynes, T., "Network File System (NFS) Version 4 Minor
skipping to change at page 12, line 39 skipping to change at line 501
[RFC8435] Halevy, B. and T. Haynes, "Parallel NFS (pNFS) Flexible [RFC8435] Halevy, B. and T. Haynes, "Parallel NFS (pNFS) Flexible
File Layout", RFC 8435, DOI 10.17487/RFC8435, August 2018, File Layout", RFC 8435, DOI 10.17487/RFC8435, August 2018,
<https://www.rfc-editor.org/info/rfc8435>. <https://www.rfc-editor.org/info/rfc8435>.
[RFC8881] Noveck, D., Ed. and C. Lever, "Network File System (NFS) [RFC8881] Noveck, D., Ed. and C. Lever, "Network File System (NFS)
Version 4 Minor Version 1 Protocol", RFC 8881, Version 4 Minor Version 1 Protocol", RFC 8881,
DOI 10.17487/RFC8881, August 2020, DOI 10.17487/RFC8881, August 2020,
<https://www.rfc-editor.org/info/rfc8881>. <https://www.rfc-editor.org/info/rfc8881>.
7.2. Informative References [RFC9754] Haynes, T. and T. Myklebust, "Extensions for Opening and
Delegating Files in NFSv4.2", RFC 9754,
DOI 10.17487/RFC9754, March 2025,
<https://www.rfc-editor.org/info/rfc9754>.
[LEGAL] IETF Trust, "Legal Provisions Relating to IETF Documents", 7.2. Informative References
November 2008, <http://trustee.ietf.org/docs/IETF-Trust-
License-Policy.pdf>.
[RFC1813] Callaghan, B., Pawlowski, B., and P. Staubach, "NFS [RFC1813] Callaghan, B., Pawlowski, B., and P. Staubach, "NFS
Version 3 Protocol Specification", RFC 1813, Version 3 Protocol Specification", RFC 1813,
DOI 10.17487/RFC1813, June 1995, DOI 10.17487/RFC1813, June 1995,
<https://www.rfc-editor.org/info/rfc1813>. <https://www.rfc-editor.org/info/rfc1813>.
Appendix A. Acknowledgments Acknowledgments
Dave Noveck, Tigran Mkrtchyan, and Rick Macklem provided reviews of Dave Noveck, Tigran Mkrtchyan, and Rick Macklem provided reviews of
the document. the document.
Authors' Addresses Authors' Addresses
Thomas Haynes Thomas Haynes
Hammerspace Hammerspace
Email: loghyr@gmail.com Email: loghyr@gmail.com
 End of changes. 49 change blocks. 
202 lines changed or deleted 184 lines changed or added

This html diff was produced by rfcdiff 1.48.