rfc9766.original.xml   rfc9766.xml 
<?xml version='1.0' encoding='utf-8'?> <?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE rfc>
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt'?>
<rfc <!DOCTYPE rfc [
category='std' <!ENTITY nbsp "&#160;">
docName='draft-ietf-nfsv4-layoutwcc-07' <!ENTITY zwsp "&#8203;">
ipr='trust200902' <!ENTITY nbhy "&#8209;">
obsoletes='' <!ENTITY wj "&#8288;">
scripts='Common,Latin' ]>
sortRefs='true'
submissionType='IETF' <rfc xmlns:xi="http://www.w3.org/2001/XInclude" category='std' docName='draft-ie
symRefs='true' tf-nfsv4-layoutwcc-07' number="9766" ipr='trust200902' obsoletes='' updates="" s
tocDepth='3' ortRefs='true' submissionType='IETF' symRefs='true' tocDepth='3' tocInclude='tru
tocInclude='true' e' version='3' consensus='true' xml:lang='en'>
version='3'
consensus='true'
xml:lang='en'>
<front> <front>
<title abbrev='LAYOUT_WCC'> <title abbrev="WCC in NFSv4.2's Flexible File Layout">Extensions for Weak Cach
Add LAYOUT_WCC to NFSv4.2's Flex File Layout Type e Consistency in NFSv4.2's Flexible File Layout
</title> </title>
<seriesInfo name='Internet-Draft' value='draft-ietf-nfsv4-layoutwcc-07'/> <seriesInfo name='RFC' value='9766'/>
<author fullname='Thomas Haynes' initials='T.' surname='Haynes'> <author fullname='Thomas Haynes' initials='T.' surname='Haynes'>
<organization abbrev='Hammerspace'>Hammerspace</organization> <organization abbrev='Hammerspace'>Hammerspace</organization>
<address> <address>
<email>loghyr@gmail.com</email> <email>loghyr@gmail.com</email>
</address> </address>
</author> </author>
<author fullname='Trond Myklebust' initials='T.' surname='Myklebust'> <author fullname='Trond Myklebust' initials='T.' surname='Myklebust'>
<organization abbrev='Hammerspace'>Hammerspace</organization> <organization abbrev='Hammerspace'>Hammerspace</organization>
<address> <address>
<email>trondmy@hammerspace.com</email> <email>trondmy@hammerspace.com</email>
</address> </address>
</author> </author>
<date year='2025' month='February' day='07'/> <date year='2025' month='April'/>
<area>Transport</area> <area>WIT</area>
<workgroup>Network File System Version 4</workgroup> <workgroup>nfsv4</workgroup>
<keyword>NFSv4</keyword>
<abstract> <abstract>
<t> <t>
This document specifies extensions to the parallel Network This document specifies extensions to NFSv4.2 for improving Weak Cache
File System (NFS) version 4 (pNFS) for improving write cache Consistency (WCC). These extensions introduce mechanisms that ensure
consistency. These extensions introduce mechanisms that ensure partial writes performed under a Parallel NFS (pNFS) layout remain
partial writes performed under a pNFS layout remain coherent and coherent and correctly tracked. The solution addresses concurrency and
correctly tracked. The solution addresses concurrency and data data integrity concerns that may arise when multiple clients write to
integrity concerns that may arise when multiple clients write to
the same file through separate data servers. By defining additional the same file through separate data servers. By defining additional
interactions among clients, metadata servers, and data servers, this interactions among clients, metadata servers, and data servers, this
specification enhances the reliability of NFSv4 in parallel-access specification enhances the reliability of NFSv4 in parallel-access
environments and ensures consistency across diverse deployment environments and ensures consistency across diverse deployment
scenarios. scenarios.
</t> </t>
</abstract> </abstract>
<note removeInRFC='true'>
<t>
Discussion of this draft takes place
on the NFSv4 working group mailing list (nfsv4@ietf.org),
which is archived at
<eref target='https://mailarchive.ietf.org/arch/browse/nfsv4/'/>.
Working Group information can be found at
<eref target='https://datatracker.ietf.org/wg/nfsv4/about/'/>.
</t>
</note>
</front> </front>
<middle> <middle>
<section anchor='sec_intro' numbered='true' removeInRFC='false' toc='default'> <section anchor='sec_intro' numbered='true' toc='default'>
<name>Introduction</name> <name>Introduction</name>
<t> <t>
In the Network File System version 4 (NFSv4) with a Parallel NFS In the Parallel NFS (pNFS)
(pNFS) Flexible File Layout (see Section 12 of <xref target='RFC8435' format flexible file layout (see <xref target='RFC8435'/>), there is no mechanism f
='default' or the data servers to
sectionFormat='of'/>) server, there is no mechanism for the data update the metadata servers when the data portion of the file is
servers to update the metadata servers for when the data portion modified. The metadata server needs this knowledge to correspondingly
of the file is modified. The metadata server needs this knowledge update the metadata portion of the file. If the client is using NFSv3 as
to correspondingly update the metadata portion of the file. If the the protocol with the data server, it can leverage Weak Cache Consistency
client is using NFSv3 as the protocol with the data server, it can (WCC) to update the metadata server of the attribute changes. In this
leverage weak cache consistency (WCC) to update the metadata server document, we introduce a new operation called LAYOUT_WCC to NFSv4.2, which
of the attribute changes. In this document, we introduce a new allows the client to periodically report the attributes of the data files
operation called LAYOUT_WCC to NFSv4.2 which allows the client to periodical to the metadata server.
ly
report the attributes of the data files to the metadata server.
</t> </t>
<t> <t>
Using the process detailed in <xref target='RFC8178' format='default' Using the process detailed in <xref target='RFC8178' format='default'/>, the
sectionFormat='of'/>, the revisions in this document become an revisions in this document become an
extension of NFSv4.2 <xref target='RFC7862' format='default' extension of NFSv4.2 <xref target='RFC7862' format='default'/>. They are bui
sectionFormat='of'/>. They are built on top of the external data lt on top of the External Data
representation (XDR) <xref target='RFC4506' format='default' Representation (XDR) <xref target='RFC4506' format='default'/> generated fro
sectionFormat='of'/> generated from <xref target='RFC7863' m <xref target='RFC7863'
format='default' sectionFormat='of'/>. format='default'/>.
</t> </t>
<section anchor='sec_defs' numbered='true' removeInRFC='false' toc='default'> <section anchor='sec_defs' numbered='true' toc='default'>
<name>Definitions</name> <name>Definitions</name>
<t> <t>
For a more comprehensive set of definitions, see Section 1.1 of <xref targ For a more comprehensive set of definitions, see <xref target='RFC8435'
et='RFC8435' format='default' sectionFormat='of' section="1.1"/>.
sectionFormat='of'/>.
</t> </t>
<dl newline='false' spacing='normal'> <dl newline='false' spacing='normal'>
<dt>(file) data:</dt> <dt>(file) data:</dt>
<dd> <dd>
that part of the file system object that contains the that part of the file system object that contains the
data to be read or written. It is the contents of the object data to be read or written. It is the contents of the object
rather than the attributes of the object. rather than the attributes of the object.
</dd> </dd>
<dt>data server (DS):</dt> <dt>data server (DS):</dt>
skipping to change at line 136 skipping to change at line 113
<dd> <dd>
the pNFS server that provides metadata the pNFS server that provides metadata
information for a file system object. information for a file system object.
</dd> </dd>
<dt>storage device:</dt> <dt>storage device:</dt>
<dd> <dd>
the target to which clients may direct I/O requests the target to which clients may direct I/O requests
when they hold an appropriate layout. Note that each data server when they hold an appropriate layout. Note that each data server
is a storage device but that some storage device are not data is a storage device but that some storage device are not data
servers. (See Section 2.1 of <xref target='RFC8434' format='default' sec tionFormat='of'/> servers. (See <xref target='RFC8434' sectionFormat='of' section="2.1"/>
for a discussion on the difference between a data server for a discussion on the difference between a data server
and a storage device.) and a storage device.)
</dd> </dd>
<dt>weak cache consistency (WCC):</dt> <dt>weak cache consistency (WCC):</dt>
<dd> <dd>
In NFSv3, WCC allows the client to check for file attribute changes the mechanism in NFSv3 that allows the client to check for file attribut
before and after an operation (See Section 2.6 of <xref target='RFC1813' e changes
format='default' sectionFormat='of'/>). before and after an operation (see <xref target='RFC1813'
sectionFormat='of' section="2.6"/>).
</dd> </dd>
</dl> </dl>
</section> </section>
<section numbered='true' removeInRFC='false' toc='default'> <section numbered='true' toc='default'>
<name>Requirements Language</name> <name>Requirements Language</name>
<t> <t>
The key words '<bcp14>MUST</bcp14>', '<bcp14>MUST NOT</bcp14>', The key words "<bcp14>MUST</bcp14>", "<bcp14>MUST NOT</bcp14>", "<bcp14>REQU
'<bcp14>REQUIRED</bcp14>', '<bcp14>SHALL</bcp14>', '<bcp14>SHALL IRED</bcp14>", "<bcp14>SHALL</bcp14>", "<bcp14>SHALL
NOT</bcp14>', '<bcp14>SHOULD</bcp14>', '<bcp14>SHOULD NOT</bcp14>', NOT</bcp14>", "<bcp14>SHOULD</bcp14>", "<bcp14>SHOULD NOT</bcp14>", "<bcp14>
'<bcp14>RECOMMENDED</bcp14>', '<bcp14>NOT RECOMMENDED</bcp14>', RECOMMENDED</bcp14>", "<bcp14>NOT RECOMMENDED</bcp14>",
'<bcp14>MAY</bcp14>', and '<bcp14>OPTIONAL</bcp14>' in this "<bcp14>MAY</bcp14>", and "<bcp14>OPTIONAL</bcp14>" in this document are to
document are to be interpreted as described in BCP 14 <xref be interpreted as
target='RFC2119' format='default' sectionFormat='of'/> <xref described in BCP&nbsp;14 <xref target="RFC2119"/> <xref target="RFC8174"/>
target='RFC8174' format='default' sectionFormat='of'/> when, when, and only when, they appear in all capitals, as shown here.
and only when, they appear in all capitals, as shown here. </t>
</t>
</section> </section>
</section> </section>
<section anchor='wcc' numbered='true' removeInRFC='false' toc='default'> <section anchor='wcc' numbered='true' toc='default'>
<name>Weak Cache Consistency (WCC)</name> <name>Weak Cache Consistency (WCC)</name>
<t> <t>
A pNFS layout type enables the metadata server to inform the client A pNFS layout type enables the metadata server to inform the client of
of both the storage protocol and the locations of the data that the both the storage protocol and the locations of the data that the client
client should use when communicating with the storage devices. The should use when communicating with the storage devices. The flexible
Flex Files Layout Type, as specified in <xref target='RFC8435' file layout type, as specified in <xref target='RFC8435' />, describes
/>, describes how data servers using NFSv3 can be accessed. The how data servers using NFSv3 can be accessed. The client is restricted
client is restricted to performing NFSv3 READ (Section 3.3.6 to performing the following NFSv3 operations on the filehandles
of <xref target='RFC1813'/>), WRITE (Section 3.3.6 of <xref provided in the layout: READ, WRITE, and COMMIT (see Sections <xref
target='RFC1813'/>), and COMMIT (Section 3.3.21 of <xref target='RFC1813' sectionFormat="bare" section="3.3.6"/>, <xref
target='RFC1813'/>) operations on the file handles provided in the target='RFC1813' sectionFormat="bare" section="3.3.7"/>, and <xref
layout. In other words, the client may only use NFSv3 operations target='RFC1813' sectionFormat="bare" section="3.3.21"/> of <xref
that act directly on the data portion of the file. target='RFC1813'/>, respectively). In other words, the client may only us
e NFSv3
operations that act directly on the data portion of the file.
</t> </t>
<t> <t>
Because Because there is no control protocol (see <xref target='RFC8434'
there is no contol protocol (see <xref target='RFC8434' format='default' format='default' sectionFormat='of'/>) possible with all data servers,
sectionFormat='of'/>) NFSv3 is used as the control protocol. As such, the following NFSv3
possible with all data servers, NFSv3 is used as the control protocol. operations are commonly used by the metadata server: CREATE, GETATTR,
As such, the NFSv3 CREATE (see Section 3.3.8 of <xref target='RFC1813' and SETATTR (see Sections <xref target='RFC1813' section="3.3.8"
format='default' sectionFormat='of'/>), GETATTR (see Section 3.3.1 of <xr sectionFormat='bare'/>, <xref target='RFC1813' section="3.3.1"
ef target='RFC1813' sectionFormat='bare'/>, and <xref target='RFC1813' section="3.3.2"
format='default' sectionFormat='of'/>), and SETATTR (see Section 3.3.2 of sectionFormat='bare'/> of <xref target='RFC1813'/>, respectively). That
<xref target='RFC1813' is, the metadata server is only allowed to use NFSv3 operations that
format='default' sectionFormat='of'/>) are operations commonly directly act on the metadata portion of the data file. GETATTR allows
used by the metadata server. I.e., the metadata server is only allowed to the metadata server to mainly retrieve the mtime (modify time), ctime
use (change time), and atime (access time). The metadata server can use
NFSv3 operations which directly act on the metadata portion of the data f this information to determine if the client modified the file whilst it
ile. held an iomode of LAYOUTIOMODE4_RW (see <xref target='RFC8881'
GETATTR allows the metadata server to mainly retrieve the mtime (modify t section="3.3.20" sectionFormat='of'/>). Then it can determine the
ime), following for the metadata file: time_modify, time_metadata, and
ctime (change time), and atime (access time). The metadata server time_access (see Sections <xref target='RFC8881' section="5.8.2.43"
can use this information to determine if the client modified the sectionFormat='bare'/>, <xref target='RFC8881' section="5.8.2.42"
file whilst it held an iomode of LAYOUTIOMODE4_RW (see Section 3.3.20 sectionFormat='bare'/>, and <xref target='RFC8881' section="5.8.2.37"
of <xref target='RFC8881' format='default' sectionFormat='of'/>). Then sectionFormat='bare'/> of <xref target='RFC8881'/>, respectively). That
it can determine the time_modify (see Section 5.8.2.43 is, it can determine the information to return to clients in an NFSv4.2
of <xref target='RFC8881' format='default' sectionFormat='of'/>), time_me GETATTR response.
tadata (see Section 5.8.2.42
of <xref target='RFC8881' format='default' sectionFormat='of'/>), and tim
e_access (see Section 5.8.2.37
of <xref target='RFC8881' format='default' sectionFormat='of'/>) for
the metadata file. I.e., the information to return to clients
in a NFSv4.2 GETATTR response.
</t> </t>
<t> <t>
For example, the metadata server might issue an NFSv3 GETATTR For example, the metadata server might issue an NFSv3 GETATTR operation
operation to the data server, which is typically triggered by to the data server, which is typically triggered by a client's NFSv4
a client's NFSv4 GETATTR request to the metadata server. In GETATTR request to the metadata server. In addition to the cost of each
addition to the cost of each individual GETATTR operation, individual GETATTR operation, the data server can be overwhelmed by a
the data server can be overwhelmed by a large volume of such large volume of such requests. NFSv3 addressed a similar challenge by
requests. NFSv3 addressed a similar challenge by including a including a post-operation attribute in the READ and WRITE operations
post-operation attribute in the READ and WRITE operations to to report WCC data (see <xref target='RFC1813'
report weak cache consistency (WCC) data (see Section 2.6 of sectionFormat="of" section="2.6"/>).
<xref target='RFC1813' />).
</t> </t>
<t> <t>
Each NFSv3 operation entails a single round trip between the Each NFSv3 operation entails a single round trip between the
client and server. Consequently, issuing a WRITE followed by client and server. Consequently, issuing a WRITE followed by
a GETATTR would require two round trips. In that situation, the a GETATTR would require two round trips. In that situation, the
retrieved attribute information is regarded as strict server-client retrieved attribute information is regarded as having strict server-clien t
consistency. By contrast, NFSv4 enables a WRITE and GETATTR to consistency. By contrast, NFSv4 enables a WRITE and GETATTR to
be combined within a compound operation, which requires only be combined within a compound operation, which requires only
one round trip. This combined approach is likewise considered one round trip. This combined approach is likewise considered to have
strict server-client consistency. Essentially, NFSv4 READ and strict server-client consistency. Essentially, NFSv4 READ and
WRITE operations omit post-operation attributes, allowing the WRITE operations omit post-operation attributes, allowing the
client to determine whether it requires that information. client to determine whether it requires that information.
</t> </t>
<t> <t>
Whilst NFSv4 got rid of the requirement for WCC information to Whilst NFSv4 got rid of the requirement for WCC information to
be supplied by the WRITE or READ operations, the introduction be supplied by the WRITE or READ operations, the introduction
of pNFS re-introduces the same problem. The metadata server of pNFS reintroduces the same problem. The metadata server
has to communicate with the data server in order to get has to communicate with the data server in order to get
at the data which could be provided by a WCC model. the data that could be provided by a WCC model.
</t> </t>
<t> <t>
With the flexible file layout type, the client can leverage With the flexible file layout type, the client can leverage
the NFSv3 WCC to service the proxying of times (See Section 4 of <xref the NFSv3 WCC to service the proxying of times (see <xref
target='I-D.ietf-nfsv4-delstid' format='default' sectionFormat='of'/>). target='RFC9754' section="5" sectionFormat='of'/>),
But the granularity of this data is limited. With client side but the granularity of this data is limited. With client-side
mirroring (See Section 8 of <xref target='RFC8435' format='default' mirroring (see <xref target='RFC8435' section="8"
sectionFormat='of'/>), the client has to aggregate the N mirrored sectionFormat='of'/>), the client has to aggregate the N mirrored
files in order to send one piece of information instead of N files in order to send one piece of information instead of N
pieces of information. Also, the client is limited to sending pieces of information. Also, the client is limited to sending
that information only when it returns the delegation. that information only when it returns the delegation.
</t> </t>
<t> <t>
This document introduces a new NFSv4.2 operation, LAYOUT_WCC, This document introduces a new NFSv4.2 operation, LAYOUT_WCC,
which enables the client to provide the metadata server with which enables the client to provide the metadata server with
information obtained from the data server. The client is information obtained from the data server. The client is
responsible for gathering the NFSv3 WCC data, returned by the responsible for gathering the NFSv3 WCC data, returned by the
three permissible NFSv3 operations, and conveying it back to three permissible NFSv3 operations, and conveying it back to
the metadata server as part of NFSv4.2 attributes. The metadata the metadata server as part of NFSv4.2 attributes. The metadata
server <bcp14>MAY</bcp14> therefore avoid issuing costly NFSv3 server <bcp14>MAY</bcp14> therefore avoid issuing costly NFSv3
GETATTR calls to the data servers. Because this approach relies GETATTR calls to the data servers. Because this approach relies
on a weak model, the metadata server <bcp14>MAY</bcp14> still on a weak model, the metadata server <bcp14>MAY</bcp14> still
perform these calls if it chooses to strengthen the model. perform these calls if it chooses to strengthen the model.
</t> </t>
</section> </section>
<section anchor='op_LAYOUT_WCC' numbered='true' removeInRFC='false' toc='default '> <section anchor='op_LAYOUT_WCC' numbered='true' toc='default'>
<name>Operation 77: LAYOUT_WCC - Layout Weak Cache Consistency</name> <name>Operation 77: LAYOUT_WCC - Layout Weak Cache Consistency</name>
<section toc='exclude' anchor='ss_op_LAYOUT_WCC_A' numbered='true'>
<section toc='default' anchor='ss_op_LAYOUT_WCC_A' numbered='true'>
<name>ARGUMENT</name> <name>ARGUMENT</name>
<sourcecode name='' type='' markers='true'><![CDATA[ <sourcecode name='' type='xdr' markers='true'><![CDATA[
/// struct LAYOUT_WCC4args { /// struct LAYOUT_WCC4args {
/// stateid4 lowa_stateid; /// stateid4 lowa_stateid;
/// layouttype4 lowa_type; /// layouttype4 lowa_type;
/// opaque lowa_body<>; /// opaque lowa_body<>;
/// }; /// };
]]></sourcecode> ]]></sourcecode>
<t> <t>
stateid4 is defined in Section 3.3.12 of <xref target='RFC8881' stateid4 is defined in <xref target='RFC8881' section="3.3.12"
format='default' sectionFormat='of'/>. sectionFormat='of'/>. layouttype4 is defined in <xref target='RFC8881'
layouttype4 is defined in Section 3.3.13 of <xref target='RFC8881' section="3.3.13" sectionFormat='of'/>.
format='default' sectionFormat='of'/>.
</t> </t>
</section> </section>
<section toc='exclude' anchor='ss_op_LAYOUT_WCC_R' numbered='true'> <section toc='default' anchor='ss_op_LAYOUT_WCC_R' numbered='true'>
<name>RESULT</name> <name>RESULT</name>
<sourcecode name='' type='' markers='true'><![CDATA[ <sourcecode name='' type='xdr' markers='true'><![CDATA[
/// struct LAYOUT_WCC4res { /// struct LAYOUT_WCC4res {
/// nfsstat4 lowr_status; /// nfsstat4 lowr_status;
/// }; /// };
]]></sourcecode> ]]></sourcecode>
<t> <t>
nfsstat4 is defined in Section 3.2 of <xref target='RFC8881' nfsstat4 is defined in <xref target='RFC8881'
format='default' sectionFormat='of'/>. section="3.2" sectionFormat='of'/>.
</t> </t>
</section> </section>
<section toc='exclude' anchor='ss_op_LAYOUT_WCC_D' numbered='true'> <section toc='default' anchor='ss_op_LAYOUT_WCC_D' numbered='true'>
<name>DESCRIPTION</name> <name>DESCRIPTION</name>
<t> <t>
The current filehandle and the lowa_stateid identify the specific The current filehandle and the lowa_stateid identify the specific
layout for the LAYOUT_WCC operation. The lowa_type indicates how layout for the LAYOUT_WCC operation. The lowa_type indicates how
to interpret the layout-type-specific payload contained in the to interpret the layout-type-specific payload contained in the
lowa_body field. The lowa_type is the corresponding value lowa_body field. The lowa_type is the corresponding value
from the IANA registry for 'pNFS Layout Types' for the layout from the "pNFS Layout Types" IANA registry for the layout
type being used. type being used.
</t> </t>
<t> <t>
The lowa_body contains the data file attributes. The client is The lowa_body contains the data file attributes. The client is
responsible for mapping NFSv3 post-operation attributes to the responsible for mapping NFSv3 post-operation attributes to the
fattr4 representation. Similar to the behavior of post-operation fattr4 representation. Similar to the behavior of post-operation
attributes, the client may ignore these attributes, and the attributes, the client may ignore these attributes, and the
server may also choose to ignore any attributes included in server may also choose to ignore any attributes included in
LAYOUT_WCC. However, the server can use these attributes to avoid LAYOUT_WCC. However, the server can use these attributes to avoid
querying the data server for data file attributes. Because these querying the data server for data file attributes. Because these
attributes are optional and the client has no recourse if the attributes are optional and the client has no recourse if the
server opts to disregard them, there is no requirement to return server opts to disregard them, there is no requirement to return
a bitmap4 indicating which attributes have been accepted in the a bitmap4 indicating which attributes have been accepted in the
LAYOUT_WCC result. LAYOUT_WCC result.
</t> </t>
</section> </section>
<section anchor='ss_op_LAYOUT_WCC_impl' numbered='true' removeInRFC='false' t oc='default'> <section anchor='ss_op_LAYOUT_WCC_impl' numbered='true' toc='default'>
<name>Implementation</name> <name>Implementation</name>
<section anchor='ss_op_LAYOUT_WCC_examples' numbered='true' removeInRFC='fa <section anchor='ss_op_LAYOUT_WCC_examples' numbered='true' toc='default'>
lse' toc='default'> <name>Examples of When to Use LAYOUT_WCC</name>
<name>Examples of when to use LAYOUT_WCC</name>
<t> <t>
The only way for the metadata server to detect modifications The only way for the metadata server to detect modifications
to the data file is to probe the data servers via a GETATTR. It to the data file is to probe the data servers via a GETATTR. It
can compare the mtime results across multiple calls to detect a can compare the mtime results across multiple calls to detect an
NFSv3 WRITE operation by the client. Likewise, the atime results NFSv3 WRITE operation by the client. Likewise, the atime results
indicate the client having issued a NFSv3 READ operation. As such, indicate the client having issued an NFSv3 READ operation. As such,
the client can leverage the LAYOUT_WCC operation whenever it the client can leverage the LAYOUT_WCC operation whenever it
has the belief that the metadata server would need to refresh has the belief that the metadata server would need to refresh
the attributes of the data files. While the client can send a the attributes of the data files. While the client can send a
LAYOUT_WCC at any time, there are times it will want to do this LAYOUT_WCC at any time, there are times it will want to do this
operation in order to avoid having the metadata server issue operation in order to avoid having the metadata server issue
NFSv3 GETATTR requests to the data servers: NFSv3 GETATTR requests to the data servers:
</t> </t>
<ul spacing='normal'> <ul spacing='normal'>
<li> <li>
Whenever it sends a GETATTR for any of the following attributes: size <t>Whenever it sends a GETATTR for any of the following attributes:</
(see Section 5.8.1.5 t>
of <xref target='RFC8881' format='default' sectionFormat='of'/>), spa <ul spacing='normal'>
ce_used (see Section 5.8.2.25 <li>size (see <xref target='RFC8881' sectionFormat='of'
of <xref target='RFC8881' format='default' sectionFormat='of'/>), cha section="5.8.1.5"/>)</li>
nge (see Section 5.8.1.4 <li>space_used (see <xref target='RFC8881'
of <xref target='RFC8881' format='default' sectionFormat='of'/>), sectionFormat='of' section="5.8.2.35"/>)</li>
time_access (see Section 5.8.2.37 <li>change (see <xref
of <xref target='RFC8881' format='default' sectionFormat='of'/>), tim target='RFC8881' sectionFormat='of' section="5.8.1.4"/>)</li>
e_metadata (see Section 5.8.2.42 <li>time_access (see <xref target='RFC8881' sectionFormat='of'
of <xref target='RFC8881' format='default' sectionFormat='of'/>), and section="5.8.2.37"/>)</li>
time_modify (see Section 5.8.2.43 <li>time_metadata (see <xref target='RFC8881'
of <xref target='RFC8881' format='default' sectionFormat='of'/>). sectionFormat='of' section="5.8.2.42"/>)</li>
<li>time_modify (see
<xref target='RFC8881' sectionFormat='of'
section="5.8.2.43"/>)</li>
</ul>
</li> </li>
<li> <li>
Whenever it sends an NFS4ERR_ACCESS error via LAYOUTRETURN or LAYOUTE RROR - it could Whenever it sends an NFS4ERR_ACCESS error via LAYOUTRETURN or LAYOUTE RROR. It could
have already gotten the NFSv3 uid and gid values back in the WCC of t he WRITE, have already gotten the NFSv3 uid and gid values back in the WCC of t he WRITE,
READ, or COMMIT operation which got the error. Thus it could report t READ, or COMMIT operation that got the error. Thus, it could report t
hat information hat information
back to the metadata server, saving it from querying that information back to the metadata server, saving it from querying that information
via a NFSv3 GETATTR. via an NFSv3 GETATTR.
</li> </li>
<li>
Whenever it sends a SETATTR to refresh the proxied times (See Section <li>
4 of <xref Whenever it sends a SETATTR to refresh the proxied times (see <xref
target='I-D.ietf-nfsv4-delstid' format='default' sectionFormat='of'/> target='RFC9754' section="5" sectionFormat='of'/>). The metadata serv
) - the metadata server is er will
going to want to correlate these times in order to detect later modif correlate these times in order to detect later modification to
ication to
the data file. the data file.
</li> </li>
</ul> </ul>
</section> </section>
<section anchor='ss_op_LAYOUT_WCC_payload' numbered='true' removeInRFC='fal <section anchor='ss_op_LAYOUT_WCC_payload' numbered='true' toc='default'>
se' toc='default'> <name>Examples of What to Send in LAYOUT_WCC</name>
<name>Examples of what to send in the LAYOUT_WCC</name>
<t> <t>
The NFSv3 attributes returned in the WCC of WRITE, READ, and COMMIT are The NFSv3 attributes returned in the WCC of WRITE, READ, and COMMIT ope
a smaller subset rations are a smaller subset
of what can be transmitted as a NFSv4 attribute. The mapping of NFSv3 t of what can be transmitted as an NFSv4 attribute. The mapping of NFSv3
o NFSv4 attributes to NFSv4 attributes
is shown in <xref target='table_mappings'/>. is shown in <xref target='table_mappings'/>.
The LAYOUT_WCC <bcp14>MUST</bcp14> provide all of these attributes to t he metadata server. The LAYOUT_WCC <bcp14>MUST</bcp14> provide all of these attributes to t he metadata server.
Both the uid and gid are stringified into their respective attributes o f owner and owner_group. Both the uid and gid are stringified into their respective attributes o f owner and owner_group.
The reason to provide these two attributes is in case of NFS4ERR_ACCESS , the metadata In the case of NFS4ERR_ACCESS, the reason to provide these two attribut es is that the metadata
server can compare what it expects the values of the uid and gid of the data file server can compare what it expects the values of the uid and gid of the data file
to be versus the actual values. It can then repair the permissions as n eeded or to be versus the actual values. It can then repair the permissions as n eeded or
modify the expected values it has cached. modify the expected values it has cached.
</t> </t>
<table anchor='table_mappings'> <table anchor='table_mappings'>
<name>NFSv3 to NFSv4.2 Attribute Mappings</name> <name>NFSv3 to NFSv4.2 Attribute Mappings</name>
<thead> <thead>
<tr><th>NFSv3 Attribute</th> <th>NFSv4.2 Attribute</th></tr> <tr><th>NFSv3 Attribute</th> <th>NFSv4.2 Attribute</th></tr>
</thead> </thead>
skipping to change at line 391 skipping to change at line 380
<tr><td>gid</td> <td>owner_group</td></tr> <tr><td>gid</td> <td>owner_group</td></tr>
<tr><td>atime</td> <td>time_access</td></tr> <tr><td>atime</td> <td>time_access</td></tr>
<tr><td>mtime</td> <td>time_modify</td></tr> <tr><td>mtime</td> <td>time_modify</td></tr>
<tr><td>ctime</td> <td>time_metadata</td></tr> <tr><td>ctime</td> <td>time_metadata</td></tr>
</tbody> </tbody>
</table> </table>
</section> </section>
</section> </section>
<section anchor='ss_op_LAYOUT_WCC_errs' numbered='true' removeInRFC='false' t oc='default'> <section anchor='ss_op_LAYOUT_WCC_errs' numbered='true' toc='default'>
<name>Allowed Errors</name> <name>Allowed Errors</name>
<t> <t>
The LAYOUT_WCC operation can raise the errors in The LAYOUT_WCC operation can raise the errors listed in <xref
<xref target='tbl_op_error_returns' format='default' sectionFormat='of'/>. target='tbl_op_error_returns' format='default'/>. When an error is
When an error is encountered, the metadata server can decide to ignore encountered, the metadata server can decide to ignore the entire
the entire operation or depending on the layout type operation, or depending on the layout-type-specific payload, it could
specific payload, it could decide to apply a portion of the payload. decide to apply a portion of the payload. Note that there are no new
Note that there are no new errors introduced for the LAYOUT_WCC errors introduced for the LAYOUT_WCC operation and the errors in <xref
operation and the errors in <xref target='tbl_op_error_returns' format='de target='tbl_op_error_returns' format='default'/> are each defined in
fault' sectionFormat='of'/> <xref target='RFC8881' section="15.1" sectionFormat='of'/>. <xref
are each defined in Section 15.1 of <xref target='RFC8881' target='tbl_op_error_returns' format='default'/> can be considered as an
format='default' sectionFormat='of'/>. <xref target='tbl_op_error_returns' extension of <xref target='RFC8881' section="15.2" sectionFormat='of'/>.
format='default' sectionFormat='of'/>
can be considered as an extension of Section 15.2 of <xref target='RFC8881
'
format='default' sectionFormat='of'/>.
</t> </t>
<table anchor='tbl_op_error_returns' align='center'> <table anchor='tbl_op_error_returns' align='center'>
<name>Operations and Their Valid Errors</name> <name>Operations and Their Valid Errors</name>
<thead> <thead>
<tr> <tr>
<th>Operation</th> <th>Operation</th>
<th>Errors</th> <th>Errors</th>
</tr> </tr>
</thead> </thead>
<tbody> <tbody>
skipping to change at line 452 skipping to change at line 440
NFS4ERR_STALE, NFS4ERR_STALE,
NFS4ERR_TOO_MANY_OPS, NFS4ERR_TOO_MANY_OPS,
NFS4ERR_UNKNOWN_LAYOUTTYPE, NFS4ERR_UNKNOWN_LAYOUTTYPE,
NFS4ERR_WRONG_CRED, NFS4ERR_WRONG_CRED,
NFS4ERR_WRONG_TYPE NFS4ERR_WRONG_TYPE
</td> </td>
</tr> </tr>
</tbody> </tbody>
</table> </table>
</section> </section>
<section anchor='ss_op_LAYOUT_WCC_opt' numbered='true' removeInRFC='false' toc ='default'> <section anchor='ss_op_LAYOUT_WCC_opt' numbered='true' toc='default'>
<name>Extension of Existing Implementations</name> <name>Extension of Existing Implementations</name>
<t> <t>
The new LAYOUT_WCC operation is <bcp14>OPTIONAL</bcp14> for both The new LAYOUT_WCC operation is <bcp14>OPTIONAL</bcp14> for both
NFSv4.2 (<xref target='RFC7863' format='default' sectionFormat='of'/>) NFSv4.2 <xref target='RFC7863' format='default'/>
and the flexible file layout type (<xref target='RFC8435' format='default' and the flexible file layout type <xref target='RFC8435' format='default'/
sectionFormat='of'/>). >.
</t> </t>
</section> </section>
<section anchor='ss_op_LAYOUT_WCC_ff' numbered='true' removeInRFC='false' toc= <section anchor='ss_op_LAYOUT_WCC_ff' numbered='true' toc='default'>
'default'> <name>Flexible File Layout Type</name>
<name>Flex Files Layout Type</name> <sourcecode name='' type='xdr' markers='true'><![CDATA[
<sourcecode name='' type='' markers='true'><![CDATA[
/// struct ff_data_server_wcc4 { /// struct ff_data_server_wcc4 {
/// deviceid4 ffdsw_deviceid; /// deviceid4 ffdsw_deviceid;
/// stateid4 ffdsw_stateid; /// stateid4 ffdsw_stateid;
/// nfs_fh4 ffdsw_fh_vers<>; /// nfs_fh4 ffdsw_fh_vers<>;
/// fattr4 ffdsw_attributes; /// fattr4 ffdsw_attributes;
/// }; /// };
/// ///
/// struct ff_mirror_wcc4 { /// struct ff_mirror_wcc4 {
/// ff_data_server_wcc4 ffmw_data_servers<>; /// ff_data_server_wcc4 ffmw_data_servers<>;
/// }; /// };
/// ///
/// struct ff_layout_wcc4 { /// struct ff_layout_wcc4 {
/// ff_mirror_wcc4 fflw_mirrors<>; /// ff_mirror_wcc4 fflw_mirrors<>;
/// }; /// };
]]> ]]></sourcecode>
</sourcecode>
<t> <t>
The flex file layout type specific results <bcp14>MUST</bcp14> correspond The results specific to the flexible file layout type <bcp14>MUST</bcp14>
to the ff_layout4 data structure as defined in Section 5.1 of correspond to the ff_layout4 data structure as defined in <xref
<xref target='RFC8435' format='default' sectionFormat='of'/>. target='RFC8435' section="5.1" sectionFormat='of'/>. There
There <bcp14>MUST</bcp14> be a one-to-one correspondence between: <bcp14>MUST</bcp14> be a one-to-one correspondence between the following:
</t> </t>
<ul spacing='normal'> <ul spacing='normal'>
<li> <li>
ff_data_server4 -&gt; ff_data_server_wcc4 ff_data_server4 -&gt; ff_data_server_wcc4
</li> </li>
<li> <li>
ff_mirror4 -&gt; ff_mirror_wcc4 ff_mirror4 -&gt; ff_mirror_wcc4
</li> </li>
<li> <li>
ff_layout4 -&gt; ff_layout_wcc4 ff_layout4 -&gt; ff_layout_wcc4
</li> </li>
</ul> </ul>
<t> <t>
Each ff_layout4 has an array of ff_mirror4, which have an array of ff_data _server4. Each ff_layout4 has an array of ff_mirror4, which has an array of ff_data_ server4.
Based on the current filehandle and the lowa_stateid, the server can match the Based on the current filehandle and the lowa_stateid, the server can match the
reported attributes. reported attributes.
</t> </t>
<t> <t>
But the positional correspondence between the elements is not But the positional correspondence between the elements is not
sufficient to determine the attributes to update. Consider the sufficient to determine the attributes to update. Consider the
case where a layout had three mirrors and two of them had updated case where a layout has three mirrors and two of them have updated
attributes, but the third did not. A client could decide to present attributes but the third does not. A client could decide to present
all three mirrors, with one mirror having an attribute mask with all three mirrors, with one mirror having an attribute mask with
no attributes present. Or it could decide to present only the no attributes present. Or it could decide to present only the
two mirrors which had been changed. two mirrors that had been changed.
</t> </t>
<t> <t>
In either case, the combination of ffdsw_deviceid, ffdsw_stateid, and In either case, the combination of ffdsw_deviceid, ffdsw_stateid, and
ffdsw_fh_vers will uniquely identify the attributes to be updated. ffdsw_fh_vers will uniquely identify the attributes to be updated.
All three arguments are required. A layout might have multiple data All three arguments are required. A layout might have multiple data
files on the same storage device, in which case the ffdsw_deviceid and files on the same storage device, in which case the ffdsw_deviceid and
ffdsw_stateid would match, but the ffdsw_fh_vers would not. ffdsw_stateid would match, but the ffdsw_fh_vers would not.
</t> </t>
<t> <t>
The ffdsw_attributes are processed similar to the obj_attributes in The ffdsw_attributes are processed similar to the obj_attributes in
the SETATTR arguments (See Section 18.34 of <xref target='RFC8881' format= 'default' sectionFormat='of'/>). the SETATTR arguments (see <xref target='RFC8881' section="18.30" sectionF ormat='of'/>).
</t> </t>
</section> </section>
</section> </section>
<section anchor='xdr_desc' numbered='true' removeInRFC='false' toc='default'> <section anchor='xdr_desc' numbered='true' toc='default'>
<name>Extraction of XDR</name> <name>Extraction of XDR</name>
<t> <t>
This document contains the external data representation (XDR) This document contains the XDR
<xref target='RFC4506' format='default' sectionFormat='of'/> description of <xref target='RFC4506' format='default'/> description of the new NFSv4.2 ope
the new open ration LAYOUT_WCC.
flags for delegating the file to the client.
The XDR description is embedded in this The XDR description is embedded in this
document in a way that makes it simple for the reader to extract document in a way that makes it simple for the reader to extract
into a ready-to-compile form. The reader can feed this document into a ready-to-compile form. The reader can feed this document
into the following shell script to produce the machine-readable into the following shell script to produce the machine-readable
XDR description of the new flags: XDR description of the new NFSv4.2 operation LAYOUT_WCC.
</t> </t>
<sourcecode name='' type='' markers='true'><![CDATA[ <sourcecode name='' type='shell' markers='true'><![CDATA[
#!/bin/sh #!/bin/sh
grep '^ *///' $* | sed 's?^ */// ??' | sed 's?^ *///$??' grep '^ *///' $* | sed 's?^ */// ??' | sed 's?^ *///$??'
]]> ]]></sourcecode>
</sourcecode>
<t> <t>
That is, if the above script is stored in a file called 'extract.sh', and That is, if the above script is stored in a file called 'extract.sh', and
this document is in a file called 'spec.txt', then the reader can do: this document is in a file called 'spec.txt', then the reader can do:
</t> </t>
<sourcecode name='' type='' markers='true'><![CDATA[ <sourcecode name='' type='shell' markers='true'><![CDATA[
sh extract.sh < spec.txt > layout_wcc.x sh extract.sh < spec.txt > layout_wcc.x
]]> ]]></sourcecode>
</sourcecode>
<t> <t>
The effect of the script is to remove leading white space from each The effect of the script is to remove leading blank space from each
line, plus a sentinel sequence of '///'. XDR descriptions with the line, plus a sentinel sequence of '///'. XDR descriptions with the
sentinel sequence are embedded throughout the document. sentinel sequence are embedded throughout the document.
</t> </t>
<t> <t>
Note that the XDR code contained in this document depends on types Note that the XDR code contained in this document depends on types
from the NFSv4.2 nfs4_prot.x file (generated from <xref target='RFC7863' for from the NFSv4.2 nfs4_prot.x file (generated from <xref target='RFC7863' for
mat='default' sectionFormat='of'/>). mat='default'/>).
This includes both nfs types that end with a 4, such as offset4, This includes both nfs types that end with a 4 (such as offset4 and
length4, etc., as well as more generic types such as uint32_t and length4) as well as more generic types (such as uint32_t and
uint64_t. uint64_t).
</t> </t>
<t> <t>
While the XDR can be appended to that from <xref target='RFC7863' format='de fault' sectionFormat='of'/>, While the XDR can be appended to that from <xref target='RFC7863' format='de fault' sectionFormat='of'/>,
the various code snippets belong in their respective areas of the various code snippets belong in their respective areas of
that XDR. that XDR.
</t> </t>
<section anchor='code_copyright' numbered='true' removeInRFC='false' toc='defa
ult'>
<name>Code Components Licensing Notice</name>
<t>
Both the XDR description and the scripts used for extracting the
XDR description are Code Components as described in Section 4 of
<xref target='LEGAL' format='default' sectionFormat='of'>'Legal
Provisions Relating to IETF Documents'</xref>. These Code
Components are licensed according to the terms of that document.
</t>
</section>
</section> </section>
<section anchor='sec_security' numbered='true' removeInRFC='false' toc='default' > <section anchor='sec_security' numbered='true' toc='default'>
<name>Security Considerations</name> <name>Security Considerations</name>
<t> <t>
There are no new security considerations beyond those in There are no new security considerations beyond those in
<xref target='RFC8435' format='default' sectionFormat='of'/>. <xref target='RFC8435' format='default'/>.
</t> </t>
</section> </section>
<section anchor='sec_iana' numbered='true' removeInRFC='true' toc='default'> <section anchor='sec_iana' numbered='true' toc='default'>
<name>IANA Considerations</name> <name>IANA Considerations</name>
<t> <t>This document has no IANA actions.
There are no IANA considerations for this document.
</t> </t>
</section> </section>
</middle> </middle>
<back> <back>
<references> <references>
<name>References</name> <name>References</name>
<references> <references>
<name>Normative References</name> <name>Normative References</name>
<xi:include xmlns:xi='http://www.w3.org/2001/XInclude' <xi:include href='https://bib.ietf.org/public/rfc/bibxml/reference.RFC.2119.
href='https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.2119 xml'/>
.xml'/> <xi:include href='https://bib.ietf.org/public/rfc/bibxml/reference.RFC.4506.
<xi:include xmlns:xi='http://www.w3.org/2001/XInclude' xml'/>
href='https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.4506 <xi:include href='https://bib.ietf.org/public/rfc/bibxml/reference.RFC.7862.
.xml'/> xml'/>
<xi:include xmlns:xi='http://www.w3.org/2001/XInclude' <xi:include href='https://bib.ietf.org/public/rfc/bibxml/reference.RFC.7863.
href='https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.7862 xml'/>
.xml'/> <xi:include href='https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8174.
<xi:include xmlns:xi='http://www.w3.org/2001/XInclude' xml'/>
href='https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.7863 <xi:include href='https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8178.
.xml'/> xml'/>
<xi:include xmlns:xi='http://www.w3.org/2001/XInclude' <xi:include href='https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8434.
href='https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.8174 xml'/>
.xml'/> <xi:include href='https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8435.
<xi:include xmlns:xi='http://www.w3.org/2001/XInclude' xml'/>
href='https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.8178 <xi:include href='https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8881.
.xml'/> xml'/>
<xi:include xmlns:xi='http://www.w3.org/2001/XInclude'
href='https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.8434 <!-- draft-ietf-nfsv4-delstid-08 published as RFC 9754 -->
.xml'/>
<xi:include xmlns:xi='http://www.w3.org/2001/XInclude' <xi:include href='https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9754.
href='https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.8435 xml'/>
.xml'/>
<xi:include xmlns:xi='http://www.w3.org/2001/XInclude'
href='https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.8881
.xml'/>
<xi:include xmlns:xi='http://www.w3.org/2001/XInclude'
href='https://bib.ietf.org/public/rfc/bibxml3/reference.I-D.ietf-nfsv4-de
lstid.xml'/>
</references> </references>
<references> <references>
<name>Informative References</name> <name>Informative References</name>
<xi:include xmlns:xi='http://www.w3.org/2001/XInclude' <xi:include href='https://bib.ietf.org/public/rfc/bibxml/reference.RFC.1813.
href='https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.1813 xml'/>
.xml'/>
<reference anchor='LEGAL' target='http://trustee.ietf.org/docs/IETF-Trust-Li
cense-Policy.pdf'>
<front>
<title abbrev='Legal Provisions'>Legal Provisions Relating to IETF Docum
ents</title>
<author>
<organization>IETF Trust</organization>
</author>
<date month='November' year='2008'/>
</front>
</reference>
</references> </references>
</references> </references>
<section numbered='true' removeInRFC='false' toc='default'> <section numbered='false' toc='default'>
<name>Acknowledgments</name> <name>Acknowledgments</name>
<t> <t><contact fullname="Dave Noveck"/>, <contact fullname="Tigran
Dave Noveck, Tigran Mkrtchyan, and Rick Macklem provided reviews of the Mkrtchyan"/>, and <contact fullname="Rick Macklem"/> provided reviews of
document. the document.</t>
</t>
</section> </section>
</back> </back>
</rfc> </rfc>
 End of changes. 80 change blocks. 
306 lines changed or deleted 253 lines changed or added

This html diff was produced by rfcdiff 1.48.