[LTP] [PATCH 1/1] nfsstat01: Update client RPC calls for kernel 6.9
Thorsten Leemhuis
regressions@leemhuis.info
Fri Jul 12 15:45:45 CEST 2024
On 08.07.24 19:49, Chuck Lever III wrote:
>> On Jul 8, 2024, at 6:36 AM, Greg KH <greg@kroah.com> wrote:
>> On Sat, Jul 06, 2024 at 07:46:19AM +0000, Sherry Yang wrote:
>>>> On Jul 6, 2024, at 12:11 AM, Greg KH <greg@kroah.com> wrote:
>>>> On Fri, Jul 05, 2024 at 02:19:18PM +0000, Chuck Lever III wrote:
>>>>>> On Jul 2, 2024, at 6:55 PM, Calum Mackay <calum.mackay@oracle.com> wrote:
>>>>>> On 02/07/2024 5:54 pm, Calum Mackay wrote:
>>>>>>> I noticed your LTP patch [1][2] which adjusts the nfsstat01 test on v6.9 kernels, to account for Josef's changes [3], which restrict the NFS/RPC stats per-namespace.
>>>>>>> I see that Josef's changes were backported, as far back as longterm v5.4,
> [...]
>>>>>> I'm wondering if this difference between NFS client, and NFS server, stat behaviour, across kernel versions, may perhaps cause some user confusion?
>>>>>
>>>>> As a refresher for the stable folken, Josef's changes make
>>>>> nfsstats silo'd, so they no longer show counts from the whole
>>>>> system, but only for NFS operations relating to the local net
>>>>> namespace. That is a surprising change for some users, tools,
>>>>> and testing.
>>>>>
>>>>> I'm not clear on whether there are any rules/guidelines around
>>>>> LTS backports causing behavior changes that user tools, like
>>>>> nfsstat, might be impacted by.
>>>>
>>>> The same rules that apply for Linus's tree (i.e. no userspace
>>>> regressions.)
>>> [...]
>>> If no userspace regression, should we revert the Josef’s NFS client-side changes on LTS?
>>
>> This sounds like a regression in Linus's tree too, so why isn't it
>> reverted there first?
>
> There is a change in behavior in the upstream code, but Josef's
> patches fix an information leak and make the statistics more
> sensible in container environments. I'm not certain that
> should be considered a regression, but confess I don't know
> the regression rules to this fine a degree of detail.
Chuck pointed me to this thread (I had an eye on it already anyway) and
asked for advice. Take everything I write here with a grain of salt, as
this is somewhat tricky situation which makes it hard to predict how
Linus would actually want to see this handled. Maybe I should have CCed
him, but I doubt he cares right now; but we maybe should bring him in,
if an actual user complains.
With that out of the way, let me write a few thoughts:
* That some test breaks is not a regression, as regressions are about
"practical issues", not some ABI/API changes that only some tests care
about. So if it's just a test that broke update it.
* If a user would reported something like "this change broke my app" it
obviously would be something totally different. But that did not happen
yet afaics -- or did it? But from the discussion it sounds like that is
something that will likely happen down the road. If that's the case I'd
say it's best to prevent that from happening.
* Not sure how Linus would react if a user would complain that some
workflow broke because rpc_stat are now per net namespace and shows
different numbers (e.g. using a format that does not break any apps). It
would likely depend on the actual case and how bad he would consider the
information leak.
> If it is indeed a regression, how can we go about retaining
> both behaviors (selectable by Kconfig or perhaps administrative
> UI)?
That likely might be the best idea if user report an actual regression
due to this. But switching the format of any existing file creates quite
some trouble, as others already mentioned in this thread. So maybe
providing the newer format in a different file and allowing to disable
the older one though a Kconfig setting might be the best way forward.
Sure, it would take years until people would have switched over, but
that's how it is with our "no regressions" rule.
Does that help?
Ciao, Thorsten
More information about the ltp
mailing list