[LTP] [PATCH] vfs: fix readahead(2) on block devices

Amir Goldstein amir73il@gmail.com
Sun Sep 24 13:47:42 CEST 2023


On Sun, Sep 24, 2023 at 9:46 AM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Sun, Sep 24, 2023 at 6:48 AM Reuben Hawkins <reubenhwk@gmail.com> wrote:
> >
> >
> >
> > On Sat, Sep 23, 2023 at 10:48 AM Amir Goldstein <amir73il@gmail.com> wrote:
> >>
> >> On Sat, Sep 23, 2023 at 5:41 PM Matthew Wilcox <willy@infradead.org> wrote:
> >> >
> >> > On Sat, Sep 23, 2023 at 08:56:28AM +0300, Amir Goldstein wrote:
> >> > > We decided to deliberately try the change of behavior
> >> > > from EINVAL to ESPIPE, to align with fadvise behavior,
> >> > > so eventually the LTP test should be changed to allow both.
> >> > >
> >> > > It was the test failure on the socket that alarmed me.
> >> > > However, if we will have to special case socket in
> >> > > readahead() after all, we may as well also special case
> >> > > pipe with it and retain the EINVAL behavior - let's see
> >> > > what your findings are and decide.
> >> >
> >> > If I read it correctly, LTP is reporting that readhaead() on a socket
> >> > returned success instead of an error.  Sockets do have a_ops, right?
> >> > It's set to empty_aops in inode_init_always, I think.
> >> >
> >>
> >> Yeh, you are right.
> >> I guess the check !f.file->f_mapping->a_ops is completely futile
> >> in that code. It's the only place I could find this sort of check
> >> except for places like:
> >> if (f->f_mapping->a_ops && f->f_mapping->a_ops->direct_IO)
> >> which just looks like a coding habit.
> >>
> >> > It would be nice if we documented somewhere which pointers should be
> >> > checked for NULL for which cases ... it doesn't really make sense for
> >> > a socket inode to have an i_mapping since it doesn't have pagecache.
> >> > But maybe we rely on i_mapping always being set.
> >> >
> >>
> >> I can't imagine that a socket has f_mapping.
> >> There must have been something off with this specific bug report,
> >> because it was reported on a WIP patch.
> >>
> >> > Irritatingly, POSIX specifies ESPIPE for pipes, but does not specify
> >> > what to do with sockets.  It's kind of a meaningless syscall for
> >> > any kind of non-seekable fd.  lseek() returns ESPIPE for sockets
> >> > as well as pipes, so I'd see this as an oversight.
> >> > https://pubs.opengroup.org/onlinepubs/9699919799/functions/posix_fadvise.html
> >> > https://pubs.opengroup.org/onlinepubs/9699919799/functions/lseek.html
> >> >
> >>
> >> Indeed, we thought it wouldn't be too bad to align the
> >> readahead errors with those of posix_fadvise.
> >> That's why we asked to remove the S_ISFIFO check for v2.
> >> But looking again, pipe will get EINVAL for !f_mapping, so the
> >> UAPI wasn't changed at all and we were just talking BS all along.
> >> Let's leave the UAPI as is.
> >>
> >> > Of course readahead() is a Linux-specific syscall, so we can do whatever
> >> > we want here, but I'm really tempted to just allow readahead() for
> >> > regular files and block devices.  Hmm.  Can we check FMODE_LSEEK
> >> > instead of (S_ISFILE || S_ISBLK)?
> >>
> >> I think the f_mapping check should be good.
> >> Reuben already said he could not reproduce the LTP failure with
> >> v2 that is on Christian's vfs.misc branch.
> >>
> >> The S_ISREG check I put in the Fixes commit was completely
> >> unexplained in the commit message and completely unneeded.
> >> Just removing it as was done in v2 is enough.
> >>
> >> However, v2 has this wrong comment in the commit message:
> >> "The change also means that readahead will return -ESPIPE
> >>   on FIFO files instead of -EINVAL."
> >>
> >> We need to remove this comment
> >> and could also remove the unneeded !f.file->f_mapping->a_ops
> >> check while at it.
> >>
> >> Thanks,
> >> Amir.
> >
> >
> > It looks to me like the following will fix the problem without breaking the tests...
> >
> > - if (!f.file->f_mapping || !f.file->f_mapping->a_ops ||
> > -    !S_ISREG(file_inode(f.file)->i_mode))
> > + if (!(f.file->f_mode & FMODE_LSEEK))
> >
> > ...I'll put this in a v3 patch soon unless somebody can spot any reasons why
> > this is no good.
>
> I am confused.
> I thought you wrote that v2 did not break the test.
> Why then is this change to use FMODE_LSEEK needed?
> Why is it not enough to leave just
>    if (!f.file->f_mapping)
>

Christian,

I cleared the confusion with Reuben off-list.

V2 on current vfs.misc is good as is, in the sense that it does
what we expected it to do - it breaks the LTP test for pipe because
the error value has changed from EINVAL to ESPIPE.
The error value for socket (EINVAL) has not changed.

Matthew,

Since you joined the discussion, you have the opportunity to agree or
disagree with our decision to change readahead() to ESPIPE.
Judging  by your citing of lseek and posix_fadvise standard,
I assume that you will be on board?

I think that the FMODE_LSEEK test would have been a good idea if
we wanted to preserve the EINVAL error code for pipe, but
I don't think that is the case?

Thanks,
Amir.


More information about the ltp mailing list