On Tue, Sep 20, 2022 at 02:08:02PM +0200, Christoph Hellwig wrote: >> -static int bio_map_user_iov(struct request *rq, struct iov_iter *iter, >> +static struct bio *bio_map_get(struct request *rq, unsigned int nr_vecs, >> gfp_t gfp_mask) > >bio_map_get is a very confusing name. So I chose that name because functionality is opposite of what we do inside existing bio_map_put helper. In that way it is symmetric. >And I also still think this is >the wrong way to go. If plain slab allocations don't use proper >per-cpu caches we have a MM problem and need to talk to the slab >maintainers and not use the overkill bio_set here. This series is not about using (or not using) bio-set. Attempt here has been to use pre-mapped buffers (and bvec) that we got from io_uring. >> +/* Prepare bio for passthrough IO given an existing bvec iter */ >> +int blk_rq_map_user_bvec(struct request *rq, struct iov_iter *iter) > >I'm a little confused about the interface we're trying to present from >the block layer to the driver here. > >blk_rq_map_user_iov really should be able to detect that it is called >on a bvec iter and just do the right thing rather than needing different >helpers. I too explored that possibility, but found that it does not. It maps the user-pages into bio either directly or by doing that copy (in certain odd conditions) but does not know how to deal with existing bvec. Reason, I guess, is no one felt the need to try passthrough for bvecs before. It makes sense only in context of io_uring passthrough. And it really felt cleaner to me write a new function rather than overloading the blk_rq_map_user_iov with multiple if/else canals. I tried that again after your comment, but it does not seem to produce any good-looking code. The other factor is - it seemed safe to go this way as I am more sure that I will not break something else (using blk_rq_map_user_iov). >> + /* >> + * If the queue doesn't support SG gaps and adding this >> + * offset would create a gap, disallow it. >> + */ >> + if (bvprvp && bvec_gap_to_prev(lim, bvprvp, bv->bv_offset)) >> + goto out_err; > >So now you limit the input that is accepted? That's not really how >iov_iters are used. We can either try to reshuffle the bvecs, or >just fall back to the copy data version as blk_rq_map_user_iov does >for 'weird' itersĖ™ Since I was writing a 'new' helper for passthrough only, I thought it will not too bad to just bail out (rather than try to handle it using copy) if we hit this queue_virt_boundary related situation. To handle it the 'copy data' way we would need this - 585 else if (queue_virt_boundary(q)) 586 copy = queue_virt_boundary(q) & iov_iter_gap_alignment(iter); 587 But iov_iter_gap_alignment does not work on bvec iters. Line #1274 below 1264 unsigned long iov_iter_gap_alignment(const struct iov_iter *i) 1265 { 1266 unsigned long res = 0; 1267 unsigned long v = 0; 1268 size_t size = i->count; 1269 unsigned k; 1270 1271 if (iter_is_ubuf(i)) 1272 return 0; 1273 1274 if (WARN_ON(!iter_is_iovec(i))) 1275 return ~0U; Do you see a way to overcome this. Or maybe this can be revisted as we are not missing a lot? >> + >> + /* check full condition */ >> + if (nsegs >= nr_segs || bytes > UINT_MAX - bv->bv_len) >> + goto out_err; >> + >> + if (bytes + bv->bv_len <= nr_iter && >> + bv->bv_offset + bv->bv_len <= PAGE_SIZE) { >> + nsegs++; >> + bytes += bv->bv_len; >> + } else >> + goto out_err; > >Nit: This would read much better as: > > if (bytes + bv->bv_len > nr_iter) > goto out_err; > if (bv->bv_offset + bv->bv_len > PAGE_SIZE) > goto out_err; > > bytes += bv->bv_len; > nsegs++; Indeed, cleaner. Thanks.