Reimpl Buffer Protocol and memoryview support ndarray with shape, stride and suboffset#3340
Conversation
| #[derive(Debug)] | ||
| #[pyclass(noattr, module = false, name = "_buffered_raw_buffer")] | ||
| #[derive(Debug, PyValue)] | ||
| struct BufferedRawBuffer { | ||
| data: PyMutex<Vec<u8>>, | ||
| range: Range<usize>, | ||
| } | ||
| impl BufferInternal for BufferedRawBuffer { | ||
| fn obj_bytes(&self) -> BorrowedValue<[u8]> { | ||
| BorrowedValue::map(self.data.lock().into(), |data| &data[self.range.clone()]) | ||
| } | ||
| #[pyimpl(with(AsBuffer))] | ||
| impl BufferedRawBuffer {} | ||
|
|
||
| fn obj_bytes_mut(&self) -> BorrowedValueMut<[u8]> { | ||
| BorrowedValueMut::map(self.data.lock().into(), |data| { | ||
| &mut data[self.range.clone()] | ||
| }) | ||
| } | ||
| static BUFFERED_RAW_BUFFER_METHODS: BufferMethods = BufferMethods { |
There was a problem hiding this comment.
I am trying to implement BufferedRawBuffer as an internal class so we can downcast from pyobjectref when the buffer methods is called. but I am getting panic for 'static type has not been initialized', anything missing here?
@youknowone
There was a problem hiding this comment.
I am sorry, I didn't catch I was mentioned here. You added pyclass to BufferredRawBuffer but with noattr. I think a class is initialized only when it is a pyattr. In this case, you can manually initailize it in make_modue. Look for noattr in vm/src/stdlib/sys.rs. There are a few examples about it.
Probably we'd better to fix pymodule macro to initialize them regardless of noattr
| // GUARANTEE: called only if the buffer option is contiguous | ||
| pub contiguous: Option<fn(&PyObjectRef) -> BorrowedValue<[u8]>>, | ||
| // GUARANTEE: called only if the buffer option is contiguous | ||
| pub contiguous_mut: Option<fn(&PyObjectRef) -> BorrowedValueMut<[u8]>>, | ||
| // collect bytes to buf when buffer option is not contiguous | ||
| pub collect_bytes: Option<fn(&PyObjectRef, buf: &mut Vec<u8>)>, |
There was a problem hiding this comment.
These three function should return the bytes that available in the current class. compare to obj_bytes and obj_bytes_mut that should return the whole bytes for the top class.
88b32ec to
0041a81
Compare
| none_type: singletons::PyNone::init_bare_type().clone(), | ||
| not_implemented_type: singletons::PyNotImplemented::init_bare_type().clone(), | ||
| generic_alias_type: genericalias::PyGenericAlias::init_bare_type().clone(), | ||
| vec_buffer_type: VecBuffer::init_bare_type().clone(), |
There was a problem hiding this comment.
I had put the type here, it works, but I don't know does it has better solution?
There was a problem hiding this comment.
If you don't need to use it through vm.ctx.types.vec_buffer_type, you don't need to add a member here. static types will be saved for each of their static_class.
(In other word, if you need to use it like that, this is the right place.)
|
How CPython handle it? Are those python classes there too? |
0041a81 to
aec309c
Compare
CPython is using void* with buffer options for the buffer data, we cannot do like that because we may have the lock before the data, so we have to visit data via function that base on each class. Secondly, CPython manage the exports by manually call PyBuffer_Release that will call the bf_releasebuffer slots, we are manage it more safer way by wrap it with PyBuffer. As we are using static functions for what buffer need, our appoarch shouldn't have a lot of overhead. For contiguous, contiguous_mut and collect_bytes, I think there is a way to rid out it by move some logic from memoryview to PyBuffer, I will work on it push to here. For VecBuffer, CPython don't have it because it use void pointer, we need type to retrive the data from the python object so I use VecBuffer to wrap a Vec to support as a buffer. |
|
For the polymorphism using |
The original design for void* case is to use dyn T(dyn BufferInternal).
so I decide to use static functions and downcast the obj to visit the class. the only issue for it is when we need buffer a rust vec, we need a wrap object that implement PyValue so we can able to downcast it. but I think that is only const cost when initialize and zero cost on the runtime. |
aec309c to
111cd3b
Compare
| pub none_type: PyTypeRef, | ||
| pub not_implemented_type: PyTypeRef, | ||
| pub generic_alias_type: PyTypeRef, | ||
| pub vec_buffer_type: PyTypeRef, |
There was a problem hiding this comment.
@youknowone can you help me to move it away?
| } | ||
| } | ||
|
|
||
| pub struct SaturatedSliceIterator { |
There was a problem hiding this comment.
Rust iterators are typically named Iter (mostly because Iterator is a trait name)
| pub struct SaturatedSliceIterator { | |
| pub struct SaturatedSliceIter { |
| pub index: isize, | ||
| pub step: isize, | ||
| pub len: usize, |
There was a problem hiding this comment.
This is fields of iterator, but is it pub?
There was a problem hiding this comment.
no, it should not be pub.
| self.internal.obj_bytes().to_vec() | ||
| /// # Safety | ||
| /// assume the buffer is contiguous | ||
| pub unsafe fn contiguous(&self) -> BorrowedValue<[u8]> { |
There was a problem hiding this comment.
if we are going to name this under rust convention,
| pub unsafe fn contiguous(&self) -> BorrowedValue<[u8]> { | |
| pub unsafe fn contiguous_unchecked(&self) -> BorrowedValue<[u8]> { |
| self.obj_bytes_mut() | ||
| } | ||
|
|
||
| pub fn collect(&self, buf: &mut Vec<u8>) { |
There was a problem hiding this comment.
collect typically returns the collection itself, but this is extending existing vector.
| pub fn collect(&self, buf: &mut Vec<u8>) { | |
| pub fn append_to(&self, buf: &mut Vec<u8>) { |
|
|
||
| /// this function do not check the bound | ||
| /// panic if indices.len() != ndim | ||
| pub fn get_position_fast(&self, indices: &[usize]) -> isize { |
There was a problem hiding this comment.
if the name get_ is not coming from CPython, I want to suggest to remove it because this function is not perform getter or getitem. Just fast_position like fast_getitem?
6c6007e to
ffeb638
Compare
| let format_spec = Self::parse_format(&buffer.desc.format, vm)?; | ||
| let desc = buffer.desc.clone(); | ||
|
|
||
| let mut zelf = PyMemoryView { | ||
| buffer: ManuallyDrop::new(buffer), | ||
| released: AtomicCell::new(false), | ||
| start: range.start * itemsize, | ||
| stop: range.end * itemsize, | ||
| step: 1, | ||
| start: 0, | ||
| format_spec, | ||
| desc, | ||
| hash: OnceCell::new(), | ||
| } | ||
| .validate()) | ||
| } | ||
|
|
||
| fn as_contiguous(&self) -> Option<BorrowedValue<[u8]>> { | ||
| if !self.buffer.options.contiguous { | ||
| return None; | ||
| } | ||
| Some(self.obj_bytes()) | ||
| } | ||
| }; |
There was a problem hiding this comment.
these lines looks same as from_buffer. then will let mut zelf = Self::from_buffer(buffer, vm) works?
|
|
||
| fn retain(&self) {} | ||
| } | ||
| fn collect(&self, buf: &mut Vec<u8>) { |
There was a problem hiding this comment.
here is another collect not returning buffer
| @@ -123,17 +147,293 @@ impl TryFromBorrowedObject for PyBuffer { | |||
| // but it is not supported by Rust | |||
|
|
||
| #[pyimpl(flags(BASETYPE), with(Constructor))] | ||
| impl VecBuffer { | ||
| pub fn new(data: Vec<u8>) -> Self { |
There was a problem hiding this comment.
From is typically implemented for conversion
|
We should have the _test_buffer for multi-dimention buffer test. but I think this pr need to be merge now to fix the existing issue. |
There was a misunderstanding about the code for buffer protocol I wrote at beginning, so the related pr #3029 and #3043 finally broke the memoryview(the test did not show). When I review there pr I also just broken the mind, really sorry about that.
But in the pr I reimplement the protocol with more efficient way(thanks for #3121), this should works better.