Implement Buffer Protocol#2226
Conversation
|
Slots is a data field of type (PyClass). For now, we typically iterate mro to find proper field of slots. |
| use crate::{bytesinner::try_as_bytes, pyobject::IntoPyObject}; | ||
| use crossbeam_utils::atomic::AtomicCell; | ||
|
|
||
| pub trait BufferProtocol: Debug + Sync + Send { |
There was a problem hiding this comment.
Instead of Send + Sync here, you probably want to use PyThreadingConstraint from pyobject
|
#2195 I have done it, I will push out when I passed the memoryview unittest. |
7a3ae26 to
df78720
Compare
|
@qingshi163 I attached a commit to change Box to plain fn c0ecc35 . Because this is using read-only slot unlike other ones, It needed different storing method. I couldn't catch it on the previous comments. |
| } | ||
|
|
||
| #[pyimpl] | ||
| pub trait Bufferable: PyValue { |
There was a problem hiding this comment.
I think the original name AsBuffer you used was better. "hashable object" is a term in python(or cpython?) but bufferable is not.
Though "comparable object" is also nothing in python. @coolreader18 any idea about naming?
There was a problem hiding this comment.
I think AsBuffer is good, "buffer" isn't a verb so it doesn't really make sense as Verbable and there's precedence for AsNoun as a trait name in Rust, e.g. AsRawFd, AsRef
There was a problem hiding this comment.
AsBuffer is good, but the trait function name I changed it to get_buffer as same as CPython, also because it did return a new BufferProtocol. so we should go back to AsBuffer or something like GetBuffer? what we perfer?
There was a problem hiding this comment.
I personally think we can keep it as AsBuffer
There was a problem hiding this comment.
That sounds like the implementation trait name can be Buffer and the desriptor trait name can be BufferProtocol. Like,
- AsBuffer(Bufferable) -> BufferProtocol
- BufferProtocol -> Buffer
Also I personally prefer to reuse CPython terms if there is no special reason not to do. Because we are practically read and port CPython code everyday. So always +1 for following CPython naming.
There was a problem hiding this comment.
That makes sense, since a trait is inherently a "protocol", so there's no need to specify that stuff that a buffer can do is a protocol; it's redundant
There was a problem hiding this comment.
That sounds good!
c0ecc35 to
e769804
Compare
|
|
||
| fn as_contiguous(&self) -> Option<BorrowedValue<[u8]>> { | ||
| let options = self.get_options(); | ||
| if !options.contiguous { | ||
| return None; | ||
| } | ||
| Some(self.obj_bytes()) | ||
| } | ||
|
|
||
| fn as_contiguous_mut(&self) -> Option<BorrowedValueMut<[u8]>> { | ||
| let options = self.get_options(); | ||
| if !options.contiguous { | ||
| return None; | ||
| } | ||
| Some(self.obj_bytes_mut()) | ||
| } | ||
|
|
There was a problem hiding this comment.
the consumer should call as_contiguous if they want just a plain buffer. Should we also need to add a function here for consuming a non-contiguous buffer like a sliced memoryview? Like a iter() return the unpacked object?
| pub trait BufferProtocol: Debug + PyThreadingConstraint { | ||
| // TODO: return reference to avoid copy | ||
| fn get_options(&self) -> BufferOptions; | ||
| fn obj_bytes(&self) -> BorrowedValue<[u8]>; | ||
| fn obj_bytes_mut(&self) -> BorrowedValueMut<[u8]>; | ||
| fn release(&self); | ||
| fn is_resizable(&self) -> bool; |
There was a problem hiding this comment.
I renamed as_bytes to obj_bytes because it should always return the full memory range for the original object. as_bytes may easier to be miss using.
| impl FormatCode { | ||
| fn unit_size(&self) -> usize { | ||
| // XXX: size of l L q Q is platform depended? | ||
| match self.code { | ||
| 'x' | 'c' | 'b' | 'B' | '?' | 's' | 'p' => 1, | ||
| 'h' | 'H' => 2, | ||
| 'i' | 'l' | 'I' | 'L' | 'f' => 4, | ||
| 'q' | 'Q' | 'd' => 8, | ||
| 'i' | 'I' | 'f' => 4, | ||
| 'l' | 'L' | 'q' | 'Q' | 'd' => 8, | ||
| 'n' | 'N' | 'P' => std::mem::size_of::<usize>(), | ||
| c => { |
There was a problem hiding this comment.
I have to change the unit size for 'l' and 'L' to 8, because that is how array.array doing. I don't know is any compatible problem, but in my machine CPython 'l' and 'L' is 8 bytes.
There was a problem hiding this comment.
isn't it architecture dependent value?
There was a problem hiding this comment.
/* Integers */
case 'h':
intsize = sizeof(short);
is_signed = 1;
break;
case 'H':
intsize = sizeof(short);
is_signed = 0;
break;
case 'i':
intsize = sizeof(int);
is_signed = 1;
break;
case 'I':
intsize = sizeof(int);
is_signed = 0;
break;
case 'l':
intsize = sizeof(long);
is_signed = 1;
break;
case 'L':
intsize = sizeof(long);
is_signed = 0;
break;
case 'q':
intsize = sizeof(long long);
is_signed = 1;
break;
case 'Q':
intsize = sizeof(long long);
is_signed = 0;
break;
This is from CPython, should we put it somewhere together? because we have to make sure array.array and struct have the same size for the type.
There was a problem hiding this comment.
For common platforms, using std::mem::size_of::<isize> or cfg target_pointer_width work. It fits for common platforms but not perfect. using libc always will fit as it defined in CPython. Try std::mem::size_of::<libc::c_int>
There was a problem hiding this comment.
I think we should do it in another PR and try unify the reflect from format code to size and the type.
e769804 to
258ffcb
Compare
| drop(s.1); | ||
| read_rwlock(s.0) |
There was a problem hiding this comment.
| drop(s.1); | |
| read_rwlock(s.0) | |
| s.1 |
| Self::Ref(r) => BorrowedValue::Ref(f(r)), | ||
| Self::MuLock(m) => BorrowedValue::MappedMuLock(PyMutexGuard::map(m, |x| unsafe { | ||
| #[allow(mutable_transmutes, clippy::transmute_ptr_to_ptr)] | ||
| std::mem::transmute(f(x)) |
There was a problem hiding this comment.
This isn't sound -- somebody could match to get the mutex guard out of the enum, and then treat it as mutable. I ran into the same problem, and I'm not sure exactly what the solution is. Maybe try something in cell? PyImmutableMappedMutexGuard or something?
There was a problem hiding this comment.
I could try making that later today if you don't feel confident about it; I think it would be like { data: *const T, raw: &'a parking_lot::RawMutex, _marker: PhantomData<(&'a T, RawMutex::GuardMarker)> }
There was a problem hiding this comment.
If you can fix that will be perfect, I have no clue to get it soundless..
There was a problem hiding this comment.
@coolreader18 will you merge #2239 before or after the pr?
3403be7 to
9e5032f
Compare
|
Oooh, here's an idea: |
|
But the obj and the buffer in the memoryview could be different, the obj should always point to the original data source and the buffer is the controller shows how can we visit the data. That is how I implement it as the memoryview build from a memoryview the obj is cloned from original object. |
|
I want to know if it is ok to merge, it is not fully completed but works. |
|
Oh, I suppose that's true. Nevermind |
| # Not implemented: multidimensional slices | ||
| slices = (slice(0,1,1), slice(0,1,2)) | ||
| self.assertRaises(NotImplementedError, setitem, slices, b"a") | ||
| # TODO: RUSTPYTHON |
There was a problem hiding this comment.
I know this PR enabled a lot of parts from this test, but unless it is fully resolved, I prefer to keep the whole test as expectedFailure before merge. this is only trackable by code, but expectedFailure is trackable by test result. Once the feature is fully done, it will be detected if it is expectedFailiure but not if it is commented out.
| assert a.obj == obj | ||
|
|
||
| assert a[2:3] == b"c" | ||
| # assert a[2:3] == b"c" |
| elements: k.try_bytes(|v| v.to_vec()).unwrap() | ||
| }), | ||
| l @ PyList => l.to_byte_inner(vm), | ||
| // TODO: PyTyple |
6c72e69 to
9c3d97b
Compare
| op: PyComparisonOp, | ||
| vm: &VirtualMachine, | ||
| ) -> PyComparisonValue { | ||
| // TODO: bytes can compare with any object implemented buffer protocol | ||
| // but not memoryview, and not equal if compare with unicode str(PyStr) | ||
| PyComparisonValue::from_option( | ||
| try_bytes_like(vm, other, |other| { | ||
| op.eval_ord(self.elements.as_slice().cmp(other)) |
There was a problem hiding this comment.
Now we have different behavour between (memoryview op bytes) and (bytes op memoryview), Is any solution better than check the type here?
@coolreader18
There was a problem hiding this comment.
Does memory view not implement the buffer protocol? I think that would maybe fix the issue(?)
There was a problem hiding this comment.
But memoryview is implemented buffer protocol, we will have much more issue if not so.
There was a problem hiding this comment.
Huh, that's strange, I think that's probably fine for now.
| // TODO: generic way from &[PyObjectRef] | ||
| l @ PyList => l.to_byte_inner(vm), | ||
| t @ PyTuple => t.to_bytes_inner(vm), | ||
| obj => { | ||
| let iter = vm.get_method_or_type_error(obj.clone(), "__iter__", || { | ||
| format!("a bytes-like object is required, not {}", obj.class()) |
There was a problem hiding this comment.
I think we could have a trait for the objects that can be borrowed as &[PyObjectRef], so we can have a generic and efficiency way to iter.
9c3d97b to
604bd05
Compare
|
What that random ci fail comes from? |
dcca305 to
4e08961
Compare
4e08961 to
2ed49bc
Compare
|
@qingshi163 I rebased this to master with the new mutex/ |
|
@coolreader18 Thanks, I think I need review for merge now. |
07959c2 to
896590c
Compare
coolreader18
left a comment
There was a problem hiding this comment.
I think this overall looks really good!! I'm fine with merging it and fixing leftover issues later, since this is so big already. @youknowone what do you think?
|
And don't worry about that CI failure; I can fix that when I merge. |
|
oops, after a few days of resting, I totally forgot to merge this first before other PRs. I agree to merge this and fix other stuffs later. Let me fix the conflicts. |
896590c to
8a0b2d9
Compare
8a0b2d9 to
88f5466
Compare
|
I fixed a few commit messages because fisrt two 'fix test' commits are not fixing test but disabling it. |
I add a slot for implement the buffer protocol, more or less like what CPython doing. I am not sure is the right way for RustPython. Also I am confusing about the slots seem do not inherit from the base? So how we control the behavour of the subclass?