Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
| timesteps = timestep | ||
| timesteps = timesteps.to(sample.device) |
There was a problem hiding this comment.
Instead of using _to_tensor() here we simply do the device placement since timestep here will always come as a tensor IIUC.
There was a problem hiding this comment.
ok here but I didn't see that how timestep is forced to be tensor in the pipeline - note that the code to create timesteps is
self.scheduler.set_timesteps(num_inference_steps, device=device)
timesteps = self.scheduler.timesteps
so whether timesteps is a tensor or not it is really up to the how set_timesteps is implemented in the scheduler hence we still have to apply to_tensor inside the pipeline
There was a problem hiding this comment.
Fixed in 3d26814.
If a scheduler in the pipeline changes and it set_timesteps() doesn't ensure tensor packing, it's gonna fail. I think the bigger question might be to think whether we should start setting timesteps to always be tensors from set_timesteps(). But I also think it's not a super high-prio right now.
yiyixuxu
left a comment
There was a problem hiding this comment.
Thanks!
left one feedback and also let's fix the argument about attention_dim (see comment here #6665 (comment))
I'm ok with not refactoring the image pre-processing for now
| timesteps = timestep | ||
| timesteps = timesteps.to(sample.device) |
There was a problem hiding this comment.
ok here but I didn't see that how timestep is forced to be tensor in the pipeline - note that the code to create timesteps is
self.scheduler.set_timesteps(num_inference_steps, device=device)
timesteps = self.scheduler.timesteps
so whether timesteps is a tensor or not it is really up to the how set_timesteps is implemented in the scheduler hence we still have to apply to_tensor inside the pipeline
| Number of layers to be skipped from CLIP while computing the prompt embeddings. A value of 1 means that | ||
| the output of the pre-final layer will be used for computing the prompt embeddings. | ||
| """ | ||
| # set lora scale so that monkey patched LoRA |
|
Going to merge it now and tackle the |
* remove _to_tensor * remove _to_tensor definition * remove _collapse_frames_into_batch * remove lora for not bloating the code. * remove sample_size. * simplify code a bit more * ensure timesteps are always in tensor.
What does this PR do?
Follow-up of #6665.
FYI @patrickvonplaten
I didn't address https://github.com/huggingface/diffusers/pull/6665/files#r1471447995 because I don't think the resizing scheme should be made a part of
image_processoras it's quite specific I2VGenXL.I will run the slow test of I2VGenXL once things look good just to be sure.