Skip to content

Commit 1c50bcf

Browse files
prishajain1csgoogle
authored andcommitted
Add LTX2 Video VAE implementation
1 parent 347644f commit 1c50bcf

File tree

1 file changed

+56
-0
lines changed

1 file changed

+56
-0
lines changed

Gemini.md

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
# Gemini Development Notes
2+
3+
This file contains notes and lessons learned during development to avoid repeating mistakes.
4+
5+
## Code Formatting
6+
7+
- **Guideline**: Always format the code after making changes.
8+
- **Tools**:
9+
- Use `make style` to format using `black` and `ruff`.
10+
- Use `./code_style.sh` (if configured to use the correct `pyink`) to format using `pyink`.
11+
- Use `make quality` to check formatting without modifying files.
12+
- **Manual Pyink**: To run `pyink` with specific settings (e.g. indentation=2, line-length=125):
13+
```bash
14+
pyink src/maxdiffusion --check --diff --color --pyink-indentation=2 --line-length=125
15+
```
16+
- **Ruff Cleanup**: To remove unused imports using `ruff`:
17+
```bash
18+
ruff check src/maxdiffusion/tests/wan_animate/test_transformer_wan_animate.py --fix
19+
```
20+
21+
22+
## Virtual Environment and Python Version
23+
24+
- **Issue**: The virtual environment `.venv` uses a `python3.13` directory for dependencies, but `python3` invocations might pick up the system python 3.9 if activated incorrectly or if the `bin/python` symlink points to system python.
25+
- **Solution**: Always use the absolute path to the intended python version inside the virtual environment when running tests or scripts that depend on specific packages like `diffusers`.
26+
- Example: `/Users/sagarchapara/repos/maxdiffusion/.venv/bin/python3.13`
27+
28+
## Artifact Paths
29+
30+
- **Issue**: Attempting to write artifacts (implementation plans, tasks, walkthroughs) to the codebase.
31+
- **Solution**: Always write artifacts to the artifact directory specific to the conversation.
32+
- Path: `<appDataDir>/brain/<conversation-id>/`
33+
34+
## Wan Animate Face Encoder
35+
36+
### Shape Expectations
37+
38+
- The test `test_wan_animate_face_encoder_shape` previously had an incorrect expectation `(2, 5, 5, 512)`.
39+
- The correct expectation for `WanAnimateFaceEncoder` output is `(2, 3, 5, 512)` (due to concatenation at the end adding 1 to the 3rd dimension after reshaping).
40+
41+
### Weight Mapping
42+
43+
- **Convolutions**: PyTorch 1D convolutional weights of shape `(out_channels, in_channels, kernel_size)` (e.g., `(4096, 512, 3)`) must be transposed to `(kernel_size, in_channels, out_channels)` (e.g., `(3, 512, 4096)`) using `transpose(2, 1, 0)` for `nnx.Conv`.
44+
- **Linears**: PyTorch linear weights `(out_features, in_features)` must be transposed to `(in_features, out_features)` for `nnx.Linear`.
45+
- **LayerNorm**: PyTorch typically does not use bias and scale in `WanAnimateFaceEncoder` (or they are fixed), which maps to `use_bias=False` and `use_scale=False` in JAX.
46+
47+
## Wan Animate Face Block Cross Attention
48+
49+
### Equivalence Test
50+
51+
- Added `test_equivalence_face_block_cross_attention` to `src/maxdiffusion/tests/wan_animate/test_transformer_wan_animate.py` to verify equivalence between `FlaxWanAnimateFaceBlockCrossAttention` and `WanAnimateFaceBlockCrossAttention` in `diffusers`.
52+
- The test transfers weights and compares outputs for random inputs, asserting a tolerance of `1e-4`.
53+
54+
### Temp Files Removal
55+
56+
- Removed temporary inspection files created during the development process.

0 commit comments

Comments
 (0)