[INFO] cloning repository https://github.com/aemiguel/krasis
[INFO] running `Command { std: "git" "-c" "credential.helper=" "-c" "credential.helper=/workspace/cargo-home/bin/git-credential-null" "clone" "--bare" "https://github.com/aemiguel/krasis" "/workspace/cache/git-repos/https%3A%2F%2Fgithub.com%2Faemiguel%2Fkrasis", kill_on_drop: false }`
[INFO] [stderr] Cloning into bare repository '/workspace/cache/git-repos/https%3A%2F%2Fgithub.com%2Faemiguel%2Fkrasis'...
[INFO] running `Command { std: "git" "rev-parse" "HEAD", kill_on_drop: false }`
[INFO] [stdout] 43eb8850ae0df873d6cf2d32e4592af4d242ba02
[INFO] checking aemiguel/krasis against master#57f772f25c5ce2bd870d6f8c3ab318eaee5a3326 for pr-133502-22
[INFO] running `Command { std: "git" "clone" "/workspace/cache/git-repos/https%3A%2F%2Fgithub.com%2Faemiguel%2Fkrasis" "/workspace/builds/worker-2-tc1/source", kill_on_drop: false }`
[INFO] [stderr] Cloning into '/workspace/builds/worker-2-tc1/source'...
[INFO] [stderr] done.
[INFO] [stderr] Updating files:  54% (2226/4121)
Updating files:  55% (2267/4121)
Updating files:  56% (2308/4121)
Updating files:  57% (2349/4121)
Updating files:  57% (2375/4121)
Updating files:  58% (2391/4121)
Updating files:  59% (2432/4121)
Updating files:  60% (2473/4121)
Updating files:  61% (2514/4121)
Updating files:  62% (2556/4121)
Updating files:  63% (2597/4121)
Updating files:  64% (2638/4121)
Updating files:  65% (2679/4121)
Updating files:  66% (2720/4121)
Updating files:  67% (2762/4121)
Updating files:  68% (2803/4121)
Updating files:  69% (2844/4121)
Updating files:  70% (2885/4121)
Updating files:  71% (2926/4121)
Updating files:  72% (2968/4121)
Updating files:  73% (3009/4121)
Updating files:  74% (3050/4121)
Updating files:  75% (3091/4121)
Updating files:  76% (3132/4121)
Updating files:  77% (3174/4121)
Updating files:  78% (3215/4121)
Updating files:  79% (3256/4121)
Updating files:  80% (3297/4121)
Updating files:  81% (3339/4121)
Updating files:  82% (3380/4121)
Updating files:  83% (3421/4121)
Updating files:  83% (3426/4121)
Updating files:  84% (3462/4121)
Updating files:  85% (3503/4121)
Updating files:  86% (3545/4121)
Updating files:  86% (3567/4121)
Updating files:  87% (3586/4121)
Updating files:  88% (3627/4121)
Updating files:  89% (3668/4121)
Updating files:  90% (3709/4121)
Updating files:  91% (3751/4121)
Updating files:  91% (3789/4121)
Updating files:  92% (3792/4121)
Updating files:  93% (3833/4121)
Updating files:  94% (3874/4121)
Updating files:  95% (3915/4121)
Updating files:  95% (3936/4121)
Updating files:  96% (3957/4121)
Updating files:  97% (3998/4121)
Updating files:  98% (4039/4121)
Updating files:  99% (4080/4121)
Updating files: 100% (4121/4121)
Updating files: 100% (4121/4121), done.
[INFO] started tweaking git repo https://github.com/aemiguel/krasis
[INFO] finished tweaking git repo https://github.com/aemiguel/krasis
[INFO] tweaked toml for git repo https://github.com/aemiguel/krasis written to /workspace/builds/worker-2-tc1/source/Cargo.toml
[INFO] validating manifest of git repo https://github.com/aemiguel/krasis on toolchain 57f772f25c5ce2bd870d6f8c3ab318eaee5a3326
[INFO] running `Command { std: CARGO_HOME="/workspace/cargo-home" RUSTUP_HOME="/workspace/rustup-home" "/workspace/cargo-home/bin/cargo" "+57f772f25c5ce2bd870d6f8c3ab318eaee5a3326" "metadata" "--manifest-path" "Cargo.toml" "--no-deps", kill_on_drop: false }`
[INFO] crate git repo https://github.com/aemiguel/krasis already has a lockfile, it will not be regenerated
[INFO] running `Command { std: CARGO_HOME="/workspace/cargo-home" RUSTUP_HOME="/workspace/rustup-home" "/workspace/cargo-home/bin/cargo" "+57f772f25c5ce2bd870d6f8c3ab318eaee5a3326" "fetch" "--manifest-path" "Cargo.toml", kill_on_drop: false }`
[INFO] [stderr]     Blocking waiting for file lock on package cache
[INFO] [stderr]     Updating crates.io index
[INFO] [stderr]     Blocking waiting for file lock on package cache
[INFO] [stderr]  Downloading crates ...
[INFO] [stderr]   Downloaded safetensors v0.4.5
[INFO] [stderr]   Downloaded tokenizers v0.21.4
[INFO] [stderr]   Downloaded memo-map v0.3.3
[INFO] [stderr]   Downloaded minijinja v2.19.0
[INFO] [stderr]   Downloaded cudarc v0.12.1
[INFO] running `Command { std: "docker" "create" "-v" "/var/lib/crater-agent-workspace/builds/worker-2-tc1/target:/opt/rustwide/target:rw,Z" "-v" "/var/lib/crater-agent-workspace/builds/worker-2-tc1/source:/opt/rustwide/workdir:ro,Z" "-v" "/var/lib/crater-agent-workspace/cargo-home:/opt/rustwide/cargo-home:ro,Z" "-v" "/var/lib/crater-agent-workspace/rustup-home:/opt/rustwide/rustup-home:ro,Z" "-e" "SOURCE_DIR=/opt/rustwide/workdir" "-e" "CARGO_TARGET_DIR=/opt/rustwide/target" "-e" "CARGO_HOME=/opt/rustwide/cargo-home" "-e" "RUSTUP_HOME=/opt/rustwide/rustup-home" "-w" "/opt/rustwide/workdir" "-m" "1610612736" "--user" "0:0" "--network" "none" "ghcr.io/rust-lang/crates-build-env/linux@sha256:d429b63d4308055ea97f60fb1d3dfca48854a00942f1bd2ad806beaf015945ec" "/opt/rustwide/cargo-home/bin/cargo" "+57f772f25c5ce2bd870d6f8c3ab318eaee5a3326" "metadata" "--no-deps" "--format-version=1", kill_on_drop: false }`
[INFO] [stdout] d9809654e4e4f6d77a26f1ff21006b4792c31388ba91d20b2cc6d500314d77a7
[INFO] running `Command { std: "docker" "start" "-a" "d9809654e4e4f6d77a26f1ff21006b4792c31388ba91d20b2cc6d500314d77a7", kill_on_drop: false }`
[INFO] running `Command { std: "docker" "inspect" "d9809654e4e4f6d77a26f1ff21006b4792c31388ba91d20b2cc6d500314d77a7", kill_on_drop: false }`
[INFO] running `Command { std: "docker" "rm" "-f" "d9809654e4e4f6d77a26f1ff21006b4792c31388ba91d20b2cc6d500314d77a7", kill_on_drop: false }`
[INFO] [stdout] d9809654e4e4f6d77a26f1ff21006b4792c31388ba91d20b2cc6d500314d77a7
[INFO] running `Command { std: "docker" "create" "-v" "/var/lib/crater-agent-workspace/builds/worker-2-tc1/target:/opt/rustwide/target:rw,Z" "-v" "/var/lib/crater-agent-workspace/builds/worker-2-tc1/source:/opt/rustwide/workdir:ro,Z" "-v" "/var/lib/crater-agent-workspace/cargo-home:/opt/rustwide/cargo-home:ro,Z" "-v" "/var/lib/crater-agent-workspace/rustup-home:/opt/rustwide/rustup-home:ro,Z" "-e" "SOURCE_DIR=/opt/rustwide/workdir" "-e" "CARGO_TARGET_DIR=/opt/rustwide/target" "-e" "CARGO_INCREMENTAL=0" "-e" "RUST_BACKTRACE=full" "-e" "RUSTFLAGS=--cap-lints=forbid" "-e" "RUSTDOCFLAGS=--cap-lints=forbid" "-e" "CARGO_HOME=/opt/rustwide/cargo-home" "-e" "RUSTUP_HOME=/opt/rustwide/rustup-home" "-w" "/opt/rustwide/workdir" "-m" "1610612736" "--user" "0:0" "--network" "none" "ghcr.io/rust-lang/crates-build-env/linux@sha256:d429b63d4308055ea97f60fb1d3dfca48854a00942f1bd2ad806beaf015945ec" "/opt/rustwide/cargo-home/bin/cargo" "+57f772f25c5ce2bd870d6f8c3ab318eaee5a3326" "check" "--frozen" "--all" "--all-targets" "--message-format=json", kill_on_drop: false }`
[INFO] [stdout] b30848976230aff2fe214237aba0685a4ba220942724bb1bae64a20f6a3c55b8
[INFO] running `Command { std: "docker" "start" "-a" "b30848976230aff2fe214237aba0685a4ba220942724bb1bae64a20f6a3c55b8", kill_on_drop: false }`
[INFO] [stderr]    Compiling libc v0.2.185
[INFO] [stderr]    Compiling once_cell v1.21.4
[INFO] [stderr]     Checking memchr v2.8.0
[INFO] [stderr]    Compiling cc v1.2.60
[INFO] [stderr]     Checking regex-syntax v0.8.10
[INFO] [stderr]    Compiling pkg-config v0.3.33
[INFO] [stderr]    Compiling syn v2.0.117
[INFO] [stderr]    Compiling pyo3-build-config v0.23.5
[INFO] [stderr]    Compiling ahash v0.8.12
[INFO] [stderr]    Compiling esaxx-rs v0.1.10
[INFO] [stderr]     Checking rayon v1.12.0
[INFO] [stderr]     Checking castaway v0.2.4
[INFO] [stderr]     Checking unicode-segmentation v1.13.2
[INFO] [stderr]    Compiling cudarc v0.12.1
[INFO] [stderr]    Compiling macro_rules_attribute-proc_macro v0.2.2
[INFO] [stderr]     Checking base64 v0.13.1
[INFO] [stderr]     Checking bitflags v2.11.1
[INFO] [stderr]     Checking anstream v1.0.0
[INFO] [stderr]     Checking unicode-normalization-alignments v0.1.12
[INFO] [stderr]     Checking jiff v0.2.23
[INFO] [stderr]     Checking inventory v0.3.24
[INFO] [stderr]     Checking unindent v0.2.4
[INFO] [stderr]     Checking memo-map v0.3.3
[INFO] [stderr]     Checking aho-corasick v1.1.4
[INFO] [stderr]     Checking serde_json v1.0.149
[INFO] [stderr]     Checking nom v7.1.3
[INFO] [stderr]     Checking macro_rules_attribute v0.2.2
[INFO] [stderr]    Compiling onig_sys v69.9.1
[INFO] [stderr]    Compiling krasis v0.1.66-rc29 (/opt/rustwide/workdir)
[INFO] [stderr] warning: krasis@0.1.66-rc29: KRASIS_BUILD_TIMING phase="probe libnuma" duration_ms=1 duration_s=0.002
[INFO] [stderr] warning: krasis@0.1.66-rc29: nvcc not found — GPU decode kernels disabled
[INFO] [stderr] warning: krasis@0.1.66-rc29: KRASIS_BUILD_TIMING phase="decode PTX" duration_ms=4 duration_s=0.004
[INFO] [stderr] warning: krasis@0.1.66-rc29: nvcc not found — GPU prefill kernels disabled
[INFO] [stderr] warning: krasis@0.1.66-rc29: KRASIS_BUILD_TIMING phase="prefill PTX" duration_ms=0 duration_s=0.000
[INFO] [stderr] warning: krasis@0.1.66-rc29: nvcc not found — HQQ CUDA search disabled
[INFO] [stderr] warning: krasis@0.1.66-rc29: KRASIS_BUILD_TIMING phase="HQQ search PTX" duration_ms=0 duration_s=0.000
[INFO] [stderr] warning: krasis@0.1.66-rc29: KRASIS_BUILD_TIMING phase="build.rs total" duration_ms=7 duration_s=0.007
[INFO] [stderr]     Checking getrandom v0.3.4
[INFO] [stderr]     Checking getrandom v0.2.17
[INFO] [stderr]     Checking memmap2 v0.9.10
[INFO] [stderr]     Checking rand_core v0.6.4
[INFO] [stderr]     Checking rand_core v0.9.5
[INFO] [stderr]    Compiling pyo3-macros-backend v0.23.5
[INFO] [stderr]    Compiling pyo3-ffi v0.23.5
[INFO] [stderr]    Compiling pyo3 v0.23.5
[INFO] [stderr]     Checking regex-automata v0.4.14
[INFO] [stderr]     Checking rayon-cond v0.4.0
[INFO] [stderr]    Compiling darling_core v0.20.11
[INFO] [stderr]     Checking regex v1.12.3
[INFO] [stderr]    Compiling zerocopy-derive v0.8.48
[INFO] [stderr]    Compiling serde_derive v1.0.228
[INFO] [stderr]    Compiling thiserror-impl v2.0.18
[INFO] [stderr]    Compiling monostate-impl v0.1.18
[INFO] [stderr]     Checking env_filter v1.0.1
[INFO] [stderr]     Checking env_logger v0.11.10
[INFO] [stderr]     Checking monostate v0.1.18
[INFO] [stderr]     Checking onig v6.5.1
[INFO] [stderr]     Checking thiserror v2.0.18
[INFO] [stderr]     Checking zerocopy v0.8.48
[INFO] [stderr]    Compiling darling_macro v0.20.11
[INFO] [stderr]    Compiling pyo3-macros v0.23.5
[INFO] [stderr]    Compiling darling v0.20.11
[INFO] [stderr]    Compiling derive_builder_core v0.20.2
[INFO] [stderr]    Compiling derive_builder_macro v0.20.2
[INFO] [stderr]     Checking serde v1.0.228
[INFO] [stderr]     Checking derive_builder v0.20.2
[INFO] [stderr]     Checking spm_precompiled v0.1.4
[INFO] [stderr]     Checking dary_heap v0.3.9
[INFO] [stderr]     Checking compact_str v0.9.0
[INFO] [stderr]     Checking minijinja v2.19.0
[INFO] [stderr]     Checking safetensors v0.4.5
[INFO] [stderr]     Checking ppv-lite86 v0.2.21
[INFO] [stderr]     Checking rand_chacha v0.9.0
[INFO] [stderr]     Checking rand_chacha v0.3.1
[INFO] [stderr]     Checking rand v0.9.4
[INFO] [stderr]     Checking rand v0.8.6
[INFO] [stderr]     Checking rand_distr v0.5.1
[INFO] [stderr]     Checking tokenizers v0.21.4
[INFO] [stderr]     Checking half v2.7.1
[INFO] [stdout] warning: unused imports: `repack_tiled_int4_packed`, `repack_tiled_int8_packed`, and `repack_tiled_scales`
[INFO] [stdout]   --> src/decode.rs:12:5
[INFO] [stdout]    |
[INFO] [stdout] 12 |     repack_tiled_int4_packed, repack_tiled_int8_packed, repack_tiled_scales,
[INFO] [stdout]    |     ^^^^^^^^^^^^^^^^^^^^^^^^  ^^^^^^^^^^^^^^^^^^^^^^^^  ^^^^^^^^^^^^^^^^^^^
[INFO] [stdout]    |
[INFO] [stdout]    = note: `#[warn(unused_imports)]` (part of `#[warn(unused)]`) on by default
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused import: `std::time::Instant`
[INFO] [stdout]      --> src/gpu_decode.rs:31393:13
[INFO] [stdout]       |
[INFO] [stdout] 31393 |         use std::time::Instant;
[INFO] [stdout]       |             ^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused import: `crate::weights::marlin::bf16_to_f32`
[INFO] [stdout]      --> src/gpu_decode.rs:33092:13
[INFO] [stdout]       |
[INFO] [stdout] 33092 |         use crate::weights::marlin::bf16_to_f32;
[INFO] [stdout]       |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused import: `UnifiedExpertWeights`
[INFO] [stdout]      --> src/gpu_decode.rs:33455:43
[INFO] [stdout]       |
[INFO] [stdout] 33455 |         use crate::weights::{WeightStore, UnifiedExpertWeights};
[INFO] [stdout]       |                                           ^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused import: `crate::weights::marlin::bf16_to_f32`
[INFO] [stdout]      --> src/gpu_decode.rs:33456:13
[INFO] [stdout]       |
[INFO] [stdout] 33456 |         use crate::weights::marlin::bf16_to_f32;
[INFO] [stdout]       |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused import: `crate::weights::marlin::bf16_to_f32`
[INFO] [stdout]      --> src/gpu_decode.rs:33652:13
[INFO] [stdout]       |
[INFO] [stdout] 33652 |         use crate::weights::marlin::bf16_to_f32;
[INFO] [stdout]       |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused imports: `repack_tiled_int4_packed`, `repack_tiled_int8_packed`, and `repack_tiled_scales`
[INFO] [stdout]   --> src/decode.rs:12:5
[INFO] [stdout]    |
[INFO] [stdout] 12 |     repack_tiled_int4_packed, repack_tiled_int8_packed, repack_tiled_scales,
[INFO] [stdout]    |     ^^^^^^^^^^^^^^^^^^^^^^^^  ^^^^^^^^^^^^^^^^^^^^^^^^  ^^^^^^^^^^^^^^^^^^^
[INFO] [stdout]    |
[INFO] [stdout]    = note: `#[warn(unused_imports)]` (part of `#[warn(unused)]`) on by default
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused import: `std::time::Instant`
[INFO] [stdout]      --> src/gpu_decode.rs:31393:13
[INFO] [stdout]       |
[INFO] [stdout] 31393 |         use std::time::Instant;
[INFO] [stdout]       |             ^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused import: `crate::weights::marlin::bf16_to_f32`
[INFO] [stdout]      --> src/gpu_decode.rs:33092:13
[INFO] [stdout]       |
[INFO] [stdout] 33092 |         use crate::weights::marlin::bf16_to_f32;
[INFO] [stdout]       |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused import: `UnifiedExpertWeights`
[INFO] [stdout]      --> src/gpu_decode.rs:33455:43
[INFO] [stdout]       |
[INFO] [stdout] 33455 |         use crate::weights::{WeightStore, UnifiedExpertWeights};
[INFO] [stdout]       |                                           ^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused import: `crate::weights::marlin::bf16_to_f32`
[INFO] [stdout]      --> src/gpu_decode.rs:33456:13
[INFO] [stdout]       |
[INFO] [stdout] 33456 |         use crate::weights::marlin::bf16_to_f32;
[INFO] [stdout]       |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused import: `crate::weights::marlin::bf16_to_f32`
[INFO] [stdout]      --> src/gpu_decode.rs:33652:13
[INFO] [stdout]       |
[INFO] [stdout] 33652 |         use crate::weights::marlin::bf16_to_f32;
[INFO] [stdout]       |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unreachable statement
[INFO] [stdout]     --> src/gpu_decode.rs:9711:9
[INFO] [stdout]      |
[INFO] [stdout] 9706 | /         return Err(pyo3::exceptions::PyRuntimeError::new_err(
[INFO] [stdout] 9707 | |             "Decode kernels not available (nvcc not found at build time). \
[INFO] [stdout] 9708 | |              GPU decode requires compiled CUDA kernels. Rebuild with nvcc in PATH."
[INFO] [stdout] 9709 | |         ));
[INFO] [stdout]      | |__________- any code following this expression is unreachable
[INFO] [stdout] 9710 |
[INFO] [stdout] 9711 |           let kernels_loaded = cfg!(has_decode_kernels);
[INFO] [stdout]      |           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ unreachable statement
[INFO] [stdout]      |
[INFO] [stdout]      = note: `#[warn(unreachable_code)]` (part of `#[warn(unused)]`) on by default
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unreachable statement
[INFO] [stdout]     --> src/gpu_decode.rs:9711:9
[INFO] [stdout]      |
[INFO] [stdout] 9706 | /         return Err(pyo3::exceptions::PyRuntimeError::new_err(
[INFO] [stdout] 9707 | |             "Decode kernels not available (nvcc not found at build time). \
[INFO] [stdout] 9708 | |              GPU decode requires compiled CUDA kernels. Rebuild with nvcc in PATH."
[INFO] [stdout] 9709 | |         ));
[INFO] [stdout]      | |__________- any code following this expression is unreachable
[INFO] [stdout] 9710 |
[INFO] [stdout] 9711 |           let kernels_loaded = cfg!(has_decode_kernels);
[INFO] [stdout]      |           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ unreachable statement
[INFO] [stdout]      |
[INFO] [stdout]      = note: `#[warn(unreachable_code)]` (part of `#[warn(unused)]`) on by default
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `rope_len`
[INFO] [stdout]     --> src/decode.rs:2528:9
[INFO] [stdout]      |
[INFO] [stdout] 2528 |         rope_len: usize, rope_max_seq: usize,
[INFO] [stdout]      |         ^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_rope_len`
[INFO] [stdout]      |
[INFO] [stdout]      = note: `#[warn(unused_variables)]` (part of `#[warn(unused)]`) on by default
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `hs`
[INFO] [stdout]     --> src/decode.rs:2857:13
[INFO] [stdout]      |
[INFO] [stdout] 2857 |         let hs = g.hidden_size;
[INFO] [stdout]      |             ^^ help: if this is intentional, prefix it with an underscore: `_hs`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `kv_lora_rank`
[INFO] [stdout]     --> src/decode.rs:2906:32
[INFO] [stdout]      |
[INFO] [stdout] 2906 |                     num_heads, kv_lora_rank, qk_nope_dim, qk_rope_dim, v_head_dim, .. } => {
[INFO] [stdout]      |                                ^^^^^^^^^^^^ help: try ignoring the field: `kv_lora_rank: _`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `qk_nope_dim`
[INFO] [stdout]     --> src/decode.rs:2906:46
[INFO] [stdout]      |
[INFO] [stdout] 2906 |                     num_heads, kv_lora_rank, qk_nope_dim, qk_rope_dim, v_head_dim, .. } => {
[INFO] [stdout]      |                                              ^^^^^^^^^^^ help: try ignoring the field: `qk_nope_dim: _`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `qk_rope_dim`
[INFO] [stdout]     --> src/decode.rs:2906:59
[INFO] [stdout]      |
[INFO] [stdout] 2906 |                     num_heads, kv_lora_rank, qk_nope_dim, qk_rope_dim, v_head_dim, .. } => {
[INFO] [stdout]      |                                                           ^^^^^^^^^^^ help: try ignoring the field: `qk_rope_dim: _`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `max_seq`
[INFO] [stdout]     --> src/decode.rs:4764:5
[INFO] [stdout]      |
[INFO] [stdout] 4764 |     max_seq: usize,
[INFO] [stdout]      |     ^^^^^^^ help: if this is intentional, prefix it with an underscore: `_max_seq`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `rope_len`
[INFO] [stdout]     --> src/decode.rs:2528:9
[INFO] [stdout]      |
[INFO] [stdout] 2528 |         rope_len: usize, rope_max_seq: usize,
[INFO] [stdout]      |         ^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_rope_len`
[INFO] [stdout]      |
[INFO] [stdout]      = note: `#[warn(unused_variables)]` (part of `#[warn(unused)]`) on by default
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `hs`
[INFO] [stdout]     --> src/decode.rs:2857:13
[INFO] [stdout]      |
[INFO] [stdout] 2857 |         let hs = g.hidden_size;
[INFO] [stdout]      |             ^^ help: if this is intentional, prefix it with an underscore: `_hs`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `kv_lora_rank`
[INFO] [stdout]     --> src/decode.rs:2906:32
[INFO] [stdout]      |
[INFO] [stdout] 2906 |                     num_heads, kv_lora_rank, qk_nope_dim, qk_rope_dim, v_head_dim, .. } => {
[INFO] [stdout]      |                                ^^^^^^^^^^^^ help: try ignoring the field: `kv_lora_rank: _`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `qk_nope_dim`
[INFO] [stdout]     --> src/decode.rs:2906:46
[INFO] [stdout]      |
[INFO] [stdout] 2906 |                     num_heads, kv_lora_rank, qk_nope_dim, qk_rope_dim, v_head_dim, .. } => {
[INFO] [stdout]      |                                              ^^^^^^^^^^^ help: try ignoring the field: `qk_nope_dim: _`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `qk_rope_dim`
[INFO] [stdout]     --> src/decode.rs:2906:59
[INFO] [stdout]      |
[INFO] [stdout] 2906 |                     num_heads, kv_lora_rank, qk_nope_dim, qk_rope_dim, v_head_dim, .. } => {
[INFO] [stdout]      |                                                           ^^^^^^^^^^^ help: try ignoring the field: `qk_rope_dim: _`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `entry`
[INFO] [stdout]     --> src/gpu_decode.rs:3693:64
[INFO] [stdout]      |
[INFO] [stdout] 3693 |     fn gpu_expert_ptrs_set(&self, layer: usize, expert: usize, entry: &HcsCacheEntry) {
[INFO] [stdout]      |                                                                ^^^^^ help: if this is intentional, prefix it with an underscore: `_entry`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `max_seq`
[INFO] [stdout]     --> src/decode.rs:4764:5
[INFO] [stdout]      |
[INFO] [stdout] 4764 |     max_seq: usize,
[INFO] [stdout]      |     ^^^^^^^ help: if this is intentional, prefix it with an underscore: `_max_seq`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: variable does not need to be mutable
[INFO] [stdout]     --> src/gpu_decode.rs:9588:13
[INFO] [stdout]      |
[INFO] [stdout] 9588 |         let mut gqa_smem_limit: u32 = 48 * 1024; // default
[INFO] [stdout]      |             ----^^^^^^^^^^^^^^
[INFO] [stdout]      |             |
[INFO] [stdout]      |             help: remove this `mut`
[INFO] [stdout]      |
[INFO] [stdout]      = note: `#[warn(unused_mut)]` (part of `#[warn(unused)]`) on by default
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `blas`
[INFO] [stdout]     --> src/gpu_decode.rs:9522:13
[INFO] [stdout]      |
[INFO] [stdout] 9522 |         let blas = CudaBlas::new(device.clone())
[INFO] [stdout]      |             ^^^^ help: if this is intentional, prefix it with an underscore: `_blas`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `compute_stream`
[INFO] [stdout]     --> src/gpu_decode.rs:9526:13
[INFO] [stdout]      |
[INFO] [stdout] 9526 |         let compute_stream = unsafe {
[INFO] [stdout]      |             ^^^^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_compute_stream`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `copy_stream`
[INFO] [stdout]     --> src/gpu_decode.rs:9539:13
[INFO] [stdout]      |
[INFO] [stdout] 9539 |         let copy_stream = unsafe {
[INFO] [stdout]      |             ^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_copy_stream`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `prefetch_stream`
[INFO] [stdout]     --> src/gpu_decode.rs:9552:13
[INFO] [stdout]      |
[INFO] [stdout] 9552 |         let prefetch_stream = unsafe {
[INFO] [stdout]      |             ^^^^^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_prefetch_stream`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `gqa_smem_limit`
[INFO] [stdout]     --> src/gpu_decode.rs:9588:13
[INFO] [stdout]      |
[INFO] [stdout] 9588 |         let mut gqa_smem_limit: u32 = 48 * 1024; // default
[INFO] [stdout]      |             ^^^^^^^^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_gqa_smem_limit`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `entry`
[INFO] [stdout]     --> src/gpu_decode.rs:3693:64
[INFO] [stdout]      |
[INFO] [stdout] 3693 |     fn gpu_expert_ptrs_set(&self, layer: usize, expert: usize, entry: &HcsCacheEntry) {
[INFO] [stdout]      |                                                                ^^^^^ help: if this is intentional, prefix it with an underscore: `_entry`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `f`
[INFO] [stdout]      --> src/gpu_decode.rs:12388:17
[INFO] [stdout]       |
[INFO] [stdout] 12388 |             let f = self.device.get_func(MODULE_NAME, "rmsnorm")
[INFO] [stdout]       |                 ^ help: if this is intentional, prefix it with an underscore: `_f`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `row_len`
[INFO] [stdout]      --> src/gpu_decode.rs:12633:13
[INFO] [stdout]       |
[INFO] [stdout] 12633 |         let row_len = n * 16;
[INFO] [stdout]       |             ^^^^^^^ help: if this is intentional, prefix it with an underscore: `_row_len`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: variable does not need to be mutable
[INFO] [stdout]     --> src/gpu_decode.rs:9588:13
[INFO] [stdout]      |
[INFO] [stdout] 9588 |         let mut gqa_smem_limit: u32 = 48 * 1024; // default
[INFO] [stdout]      |             ----^^^^^^^^^^^^^^
[INFO] [stdout]      |             |
[INFO] [stdout]      |             help: remove this `mut`
[INFO] [stdout]      |
[INFO] [stdout]      = note: `#[warn(unused_mut)]` (part of `#[warn(unused)]`) on by default
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `blas`
[INFO] [stdout]     --> src/gpu_decode.rs:9522:13
[INFO] [stdout]      |
[INFO] [stdout] 9522 |         let blas = CudaBlas::new(device.clone())
[INFO] [stdout]      |             ^^^^ help: if this is intentional, prefix it with an underscore: `_blas`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `compute_stream`
[INFO] [stdout]     --> src/gpu_decode.rs:9526:13
[INFO] [stdout]      |
[INFO] [stdout] 9526 |         let compute_stream = unsafe {
[INFO] [stdout]      |             ^^^^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_compute_stream`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `copy_stream`
[INFO] [stdout]     --> src/gpu_decode.rs:9539:13
[INFO] [stdout]      |
[INFO] [stdout] 9539 |         let copy_stream = unsafe {
[INFO] [stdout]      |             ^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_copy_stream`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `prefetch_stream`
[INFO] [stdout]     --> src/gpu_decode.rs:9552:13
[INFO] [stdout]      |
[INFO] [stdout] 9552 |         let prefetch_stream = unsafe {
[INFO] [stdout]      |             ^^^^^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_prefetch_stream`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `gqa_smem_limit`
[INFO] [stdout]     --> src/gpu_decode.rs:9588:13
[INFO] [stdout]      |
[INFO] [stdout] 9588 |         let mut gqa_smem_limit: u32 = 48 * 1024; // default
[INFO] [stdout]      |             ^^^^^^^^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_gqa_smem_limit`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `f`
[INFO] [stdout]      --> src/gpu_decode.rs:12388:17
[INFO] [stdout]       |
[INFO] [stdout] 12388 |             let f = self.device.get_func(MODULE_NAME, "rmsnorm")
[INFO] [stdout]       |                 ^ help: if this is intentional, prefix it with an underscore: `_f`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `row_len`
[INFO] [stdout]      --> src/gpu_decode.rs:12633:13
[INFO] [stdout]       |
[INFO] [stdout] 12633 |         let row_len = n * 16;
[INFO] [stdout]       |             ^^^^^^^ help: if this is intentional, prefix it with an underscore: `_row_len`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: variable does not need to be mutable
[INFO] [stdout]      --> src/gpu_decode.rs:17766:13
[INFO] [stdout]       |
[INFO] [stdout] 17766 |         let mut engine = PrefillEngine {
[INFO] [stdout]       |             ----^^^^^^
[INFO] [stdout]       |             |
[INFO] [stdout]       |             help: remove this `mut`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `total_w1_packed`
[INFO] [stdout]      --> src/gpu_decode.rs:17739:13
[INFO] [stdout]       |
[INFO] [stdout] 17739 |         let total_w1_packed = n_routed * w1_packed_per_expert;
[INFO] [stdout]       |             ^^^^^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_total_w1_packed`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `total_w1_scales`
[INFO] [stdout]      --> src/gpu_decode.rs:17740:13
[INFO] [stdout]       |
[INFO] [stdout] 17740 |         let total_w1_scales = n_routed * w1_scales_per_expert;
[INFO] [stdout]       |             ^^^^^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_total_w1_scales`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `total_w2_packed`
[INFO] [stdout]      --> src/gpu_decode.rs:17741:13
[INFO] [stdout]       |
[INFO] [stdout] 17741 |         let total_w2_packed = n_routed * w2_packed_per_expert;
[INFO] [stdout]       |             ^^^^^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_total_w2_packed`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `total_w2_scales`
[INFO] [stdout]      --> src/gpu_decode.rs:17742:13
[INFO] [stdout]       |
[INFO] [stdout] 17742 |         let total_w2_scales = n_routed * w2_scales_per_expert;
[INFO] [stdout]       |             ^^^^^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_total_w2_scales`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `conv_state_ptr`
[INFO] [stdout]      --> src/gpu_decode.rs:18318:17
[INFO] [stdout]       |
[INFO] [stdout] 18318 |                 conv_state_ptr, recur_state_ptr,
[INFO] [stdout]       |                 ^^^^^^^^^^^^^^ help: try ignoring the field: `conv_state_ptr: _`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `recur_state_ptr`
[INFO] [stdout]      --> src/gpu_decode.rs:18318:33
[INFO] [stdout]       |
[INFO] [stdout] 18318 |                 conv_state_ptr, recur_state_ptr,
[INFO] [stdout]       |                                 ^^^^^^^^^^^^^^^ help: try ignoring the field: `recur_state_ptr: _`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `positions`
[INFO] [stdout]      --> src/gpu_decode.rs:18530:9
[INFO] [stdout]       |
[INFO] [stdout] 18530 |         positions: &[usize],
[INFO] [stdout]       |         ^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_positions`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `eps`
[INFO] [stdout]      --> src/gpu_decode.rs:18533:13
[INFO] [stdout]       |
[INFO] [stdout] 18533 |         let eps = graph.eps;
[INFO] [stdout]       |             ^^^ help: if this is intentional, prefix it with an underscore: `_eps`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `k`
[INFO] [stdout]      --> src/gpu_decode.rs:18537:13
[INFO] [stdout]       |
[INFO] [stdout] 18537 |         let k = graph.kernels.as_ref().ok_or("kernels not cached")?.clone();
[INFO] [stdout]       |             ^ help: if this is intentional, prefix it with an underscore: `_k`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: variable does not need to be mutable
[INFO] [stdout]      --> src/gpu_decode.rs:17766:13
[INFO] [stdout]       |
[INFO] [stdout] 17766 |         let mut engine = PrefillEngine {
[INFO] [stdout]       |             ----^^^^^^
[INFO] [stdout]       |             |
[INFO] [stdout]       |             help: remove this `mut`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `total_w1_packed`
[INFO] [stdout]      --> src/gpu_decode.rs:17739:13
[INFO] [stdout]       |
[INFO] [stdout] 17739 |         let total_w1_packed = n_routed * w1_packed_per_expert;
[INFO] [stdout]       |             ^^^^^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_total_w1_packed`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `total_w1_scales`
[INFO] [stdout]      --> src/gpu_decode.rs:17740:13
[INFO] [stdout]       |
[INFO] [stdout] 17740 |         let total_w1_scales = n_routed * w1_scales_per_expert;
[INFO] [stdout]       |             ^^^^^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_total_w1_scales`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `total_w2_packed`
[INFO] [stdout]      --> src/gpu_decode.rs:17741:13
[INFO] [stdout]       |
[INFO] [stdout] 17741 |         let total_w2_packed = n_routed * w2_packed_per_expert;
[INFO] [stdout]       |             ^^^^^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_total_w2_packed`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `total_w2_scales`
[INFO] [stdout]      --> src/gpu_decode.rs:17742:13
[INFO] [stdout]       |
[INFO] [stdout] 17742 |         let total_w2_scales = n_routed * w2_scales_per_expert;
[INFO] [stdout]       |             ^^^^^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_total_w2_scales`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `conv_state_ptr`
[INFO] [stdout]      --> src/gpu_decode.rs:18318:17
[INFO] [stdout]       |
[INFO] [stdout] 18318 |                 conv_state_ptr, recur_state_ptr,
[INFO] [stdout]       |                 ^^^^^^^^^^^^^^ help: try ignoring the field: `conv_state_ptr: _`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `recur_state_ptr`
[INFO] [stdout]      --> src/gpu_decode.rs:18318:33
[INFO] [stdout]       |
[INFO] [stdout] 18318 |                 conv_state_ptr, recur_state_ptr,
[INFO] [stdout]       |                                 ^^^^^^^^^^^^^^^ help: try ignoring the field: `recur_state_ptr: _`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `positions`
[INFO] [stdout]      --> src/gpu_decode.rs:18530:9
[INFO] [stdout]       |
[INFO] [stdout] 18530 |         positions: &[usize],
[INFO] [stdout]       |         ^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_positions`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `graph_idx`
[INFO] [stdout]      --> src/gpu_decode.rs:19503:9
[INFO] [stdout]       |
[INFO] [stdout] 19503 |         graph_idx: usize,
[INFO] [stdout]       |         ^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_graph_idx`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `w13_ksplits_batched`
[INFO] [stdout]      --> src/gpu_decode.rs:19544:13
[INFO] [stdout]       |
[INFO] [stdout] 19544 |         let w13_ksplits_batched = if w13_max_ksplits > 1 {
[INFO] [stdout]       |             ^^^^^^^^^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_w13_ksplits_batched`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: variable `gqa_cache_idx` is assigned to, but never used
[INFO] [stdout]      --> src/gpu_decode.rs:19783:13
[INFO] [stdout]       |
[INFO] [stdout] 19783 |         let mut gqa_cache_idx = seg_gqa_offset;
[INFO] [stdout]       |             ^^^^^^^^^^^^^^^^^
[INFO] [stdout]       |
[INFO] [stdout]       = note: consider using `_gqa_cache_idx` instead
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: value assigned to `gqa_cache_idx` is never read
[INFO] [stdout]      --> src/gpu_decode.rs:19787:13
[INFO] [stdout]       |
[INFO] [stdout] 19787 |             gqa_cache_idx = seg_gqa_offset;
[INFO] [stdout]       |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ this value is reassigned later and never used
[INFO] [stdout] ...
[INFO] [stdout] 19790 |                     gqa_cache_idx += 1;
[INFO] [stdout]       |                     ------------------ `gqa_cache_idx` is overwritten here before the previous value is read
[INFO] [stdout]       |
[INFO] [stdout]       = note: `#[warn(unused_assignments)]` (part of `#[warn(unused)]`) on by default
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: value assigned to `gqa_cache_idx` is never read
[INFO] [stdout]      --> src/gpu_decode.rs:19790:21
[INFO] [stdout]       |
[INFO] [stdout] 19790 |                     gqa_cache_idx += 1;
[INFO] [stdout]       |                     ^^^^^^^^^^^^^^^^^^ this value is reassigned later and never used
[INFO] [stdout] ...
[INFO] [stdout] 20827 |                         gqa_cache_idx += 1;
[INFO] [stdout]       |                         ------------------ `gqa_cache_idx` is overwritten here before the previous value is read
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: value assigned to `gqa_cache_idx` is never read
[INFO] [stdout]      --> src/gpu_decode.rs:20827:25
[INFO] [stdout]       |
[INFO] [stdout] 20827 |                         gqa_cache_idx += 1;
[INFO] [stdout]       |                         ^^^^^^^^^^^^^^^^^^
[INFO] [stdout]       |
[INFO] [stdout]       = help: maybe it is overwritten before being read?
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `eps`
[INFO] [stdout]      --> src/gpu_decode.rs:18533:13
[INFO] [stdout]       |
[INFO] [stdout] 18533 |         let eps = graph.eps;
[INFO] [stdout]       |             ^^^ help: if this is intentional, prefix it with an underscore: `_eps`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `k`
[INFO] [stdout]      --> src/gpu_decode.rs:18537:13
[INFO] [stdout]       |
[INFO] [stdout] 18537 |         let k = graph.kernels.as_ref().ok_or("kernels not cached")?.clone();
[INFO] [stdout]       |             ^ help: if this is intentional, prefix it with an underscore: `_k`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `hs`
[INFO] [stdout]      --> src/gpu_decode.rs:21541:13
[INFO] [stdout]       |
[INFO] [stdout] 21541 |         let hs = graph.hidden_size;
[INFO] [stdout]       |             ^^ help: if this is intentional, prefix it with an underscore: `_hs`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unnecessary `unsafe` block
[INFO] [stdout]      --> src/gpu_decode.rs:22762:49
[INFO] [stdout]       |
[INFO] [stdout] 22751 | ...                   unsafe {
[INFO] [stdout]       |                       ------ because it's nested under this `unsafe` block
[INFO] [stdout] ...
[INFO] [stdout] 22762 | ...                       let k_out = unsafe {
[INFO] [stdout]       |                                       ^^^^^^ unnecessary `unsafe` block
[INFO] [stdout]       |
[INFO] [stdout]       = note: `#[warn(unused_unsafe)]` (part of `#[warn(unused)]`) on by default
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `graph_idx`
[INFO] [stdout]      --> src/gpu_decode.rs:19503:9
[INFO] [stdout]       |
[INFO] [stdout] 19503 |         graph_idx: usize,
[INFO] [stdout]       |         ^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_graph_idx`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `w13_ksplits_batched`
[INFO] [stdout]      --> src/gpu_decode.rs:19544:13
[INFO] [stdout]       |
[INFO] [stdout] 19544 |         let w13_ksplits_batched = if w13_max_ksplits > 1 {
[INFO] [stdout]       |             ^^^^^^^^^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_w13_ksplits_batched`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: variable `gqa_cache_idx` is assigned to, but never used
[INFO] [stdout]      --> src/gpu_decode.rs:19783:13
[INFO] [stdout]       |
[INFO] [stdout] 19783 |         let mut gqa_cache_idx = seg_gqa_offset;
[INFO] [stdout]       |             ^^^^^^^^^^^^^^^^^
[INFO] [stdout]       |
[INFO] [stdout]       = note: consider using `_gqa_cache_idx` instead
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: value assigned to `gqa_cache_idx` is never read
[INFO] [stdout]      --> src/gpu_decode.rs:19787:13
[INFO] [stdout]       |
[INFO] [stdout] 19787 |             gqa_cache_idx = seg_gqa_offset;
[INFO] [stdout]       |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ this value is reassigned later and never used
[INFO] [stdout] ...
[INFO] [stdout] 19790 |                     gqa_cache_idx += 1;
[INFO] [stdout]       |                     ------------------ `gqa_cache_idx` is overwritten here before the previous value is read
[INFO] [stdout]       |
[INFO] [stdout]       = note: `#[warn(unused_assignments)]` (part of `#[warn(unused)]`) on by default
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: value assigned to `gqa_cache_idx` is never read
[INFO] [stdout]      --> src/gpu_decode.rs:19790:21
[INFO] [stdout]       |
[INFO] [stdout] 19790 |                     gqa_cache_idx += 1;
[INFO] [stdout]       |                     ^^^^^^^^^^^^^^^^^^ this value is reassigned later and never used
[INFO] [stdout] ...
[INFO] [stdout] 20827 |                         gqa_cache_idx += 1;
[INFO] [stdout]       |                         ------------------ `gqa_cache_idx` is overwritten here before the previous value is read
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: value assigned to `gqa_cache_idx` is never read
[INFO] [stdout]      --> src/gpu_decode.rs:20827:25
[INFO] [stdout]       |
[INFO] [stdout] 20827 |                         gqa_cache_idx += 1;
[INFO] [stdout]       |                         ^^^^^^^^^^^^^^^^^^
[INFO] [stdout]       |
[INFO] [stdout]       = help: maybe it is overwritten before being read?
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: variable does not need to be mutable
[INFO] [stdout]      --> src/gpu_decode.rs:22482:13
[INFO] [stdout]       |
[INFO] [stdout] 22482 |         let mut tt_norm = 0.0f64;
[INFO] [stdout]       |             ----^^^^^^^
[INFO] [stdout]       |             |
[INFO] [stdout]       |             help: remove this `mut`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: variable does not need to be mutable
[INFO] [stdout]      --> src/gpu_decode.rs:22483:13
[INFO] [stdout]       |
[INFO] [stdout] 22483 |         let mut tt_shared = 0.0f64;
[INFO] [stdout]       |             ----^^^^^^^^^
[INFO] [stdout]       |             |
[INFO] [stdout]       |             help: remove this `mut`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: variable `gqa_cache_idx` is assigned to, but never used
[INFO] [stdout]      --> src/gpu_decode.rs:22477:13
[INFO] [stdout]       |
[INFO] [stdout] 22477 |         let mut gqa_cache_idx = seg_gqa_offset; // Start at offset for multi-GPU segments
[INFO] [stdout]       |             ^^^^^^^^^^^^^^^^^
[INFO] [stdout]       |
[INFO] [stdout]       = note: consider using `_gqa_cache_idx` instead
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `tt_norm`
[INFO] [stdout]      --> src/gpu_decode.rs:22482:13
[INFO] [stdout]       |
[INFO] [stdout] 22482 |         let mut tt_norm = 0.0f64;
[INFO] [stdout]       |             ^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_tt_norm`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `num_blocks`
[INFO] [stdout]      --> src/gpu_decode.rs:23555:29
[INFO] [stdout]       |
[INFO] [stdout] 23555 |                         let num_blocks = graph.kv_num_blocks;
[INFO] [stdout]       |                             ^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_num_blocks`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `c_ptr`
[INFO] [stdout]      --> src/gpu_decode.rs:24282:25
[INFO] [stdout]       |
[INFO] [stdout] 24282 |                     let c_ptr = b_ptr + (n_groups * state_size * 4) as u64;
[INFO] [stdout]       |                         ^^^^^ help: if this is intentional, prefix it with an underscore: `_c_ptr`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: value assigned to `gqa_cache_idx` is never read
[INFO] [stdout]      --> src/gpu_decode.rs:23979:21
[INFO] [stdout]       |
[INFO] [stdout] 23979 |                     gqa_cache_idx += 1;
[INFO] [stdout]       |                     ^^^^^^^^^^^^^^^^^^
[INFO] [stdout]       |
[INFO] [stdout]       = help: maybe it is overwritten before being read?
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `hs`
[INFO] [stdout]      --> src/gpu_decode.rs:21541:13
[INFO] [stdout]       |
[INFO] [stdout] 21541 |         let hs = graph.hidden_size;
[INFO] [stdout]       |             ^^ help: if this is intentional, prefix it with an underscore: `_hs`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unnecessary `unsafe` block
[INFO] [stdout]      --> src/gpu_decode.rs:22762:49
[INFO] [stdout]       |
[INFO] [stdout] 22751 | ...                   unsafe {
[INFO] [stdout]       |                       ------ because it's nested under this `unsafe` block
[INFO] [stdout] ...
[INFO] [stdout] 22762 | ...                       let k_out = unsafe {
[INFO] [stdout]       |                                       ^^^^^^ unnecessary `unsafe` block
[INFO] [stdout]       |
[INFO] [stdout]       = note: `#[warn(unused_unsafe)]` (part of `#[warn(unused)]`) on by default
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: value assigned to `step_ok` is never read
[INFO] [stdout]      --> src/gpu_decode.rs:25856:41
[INFO] [stdout]       |
[INFO] [stdout] 25856 | ...                   step_ok = false; break;
[INFO] [stdout]       |                       ^^^^^^^^^^^^^^^
[INFO] [stdout]       |
[INFO] [stdout]       = help: maybe it is overwritten before being read?
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: variable does not need to be mutable
[INFO] [stdout]      --> src/gpu_decode.rs:22482:13
[INFO] [stdout]       |
[INFO] [stdout] 22482 |         let mut tt_norm = 0.0f64;
[INFO] [stdout]       |             ----^^^^^^^
[INFO] [stdout]       |             |
[INFO] [stdout]       |             help: remove this `mut`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: variable does not need to be mutable
[INFO] [stdout]      --> src/gpu_decode.rs:22483:13
[INFO] [stdout]       |
[INFO] [stdout] 22483 |         let mut tt_shared = 0.0f64;
[INFO] [stdout]       |             ----^^^^^^^^^
[INFO] [stdout]       |             |
[INFO] [stdout]       |             help: remove this `mut`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: variable `gqa_cache_idx` is assigned to, but never used
[INFO] [stdout]      --> src/gpu_decode.rs:22477:13
[INFO] [stdout]       |
[INFO] [stdout] 22477 |         let mut gqa_cache_idx = seg_gqa_offset; // Start at offset for multi-GPU segments
[INFO] [stdout]       |             ^^^^^^^^^^^^^^^^^
[INFO] [stdout]       |
[INFO] [stdout]       = note: consider using `_gqa_cache_idx` instead
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `tt_norm`
[INFO] [stdout]      --> src/gpu_decode.rs:22482:13
[INFO] [stdout]       |
[INFO] [stdout] 22482 |         let mut tt_norm = 0.0f64;
[INFO] [stdout]       |             ^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_tt_norm`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `num_blocks`
[INFO] [stdout]      --> src/gpu_decode.rs:23555:29
[INFO] [stdout]       |
[INFO] [stdout] 23555 |                         let num_blocks = graph.kv_num_blocks;
[INFO] [stdout]       |                             ^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_num_blocks`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `c_ptr`
[INFO] [stdout]      --> src/gpu_decode.rs:24282:25
[INFO] [stdout]       |
[INFO] [stdout] 24282 |                     let c_ptr = b_ptr + (n_groups * state_size * 4) as u64;
[INFO] [stdout]       |                         ^^^^^ help: if this is intentional, prefix it with an underscore: `_c_ptr`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: value assigned to `gqa_cache_idx` is never read
[INFO] [stdout]      --> src/gpu_decode.rs:23979:21
[INFO] [stdout]       |
[INFO] [stdout] 23979 |                     gqa_cache_idx += 1;
[INFO] [stdout]       |                     ^^^^^^^^^^^^^^^^^^
[INFO] [stdout]       |
[INFO] [stdout]       = help: maybe it is overwritten before being read?
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: variable does not need to be mutable
[INFO] [stdout]      --> src/gpu_decode.rs:26130:13
[INFO] [stdout]       |
[INFO] [stdout] 26130 |         let mut tt_proj: f64 = 0.0;
[INFO] [stdout]       |             ----^^^^^^^
[INFO] [stdout]       |             |
[INFO] [stdout]       |             help: remove this `mut`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `tt_proj`
[INFO] [stdout]      --> src/gpu_decode.rs:26130:13
[INFO] [stdout]       |
[INFO] [stdout] 26130 |         let mut tt_proj: f64 = 0.0;
[INFO] [stdout]       |             ^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_tt_proj`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `nep`
[INFO] [stdout]      --> src/gpu_decode.rs:27895:13
[INFO] [stdout]       |
[INFO] [stdout] 27895 |         let nep = hcs.num_experts_per_layer;
[INFO] [stdout]       |             ^^^ help: if this is intentional, prefix it with an underscore: `_nep`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: value assigned to `step_ok` is never read
[INFO] [stdout]      --> src/gpu_decode.rs:25856:41
[INFO] [stdout]       |
[INFO] [stdout] 25856 | ...                   step_ok = false; break;
[INFO] [stdout]       |                       ^^^^^^^^^^^^^^^
[INFO] [stdout]       |
[INFO] [stdout]       = help: maybe it is overwritten before being read?
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: variable does not need to be mutable
[INFO] [stdout]      --> src/gpu_decode.rs:26130:13
[INFO] [stdout]       |
[INFO] [stdout] 26130 |         let mut tt_proj: f64 = 0.0;
[INFO] [stdout]       |             ----^^^^^^^
[INFO] [stdout]       |             |
[INFO] [stdout]       |             help: remove this `mut`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `tt_proj`
[INFO] [stdout]      --> src/gpu_decode.rs:26130:13
[INFO] [stdout]       |
[INFO] [stdout] 26130 |         let mut tt_proj: f64 = 0.0;
[INFO] [stdout]       |             ^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_tt_proj`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `draft_context_window`
[INFO] [stdout]      --> src/gpu_decode.rs:29076:13
[INFO] [stdout]       |
[INFO] [stdout] 29076 |         let draft_context_window = self.draft_context_window;
[INFO] [stdout]       |             ^^^^^^^^^^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_draft_context_window`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `used_mb`
[INFO] [stdout]      --> src/gpu_decode.rs:29804:17
[INFO] [stdout]       |
[INFO] [stdout] 29804 |             let used_mb = total_mb - free_mb;
[INFO] [stdout]       |                 ^^^^^^^ help: if this is intentional, prefix it with an underscore: `_used_mb`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `nep`
[INFO] [stdout]      --> src/gpu_decode.rs:27895:13
[INFO] [stdout]       |
[INFO] [stdout] 27895 |         let nep = hcs.num_experts_per_layer;
[INFO] [stdout]       |             ^^^ help: if this is intentional, prefix it with an underscore: `_nep`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `t_moe`
[INFO] [stdout]      --> src/gpu_decode.rs:33591:13
[INFO] [stdout]       |
[INFO] [stdout] 33591 |         let t_moe = Instant::now();
[INFO] [stdout]       |             ^^^^^ help: if this is intentional, prefix it with an underscore: `_t_moe`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `draft_context_window`
[INFO] [stdout]      --> src/gpu_decode.rs:29076:13
[INFO] [stdout]       |
[INFO] [stdout] 29076 |         let draft_context_window = self.draft_context_window;
[INFO] [stdout]       |             ^^^^^^^^^^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_draft_context_window`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `used_mb`
[INFO] [stdout]      --> src/gpu_decode.rs:29804:17
[INFO] [stdout]       |
[INFO] [stdout] 29804 |             let used_mb = total_mb - free_mb;
[INFO] [stdout]       |                 ^^^^^^^ help: if this is intentional, prefix it with an underscore: `_used_mb`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `t_moe`
[INFO] [stdout]      --> src/gpu_decode.rs:33591:13
[INFO] [stdout]       |
[INFO] [stdout] 33591 |         let t_moe = Instant::now();
[INFO] [stdout]       |             ^^^^^ help: if this is intentional, prefix it with an underscore: `_t_moe`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: variable does not need to be mutable
[INFO] [stdout]     --> src/gpu_prefill.rs:8906:13
[INFO] [stdout]      |
[INFO] [stdout] 8906 |         let mut t_other_ms = 0.0f64;
[INFO] [stdout]      |             ----^^^^^^^^^^
[INFO] [stdout]      |             |
[INFO] [stdout]      |             help: remove this `mut`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `diag_moe_detail`
[INFO] [stdout]     --> src/gpu_prefill.rs:8582:13
[INFO] [stdout]      |
[INFO] [stdout] 8582 |         let diag_moe_detail = std::env::var("KRASIS_PREFILL_DIAG_MOE_DETAIL").is_ok();
[INFO] [stdout]      |             ^^^^^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_diag_moe_detail`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: value assigned to `attn_path` is never read
[INFO] [stdout]      --> src/gpu_prefill.rs:13328:29
[INFO] [stdout]       |
[INFO] [stdout] 13328 |         let mut attn_path = "custom_tiled";
[INFO] [stdout]       |                             ^^^^^^^^^^^^^^ this value is reassigned later and never used
[INFO] [stdout] ...
[INFO] [stdout] 14705 |                 attn_path = "custom_tiled_cross_chunk";
[INFO] [stdout]       |                 -------------------------------------- `attn_path` is overwritten here before the previous value is read
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: value assigned to `attn_path` is never read
[INFO] [stdout]      --> src/gpu_prefill.rs:14436:17
[INFO] [stdout]       |
[INFO] [stdout] 14436 |                 attn_path = "fa2_cross_chunk";
[INFO] [stdout]       |                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ this value is reassigned later and never used
[INFO] [stdout] ...
[INFO] [stdout] 14450 |                     attn_path = "fa2_fp8_cross_chunk";
[INFO] [stdout]       |                     --------------------------------- `attn_path` is overwritten here before the previous value is read
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `la_chunk_out`
[INFO] [stdout]      --> src/gpu_prefill.rs:15575:13
[INFO] [stdout]       |
[INFO] [stdout] 15575 |         let la_chunk_out = *self
[INFO] [stdout]       |             ^^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_la_chunk_out`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: variable does not need to be mutable
[INFO] [stdout]     --> src/gpu_prefill.rs:8906:13
[INFO] [stdout]      |
[INFO] [stdout] 8906 |         let mut t_other_ms = 0.0f64;
[INFO] [stdout]      |             ----^^^^^^^^^^
[INFO] [stdout]      |             |
[INFO] [stdout]      |             help: remove this `mut`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `diag_moe_detail`
[INFO] [stdout]     --> src/gpu_prefill.rs:8582:13
[INFO] [stdout]      |
[INFO] [stdout] 8582 |         let diag_moe_detail = std::env::var("KRASIS_PREFILL_DIAG_MOE_DETAIL").is_ok();
[INFO] [stdout]      |             ^^^^^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_diag_moe_detail`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: value assigned to `attn_path` is never read
[INFO] [stdout]      --> src/gpu_prefill.rs:13328:29
[INFO] [stdout]       |
[INFO] [stdout] 13328 |         let mut attn_path = "custom_tiled";
[INFO] [stdout]       |                             ^^^^^^^^^^^^^^ this value is reassigned later and never used
[INFO] [stdout] ...
[INFO] [stdout] 14705 |                 attn_path = "custom_tiled_cross_chunk";
[INFO] [stdout]       |                 -------------------------------------- `attn_path` is overwritten here before the previous value is read
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: value assigned to `attn_path` is never read
[INFO] [stdout]      --> src/gpu_prefill.rs:14436:17
[INFO] [stdout]       |
[INFO] [stdout] 14436 |                 attn_path = "fa2_cross_chunk";
[INFO] [stdout]       |                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ this value is reassigned later and never used
[INFO] [stdout] ...
[INFO] [stdout] 14450 |                     attn_path = "fa2_fp8_cross_chunk";
[INFO] [stdout]       |                     --------------------------------- `attn_path` is overwritten here before the previous value is read
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: value assigned to `d_temp_vendor_accum` is never read
[INFO] [stdout]      --> src/gpu_prefill.rs:24544:33
[INFO] [stdout]       |
[INFO] [stdout] 24544 | ...                   d_temp_vendor_accum = 0;
[INFO] [stdout]       |                       ^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout]       |
[INFO] [stdout]       = help: maybe it is overwritten before being read?
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: value assigned to `d_temp_vendor_src` is never read
[INFO] [stdout]      --> src/gpu_prefill.rs:24548:33
[INFO] [stdout]       |
[INFO] [stdout] 24548 | ...                   d_temp_vendor_src = 0;
[INFO] [stdout]       |                       ^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout]       |
[INFO] [stdout]       = help: maybe it is overwritten before being read?
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: value assigned to `d_temp_vendor_topk_weights` is never read
[INFO] [stdout]      --> src/gpu_prefill.rs:24552:33
[INFO] [stdout]       |
[INFO] [stdout] 24552 | ...                   d_temp_vendor_topk_weights = 0;
[INFO] [stdout]       |                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout]       |
[INFO] [stdout]       = help: maybe it is overwritten before being read?
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `total_active`
[INFO] [stdout]      --> src/gpu_prefill.rs:26358:13
[INFO] [stdout]       |
[INFO] [stdout] 26358 |         let total_active = m * topk; // upper bound
[INFO] [stdout]       |             ^^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_total_active`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `shared_ws_len`
[INFO] [stdout]      --> src/gpu_prefill.rs:28967:13
[INFO] [stdout]       |
[INFO] [stdout] 28967 |         let shared_ws_len = self.config.sms * MARLIN_MAX_LOCK_SLOTS_PER_SM;
[INFO] [stdout]       |             ^^^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_shared_ws_len`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `dv`
[INFO] [stdout]      --> src/gpu_prefill.rs:30381:13
[INFO] [stdout]       |
[INFO] [stdout] 30381 |         let dv = config.la_v_head_dim;
[INFO] [stdout]       |             ^^ help: if this is intentional, prefix it with an underscore: `_dv`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `la_chunk_out`
[INFO] [stdout]      --> src/gpu_prefill.rs:15575:13
[INFO] [stdout]       |
[INFO] [stdout] 15575 |         let la_chunk_out = *self
[INFO] [stdout]       |             ^^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_la_chunk_out`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: value assigned to `d_temp_vendor_accum` is never read
[INFO] [stdout]      --> src/gpu_prefill.rs:24544:33
[INFO] [stdout]       |
[INFO] [stdout] 24544 | ...                   d_temp_vendor_accum = 0;
[INFO] [stdout]       |                       ^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout]       |
[INFO] [stdout]       = help: maybe it is overwritten before being read?
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: value assigned to `d_temp_vendor_src` is never read
[INFO] [stdout]      --> src/gpu_prefill.rs:24548:33
[INFO] [stdout]       |
[INFO] [stdout] 24548 | ...                   d_temp_vendor_src = 0;
[INFO] [stdout]       |                       ^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout]       |
[INFO] [stdout]       = help: maybe it is overwritten before being read?
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: value assigned to `d_temp_vendor_topk_weights` is never read
[INFO] [stdout]      --> src/gpu_prefill.rs:24552:33
[INFO] [stdout]       |
[INFO] [stdout] 24552 | ...                   d_temp_vendor_topk_weights = 0;
[INFO] [stdout]       |                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout]       |
[INFO] [stdout]       = help: maybe it is overwritten before being read?
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `total_active`
[INFO] [stdout]      --> src/gpu_prefill.rs:26358:13
[INFO] [stdout]       |
[INFO] [stdout] 26358 |         let total_active = m * topk; // upper bound
[INFO] [stdout]       |             ^^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_total_active`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `shared_ws_len`
[INFO] [stdout]      --> src/gpu_prefill.rs:28967:13
[INFO] [stdout]       |
[INFO] [stdout] 28967 |         let shared_ws_len = self.config.sms * MARLIN_MAX_LOCK_SLOTS_PER_SM;
[INFO] [stdout]       |             ^^^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_shared_ws_len`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `evicted`
[INFO] [stdout]    --> src/server.rs:918:10
[INFO] [stdout]     |
[INFO] [stdout] 918 |     let (evicted, _freed_mb) = store_for_evict.hcs_evict_for_prefill(estimated_tokens);
[INFO] [stdout]     |          ^^^^^^^ help: if this is intentional, prefix it with an underscore: `_evicted`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: value assigned to `prompt_hcs_snapshot` is never read
[INFO] [stdout]     --> src/server.rs:930:76
[INFO] [stdout]      |
[INFO] [stdout]  930 |     let mut prompt_hcs_snapshot: Option<(Vec<u64>, usize, usize, usize)> = None;
[INFO] [stdout]      |                                                                            ^^^^ this value is reassigned later and never used
[INFO] [stdout] ...
[INFO] [stdout] 1000 |         prompt_hcs_snapshot = engine.prompt_hcs_shadow_snapshot();
[INFO] [stdout]      |         ------------------- `prompt_hcs_snapshot` is overwritten here before the previous value is read
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `dv`
[INFO] [stdout]      --> src/gpu_prefill.rs:30381:13
[INFO] [stdout]       |
[INFO] [stdout] 30381 |         let dv = config.la_v_head_dim;
[INFO] [stdout]       |             ^^ help: if this is intentional, prefix it with an underscore: `_dv`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: variable does not need to be mutable
[INFO] [stdout]     --> src/server.rs:1638:13
[INFO] [stdout]      |
[INFO] [stdout] 1638 |         let mut on_token = |token_id: usize, text: &str, fr: Option<&str>, token_logprobs: Option<&[(u32, f32)]>| -> bool {
[INFO] [stdout]      |             ----^^^^^^^^
[INFO] [stdout]      |             |
[INFO] [stdout]      |             help: remove this `mut`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `evicted`
[INFO] [stdout]     --> src/server.rs:2713:14
[INFO] [stdout]      |
[INFO] [stdout] 2713 |         let (evicted, _) = store.hcs_evict_for_prefill(estimated_tokens);
[INFO] [stdout]      |              ^^^^^^^ help: if this is intentional, prefix it with an underscore: `_evicted`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `w13_n`
[INFO] [stdout]    --> src/weights/mod.rs:748:13
[INFO] [stdout]     |
[INFO] [stdout] 748 |         let w13_n = if ungated { intermediate } else { 2 * intermediate };
[INFO] [stdout]     |             ^^^^^ help: if this is intentional, prefix it with an underscore: `_w13_n`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: value assigned to `effective_gs` is never read
[INFO] [stdout]     --> src/weights/mod.rs:2030:13
[INFO] [stdout]      |
[INFO] [stdout] 2030 |             effective_gs = 0;
[INFO] [stdout]      |             ^^^^^^^^^^^^^^^^
[INFO] [stdout]      |
[INFO] [stdout]      = help: maybe it is overwritten before being read?
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `total_moe_layers`
[INFO] [stdout]     --> src/weights/mod.rs:2761:9
[INFO] [stdout]      |
[INFO] [stdout] 2761 |         total_moe_layers: usize,
[INFO] [stdout]      |         ^^^^^^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_total_moe_layers`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `evicted`
[INFO] [stdout]    --> src/server.rs:918:10
[INFO] [stdout]     |
[INFO] [stdout] 918 |     let (evicted, _freed_mb) = store_for_evict.hcs_evict_for_prefill(estimated_tokens);
[INFO] [stdout]     |          ^^^^^^^ help: if this is intentional, prefix it with an underscore: `_evicted`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: value assigned to `prompt_hcs_snapshot` is never read
[INFO] [stdout]     --> src/server.rs:930:76
[INFO] [stdout]      |
[INFO] [stdout]  930 |     let mut prompt_hcs_snapshot: Option<(Vec<u64>, usize, usize, usize)> = None;
[INFO] [stdout]      |                                                                            ^^^^ this value is reassigned later and never used
[INFO] [stdout] ...
[INFO] [stdout] 1000 |         prompt_hcs_snapshot = engine.prompt_hcs_shadow_snapshot();
[INFO] [stdout]      |         ------------------- `prompt_hcs_snapshot` is overwritten here before the previous value is read
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: variable does not need to be mutable
[INFO] [stdout]     --> src/server.rs:1638:13
[INFO] [stdout]      |
[INFO] [stdout] 1638 |         let mut on_token = |token_id: usize, text: &str, fr: Option<&str>, token_logprobs: Option<&[(u32, f32)]>| -> bool {
[INFO] [stdout]      |             ----^^^^^^^^
[INFO] [stdout]      |             |
[INFO] [stdout]      |             help: remove this `mut`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `evicted`
[INFO] [stdout]     --> src/server.rs:2713:14
[INFO] [stdout]      |
[INFO] [stdout] 2713 |         let (evicted, _) = store.hcs_evict_for_prefill(estimated_tokens);
[INFO] [stdout]      |              ^^^^^^^ help: if this is intentional, prefix it with an underscore: `_evicted`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: type `PrefillTraceConfig` is more private than the item `gpu_prefill::PrefillEngine::trace`
[INFO] [stdout]     --> src/gpu_prefill.rs:3140:5
[INFO] [stdout]      |
[INFO] [stdout] 3140 |     pub trace: Option<PrefillTraceConfig>,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ field `gpu_prefill::PrefillEngine::trace` is reachable at visibility `pub`
[INFO] [stdout]      |
[INFO] [stdout] note: but type `PrefillTraceConfig` is only usable at visibility `pub(crate)`
[INFO] [stdout]     --> src/gpu_prefill.rs:74:1
[INFO] [stdout]      |
[INFO] [stdout]   74 | pub(crate) struct PrefillTraceConfig {
[INFO] [stdout]      | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout]      = note: `#[warn(private_interfaces)]` on by default
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `prefetch_weight_nta` is never used
[INFO] [stdout]     --> src/decode.rs:1962:4
[INFO] [stdout]      |
[INFO] [stdout] 1962 | fn prefetch_weight_nta(w: &TransposedWeight) {
[INFO] [stdout]      |    ^^^^^^^^^^^^^^^^^^^
[INFO] [stdout]      |
[INFO] [stdout]      = note: `#[warn(dead_code)]` (part of `#[warn(unused)]`) on by default
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `w13_n`
[INFO] [stdout]    --> src/weights/mod.rs:748:13
[INFO] [stdout]     |
[INFO] [stdout] 748 |         let w13_n = if ungated { intermediate } else { 2 * intermediate };
[INFO] [stdout]     |             ^^^^^ help: if this is intentional, prefix it with an underscore: `_w13_n`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `bf16_to_f32` is never used
[INFO] [stdout]     --> src/decode.rs:1989:4
[INFO] [stdout]      |
[INFO] [stdout] 1989 | fn bf16_to_f32(x: u16) -> f32 {
[INFO] [stdout]      |    ^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: struct `Xorshift64` is never constructed
[INFO] [stdout]     --> src/decode.rs:4917:8
[INFO] [stdout]      |
[INFO] [stdout] 4917 | struct Xorshift64(u64);
[INFO] [stdout]      |        ^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: associated items `new`, `next_u64`, `next_u32`, and `next_f32` are never used
[INFO] [stdout]     --> src/decode.rs:4920:8
[INFO] [stdout]      |
[INFO] [stdout] 4919 | impl Xorshift64 {
[INFO] [stdout]      | --------------- associated items in this implementation
[INFO] [stdout] 4920 |     fn new(seed: u64) -> Self { Self(if seed == 0 { 0xDEADBEEF } else { seed }) }
[INFO] [stdout]      |        ^^^
[INFO] [stdout] 4921 |     #[inline]
[INFO] [stdout] 4922 |     fn next_u64(&mut self) -> u64 {
[INFO] [stdout]      |        ^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 4929 |     fn next_u32(&mut self) -> u32 { self.next_u64() as u32 }
[INFO] [stdout]      |        ^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 4932 |     fn next_f32(&mut self, scale: f32) -> f32 {
[INFO] [stdout]      |        ^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `fill_random_u32` is never used
[INFO] [stdout]     --> src/decode.rs:4940:4
[INFO] [stdout]      |
[INFO] [stdout] 4940 | fn fill_random_u32(v: &mut [u32], rng: &mut Xorshift64) {
[INFO] [stdout]      |    ^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `fill_random_scales_u16` is never used
[INFO] [stdout]     --> src/decode.rs:4947:4
[INFO] [stdout]      |
[INFO] [stdout] 4947 | fn fill_random_scales_u16(v: &mut [u16], rng: &mut Xorshift64) {
[INFO] [stdout]      |    ^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `fill_random_f32` is never used
[INFO] [stdout]     --> src/decode.rs:4956:4
[INFO] [stdout]      |
[INFO] [stdout] 4956 | fn fill_random_f32(v: &mut [f32], rng: &mut Xorshift64, scale: f32) {
[INFO] [stdout]      |    ^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `fill_random_u16` is never used
[INFO] [stdout]     --> src/decode.rs:4963:4
[INFO] [stdout]      |
[INFO] [stdout] 4963 | fn fill_random_u16(v: &mut [u16], rng: &mut Xorshift64) {
[INFO] [stdout]      |    ^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `hint_hugepages` is never used
[INFO] [stdout]     --> src/decode.rs:4977:4
[INFO] [stdout]      |
[INFO] [stdout] 4977 | fn hint_hugepages<T>(v: &mut [T]) {
[INFO] [stdout]      |    ^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `fake_transposed_weight` is never used
[INFO] [stdout]     --> src/decode.rs:5041:4
[INFO] [stdout]      |
[INFO] [stdout] 5041 | fn fake_transposed_weight(rows: usize, cols: usize, group_size: usize, num_bits: u8, rng: &mut Xorshift64) -> TransposedWeight {
[INFO] [stdout]      |    ^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: constant `KERNEL_NAMES` is never used
[INFO] [stdout]     --> src/gpu_decode.rs:2739:7
[INFO] [stdout]      |
[INFO] [stdout] 2739 | const KERNEL_NAMES: &[&str] = &[
[INFO] [stdout]      |       ^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: field `buf_size` is never read
[INFO] [stdout]     --> src/gpu_decode.rs:2886:5
[INFO] [stdout]      |
[INFO] [stdout] 2882 | struct PrefetchSlot {
[INFO] [stdout]      |        ------------ field in this struct
[INFO] [stdout] ...
[INFO] [stdout] 2886 |     buf_size: usize,
[INFO] [stdout]      |     ^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: method `decode_kb_per_tok` is never used
[INFO] [stdout]     --> src/gpu_decode.rs:3055:8
[INFO] [stdout]      |
[INFO] [stdout] 2960 | impl VramCalibration {
[INFO] [stdout]      | -------------------- method in this implementation
[INFO] [stdout] ...
[INFO] [stdout] 3055 |     fn decode_kb_per_tok(&self) -> f64 {
[INFO] [stdout]      |        ^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: field `pool_slot` is never read
[INFO] [stdout]     --> src/gpu_decode.rs:3186:5
[INFO] [stdout]      |
[INFO] [stdout] 3170 | struct HcsCacheEntry {
[INFO] [stdout]      |        ------------- field in this struct
[INFO] [stdout] ...
[INFO] [stdout] 3186 |     pool_slot: Option<usize>,
[INFO] [stdout]      |     ^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: method `is_marlin` is never used
[INFO] [stdout]     --> src/gpu_decode.rs:4447:8
[INFO] [stdout]      |
[INFO] [stdout] 4406 | impl GpuWeight {
[INFO] [stdout]      | -------------- method in this implementation
[INFO] [stdout] ...
[INFO] [stdout] 4447 |     fn is_marlin(&self) -> bool {
[INFO] [stdout]      |        ^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: multiple fields are never read
[INFO] [stdout]     --> src/gpu_decode.rs:4482:5
[INFO] [stdout]      |
[INFO] [stdout] 4477 | struct HqqPrefillSidecarRegistration {
[INFO] [stdout]      |        ----------------------------- fields in this struct
[INFO] [stdout] ...
[INFO] [stdout] 4482 |     correction_bytes: usize,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^
[INFO] [stdout] 4483 |     scales_ptr: u64,
[INFO] [stdout] 4484 |     scales_bytes: usize,
[INFO] [stdout]      |     ^^^^^^^^^^^^
[INFO] [stdout] 4485 |     base_f32_ptr: u64,
[INFO] [stdout] 4486 |     base_f32_bytes: usize,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^
[INFO] [stdout] 4487 |     output_rows_ptr: u64,
[INFO] [stdout] 4488 |     output_rows_bytes: usize,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^
[INFO] [stdout] 4489 |     groups_ptr: u64,
[INFO] [stdout]      |     ^^^^^^^^^^
[INFO] [stdout] 4490 |     groups_bytes: usize,
[INFO] [stdout]      |     ^^^^^^^^^^^^
[INFO] [stdout] 4491 |     start_cols_ptr: u64,
[INFO] [stdout] 4492 |     start_cols_bytes: usize,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^
[INFO] [stdout] 4493 |     widths_ptr: u64,
[INFO] [stdout] 4494 |     widths_bytes: usize,
[INFO] [stdout]      |     ^^^^^^^^^^^^
[INFO] [stdout]      |
[INFO] [stdout]      = note: `HqqPrefillSidecarRegistration` has derived impls for the traits `Clone` and `Debug`, but these are intentionally ignored during dead code analysis
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `validate_hqq4_tensor_desc` is never used
[INFO] [stdout]     --> src/gpu_decode.rs:5118:4
[INFO] [stdout]      |
[INFO] [stdout] 5118 | fn validate_hqq4_tensor_desc(
[INFO] [stdout]      |    ^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `build_hqq_tensors` is never used
[INFO] [stdout]     --> src/gpu_decode.rs:5237:4
[INFO] [stdout]      |
[INFO] [stdout] 5237 | fn build_hqq_tensors(
[INFO] [stdout]      |    ^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: field `size` is never read
[INFO] [stdout]     --> src/gpu_decode.rs:6039:5
[INFO] [stdout]      |
[INFO] [stdout] 6036 | struct PinnedMapped {
[INFO] [stdout]      |        ------------ field in this struct
[INFO] [stdout] ...
[INFO] [stdout] 6039 |     size: usize,
[INFO] [stdout]      |     ^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: multiple fields are never read
[INFO] [stdout]     --> src/gpu_decode.rs:6084:5
[INFO] [stdout]      |
[INFO] [stdout] 6081 | struct CachedKernels {
[INFO] [stdout]      |        ------------- fields in this struct
[INFO] [stdout] ...
[INFO] [stdout] 6084 |     rmsnorm: cudarc::driver::CudaFunction,
[INFO] [stdout]      |     ^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 6090 |     add_bf16: cudarc::driver::CudaFunction,
[INFO] [stdout]      |     ^^^^^^^^
[INFO] [stdout] 6091 |     weighted_add_bf16: cudarc::driver::CudaFunction,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 6094 |     marlin_gemv_int4: cudarc::driver::CudaFunction,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 6136 |     apply_gated_attn: cudarc::driver::CudaFunction,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^
[INFO] [stdout] 6137 |     // Fused v2 kernels with inline atomic reduction (no separate reduce kernel)
[INFO] [stdout] 6138 |     marlin_gemv_int4_v2_fused_f32: cudarc::driver::CudaFunction,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 6139 |     marlin_gemv_int8_v2: cudarc::driver::CudaFunction,
[INFO] [stdout] 6140 |     marlin_gemv_int8_v2_fused_f32: cudarc::driver::CudaFunction,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 6160 |     gqa_attention_g: cudarc::driver::CudaFunction,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 6163 |     gqa_attention_g_bf16: cudarc::driver::CudaFunction,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 6169 |     gqa_attention_k4v4_tiled_g: cudarc::driver::CudaFunction,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 6195 |     gqa_attention_polar4_g: cudarc::driver::CudaFunction,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout]      |
[INFO] [stdout]      = note: `CachedKernels` has a derived impl for the trait `Clone`, but this is intentionally ignored during dead code analysis
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: multiple fields are never read
[INFO] [stdout]     --> src/gpu_decode.rs:6321:5
[INFO] [stdout]      |
[INFO] [stdout] 6210 | struct GpuDecodeGraph {
[INFO] [stdout]      |        -------------- fields in this struct
[INFO] [stdout] ...
[INFO] [stdout] 6321 |     d_batch_w13_packed_ptrs: cudarc::driver::CudaSlice<u64>,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 6322 |     d_batch_w13_scales_ptrs: cudarc::driver::CudaSlice<u64>,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 6323 |     d_batch_w2_packed_ptrs: cudarc::driver::CudaSlice<u64>,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 6324 |     d_batch_w2_scales_ptrs: cudarc::driver::CudaSlice<u64>,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 6325 |     // Device array for batched routing weights: [max_experts_per_tok] FP32
[INFO] [stdout] 6326 |     d_batch_weights: cudarc::driver::CudaSlice<f32>,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 6337 |     batch_upload_ptrs_bytes: usize, // 4 * max * 8
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 6338 |     batch_upload_total_bytes: usize, // 4 * max * 8 + max * 4
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 6441 |     d_gqa_gate_buf: Option<cudarc::driver::CudaSlice<f32>>,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: field `hqq_bf16_weight_bufs` is never read
[INFO] [stdout]     --> src/gpu_decode.rs:7473:5
[INFO] [stdout]      |
[INFO] [stdout] 7376 | pub struct GpuDecodeStore {
[INFO] [stdout]      |            -------------- field in this struct
[INFO] [stdout] ...
[INFO] [stdout] 7473 |     hqq_bf16_weight_bufs: Vec<CudaSlice<u16>>,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: methods `hqq4_dequant_to_bf16` and `materialize_hqq_gqa_decode_weights` are never used
[INFO] [stdout]     --> src/gpu_decode.rs:8845:8
[INFO] [stdout]      |
[INFO] [stdout] 7486 | impl GpuDecodeStore {
[INFO] [stdout]      | ------------------- methods in this implementation
[INFO] [stdout] ...
[INFO] [stdout] 8845 |     fn hqq4_dequant_to_bf16(
[INFO] [stdout]      |        ^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 9373 |     fn materialize_hqq_gqa_decode_weights(&mut self, layer_idx: usize) -> Result<(), String> {
[INFO] [stdout]      |        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `diag_compute_moe_accum_last_cpu_from_sorted_rows` is never used
[INFO] [stdout]     --> src/gpu_prefill.rs:1572:4
[INFO] [stdout]      |
[INFO] [stdout] 1572 | fn diag_compute_moe_accum_last_cpu_from_sorted_rows(
[INFO] [stdout]      |    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: constant `MARLIN_MAX_BLOCKS_PER_SM` is never used
[INFO] [stdout]     --> src/gpu_prefill.rs:2000:7
[INFO] [stdout]      |
[INFO] [stdout] 2000 | const MARLIN_MAX_BLOCKS_PER_SM: usize = 4;
[INFO] [stdout]      |       ^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `extract_cu_func` is never used
[INFO] [stdout]     --> src/gpu_prefill.rs:2314:4
[INFO] [stdout]      |
[INFO] [stdout] 2314 | fn extract_cu_func(func: &CudaFunction) -> RawCuFunc {
[INFO] [stdout]      |    ^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: multiple fields are never read
[INFO] [stdout]     --> src/gpu_prefill.rs:2391:5
[INFO] [stdout]      |
[INFO] [stdout] 2367 | pub struct PrefillKernels {
[INFO] [stdout]      |            -------------- fields in this struct
[INFO] [stdout] ...
[INFO] [stdout] 2391 |     moe_sum_reduce: RawCuFunc,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^
[INFO] [stdout] 2392 |     gqa_prefill: RawCuFunc,
[INFO] [stdout]      |     ^^^^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 2414 |     causal_conv1d: RawCuFunc,
[INFO] [stdout]      |     ^^^^^^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 2423 |     fp32_to_bf16_batch: RawCuFunc,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 2436 |     la_compute_v_new: RawCuFunc,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^
[INFO] [stdout] 2437 |     la_chunk_output: RawCuFunc,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^
[INFO] [stdout] 2438 |     la_state_update: RawCuFunc,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 2442 |     la_fp32_to_bf16: RawCuFunc,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 2447 |     transpose_3d_021_bf16: RawCuFunc, // [A,B,C] -> [B,A,C] BF16
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 2470 |     moe_gather_sorted: RawCuFunc,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^
[INFO] [stdout] 2471 |     moe_replicate_hidden: RawCuFunc,
[INFO] [stdout] 2472 |     moe_scatter_fused: RawCuFunc,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: methods `diag_compute_la_gated_rmsnorm_last_cpu`, `diag_compute_fla_g_cumsum_last_cpu_from_gate_f32`, `upload_tokens`, and `forward_gqa` are never used
[INFO] [stdout]      --> src/gpu_prefill.rs:7094:8
[INFO] [stdout]       |
[INFO] [stdout]  3437 | impl PrefillEngine {
[INFO] [stdout]       | ------------------ methods in this implementation
[INFO] [stdout] ...
[INFO] [stdout]  7094 |     fn diag_compute_la_gated_rmsnorm_last_cpu(
[INFO] [stdout]       |        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout]  7600 |     fn diag_compute_fla_g_cumsum_last_cpu_from_gate_f32(
[INFO] [stdout]       |        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 11376 |     fn upload_tokens(&self, token_ids: &[u32]) -> Result<(), String> {
[INFO] [stdout]       |        ^^^^^^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 13292 |     fn forward_gqa(&self, layer_idx: usize, m: usize) -> Result<(), String> {
[INFO] [stdout]       |        ^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: type alias `SidecarAbiVersionFn` is never used
[INFO] [stdout]      --> src/gpu_prefill.rs:31058:6
[INFO] [stdout]       |
[INFO] [stdout] 31058 | type SidecarAbiVersionFn = unsafe extern "C" fn() -> u32;
[INFO] [stdout]       |      ^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: type alias `SidecarBuildIdFn` is never used
[INFO] [stdout]      --> src/gpu_prefill.rs:31059:6
[INFO] [stdout]       |
[INFO] [stdout] 31059 | type SidecarBuildIdFn = unsafe extern "C" fn() -> *const libc::c_char;
[INFO] [stdout]       |      ^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `expected_sidecar_abi_version` is never used
[INFO] [stdout]      --> src/gpu_prefill.rs:31061:4
[INFO] [stdout]       |
[INFO] [stdout] 31061 | fn expected_sidecar_abi_version() -> u32 {
[INFO] [stdout]       |    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `sidecar_manifest_candidates` is never used
[INFO] [stdout]      --> src/gpu_prefill.rs:31067:4
[INFO] [stdout]       |
[INFO] [stdout] 31067 | fn sidecar_manifest_candidates(path: &str) -> Vec<PathBuf> {
[INFO] [stdout]       |    ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `manifest_build_id_for_sidecar` is never used
[INFO] [stdout]      --> src/gpu_prefill.rs:31082:4
[INFO] [stdout]       |
[INFO] [stdout] 31082 | fn manifest_build_id_for_sidecar(path: &str, manifest_key: &str) -> Result<String, String> {
[INFO] [stdout]       |    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `verify_sidecar_abi` is never used
[INFO] [stdout]      --> src/gpu_prefill.rs:31130:11
[INFO] [stdout]       |
[INFO] [stdout] 31130 | unsafe fn verify_sidecar_abi(
[INFO] [stdout]       |           ^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `load_marlin_mm` is never used
[INFO] [stdout]      --> src/gpu_prefill.rs:31197:4
[INFO] [stdout]       |
[INFO] [stdout] 31197 | fn load_marlin_mm() -> Option<MarlinMmFn> {
[INFO] [stdout]       |    ^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `load_fused_moe` is never used
[INFO] [stdout]      --> src/gpu_prefill.rs:31228:4
[INFO] [stdout]       |
[INFO] [stdout] 31228 | fn load_fused_moe() -> Option<FusedMoeFn> {
[INFO] [stdout]       |    ^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `load_fused_moe_scatter` is never used
[INFO] [stdout]      --> src/gpu_prefill.rs:31259:4
[INFO] [stdout]       |
[INFO] [stdout] 31259 | fn load_fused_moe_scatter() -> Option<FusedMoeScatterFn> {
[INFO] [stdout]       |    ^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `load_flash_attn` is never used
[INFO] [stdout]      --> src/gpu_prefill.rs:31288:4
[INFO] [stdout]       |
[INFO] [stdout] 31288 | fn load_flash_attn() -> Option<FlashAttnFwdFn> {
[INFO] [stdout]       |    ^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `load_flash_attn_fp8kv` is never used
[INFO] [stdout]      --> src/gpu_prefill.rs:31332:4
[INFO] [stdout]       |
[INFO] [stdout] 31332 | fn load_flash_attn_fp8kv() -> Option<FlashAttnFwdFn> {
[INFO] [stdout]       |    ^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `find_marlin_so` is never used
[INFO] [stdout]      --> src/gpu_prefill.rs:36467:4
[INFO] [stdout]       |
[INFO] [stdout] 36467 | fn find_marlin_so() -> Option<String> {
[INFO] [stdout]       |    ^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `cpu_expert_byte_sizes_mixed` is never used
[INFO] [stdout]     --> src/weights/mod.rs:1632:4
[INFO] [stdout]      |
[INFO] [stdout] 1632 | fn cpu_expert_byte_sizes_mixed(h: usize, m: usize, group_size: usize, w13_bits: u8, w2_bits: u8) -> (usize, usize, usize, usize) {
[INFO] [stdout]      |    ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `read_marlin_expert` is never used
[INFO] [stdout]     --> src/weights/mod.rs:5579:4
[INFO] [stdout]      |
[INFO] [stdout] 5579 | fn read_marlin_expert(
[INFO] [stdout]      |    ^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `read_unified_expert_cpu` is never used
[INFO] [stdout]     --> src/weights/mod.rs:5700:4
[INFO] [stdout]      |
[INFO] [stdout] 5700 | fn read_unified_expert_cpu(
[INFO] [stdout]      |    ^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `read_unified_expert_cpu_mixed` is never used
[INFO] [stdout]     --> src/weights/mod.rs:5803:4
[INFO] [stdout]      |
[INFO] [stdout] 5803 | fn read_unified_expert_cpu_mixed(
[INFO] [stdout]      |    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: structure field `mamba2_A` should have a snake case name
[INFO] [stdout]     --> src/gpu_prefill.rs:2969:9
[INFO] [stdout]      |
[INFO] [stdout] 2969 |     pub mamba2_A: u64,
[INFO] [stdout]      |         ^^^^^^^^ help: convert the identifier to snake case: `mamba2_a`
[INFO] [stdout]      |
[INFO] [stdout]      = note: `#[warn(non_snake_case)]` (part of `#[warn(nonstandard_style)]`) on by default
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: structure field `mamba2_D` should have a snake case name
[INFO] [stdout]     --> src/gpu_prefill.rs:2971:9
[INFO] [stdout]      |
[INFO] [stdout] 2971 |     pub mamba2_D: u64,
[INFO] [stdout]      |         ^^^^^^^^ help: convert the identifier to snake case: `mamba2_d`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: value assigned to `effective_gs` is never read
[INFO] [stdout]     --> src/weights/mod.rs:2030:13
[INFO] [stdout]      |
[INFO] [stdout] 2030 |             effective_gs = 0;
[INFO] [stdout]      |             ^^^^^^^^^^^^^^^^
[INFO] [stdout]      |
[INFO] [stdout]      = help: maybe it is overwritten before being read?
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused variable: `total_moe_layers`
[INFO] [stdout]     --> src/weights/mod.rs:2761:9
[INFO] [stdout]      |
[INFO] [stdout] 2761 |         total_moe_layers: usize,
[INFO] [stdout]      |         ^^^^^^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_total_moe_layers`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: type `PrefillTraceConfig` is more private than the item `gpu_prefill::PrefillEngine::trace`
[INFO] [stdout]     --> src/gpu_prefill.rs:3140:5
[INFO] [stdout]      |
[INFO] [stdout] 3140 |     pub trace: Option<PrefillTraceConfig>,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ field `gpu_prefill::PrefillEngine::trace` is reachable at visibility `pub`
[INFO] [stdout]      |
[INFO] [stdout] note: but type `PrefillTraceConfig` is only usable at visibility `pub(crate)`
[INFO] [stdout]     --> src/gpu_prefill.rs:74:1
[INFO] [stdout]      |
[INFO] [stdout]   74 | pub(crate) struct PrefillTraceConfig {
[INFO] [stdout]      | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout]      = note: `#[warn(private_interfaces)]` on by default
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `prefetch_weight_nta` is never used
[INFO] [stdout]     --> src/decode.rs:1962:4
[INFO] [stdout]      |
[INFO] [stdout] 1962 | fn prefetch_weight_nta(w: &TransposedWeight) {
[INFO] [stdout]      |    ^^^^^^^^^^^^^^^^^^^
[INFO] [stdout]      |
[INFO] [stdout]      = note: `#[warn(dead_code)]` (part of `#[warn(unused)]`) on by default
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `bf16_to_f32` is never used
[INFO] [stdout]     --> src/decode.rs:1989:4
[INFO] [stdout]      |
[INFO] [stdout] 1989 | fn bf16_to_f32(x: u16) -> f32 {
[INFO] [stdout]      |    ^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: struct `Xorshift64` is never constructed
[INFO] [stdout]     --> src/decode.rs:4917:8
[INFO] [stdout]      |
[INFO] [stdout] 4917 | struct Xorshift64(u64);
[INFO] [stdout]      |        ^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: associated items `new`, `next_u64`, `next_u32`, and `next_f32` are never used
[INFO] [stdout]     --> src/decode.rs:4920:8
[INFO] [stdout]      |
[INFO] [stdout] 4919 | impl Xorshift64 {
[INFO] [stdout]      | --------------- associated items in this implementation
[INFO] [stdout] 4920 |     fn new(seed: u64) -> Self { Self(if seed == 0 { 0xDEADBEEF } else { seed }) }
[INFO] [stdout]      |        ^^^
[INFO] [stdout] 4921 |     #[inline]
[INFO] [stdout] 4922 |     fn next_u64(&mut self) -> u64 {
[INFO] [stdout]      |        ^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 4929 |     fn next_u32(&mut self) -> u32 { self.next_u64() as u32 }
[INFO] [stdout]      |        ^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 4932 |     fn next_f32(&mut self, scale: f32) -> f32 {
[INFO] [stdout]      |        ^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `fill_random_u32` is never used
[INFO] [stdout]     --> src/decode.rs:4940:4
[INFO] [stdout]      |
[INFO] [stdout] 4940 | fn fill_random_u32(v: &mut [u32], rng: &mut Xorshift64) {
[INFO] [stdout]      |    ^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `fill_random_scales_u16` is never used
[INFO] [stdout]     --> src/decode.rs:4947:4
[INFO] [stdout]      |
[INFO] [stdout] 4947 | fn fill_random_scales_u16(v: &mut [u16], rng: &mut Xorshift64) {
[INFO] [stdout]      |    ^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `fill_random_f32` is never used
[INFO] [stdout]     --> src/decode.rs:4956:4
[INFO] [stdout]      |
[INFO] [stdout] 4956 | fn fill_random_f32(v: &mut [f32], rng: &mut Xorshift64, scale: f32) {
[INFO] [stdout]      |    ^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `fill_random_u16` is never used
[INFO] [stdout]     --> src/decode.rs:4963:4
[INFO] [stdout]      |
[INFO] [stdout] 4963 | fn fill_random_u16(v: &mut [u16], rng: &mut Xorshift64) {
[INFO] [stdout]      |    ^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `hint_hugepages` is never used
[INFO] [stdout]     --> src/decode.rs:4977:4
[INFO] [stdout]      |
[INFO] [stdout] 4977 | fn hint_hugepages<T>(v: &mut [T]) {
[INFO] [stdout]      |    ^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `fake_transposed_weight` is never used
[INFO] [stdout]     --> src/decode.rs:5041:4
[INFO] [stdout]      |
[INFO] [stdout] 5041 | fn fake_transposed_weight(rows: usize, cols: usize, group_size: usize, num_bits: u8, rng: &mut Xorshift64) -> TransposedWeight {
[INFO] [stdout]      |    ^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: constant `KERNEL_NAMES` is never used
[INFO] [stdout]     --> src/gpu_decode.rs:2739:7
[INFO] [stdout]      |
[INFO] [stdout] 2739 | const KERNEL_NAMES: &[&str] = &[
[INFO] [stdout]      |       ^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: field `buf_size` is never read
[INFO] [stdout]     --> src/gpu_decode.rs:2886:5
[INFO] [stdout]      |
[INFO] [stdout] 2882 | struct PrefetchSlot {
[INFO] [stdout]      |        ------------ field in this struct
[INFO] [stdout] ...
[INFO] [stdout] 2886 |     buf_size: usize,
[INFO] [stdout]      |     ^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: method `decode_kb_per_tok` is never used
[INFO] [stdout]     --> src/gpu_decode.rs:3055:8
[INFO] [stdout]      |
[INFO] [stdout] 2960 | impl VramCalibration {
[INFO] [stdout]      | -------------------- method in this implementation
[INFO] [stdout] ...
[INFO] [stdout] 3055 |     fn decode_kb_per_tok(&self) -> f64 {
[INFO] [stdout]      |        ^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: field `pool_slot` is never read
[INFO] [stdout]     --> src/gpu_decode.rs:3186:5
[INFO] [stdout]      |
[INFO] [stdout] 3170 | struct HcsCacheEntry {
[INFO] [stdout]      |        ------------- field in this struct
[INFO] [stdout] ...
[INFO] [stdout] 3186 |     pool_slot: Option<usize>,
[INFO] [stdout]      |     ^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: method `is_marlin` is never used
[INFO] [stdout]     --> src/gpu_decode.rs:4447:8
[INFO] [stdout]      |
[INFO] [stdout] 4406 | impl GpuWeight {
[INFO] [stdout]      | -------------- method in this implementation
[INFO] [stdout] ...
[INFO] [stdout] 4447 |     fn is_marlin(&self) -> bool {
[INFO] [stdout]      |        ^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: multiple fields are never read
[INFO] [stdout]     --> src/gpu_decode.rs:4482:5
[INFO] [stdout]      |
[INFO] [stdout] 4477 | struct HqqPrefillSidecarRegistration {
[INFO] [stdout]      |        ----------------------------- fields in this struct
[INFO] [stdout] ...
[INFO] [stdout] 4482 |     correction_bytes: usize,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^
[INFO] [stdout] 4483 |     scales_ptr: u64,
[INFO] [stdout] 4484 |     scales_bytes: usize,
[INFO] [stdout]      |     ^^^^^^^^^^^^
[INFO] [stdout] 4485 |     base_f32_ptr: u64,
[INFO] [stdout] 4486 |     base_f32_bytes: usize,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^
[INFO] [stdout] 4487 |     output_rows_ptr: u64,
[INFO] [stdout] 4488 |     output_rows_bytes: usize,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^
[INFO] [stdout] 4489 |     groups_ptr: u64,
[INFO] [stdout]      |     ^^^^^^^^^^
[INFO] [stdout] 4490 |     groups_bytes: usize,
[INFO] [stdout]      |     ^^^^^^^^^^^^
[INFO] [stdout] 4491 |     start_cols_ptr: u64,
[INFO] [stdout] 4492 |     start_cols_bytes: usize,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^
[INFO] [stdout] 4493 |     widths_ptr: u64,
[INFO] [stdout] 4494 |     widths_bytes: usize,
[INFO] [stdout]      |     ^^^^^^^^^^^^
[INFO] [stdout]      |
[INFO] [stdout]      = note: `HqqPrefillSidecarRegistration` has derived impls for the traits `Clone` and `Debug`, but these are intentionally ignored during dead code analysis
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `validate_hqq4_tensor_desc` is never used
[INFO] [stdout]     --> src/gpu_decode.rs:5118:4
[INFO] [stdout]      |
[INFO] [stdout] 5118 | fn validate_hqq4_tensor_desc(
[INFO] [stdout]      |    ^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `build_hqq_tensors` is never used
[INFO] [stdout]     --> src/gpu_decode.rs:5237:4
[INFO] [stdout]      |
[INFO] [stdout] 5237 | fn build_hqq_tensors(
[INFO] [stdout]      |    ^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: field `size` is never read
[INFO] [stdout]     --> src/gpu_decode.rs:6039:5
[INFO] [stdout]      |
[INFO] [stdout] 6036 | struct PinnedMapped {
[INFO] [stdout]      |        ------------ field in this struct
[INFO] [stdout] ...
[INFO] [stdout] 6039 |     size: usize,
[INFO] [stdout]      |     ^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: multiple fields are never read
[INFO] [stdout]     --> src/gpu_decode.rs:6084:5
[INFO] [stdout]      |
[INFO] [stdout] 6081 | struct CachedKernels {
[INFO] [stdout]      |        ------------- fields in this struct
[INFO] [stdout] ...
[INFO] [stdout] 6084 |     rmsnorm: cudarc::driver::CudaFunction,
[INFO] [stdout]      |     ^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 6090 |     add_bf16: cudarc::driver::CudaFunction,
[INFO] [stdout]      |     ^^^^^^^^
[INFO] [stdout] 6091 |     weighted_add_bf16: cudarc::driver::CudaFunction,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 6094 |     marlin_gemv_int4: cudarc::driver::CudaFunction,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 6136 |     apply_gated_attn: cudarc::driver::CudaFunction,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^
[INFO] [stdout] 6137 |     // Fused v2 kernels with inline atomic reduction (no separate reduce kernel)
[INFO] [stdout] 6138 |     marlin_gemv_int4_v2_fused_f32: cudarc::driver::CudaFunction,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 6139 |     marlin_gemv_int8_v2: cudarc::driver::CudaFunction,
[INFO] [stdout] 6140 |     marlin_gemv_int8_v2_fused_f32: cudarc::driver::CudaFunction,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 6160 |     gqa_attention_g: cudarc::driver::CudaFunction,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 6163 |     gqa_attention_g_bf16: cudarc::driver::CudaFunction,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 6169 |     gqa_attention_k4v4_tiled_g: cudarc::driver::CudaFunction,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 6195 |     gqa_attention_polar4_g: cudarc::driver::CudaFunction,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout]      |
[INFO] [stdout]      = note: `CachedKernels` has a derived impl for the trait `Clone`, but this is intentionally ignored during dead code analysis
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: multiple fields are never read
[INFO] [stdout]     --> src/gpu_decode.rs:6321:5
[INFO] [stdout]      |
[INFO] [stdout] 6210 | struct GpuDecodeGraph {
[INFO] [stdout]      |        -------------- fields in this struct
[INFO] [stdout] ...
[INFO] [stdout] 6321 |     d_batch_w13_packed_ptrs: cudarc::driver::CudaSlice<u64>,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 6322 |     d_batch_w13_scales_ptrs: cudarc::driver::CudaSlice<u64>,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 6323 |     d_batch_w2_packed_ptrs: cudarc::driver::CudaSlice<u64>,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 6324 |     d_batch_w2_scales_ptrs: cudarc::driver::CudaSlice<u64>,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 6325 |     // Device array for batched routing weights: [max_experts_per_tok] FP32
[INFO] [stdout] 6326 |     d_batch_weights: cudarc::driver::CudaSlice<f32>,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 6337 |     batch_upload_ptrs_bytes: usize, // 4 * max * 8
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 6338 |     batch_upload_total_bytes: usize, // 4 * max * 8 + max * 4
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 6441 |     d_gqa_gate_buf: Option<cudarc::driver::CudaSlice<f32>>,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: field `hqq_bf16_weight_bufs` is never read
[INFO] [stdout]     --> src/gpu_decode.rs:7473:5
[INFO] [stdout]      |
[INFO] [stdout] 7376 | pub struct GpuDecodeStore {
[INFO] [stdout]      |            -------------- field in this struct
[INFO] [stdout] ...
[INFO] [stdout] 7473 |     hqq_bf16_weight_bufs: Vec<CudaSlice<u16>>,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: methods `hqq4_dequant_to_bf16` and `materialize_hqq_gqa_decode_weights` are never used
[INFO] [stdout]     --> src/gpu_decode.rs:8845:8
[INFO] [stdout]      |
[INFO] [stdout] 7486 | impl GpuDecodeStore {
[INFO] [stdout]      | ------------------- methods in this implementation
[INFO] [stdout] ...
[INFO] [stdout] 8845 |     fn hqq4_dequant_to_bf16(
[INFO] [stdout]      |        ^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 9373 |     fn materialize_hqq_gqa_decode_weights(&mut self, layer_idx: usize) -> Result<(), String> {
[INFO] [stdout]      |        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `diag_compute_moe_accum_last_cpu_from_sorted_rows` is never used
[INFO] [stdout]     --> src/gpu_prefill.rs:1572:4
[INFO] [stdout]      |
[INFO] [stdout] 1572 | fn diag_compute_moe_accum_last_cpu_from_sorted_rows(
[INFO] [stdout]      |    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: constant `MARLIN_MAX_BLOCKS_PER_SM` is never used
[INFO] [stdout]     --> src/gpu_prefill.rs:2000:7
[INFO] [stdout]      |
[INFO] [stdout] 2000 | const MARLIN_MAX_BLOCKS_PER_SM: usize = 4;
[INFO] [stdout]      |       ^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `extract_cu_func` is never used
[INFO] [stdout]     --> src/gpu_prefill.rs:2314:4
[INFO] [stdout]      |
[INFO] [stdout] 2314 | fn extract_cu_func(func: &CudaFunction) -> RawCuFunc {
[INFO] [stdout]      |    ^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: multiple fields are never read
[INFO] [stdout]     --> src/gpu_prefill.rs:2391:5
[INFO] [stdout]      |
[INFO] [stdout] 2367 | pub struct PrefillKernels {
[INFO] [stdout]      |            -------------- fields in this struct
[INFO] [stdout] ...
[INFO] [stdout] 2391 |     moe_sum_reduce: RawCuFunc,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^
[INFO] [stdout] 2392 |     gqa_prefill: RawCuFunc,
[INFO] [stdout]      |     ^^^^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 2414 |     causal_conv1d: RawCuFunc,
[INFO] [stdout]      |     ^^^^^^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 2423 |     fp32_to_bf16_batch: RawCuFunc,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 2436 |     la_compute_v_new: RawCuFunc,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^
[INFO] [stdout] 2437 |     la_chunk_output: RawCuFunc,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^
[INFO] [stdout] 2438 |     la_state_update: RawCuFunc,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 2442 |     la_fp32_to_bf16: RawCuFunc,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 2447 |     transpose_3d_021_bf16: RawCuFunc, // [A,B,C] -> [B,A,C] BF16
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 2470 |     moe_gather_sorted: RawCuFunc,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^
[INFO] [stdout] 2471 |     moe_replicate_hidden: RawCuFunc,
[INFO] [stdout] 2472 |     moe_scatter_fused: RawCuFunc,
[INFO] [stdout]      |     ^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: methods `diag_compute_la_gated_rmsnorm_last_cpu`, `diag_compute_fla_g_cumsum_last_cpu_from_gate_f32`, `upload_tokens`, and `forward_gqa` are never used
[INFO] [stdout]      --> src/gpu_prefill.rs:7094:8
[INFO] [stdout]       |
[INFO] [stdout]  3437 | impl PrefillEngine {
[INFO] [stdout]       | ------------------ methods in this implementation
[INFO] [stdout] ...
[INFO] [stdout]  7094 |     fn diag_compute_la_gated_rmsnorm_last_cpu(
[INFO] [stdout]       |        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout]  7600 |     fn diag_compute_fla_g_cumsum_last_cpu_from_gate_f32(
[INFO] [stdout]       |        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 11376 |     fn upload_tokens(&self, token_ids: &[u32]) -> Result<(), String> {
[INFO] [stdout]       |        ^^^^^^^^^^^^^
[INFO] [stdout] ...
[INFO] [stdout] 13292 |     fn forward_gqa(&self, layer_idx: usize, m: usize) -> Result<(), String> {
[INFO] [stdout]       |        ^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: type alias `SidecarAbiVersionFn` is never used
[INFO] [stdout]      --> src/gpu_prefill.rs:31058:6
[INFO] [stdout]       |
[INFO] [stdout] 31058 | type SidecarAbiVersionFn = unsafe extern "C" fn() -> u32;
[INFO] [stdout]       |      ^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: type alias `SidecarBuildIdFn` is never used
[INFO] [stdout]      --> src/gpu_prefill.rs:31059:6
[INFO] [stdout]       |
[INFO] [stdout] 31059 | type SidecarBuildIdFn = unsafe extern "C" fn() -> *const libc::c_char;
[INFO] [stdout]       |      ^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `expected_sidecar_abi_version` is never used
[INFO] [stdout]      --> src/gpu_prefill.rs:31061:4
[INFO] [stdout]       |
[INFO] [stdout] 31061 | fn expected_sidecar_abi_version() -> u32 {
[INFO] [stdout]       |    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `sidecar_manifest_candidates` is never used
[INFO] [stdout]      --> src/gpu_prefill.rs:31067:4
[INFO] [stdout]       |
[INFO] [stdout] 31067 | fn sidecar_manifest_candidates(path: &str) -> Vec<PathBuf> {
[INFO] [stdout]       |    ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `manifest_build_id_for_sidecar` is never used
[INFO] [stdout]      --> src/gpu_prefill.rs:31082:4
[INFO] [stdout]       |
[INFO] [stdout] 31082 | fn manifest_build_id_for_sidecar(path: &str, manifest_key: &str) -> Result<String, String> {
[INFO] [stdout]       |    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `verify_sidecar_abi` is never used
[INFO] [stdout]      --> src/gpu_prefill.rs:31130:11
[INFO] [stdout]       |
[INFO] [stdout] 31130 | unsafe fn verify_sidecar_abi(
[INFO] [stdout]       |           ^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `load_marlin_mm` is never used
[INFO] [stdout]      --> src/gpu_prefill.rs:31197:4
[INFO] [stdout]       |
[INFO] [stdout] 31197 | fn load_marlin_mm() -> Option<MarlinMmFn> {
[INFO] [stdout]       |    ^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `load_fused_moe` is never used
[INFO] [stdout]      --> src/gpu_prefill.rs:31228:4
[INFO] [stdout]       |
[INFO] [stdout] 31228 | fn load_fused_moe() -> Option<FusedMoeFn> {
[INFO] [stdout]       |    ^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `load_fused_moe_scatter` is never used
[INFO] [stdout]      --> src/gpu_prefill.rs:31259:4
[INFO] [stdout]       |
[INFO] [stdout] 31259 | fn load_fused_moe_scatter() -> Option<FusedMoeScatterFn> {
[INFO] [stdout]       |    ^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `load_flash_attn` is never used
[INFO] [stdout]      --> src/gpu_prefill.rs:31288:4
[INFO] [stdout]       |
[INFO] [stdout] 31288 | fn load_flash_attn() -> Option<FlashAttnFwdFn> {
[INFO] [stdout]       |    ^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `load_flash_attn_fp8kv` is never used
[INFO] [stdout]      --> src/gpu_prefill.rs:31332:4
[INFO] [stdout]       |
[INFO] [stdout] 31332 | fn load_flash_attn_fp8kv() -> Option<FlashAttnFwdFn> {
[INFO] [stdout]       |    ^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `find_marlin_so` is never used
[INFO] [stdout]      --> src/gpu_prefill.rs:36467:4
[INFO] [stdout]       |
[INFO] [stdout] 36467 | fn find_marlin_so() -> Option<String> {
[INFO] [stdout]       |    ^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `cpu_expert_byte_sizes_mixed` is never used
[INFO] [stdout]     --> src/weights/mod.rs:1632:4
[INFO] [stdout]      |
[INFO] [stdout] 1632 | fn cpu_expert_byte_sizes_mixed(h: usize, m: usize, group_size: usize, w13_bits: u8, w2_bits: u8) -> (usize, usize, usize, usize) {
[INFO] [stdout]      |    ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `read_marlin_expert` is never used
[INFO] [stdout]     --> src/weights/mod.rs:5579:4
[INFO] [stdout]      |
[INFO] [stdout] 5579 | fn read_marlin_expert(
[INFO] [stdout]      |    ^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `read_unified_expert_cpu` is never used
[INFO] [stdout]     --> src/weights/mod.rs:5700:4
[INFO] [stdout]      |
[INFO] [stdout] 5700 | fn read_unified_expert_cpu(
[INFO] [stdout]      |    ^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: function `read_unified_expert_cpu_mixed` is never used
[INFO] [stdout]     --> src/weights/mod.rs:5803:4
[INFO] [stdout]      |
[INFO] [stdout] 5803 | fn read_unified_expert_cpu_mixed(
[INFO] [stdout]      |    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: structure field `mamba2_A` should have a snake case name
[INFO] [stdout]     --> src/gpu_prefill.rs:2969:9
[INFO] [stdout]      |
[INFO] [stdout] 2969 |     pub mamba2_A: u64,
[INFO] [stdout]      |         ^^^^^^^^ help: convert the identifier to snake case: `mamba2_a`
[INFO] [stdout]      |
[INFO] [stdout]      = note: `#[warn(non_snake_case)]` (part of `#[warn(nonstandard_style)]`) on by default
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: structure field `mamba2_D` should have a snake case name
[INFO] [stdout]     --> src/gpu_prefill.rs:2971:9
[INFO] [stdout]      |
[INFO] [stdout] 2971 |     pub mamba2_D: u64,
[INFO] [stdout]      |         ^^^^^^^^ help: convert the identifier to snake case: `mamba2_d`
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stdout] warning: unused import: `std::io::Read`
[INFO] [stdout]    --> tests/test_marlin_attn_shapes.rs:491:9
[INFO] [stdout]     |
[INFO] [stdout] 491 |     use std::io::Read;
[INFO] [stdout]     |         ^^^^^^^^^^^^^
[INFO] [stdout]     |
[INFO] [stdout]     = note: `#[warn(unused_imports)]` (part of `#[warn(unused)]`) on by default
[INFO] [stdout] 
[INFO] [stdout] 
[INFO] [stderr]     Finished `dev` profile [unoptimized + debuginfo] target(s) in 59.86s
[INFO] running `Command { std: "docker" "inspect" "b30848976230aff2fe214237aba0685a4ba220942724bb1bae64a20f6a3c55b8", kill_on_drop: false }`
[INFO] running `Command { std: "docker" "rm" "-f" "b30848976230aff2fe214237aba0685a4ba220942724bb1bae64a20f6a3c55b8", kill_on_drop: false }`
[INFO] [stdout] b30848976230aff2fe214237aba0685a4ba220942724bb1bae64a20f6a3c55b8
