[INFO] fetching crate rusty-llm-jury 0.1.0... [INFO] testing rusty-llm-jury-0.1.0 against master#c90bcb9571b7aab0d8beaa2ce8a998ffaf079d38 for pr-146098-7 [INFO] extracting crate rusty-llm-jury 0.1.0 into /workspace/builds/worker-4-tc1/source [INFO] started tweaking crates.io crate rusty-llm-jury 0.1.0 [INFO] removed 0 missing tests [INFO] finished tweaking crates.io crate rusty-llm-jury 0.1.0 [INFO] tweaked toml for crates.io crate rusty-llm-jury 0.1.0 written to /workspace/builds/worker-4-tc1/source/Cargo.toml [INFO] validating manifest of crates.io crate rusty-llm-jury 0.1.0 on toolchain c90bcb9571b7aab0d8beaa2ce8a998ffaf079d38 [INFO] running `Command { std: CARGO_HOME="/workspace/cargo-home" RUSTUP_HOME="/workspace/rustup-home" "/workspace/cargo-home/bin/cargo" "+c90bcb9571b7aab0d8beaa2ce8a998ffaf079d38" "metadata" "--manifest-path" "Cargo.toml" "--no-deps", kill_on_drop: false }` [INFO] crate crates.io crate rusty-llm-jury 0.1.0 already has a lockfile, it will not be regenerated [INFO] running `Command { std: CARGO_HOME="/workspace/cargo-home" RUSTUP_HOME="/workspace/rustup-home" "/workspace/cargo-home/bin/cargo" "+c90bcb9571b7aab0d8beaa2ce8a998ffaf079d38" "fetch" "--manifest-path" "Cargo.toml", kill_on_drop: false }` [INFO] running `Command { std: "docker" "create" "-v" "/var/lib/crater-agent-workspace/builds/worker-4-tc1/target:/opt/rustwide/target:rw,Z" "-v" "/var/lib/crater-agent-workspace/builds/worker-4-tc1/source:/opt/rustwide/workdir:ro,Z" "-v" "/var/lib/crater-agent-workspace/cargo-home:/opt/rustwide/cargo-home:ro,Z" "-v" "/var/lib/crater-agent-workspace/rustup-home:/opt/rustwide/rustup-home:ro,Z" "-e" "SOURCE_DIR=/opt/rustwide/workdir" "-e" "CARGO_TARGET_DIR=/opt/rustwide/target" "-e" "CARGO_HOME=/opt/rustwide/cargo-home" "-e" "RUSTUP_HOME=/opt/rustwide/rustup-home" "-w" "/opt/rustwide/workdir" "-m" "1610612736" "--user" "0:0" "--network" "none" "ghcr.io/rust-lang/crates-build-env/linux@sha256:4848fb76d95f26979359cc7e45710b1dbc8f3acb7aeedee7c460d7702230f228" "/opt/rustwide/cargo-home/bin/cargo" "+c90bcb9571b7aab0d8beaa2ce8a998ffaf079d38" "metadata" "--no-deps" "--format-version=1", kill_on_drop: false }` [INFO] [stdout] 2f0ea60e88d751cfc5c536d41b5cb4bf64292c125ae3cbb629f28086bdd24fd2 [INFO] running `Command { std: "docker" "start" "-a" "2f0ea60e88d751cfc5c536d41b5cb4bf64292c125ae3cbb629f28086bdd24fd2", kill_on_drop: false }` [INFO] running `Command { std: "docker" "inspect" "2f0ea60e88d751cfc5c536d41b5cb4bf64292c125ae3cbb629f28086bdd24fd2", kill_on_drop: false }` [INFO] running `Command { std: "docker" "rm" "-f" "2f0ea60e88d751cfc5c536d41b5cb4bf64292c125ae3cbb629f28086bdd24fd2", kill_on_drop: false }` [INFO] [stdout] 2f0ea60e88d751cfc5c536d41b5cb4bf64292c125ae3cbb629f28086bdd24fd2 [INFO] running `Command { std: "docker" "create" "-v" "/var/lib/crater-agent-workspace/builds/worker-4-tc1/target:/opt/rustwide/target:rw,Z" "-v" "/var/lib/crater-agent-workspace/builds/worker-4-tc1/source:/opt/rustwide/workdir:ro,Z" "-v" "/var/lib/crater-agent-workspace/cargo-home:/opt/rustwide/cargo-home:ro,Z" "-v" "/var/lib/crater-agent-workspace/rustup-home:/opt/rustwide/rustup-home:ro,Z" "-e" "SOURCE_DIR=/opt/rustwide/workdir" "-e" "CARGO_TARGET_DIR=/opt/rustwide/target" "-e" "CARGO_INCREMENTAL=0" "-e" "RUST_BACKTRACE=full" "-e" "RUSTFLAGS=--cap-lints=forbid" "-e" "RUSTDOCFLAGS=--cap-lints=forbid" "-e" "CARGO_HOME=/opt/rustwide/cargo-home" "-e" "RUSTUP_HOME=/opt/rustwide/rustup-home" "-w" "/opt/rustwide/workdir" "-m" "1610612736" "--user" "0:0" "--network" "none" "ghcr.io/rust-lang/crates-build-env/linux@sha256:4848fb76d95f26979359cc7e45710b1dbc8f3acb7aeedee7c460d7702230f228" "/opt/rustwide/cargo-home/bin/cargo" "+c90bcb9571b7aab0d8beaa2ce8a998ffaf079d38" "build" "--frozen" "--message-format=json", kill_on_drop: false }` [INFO] [stdout] baec76d70cb821b5d1d48dd48e4ac72765bf26fc0e7aee67df84556150c2b185 [INFO] running `Command { std: "docker" "start" "-a" "baec76d70cb821b5d1d48dd48e4ac72765bf26fc0e7aee67df84556150c2b185", kill_on_drop: false }` [INFO] [stderr] Compiling libc v0.2.172 [INFO] [stderr] Compiling zerocopy v0.8.25 [INFO] [stderr] Compiling matrixmultiply v0.3.10 [INFO] [stderr] Compiling csv-core v0.1.12 [INFO] [stderr] Compiling syn v2.0.101 [INFO] [stderr] Compiling clap_builder v4.5.39 [INFO] [stderr] Compiling num-complex v0.4.6 [INFO] [stderr] Compiling num-integer v0.1.46 [INFO] [stderr] Compiling ndarray v0.15.6 [INFO] [stderr] Compiling getrandom v0.2.16 [INFO] [stderr] Compiling rand_core v0.6.4 [INFO] [stderr] Compiling ppv-lite86 v0.2.21 [INFO] [stderr] Compiling rand_chacha v0.3.1 [INFO] [stderr] Compiling rand v0.8.5 [INFO] [stderr] Compiling serde_derive v1.0.219 [INFO] [stderr] Compiling thiserror-impl v1.0.69 [INFO] [stderr] Compiling clap_derive v4.5.32 [INFO] [stderr] Compiling thiserror v1.0.69 [INFO] [stderr] Compiling clap v4.5.39 [INFO] [stderr] Compiling serde v1.0.219 [INFO] [stderr] Compiling serde_json v1.0.140 [INFO] [stderr] Compiling csv v1.3.1 [INFO] [stderr] Compiling rusty-llm-jury v0.1.0 (/opt/rustwide/workdir) [INFO] [stderr] Finished `dev` profile [unoptimized + debuginfo] target(s) in 14.60s [INFO] running `Command { std: "docker" "inspect" "baec76d70cb821b5d1d48dd48e4ac72765bf26fc0e7aee67df84556150c2b185", kill_on_drop: false }` [INFO] running `Command { std: "docker" "rm" "-f" "baec76d70cb821b5d1d48dd48e4ac72765bf26fc0e7aee67df84556150c2b185", kill_on_drop: false }` [INFO] [stdout] baec76d70cb821b5d1d48dd48e4ac72765bf26fc0e7aee67df84556150c2b185 [INFO] running `Command { std: "docker" "create" "-v" "/var/lib/crater-agent-workspace/builds/worker-4-tc1/target:/opt/rustwide/target:rw,Z" "-v" "/var/lib/crater-agent-workspace/builds/worker-4-tc1/source:/opt/rustwide/workdir:ro,Z" "-v" "/var/lib/crater-agent-workspace/cargo-home:/opt/rustwide/cargo-home:ro,Z" "-v" "/var/lib/crater-agent-workspace/rustup-home:/opt/rustwide/rustup-home:ro,Z" "-e" "SOURCE_DIR=/opt/rustwide/workdir" "-e" "CARGO_TARGET_DIR=/opt/rustwide/target" "-e" "CARGO_INCREMENTAL=0" "-e" "RUST_BACKTRACE=full" "-e" "RUSTFLAGS=--cap-lints=forbid" "-e" "RUSTDOCFLAGS=--cap-lints=forbid" "-e" "CARGO_HOME=/opt/rustwide/cargo-home" "-e" "RUSTUP_HOME=/opt/rustwide/rustup-home" "-w" "/opt/rustwide/workdir" "-m" "1610612736" "--user" "0:0" "--network" "none" "ghcr.io/rust-lang/crates-build-env/linux@sha256:4848fb76d95f26979359cc7e45710b1dbc8f3acb7aeedee7c460d7702230f228" "/opt/rustwide/cargo-home/bin/cargo" "+c90bcb9571b7aab0d8beaa2ce8a998ffaf079d38" "test" "--frozen" "--no-run" "--message-format=json", kill_on_drop: false }` [INFO] [stdout] 756a89b693e5c3631398a04efb54c898967b2de25803df3c228065db6da73867 [INFO] running `Command { std: "docker" "start" "-a" "756a89b693e5c3631398a04efb54c898967b2de25803df3c228065db6da73867", kill_on_drop: false }` [INFO] [stderr] Compiling rustix v1.0.7 [INFO] [stderr] Compiling linux-raw-sys v0.9.4 [INFO] [stderr] Compiling bitflags v2.9.1 [INFO] [stderr] Compiling getrandom v0.3.3 [INFO] [stderr] Compiling approx v0.5.1 [INFO] [stderr] Compiling tempfile v3.20.0 [INFO] [stderr] Compiling rusty-llm-jury v0.1.0 (/opt/rustwide/workdir) [INFO] [stderr] Finished `test` profile [unoptimized + debuginfo] target(s) in 5.49s [INFO] running `Command { std: "docker" "inspect" "756a89b693e5c3631398a04efb54c898967b2de25803df3c228065db6da73867", kill_on_drop: false }` [INFO] running `Command { std: "docker" "rm" "-f" "756a89b693e5c3631398a04efb54c898967b2de25803df3c228065db6da73867", kill_on_drop: false }` [INFO] [stdout] 756a89b693e5c3631398a04efb54c898967b2de25803df3c228065db6da73867 [INFO] running `Command { std: "docker" "create" "-v" "/var/lib/crater-agent-workspace/builds/worker-4-tc1/target:/opt/rustwide/target:rw,Z" "-v" "/var/lib/crater-agent-workspace/builds/worker-4-tc1/source:/opt/rustwide/workdir:ro,Z" "-v" "/var/lib/crater-agent-workspace/cargo-home:/opt/rustwide/cargo-home:ro,Z" "-v" "/var/lib/crater-agent-workspace/rustup-home:/opt/rustwide/rustup-home:ro,Z" "-e" "SOURCE_DIR=/opt/rustwide/workdir" "-e" "CARGO_TARGET_DIR=/opt/rustwide/target" "-e" "CARGO_INCREMENTAL=0" "-e" "RUST_BACKTRACE=full" "-e" "RUSTFLAGS=--cap-lints=forbid" "-e" "RUSTDOCFLAGS=--cap-lints=forbid" "-e" "CARGO_HOME=/opt/rustwide/cargo-home" "-e" "RUSTUP_HOME=/opt/rustwide/rustup-home" "-w" "/opt/rustwide/workdir" "-m" "1610612736" "--user" "0:0" "--network" "none" "ghcr.io/rust-lang/crates-build-env/linux@sha256:4848fb76d95f26979359cc7e45710b1dbc8f3acb7aeedee7c460d7702230f228" "/opt/rustwide/cargo-home/bin/cargo" "+c90bcb9571b7aab0d8beaa2ce8a998ffaf079d38" "test" "--frozen", kill_on_drop: false }` [INFO] [stdout] dbcc18dc61ced8f028ebe2df2f7c3bc3c851917a360abc6c5a804f234a8743ab [INFO] running `Command { std: "docker" "start" "-a" "dbcc18dc61ced8f028ebe2df2f7c3bc3c851917a360abc6c5a804f234a8743ab", kill_on_drop: false }` [INFO] [stderr] Finished `test` profile [unoptimized + debuginfo] target(s) in 0.11s [INFO] [stderr] Running unittests src/lib.rs (/opt/rustwide/target/debug/deps/llmjury-f17142e3d6651ccb) [INFO] [stdout] [INFO] [stdout] running 44 tests [INFO] [stdout] test bias_correction::tests::test_input_validation_empty_arrays ... ok [INFO] [stdout] test bias_correction::tests::test_estimate_success_rate_basic ... ok [INFO] [stdout] test bias_correction::tests::test_different_confidence_levels ... ok [INFO] [stdout] test bias_correction::tests::test_input_validation_invalid_confidence_level ... ok [INFO] [stdout] test bias_correction::tests::test_input_validation_non_binary ... ok [INFO] [stdout] test bias_correction::tests::test_judge_accuracy_too_low ... ok [INFO] [stdout] test bias_correction::tests::test_judge_metrics_perfect_judge ... ok [INFO] [stdout] test bias_correction::tests::test_judge_metrics_random_judge ... ok [INFO] [stdout] test bias_correction::tests::test_no_negative_examples ... ok [INFO] [stdout] test bias_correction::tests::test_no_positive_examples ... ok [INFO] [stdout] test cli::tests::test_estimate_args_validation ... ok [INFO] [stdout] test bias_correction::tests::test_estimate_success_rate_perfect_judge ... ok [INFO] [stdout] test cli::tests::test_synth_experiment_args_create_config ... ok [INFO] [stdout] test cli::tests::test_estimate_args_load_data_from_strings ... ok [INFO] [stdout] test bias_correction::tests::test_input_validation_mismatched_lengths ... ok [INFO] [stdout] test cli::tests::test_estimate_args_load_data_from_files ... ok [INFO] [stdout] test synthetic::tests::test_generate_test_data_perfect_accuracy ... ok [INFO] [stdout] test synthetic::tests::test_create_example_dataset_invalid_scenario ... ok [INFO] [stdout] test synthetic::tests::test_generate_test_data_input_validation ... ok [INFO] [stdout] test synthetic::tests::test_generate_test_data_zero_accuracy ... ok [INFO] [stdout] test synthetic::tests::test_generate_unlabeled_data_extreme_pass_rates ... ok [INFO] [stdout] test synthetic::tests::test_generate_unlabeled_data_input_validation ... ok [INFO] [stdout] test synthetic::tests::test_create_example_dataset_reproducibility ... ok [INFO] [stdout] test synthetic::tests::test_generate_test_data_basic ... ok [INFO] [stdout] test synthetic::tests::test_generate_test_data_reproducibility ... ok [INFO] [stdout] test synthetic::tests::test_generate_unlabeled_data_basic ... ok [INFO] [stdout] test tests::test_version_is_set ... ok [INFO] [stdout] test synthetic::tests::test_create_example_dataset_different_scenarios_differ ... ok [INFO] [stdout] test utils::tests::test_format_float ... ok [INFO] [stdout] test utils::tests::test_format_percentage ... ok [INFO] [stdout] test utils::tests::test_load_binary_from_csv ... ok [INFO] [stdout] test utils::tests::test_load_binary_from_csv_invalid_data ... ok [INFO] [stdout] test utils::tests::test_load_binary_from_csv_nonexistent_file ... ok [INFO] [stdout] test utils::tests::test_load_binary_from_csv_with_empty_lines ... ok [INFO] [stdout] test utils::tests::test_parse_binary_string_empty ... ok [INFO] [stdout] test utils::tests::test_parse_binary_string_valid ... ok [INFO] [stdout] test utils::tests::test_load_binary_from_csv_with_header ... ok [INFO] [stdout] test utils::tests::test_parse_range ... ok [INFO] [stdout] test utils::tests::test_validate_probability ... ok [INFO] [stdout] test utils::tests::test_parse_binary_string_invalid ... ok [INFO] [stdout] test synthetic::tests::test_create_example_dataset_all_scenarios ... ok [INFO] [stdout] test synthetic::tests::test_scenario_accuracy_properties ... ok [INFO] [stdout] test synthetic::tests::test_run_sensitivity_experiment_tpr ... ok [INFO] [stdout] test synthetic::tests::test_run_sensitivity_experiment_tnr ... ok [INFO] [stdout] [INFO] [stdout] test result: ok. 44 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.06s [INFO] [stdout] [INFO] [stderr] Running unittests src/main.rs (/opt/rustwide/target/debug/deps/llm_jury-8de16dc533609b04) [INFO] [stdout] [INFO] [stdout] running 0 tests [INFO] [stdout] [INFO] [stdout] test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s [INFO] [stdout] [INFO] [stderr] Running tests/cli_tests.rs (/opt/rustwide/target/debug/deps/cli_tests-760f0cd34cfc3f5b) [INFO] [stdout] [INFO] [stdout] running 6 tests [INFO] [stdout] test test_cli_help ... ok [INFO] [stdout] test test_cli_estimate_with_files ... ok [INFO] [stdout] test test_cli_estimate_basic ... ok [INFO] [stdout] test test_cli_version ... ok [INFO] [stdout] test test_cli_synth_experiment ... ok [INFO] [stdout] test test_cli_error_handling ... ok [INFO] [stdout] [INFO] [stdout] test result: ok. 6 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.32s [INFO] [stdout] [INFO] [stderr] Running tests/integration_test.rs (/opt/rustwide/target/debug/deps/integration_test-5d214d06b60fd0a0) [INFO] [stdout] [INFO] [stdout] running 11 tests [INFO] [stdout] test test_boundary_conditions ... ok [INFO] [stdout] test test_confidence_intervals ... ok [INFO] [stdout] test test_performance_benchmark ... ignored [INFO] [stdout] test test_csv_file_loading ... ok [INFO] [stdout] test test_reproducibility ... ok [INFO] [stdout] test test_judge_metrics ... ok [INFO] [stdout] test test_complete_workflow ... ok [INFO] [stdout] test test_utility_functions ... ok [INFO] [stdout] test test_example_scenarios ... ok [INFO] [stdout] test test_error_handling ... ok [INFO] [stdout] test test_large_dataset ... ok [INFO] [stdout] [INFO] [stdout] test result: ok. 10 passed; 0 failed; 1 ignored; 0 measured; 0 filtered out; finished in 0.41s [INFO] [stdout] [INFO] [stderr] Doc-tests llmjury [INFO] [stdout] [INFO] [stdout] running 7 tests [INFO] [stdout] test src/utils.rs - utils::load_binary_from_csv (line 50) - compile ... ok [INFO] [stdout] test src/synthetic.rs - synthetic::generate_test_data (line 104) ... ok [INFO] [stdout] test src/synthetic.rs - synthetic::generate_unlabeled_data (line 178) ... ok [INFO] [stdout] test src/bias_correction.rs - bias_correction::estimate_success_rate (line 124) ... ok [INFO] [stdout] test src/synthetic.rs - synthetic::create_example_dataset (line 385) ... ok [INFO] [stdout] test src/utils.rs - utils::parse_binary_string (line 11) ... ok [INFO] [stdout] test src/synthetic.rs - synthetic::run_sensitivity_experiment (line 268) ... ok [INFO] [stdout] [INFO] [stdout] test result: ok. 7 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.97s [INFO] [stdout] [INFO] running `Command { std: "docker" "inspect" "dbcc18dc61ced8f028ebe2df2f7c3bc3c851917a360abc6c5a804f234a8743ab", kill_on_drop: false }` [INFO] running `Command { std: "docker" "rm" "-f" "dbcc18dc61ced8f028ebe2df2f7c3bc3c851917a360abc6c5a804f234a8743ab", kill_on_drop: false }` [INFO] [stdout] dbcc18dc61ced8f028ebe2df2f7c3bc3c851917a360abc6c5a804f234a8743ab