Skip to content

[BUG] Dataset.createIndex passes IndexType.ordinal() instead of IndexType.getValue() to JNI, causing IllegalArgumentException for java sdk #5743

@mullerhai

Description

@mullerhai

HI lance,

This is a serious breaking bug where the JNI layer passes the Java enum's internal memory index (ordinal()) instead of the protocol-defined value required by the Rust engine.

Describe :

When calling Dataset.createIndex with IndexType.IVF_PQ, the operation fails with a java.lang.IllegalArgumentException from the native layer. The error message indicates that the input value is 13, which corresponds to the Java Enum's ordinal() value, whereas the Rust engine expects the protocol value 103.

Environment

OS: macOS / Linux

Java Version: 17+ (using Scala 2.12.21 sbt project) without scala is ok

Lance Version: [Insert your version, e.g., 0.x.x]

To Reproduce

Create a Dataset with a vector column.

Define IndexOptions using IndexType.IVF_PQ.

Execute dataset.createIndex(options).

Java

// Example Snippet
IndexOptions options = IndexOptions.builder(
Arrays.asList("vector"),
IndexType.IVF_PQ, // IVF_PQ.ordinal() is 13, IVF_PQ.getValue() is 103
iParams
).build();

dataset.createIndex(options);
Expected behavior The JNI layer should pass the integer code defined in the IndexType enum (e.g., 103 for IVF_PQ) so the Rust backend can correctly identify the requested index type.

Actual behavior The native method receives the value 13 (the ordinal position of the enum in Java), which is not a valid IndexType in the Rust crate, leading to the following crash:

Plaintext

Exception in thread "main" java.lang.IllegalArgumentException: Invalid user input: the input value 13 is not a valid IndexType, .../rust/lance-index/src/lib.rs:183:27
at org.lance.Dataset.nativeCreateIndex(Native Method)
at org.lance.Dataset.createIndex(Dataset.java:850)
Analysis of the Source Code In Dataset.java, the createIndex method currently calls:

Java

nativeCreateIndex(
options.getColumns(),
options.getIndexType().ordinal(), // BUG: This should be .getValue() or .getCode()
options.getIndexName(),
...
);
The IndexType enum is defined as:

Java

public enum IndexType {
...
VECTOR(100),
IVF_FLAT(101),
IVF_SQ(102),
IVF_PQ(103), // Ordinal is 13, but Value is 103
...
}
Suggested Fix Update the JNI call in Dataset.java to use the explicit value of the enum instead of its ordinal index.

Advice for your submission:
Attach the logs: Make sure to include the exact stack trace you got.

Check the version: Double-check your build.sbt or pom.xml for the exact version of org.lance you are using.

Workaround: Until they fix it, the only way to bypass this is using a version where this is patched, or using a different index type that coincidentally has matching values (like SCALAR).

Would you like me to help you find the latest stable version where this might already be fixed?

version
// Source: https://mvnrepository.com/artifact/org.lance/lance-core
libraryDependencies += "org.lance" % "lance-core" % "2.0.0-beta.9"

// Source: https://mvnrepository.com/artifact/org.lance/lance-core
implementation("org.lance:lance-core:2.0.0-beta.9")

code:

  IvfBuildParams ivfParams = new IvfBuildParams.Builder()
                    .setNumPartitions(2)
                    .build();


            PQBuildParams pqParams = new PQBuildParams.Builder()
                    .setNumSubVectors(16)
                    .setNumBits(8)
                    .build();


            VectorIndexParams vectorParams = new VectorIndexParams.Builder(ivfParams)
                    .setPqParams(pqParams)
                    .setDistanceType(DistanceType.L2) 
                    .build();


            IndexParams iParams = IndexParams.builder()
                    .setVectorIndexParams(vectorParams)
                    .build(); 

            IndexOptions options = IndexOptions.builder(
                            Arrays.asList("vector"),
                            IndexType.IVF_PQ,
                            iParams
                    )
                    .withIndexName("my_vector_idx")
                    .replace(true)
                    .build();
            System.out.println("IVF_PQ Ordinal: " + IndexType.IVF_PQ.ordinal());
             System.out.println("IVF_PQ Value: " + IndexType.IVF_PQ.getValue());
            dataset.createIndex(options);


            System.out.println("IVF-PQ 向量索引创建成功!");

console log

Schema 构造成功: Schema<id: Utf8 not null, name: Utf8, age: Int(32, true), vector: FixedSizeList(128)<item: FloatingPoint(SINGLE) not null> not null>
数据追加成功!
数据写入成功,行数: 2
--- 开始扫描数据 ---
ID: [B@12c7a01b, Name: [B@13d9b21f, Age: 25 
Vector (前5维): [0.258460, 0.326663, 0.581190, 0.321454, 0.023315]
--------------------------------------------------
ID: [B@62727399, Name: [B@4d9ac0b4, Age: 26 
Vector (前5维): [0.095679, 0.215236, 0.946429, 0.674200, 0.978757]
--------------------------------------------------
IVF_PQ Ordinal: 13
IVF_PQ Value: 103
Exception in thread "main" java.lang.IllegalArgumentException: Invalid user input: the input value 13 is not a valid IndexType, /Users/runner/work/lance/lance/rust/lance-index/src/lib.rs:183:27
	at org.lance.Dataset.nativeCreateIndex(Native Method)
	at org.lance.Dataset.createIndex(Dataset.java:850)
	at LanceDBImageSearch.initRepo2(LanceDBImageSearch.java:290)
	at LanceDBImageSearch.main(LanceDBImageSearch.java:54)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions