Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate QuantizationStateCache teardown from KNNTestCase #2406

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
2 changes: 0 additions & 2 deletions src/test/java/org/opensearch/knn/KNNSingleNodeTestCase.java
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,6 @@
import org.opensearch.core.xcontent.XContentBuilder;
import org.opensearch.common.xcontent.XContentFactory;
import org.opensearch.index.IndexService;
import org.opensearch.knn.quantization.models.quantizationState.QuantizationStateCacheManager;
import org.opensearch.plugins.Plugin;
import org.opensearch.core.rest.RestStatus;
import org.opensearch.test.OpenSearchSingleNodeTestCase;
Expand Down Expand Up @@ -87,7 +86,6 @@ protected boolean resetNodeAfterTest() {
public void tearDown() throws Exception {
NativeMemoryCacheManager.getInstance().invalidateAll();
NativeMemoryCacheManager.getInstance().close();
QuantizationStateCacheManager.getInstance().close();
NativeMemoryLoadStrategy.IndexLoadStrategy.getInstance().close();
NativeMemoryLoadStrategy.TrainingLoadStrategy.getInstance().close();
NativeMemoryLoadStrategy.AnonymousLoadStrategy.getInstance().close();
Expand Down
5 changes: 1 addition & 4 deletions src/test/java/org/opensearch/knn/KNNTestCase.java
Original file line number Diff line number Diff line change
Expand Up @@ -24,10 +24,8 @@
import org.opensearch.core.common.bytes.BytesReference;
import org.opensearch.core.xcontent.XContentBuilder;
import org.opensearch.common.xcontent.XContentHelper;
import org.opensearch.knn.quantization.models.quantizationState.QuantizationStateCacheManager;
import org.opensearch.test.OpenSearchTestCase;

import java.io.IOException;
import java.util.Collections;
import java.util.HashSet;
import java.util.Map;
Expand Down Expand Up @@ -75,7 +73,7 @@ protected boolean enableWarningsCheck() {
return false;
}

public void resetState() throws IOException {
public void resetState() {
// Reset all of the counters
for (KNNCounter knnCounter : KNNCounter.values()) {
knnCounter.set(0L);
Expand All @@ -85,7 +83,6 @@ public void resetState() throws IOException {
// Clean up the cache
NativeMemoryCacheManager.getInstance().invalidateAll();
NativeMemoryCacheManager.getInstance().close();
QuantizationStateCacheManager.getInstance().close();
}

private void initKNNSettings() {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,8 @@
import org.opensearch.knn.index.mapper.KNNVectorFieldMapper;
import org.opensearch.knn.index.engine.KNNEngine;
import org.opensearch.knn.quantization.enums.ScalarQuantizationType;
import org.opensearch.knn.quantization.models.quantizationState.QuantizationStateCache;
import org.opensearch.knn.quantization.models.quantizationState.QuantizationStateCacheManager;

import java.io.IOException;
import java.util.Arrays;
Expand Down Expand Up @@ -263,6 +265,8 @@ public void testNativeEngineVectorFormat_whenMultipleVectorFieldIndexed_thenSucc
() -> leafReader.searchNearestVectors(BYTE_VECTOR_FIELD, byteVector, 10, new Bits.MatchAllBits(1), 10)
);
// do it at the end so that all search is completed
QuantizationStateCache.setThreadPool(null);
QuantizationStateCacheManager.getInstance().close();
indexReader.close();
}

Expand Down Expand Up @@ -300,6 +304,8 @@ public void testNativeEngineVectorFormat_whenBinaryQuantizationApplied_thenSucce
assertArrayEquals(floatVectorForBinaryQuantization, floatVectorValues.vectorValue(), 0.0f);
assertEquals(1, floatVectorValues.size());
assertEquals(8, floatVectorValues.dimension());
QuantizationStateCache.setThreadPool(null);
QuantizationStateCacheManager.getInstance().close();
indexReader.close();
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ public class NativeMemoryCacheManagerTests extends OpenSearchSingleNodeTestCase
private ThreadPool threadPool;

@Before
public void setThreadPool() {
public void setUp() {
threadPool = new ThreadPool(Settings.builder().put("node.name", "NativeMemoryCacheManagerTests").build());
NativeMemoryCacheManager.setThreadPool(threadPool);
}
Expand All @@ -54,6 +54,7 @@ public void tearDown() throws Exception {
client().admin().cluster().updateSettings(clusterUpdateSettingsRequest).get();
NativeMemoryCacheManager.getInstance().close();
terminate(threadPool);
NativeMemoryCacheManager.setThreadPool(null);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I don't think we need this explicitly.

Copy link

@Gankris96 Gankris96 Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the close of NativeMemoryCacheManager should have this. Because that is where the executor.shutdown() is called. The same place should be used to cleanup all tasks. I see the maintenanceTask.cancel() is called. Post that we should clear out the threadpool; no?

Actually, this should not even be required as @kotwanikunal mentioned

Copy link
Contributor Author

@owenhalpert owenhalpert Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Gankris96 the only consideration there is that the threadpool is set only once in KNNPlugin where we have access to pass in OpenSearch's threadpool. Once close() is called and that is set to null, we would lose that reference unless KNNPlugin.createComponents is called again. I would need to investigate the impact of getting rid of that reference fully.

Copy link

@Gankris96 Gankris96 Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the problem is that the threadpool is getting shutdown in the tests before you clear out the QSCache object. That is what causes the threadpool != null check to succeed but the mockNode.shutDown() terminates the threadpool.

For example

is where the node is closed, releasing all non-active threadpool assignments. In your case it wouldn't be active since QuantizationStateCache has not been initiailzed but you are setting it on a static variable.

super.tearDown();
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -36,14 +36,16 @@ public class QuantizationStateCacheTests extends KNNTestCase {
private ThreadPool threadPool;

@Before
public void setThreadPool() {
public void setUp() {
threadPool = new ThreadPool(Settings.builder().put("node.name", "QuantizationStateCacheTests").build());
QuantizationStateCache.setThreadPool(threadPool);
}

@After
public void terminateThreadPool() {
public void tearDown() throws IOException {
terminate(threadPool);
QuantizationStateCache.setThreadPool(null);
QuantizationStateCache.getInstance().close();
Copy link
Contributor Author

@owenhalpert owenhalpert Jan 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This call to close() (which solely cancels the maintenance) isn't necessary because of how the RescheduledRunnable is implemented. When we terminate the threadpool on line 46, the next task is rejected and the task is canceled anyway.

I thought I'd include it for clarity and to immediately cancel the task, but I wanted to flag it to see if anyone had thoughts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can have it for code clarity. My concern is the setThreadPool(null) pattern.
I know it exists solely for the tests but the threadpool termination and close() should ideally take care of all the cleanup.

Going one step further - I think the setThreadPool should be on a final variable with a non-null check.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess NativeMemoryCacheManager will always exist because it is created as part of plugin initialization. Thats why the NativeMemoryCacheManager.getInstance().close() works butQuantizationStateCacheManager.getInstance().close(); doesn't

Why cant we just have something like public boolean quantizationCacheEnabled; as part of KNNTestCase.java
and inside the resetState() have a check like -

// Clean up the cache
        NativeMemoryCacheManager.getInstance().invalidateAll();
        NativeMemoryCacheManager.getInstance().close();
        if (quantizationCacheEnabled) {
            QuantizationStateCacheManager.getInstance().close();
        }

Only in the tests where we actually have QuantizationStateCacheManager created, we can go set this value to true

}

@SneakyThrows
Expand Down
Loading