## RAPIDS Memory Manager (RMM)
We recommend using RMM to configure GPU memory resources. For performance, a best practice with cuVS, as with other libraries in the RAPIDS ecosystem, is to use a pool memory resource to allocate a chunk of memory up front for the current GPU device.

In [1]:
import rmm
pool = rmm.mr.PoolMemoryResource(
    rmm.mr.CudaMemoryResource(),
    initial_pool_size=2**30
)
rmm.mr.set_current_device_resource(pool)



In [2]:
# check the current device resource
current_resource = rmm.mr.get_current_device_resource()
print(current_resource)

<rmm._lib.memory_resource.PoolMemoryResource object at 0x7f2fac3d2780>


## Building a cuVS GPU index
With the faiss-gpu-cuvs package, the cuVS implementation is chosen by default for supported index types and can therefore be used with zero code change. Below contains an example of creating an IVFPQ index using cuVS.

In [3]:
import faiss
import numpy as np

np.random.seed(1234)
xb = np.random.random((1000000, 96)).astype('float32')
xq = np.random.random((10000, 96)).astype('float32')
xt = np.random.random((100000, 96)).astype('float32')

res = faiss.StandardGpuResources()
# Disable the default temporary memory allocation since an RMM pool resource has already been set.
res.noTempMemory()

In [4]:
# Case 1: Creating cuVS GPU index
config = faiss.GpuIndexIVFPQConfig()
config.interleavedLayout = True
assert(config.use_cuvs)
index_gpu = faiss.GpuIndexIVFPQ(res, 96, 1024, 96, 6, faiss.METRIC_L2, config) # expanded parameter set with cuVS (bits per code = 6).
%time index_gpu.train(xt)
%time index_gpu.add(xb)

using ivf_pq::index_params nrows 100000, dim 96, n_lits 1024, pq_dim 96
CPU times: user 6.11 s, sys: 614 ms, total: 6.73 s
Wall time: 307 ms
CPU times: user 114 ms, sys: 12.2 ms, total: 126 ms
Wall time: 126 ms


In [5]:
# Case 2: Cloning a CPU index to a cuVS GPU index
quantizer = faiss.IndexFlatL2(96)
index_cpu = faiss.IndexIVFPQ(quantizer,96, 1024, 96, 8, faiss.METRIC_L2)
index_cpu.train(xt)
co = faiss.GpuClonerOptions()
%time index_gpu = faiss.index_cpu_to_gpu(res, 0, index_cpu, co)

# The cuVS index now uses the trained quantizer as it's IVF centroids.
assert(index_gpu.is_trained)
%time index_gpu.add(xb)
k = 10
%time D, I = index_gpu.search(xq, k)

CPU times: user 2.61 s, sys: 189 ms, total: 2.8 s
Wall time: 35.1 ms
CPU times: user 4.67 s, sys: 329 ms, total: 5 s
Wall time: 247 ms
CPU times: user 107 ms, sys: 60.1 ms, total: 167 ms
Wall time: 167 ms


## Build a cuVS CAGRA index
The following example demonstrates building and searching the CAGRA index with FAISS.

In [6]:
# Step 1: Create the CAGRA index config
config = faiss.GpuIndexCagraConfig()
config.graph_degree = 32
config.intermediate_graph_degree = 64

# Step 2: Initialize the CAGRA index
res = faiss.StandardGpuResources()
gpu_cagra_index = faiss.GpuIndexCagra(res, 96, faiss.METRIC_L2, config)

# Step 3: Add the 1M vectors to the index
n = 1000000
data = np.random.random((n, 96)).astype('float32')
%time gpu_cagra_index.train(data)

# Step 4: Search the index for top 10 neighbors for each query.
xq = np.random.random((10000, 96)).astype('float32')
%time D, I = gpu_cagra_index.search(xq,10)

using ivf_pq::index_params nrows 1000000, dim 96, n_lits 1000, pq_dim 24
CPU times: user 5min 20s, sys: 15.7 s, total: 5min 36s
Wall time: 4.94 s
CPU times: user 817 ms, sys: 52.7 ms, total: 869 ms
Wall time: 12.5 ms


## CAGRA to HNSW
A CAGRA index can be automatically converted to HNSW through the new faiss.IndexHNSWCagra CPU index class.

In [7]:
# Create the HNSW index object.
d = 96
M = 16
cpu_hnsw_index = faiss.IndexHNSWCagra(d, M, faiss.METRIC_L2)
# Create the full HNSW hierarchy
cpu_hnsw_index.base_level_only=False

# Initializes the HNSW base layer with the CAGRA graph. 
%time gpu_cagra_index.copyTo(cpu_hnsw_index)

# Add new vectors to the hierarchy.
newVecs = np.random.random((100000, 96)).astype('float32')
%time cpu_hnsw_index.add(newVecs)

CPU times: user 1min, sys: 3.54 s, total: 1min 4s
Wall time: 1.58 s
CPU times: user 2min 59s, sys: 2.88 s, total: 3min 2s
Wall time: 2.93 s
