Bulk Get Paperwork in Couchbase utilizing Reactive or Asynchronous API

December 9, 2025

86

Bulk Get Paperwork in Couchbase utilizing Reactive or Asynchronous API — blog couchbase bulk get documents 1024x536

When working with distributed databases like Couchbase, efficiency and effectivity are key concerns, particularly when retrieving a considerable amount of knowledge. Many occasions when prospects come from completely different improvement or database backgrounds, they ask concerning the functionality of Couchbase to do “multi-get” or “bulk get” operations. Many databases supply “multi-get” as an out of the field methodology to retrieve a number of paperwork to carry out primarily based on their keys. Most Couchbase SDKs don’t supply specific APIs for batching as a result of reactive programming gives the pliability to implement batching tailor-made to your particular use case and is usually more practical than a one-size-fits-all, generic methodology.

What’s Bulk Get?

A bulk get operation permits you to request a number of paperwork in a single operation, quite than making repeated particular person GET calls. In conventional key-value shops, every request often targets a selected node. Nonetheless, in a distributed setting like Couchbase, spreading these operations throughout nodes can introduce overhead if managed manually.

SDK assist for bulk operations

The Couchbase SDKs (together with Java, .NET, and Go) supply built-in assist for bulk get operations. These SDK strategies are designed to simply accept a listing of doc keys and mechanically handle the parallel execution of particular person GET requests in an environment friendly manner due to three predominant causes.

- Parallelism: Moderately than fetching every doc sequentially, the SDKs provoke a number of requests concurrently.
- Node focusing on: The SDKs intelligently route every request to the proper node within the cluster the place the info resides.
- Asynchronous execution: Leveraging the asynchronous capabilities of every SDK, the operations are dealt with in a non-blocking style, making certain larger throughput and higher useful resource utilization.

Couchbase gives two predominant methods to realize bulk get functionality utilizing Reactive Programming and Async Programming.

Reactive API

In case you’re aiming to optimize bulk get operations in Couchbase, reactive programming gives an environment friendly and simpler strategy. Couchbase’s binary protocol has out-of-order execution and has robust assist for async operations in KV. By effectively managing asynchronous knowledge flows, it allows excessive throughput and low latency, making it ideally suited for distributed methods. To totally leverage its capabilities, a completely reactive stack the place every layer, from the database to the shopper, helps reactive streams is right. Couchbase’s ReactiveCollection integrates seamlessly with Venture Reactor, enabling absolutely non-blocking entry to Couchbase Key-Worth (KV) operations. This integration aligns completely with trendy reactive architectures, permitting purposes to deal with high-throughput workloads extra effectively by avoiding pointless thread blocking.

That mentioned, migrating a whole current utility to a reactive structure can contain vital work. If it’s a new challenge, adopting a reactive framework like Spring WebFlux is strongly beneficial. Nonetheless, even in non-reactive purposes, introducing a reactive strategy on the Couchbase CRUD layer alone can ship significant positive factors. By doing so, you’ll be able to decrease thread blocking and scale back CPU throttling, main to raised useful resource effectivity and improved scalability.

Under is an instance of a Java code that may maximize the efficiency of Couchbase utilizing Reactive API and may work with a non-reactive stack.

/** * @param assortment The gathering to get paperwork from. * @param documentIds The IDs of the paperwork to return. * @param mapSupplier Manufacturing unit for the returned map. Suggestion: * Cross {@code TreeMap::new} for sorted outcomes, * or {@code HashMap::new} for unsorted. * @param concurrency Limits the variety of Couchbases requests in flight * on the identical time. Every invocation of this methodology has a separate quota. * Suggestion: Begin with 256 and tune as desired. * @param mapValueTransformerScheduler The scheduler to make use of for changing * the outcome map values. Cross {@hyperlink Schedulers#speedy()} * to make use of the SDK’s IO scheduler. Suggestion: In case your worth converter does IO, * go {@hyperlink Schedulers#boundedElastic()}. * @param mapValueTransformer A perform that takes a doc ID and a outcome, * and returns the worth to related to that ID within the returned map. * @param <v> The return map’s worth kind. * @param <m> The kind of the map you’d wish to retailer the ends in. * @return a Map (implementation decided by {@code mapSupplier}) * the place every given doc ID is related to the results of * getting the corresponding doc from Couchbase. */ public static <v m="" extends="" map="" v="">> Map<string v=""> bulkGet( ReactiveCollection assortment, Iterable<string> documentIds, int concurrency, Provider<m> mapSupplier, Scheduler mapValueTransformerScheduler, BiFunction<string successorfailure="">, V> mapValueTransformer ) { return Flux.fromIterable(documentIds) .flatMap( documentId -> Mono.zip( Mono.simply(documentId), assortment.get(documentId) .map(SuccessOrFailure::success) .onErrorResume(error -> Mono.simply(SuccessOrFailure.failure(error))) ), concurrency ) .publishOn(mapValueTransformerScheduler) .gather( mapSupplier, (map, idAndResult) -> { String documentId = idAndResult.getT1(); SuccessOrFailure<getresult> successOrFailure = idAndResult.getT2(); map.put(documentId, mapValueTransformer.apply(documentId, successOrFailure)); } ) .block(); } }</getresult></string></m></string></string></v></m></v>

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

/**

* @param assortment The gathering to get paperwork from.

* @param documentIds The IDs of the paperwork to return.

* @param mapSupplier Manufacturing unit for the returned map. Suggestion:

* Cross {@code TreeMap::new} for sorted outcomes,

* or {@code HashMap::new} for unsorted.

* @param concurrency Limits the variety of Couchbases requests in flight

* on the identical time. Every invocation of this methodology has a separate quota.

* Suggestion: Begin with 256 and tune as desired.

* @param mapValueTransformerScheduler The scheduler to make use of for changing

* the outcome map values. Cross {@hyperlink Schedulers#speedy()}

* to make use of the SDK’s IO scheduler. Suggestion: In case your worth converter does IO,

* go {@hyperlink Schedulers#boundedElastic()}.

* @param mapValueTransformer A perform that takes a doc ID and a outcome,

* and returns the worth to related to that ID within the returned map.

* @param The return map’s worth kind.

* @param The kind of the map you’d wish to retailer the ends in.

* @return a Map (implementation decided by {@code mapSupplier})

* the place every given doc ID is related to the results of

* getting the corresponding doc from Couchbase.

*/

public static <V, M extends Map<String, V>> Map<String, V> bulkGet(

ReactiveCollection assortment,

Iterable documentIds,

int concurrency,

Provider mapSupplier,

Scheduler mapValueTransformerScheduler,

BiFunction<String, SuccessOrFailure, V> mapValueTransformer

) {

return Flux.fromIterable(documentIds)

.flatMap(

documentId -> Mono.zip(

Mono.simply(documentId),

assortment.get(documentId)

.map(SuccessOrFailure::success)

.onErrorResume(error -> Mono.simply(SuccessOrFailure.failure(error)))

),

concurrency

)

.publishOn(mapValueTransformerScheduler)

.gather(

mapSupplier,

(map, idAndResult) -> {

String documentId = idAndResult.getT1();

SuccessOrFailure successOrFailure = idAndResult.getT2();

map.put(documentId, mapValueTransformer.apply(documentId, successOrFailure));

}

)

.block();

}

This reactive strategy is fetching paperwork utilizing their IDs and returning a Map the place every secret’s a doc ID and the worth is the processed outcome. Whereas it’s not incorrect to gather the outcomes right into a Record and reprocess them later, a greater technique (each when it comes to efficiency and code readability) is to gather the outcomes right into a ConcurrentHashMap listed by doc ID. This avoids repeated scanning and makes outcome lookups constant-time operations. Let’s break down how this works step-by-step.

1. Making a Reactive stream from doc IDs
  In line 19, we’re making a Flux (reactive stream) from the checklist of doc IDs. For every doc ID, it calls assortment.get(documentId) to fetch the doc reactively.
2. Wrapping ends in SuccessOrFailure
  To make sure resilience, every async operation wraps the end in a SuccessOrFailure object. This wrapper captures each profitable and failed fetches. By default, if assortment.get(documentId) throws an error (e.g. community subject, lacking doc), the entire Flux stream will error out and cease processing. This isn’t ideally suited for bulk operations as we need to proceed processing different paperwork even when one fails. So as a substitute of propagating the error, it converts the failure right into a SuccessOrFailure.failure(error) object. This fashion, the downstream nonetheless will get a sound worth (SuccessOrFailure) for each documentID, whether or not profitable or failed.
3. Pairing doc IDs with outcomes utilizing Mono.zip
  Utilizing Mono.zip makes it specific that you simply’re combining the documentId and the async get outcome right into a tuple. This helps determine the affiliation between documentID and outcome, particularly when outcomes arrive out of order as a consequence of concurrency.
4. Concurrency controls what number of doc fetches are run in parallel (what number of requests are in flight without delay).
5. Parallelism and scheduler handoff
  Reactive streams are non-blocking by default, however transformation logic (e.g., parsing JSON, changing knowledge) could be CPU-intensive. Earlier than we gather the ensuing tuples, the stream switches to a caller-specified scheduler utilizing publishOn(…). This offloads the transformation work from IO threads to a separate thread pool. That ensures IO threads aren’t blocked by transformation work as a consequence of heavy computation.
6. Accumulating right into a map
  As soon as all outcomes are in, the stream collects the tuples pairs right into a map. It makes use of mapSupplier to create the map. For every (documentId, outcome) pair, it applies mapValueTransformer to remodel the uncooked outcome right into a domain-specific kind V after which places the remodeled worth into the map.
7. Blocking to retrieve ultimate outcome
  Since every part right here is asynchronous (non-blocking), block() is used to attend for your complete stream to complete and return the constructed map to the caller.

Asynchronous API

Whereas we advocate utilizing the reactive APIs for his or her efficiency, flexibility, and built-in backpressure dealing with, Couchbase additionally presents a low-level Asynchronous API for situations the place you want much more fine-grained management and efficiency tuning. Nonetheless, writing environment friendly asynchronous code comes with its personal challenges, it requires cautious administration of concurrency and backpressure to stop useful resource exhaustion and keep away from timeouts.

Under is an instance demonstrating the way to use the Async API to reinforce bulk get efficiency in Couchbase:

// async api to name get() for an array of keys Record<completablefuture>> futures = new LinkedList<>(); for (int i = 0; i < keys.measurement(); i++) { CompletableFuture<getresult> f = assortment.async().get( keys.get(i), (GetOptions) choices ); futures.add(f); } // Anticipate all Get operations to finish CompletableFuture.allOf(futures.toArray(new CompletableFuture[0])).be part of(); // Convert outcomes to JsonObjects Record<jsonobject> outcomes = new LinkedList<>(); for (CompletableFuture<getresult> future : futures) { strive { JsonObject json = future.be part of().contentAsObject(); outcomes.add(json); } catch (CompletionException e) { e.printStackTrace(); outcomes.add(null); // or skip / deal with in another way } }</getresult></jsonobject></getresult></completablefuture>

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

// async api to name get() for an array of keys

Record<CompletableFuture> futures = new LinkedList<>();

for (int i = 0; i < keys.measurement(); i++) {

CompletableFuture f = assortment.async().get(

keys.get(i),

(GetOptions) choices

);

futures.add(f);

}

// Anticipate all Get operations to finish

CompletableFuture.allOf(futures.toArray(new CompletableFuture[0])).be part of();

// Convert outcomes to JsonObjects

Record outcomes = new LinkedList<>();

for (CompletableFuture future : futures) {

strive {

JsonObject json = future.be part of().contentAsObject();

outcomes.add(json);

} catch (CompletionException e) {

e.printStackTrace();

outcomes.add(null); // or skip / deal with in another way

}

Let’s break down how this works step-by-step.

1. Fetch paperwork
  Right here we iterate over keys and for every key, we name assortment.async().get(key, choices), which returns a CompletableFuture after which we retailer all these futures in a listing.
2. Anticipate all fetches to complete
  CompletableFuture.allOf(…) creates a brand new future that completes when all futures within the array full..be part of() blocks the present thread till all async fetches are finished.
3. Rework outcomes
  As soon as all of the fetches are finished, we create one other checklist to carry ultimate values in plain Record. For every CompletableFuture, we retrieve and rework the outcome. Relying on the requirement, you’ll be able to deal with the failure error by both including null to the checklist instead of the failed outcome or an error marker object.
  The transformation step assumes that fetching the paperwork is full prior to reworking outcomes, nevertheless, If the aim is to proceed chaining async operations then we are able to create a listing if futures Record> and wrap the transformation in one other async wrapper.

We advocate utilizing this API solely if you’re both writing integration code for larger degree concurrency mechanisms or you really want the final drop of efficiency. In all different circumstances, the reactive API (for richness in operators) is probably going the higher alternative.

Conclusion

Reactive programming presents some of the environment friendly methods to realize excessive efficiency for bulk get operations with Couchbase. Its true energy is unlocked when utilized throughout a wholly reactive stack, the place non-blocking habits and scalability are absolutely optimized.

That mentioned, you don’t want a completely reactive structure to begin reaping the advantages. A sensible and impactful first step is emigrate simply the Couchbase CRUD layer to reactive. Doing so can dramatically scale back thread blocking and decrease CPU throttling, main to raised system responsiveness and useful resource utilization with out requiring an entire architectural overhaul.

If efficiency and scalability are priorities, reactive programming is properly definitely worth the funding, even in a partial implementation.

The creator acknowledges the Couchbase SDK group and their glorious clarification on how we are able to obtain the batching effectively with out the necessity for a generic bulk get perform, thanks.

Bulk Get Paperwork in Couchbase utilizing Reactive or Asynchronous API

What’s Bulk Get?

SDK assist for bulk operations

Reactive API

Asynchronous API

Conclusion

Related Articles

What does the longer term maintain for generative AI? | MIT Information

Finest Soundcore deal: Save $30.01 on Soundcore by Anker Q20i headphones at Amazon

How can we execute selenium automation script, with out Visible Studio?

LEAVE A REPLY Cancel reply

Latest Articles

What does the longer term maintain for generative AI? | MIT Information

Finest Soundcore deal: Save $30.01 on Soundcore by Anker Q20i headphones at Amazon

How can we execute selenium automation script, with out Visible Studio?

5 Kimi K2.5 Options for Builders [MUST TRY]

Honeywell Launches Guided Work Options to Streamline Workforce Operations