independent research lab
ZBS GG
zbs·gg·no reset. no amnesia.

empathic memory bench v3

recall@3 · n=100 probe suite · 60 events · cosine leads overall (0.420 vs 0.416) · Pulse v3 leads the stateful axis (0.419 vs cosine_state 0.314) · we say both plainly

systemoverall R@3corestatefulmulti-signalchain
cosine0.4200.5830.3430.3070.533
Pulse v30.4160.5170.4190.3330.412
cosine_state0.3900.5170.3140.2930.517
hybrid0.2850.5000.2190.1730.325
hybrid_state0.2620.4330.2190.1330.325
state_concat_only0.2030.1000.1240.1470.517
bm250.1560.3500.0860.0800.179
systemoverall R@3corestatefulmulti-signalchain
Pulse v3 (Cohere embed-v4.0)0.4160.5170.4190.3330.412
claude-mem0.4000.6000.3050.3330.450
LangMem0.3970.6170.3520.2530.433
LlamaIndex Memory0.3970.6170.3520.2530.433
Pulse v3 (TE3-small, backbone-matched)0.3750.4670.3900.2800.375
Mem00.3470.5670.2570.2800.371
OpenAI Memory (TE3-large)0.3070.5500.2570.1070.404
Graphiti (Zep)0.1200.4330.0380.0400.050

the stateful axis is the paper's supported claim: same query, different user state, different ideal episode. on overall R@3 Pulse v3 does not lead — cosine does (+0.004), and cosine also leads core and chain. backbone-matched Pulse (TE3-small) is not the overall adapter winner either — claude-mem, LangMem and LlamaIndex are ahead on overall. what survives every cut is the stateful lead.