Conversation
Add MCP-based external paper search (PubMed, arXiv, Google Scholar) to supplement internal evidence matching. Papers are fetched for edges with insufficient internal evidence, cached for 24 hours, and displayed in the canvas UI alongside attested evidence.
…ch-mcp - Add `includeExternalPapers` option to POST /api/evidence/search - Extract shared search core from edge-based search for query-based reuse - Add biorxiv and medrxiv sources (5 sources total) - Interleave results round-robin across sources for diversity - Fix operator precedence bug in paper ID generation - Add cache size limit (500 entries) and normalize title dedup
…ntegration Remove the @mastra/mcp dependency and Python-based paper-search-mcp server in favor of a lightweight HTTP client for the Semantic Scholar Academic Graph API. This simplifies the external paper search stack by eliminating the need for Python/uv runtime and reducing the number of external dependencies.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
test |
PR Review: Feat/Semantic Scholar IntegrationGood overall direction - adding Semantic Scholar as a fallback for under-matched edges is a sensible approach. The architecture is clean and the graceful degradation patterns are solid. Here are the issues I found, grouped by severity. Bugs / Correctness Issues1. External papers omitted when results are empty ( When Fix: always set if (includeExternalPapers) {
response.externalPapers =
externalResult.status === "fulfilled" ? externalResult.value : [];
if (externalResult.status === "rejected") {
logger.warn({ error: externalResult.reason }, "External paper search failed");
}
}2. Sequential external search in workflow loop ( The await Promise.all(
canvasData.arrows.map(async (arrow) => {
externalPapersByArrow[arrow.id] = await searchExternalPapersForEdge(fromContent, toContent);
})
);Performance / Infrastructure Concerns3. In-memory cache does not survive serverless restarts ( The module-level 4. export const EXTERNAL_SEARCH_ENABLED = process.env.EXTERNAL_SEARCH_ENABLED !== "false";This silently enables external API calls on all deployments that have not set the env var. Opt-in ( Code Quality Issues5. Unsafe arrow.externalPapers = edge.data.externalPapers as any[];Should cast to 6. Duplicate buttons open the same dialog ( When an edge has both internal evidence and external papers, two separate buttons (emerald + blue) appear but both set the same 7. Section header layout ( The "Academic Papers (Reference)" 8.
Minor / Low Priority9. Truncated abstracts have no ellipsis indicator (
10. Unvalidated
11.
Summary
Items 1 and 2 are the most important to address before merging. |
- Add Gemini 2.0 Flash-based keyword extraction to convert card titles (including Japanese) into English academic search keywords - Fix buildSearchQuery to use only card titles instead of full title+description, preventing garbled truncated queries - Add empty/short query guard in Semantic Scholar client to avoid 500 errors - Add @ai-sdk/google dependency for direct Vercel AI SDK usage
Code Review: Semantic Scholar IntegrationGood overall direction — adding external academic evidence as a fallback when internal matches are sparse is a solid feature. The layered architecture (type → API client → orchestration → workflow step → UI) is clean and follows the project's patterns well. A few issues worth addressing before merging: Bugs / Correctness1. "Academic Papers" section heading missing when no internal evidence In {hasExternalPapers && (
<>
{hasEvidence && ( // <-- heading only shows when both exist
<div className="border-t pt-4">
<h3 ...>Academic Papers (Reference)</h3>
</div>
)}
{externalPapers.map(...)}
</>
)}If an edge has only external papers (no internal evidence), users see a list of blue cards with no heading explaining what they are. The heading should always render when external papers are present: {hasExternalPapers && (
<>
<div className={hasEvidence ? "border-t pt-4" : ""}>
<h3 className="mb-3 text-sm font-semibold text-gray-500">
Academic Papers (Reference)
</h3>
</div>
{externalPapers.map(...)}
</>
)}2. In const rawQuery = buildSearchQuery(fromContent, toContent); // only used for empty check
if (!rawQuery) return [];
const fromTitle = fromContent.split(".")[0].trim(); // duplicates buildSearchQuery logic
const toTitle = toContent.split(".")[0].trim();
const query = await extractSearchKeywords(fromTitle, toTitle); // this is what's usedEither pass Performance3. Sequential external paper search in workflow
for (const arrow of canvasData.arrows) {
externalPapersByArrow[arrow.id] = await searchExternalPapersForEdge(fromContent, toContent);
}These are independent and should run in parallel: const results = await Promise.allSettled(
canvasData.arrows.map(async (arrow) => {
// ... build fromContent, toContent ...
const papers = await searchExternalPapersForEdge(fromContent, toContent);
return { arrowId: arrow.id, papers };
})
);
for (const result of results) {
if (result.status === "fulfilled") {
externalPapersByArrow[result.value.arrowId] = result.value.papers;
}
}Design / Semantics4. Green edge color for external-paper-only edges In stroke: hasAnyContent ? "#10b981" : "#6b7280",Green edges visually signal "validated by attested evidence" in this system. Using the same color for "has reference papers only" blurs that distinction. Consider using the blue accent ( 5. Comments say "paper-search-mcp" but implementation is direct REST Multiple doc comments reference
The implementation calls the Semantic Scholar REST API directly — no MCP is involved. Update these to say "Semantic Scholar API". Minor Issues6.
7. arrow.externalPapers = edge.data.externalPapers as any[];This is consistent with the 8. Two buttons, one dialog In Positive Notes
|
…ading - Always return externalPapers as array (not undefined) when includeExternalPapers is true - Show "Academic Papers (Reference)" heading even when only external papers exist (no internal evidence)
…lSettled Replace sequential for...of loop with Promise.allSettled to search all arrows concurrently. Each arrow's LLM keyword extraction + HTTP request is independent, so parallelization significantly reduces total latency for logic models with multiple edges.
…olar integration - Replace `as any[]` casts with proper `EvidenceMatch[]` / `ExternalPaper[]` types - Differentiate edge colors: green (attested evidence), blue (external papers only), gray (none) - Consolidate dual buttons into single button when both evidence types exist on an edge - Update all "paper-search-mcp" references to "Semantic Scholar API" - Change EXTERNAL_SEARCH_ENABLED default to opt-in (=== "true") for safer deployments - Add EXTERNAL_SEARCH_ENABLED to .env.example - Add ellipsis to truncated abstracts, remove no-op `?? undefined` expressions - Fix word-boundary truncation in search query builder - Remove dead code in searchExternalPapersForEdge (unused buildSearchQuery call)
PR Review: Feat/semantic scholarGood overall structure for integrating external paper search. The parallel execution design ( 🐛 Bugs1. Incorrect JSDoc comment on The function is documented as "Deterministic (no LLM calls)" but it explicitly calls // Extract English academic keywords via LLM (falls back to raw titles on failure)
const query = await extractSearchKeywords(fromTitle, toTitle);Only 2. Query truncation bug in const searchQuery =
trimmed.length > 200 ? trimmed.slice(0, trimmed.lastIndexOf(" ", 200) || 200) : trimmed;
const lastSpace = trimmed.lastIndexOf(" ", 200);
const searchQuery = trimmed.length > 200
? trimmed.slice(0, lastSpace > 0 ? lastSpace : 200)
: trimmed;3. LLM-generated keywords defeat the cache for the workflow path (
Consider using the deterministic
|
…orrect JSDoc - Use deterministic buildSearchQuery output as cache key for edge-based searches instead of LLM-generated keywords, ensuring cache hits for repeated edge queries - Fix lastIndexOf returning -1 (truthy) causing slice(0, -1) instead of slice(0, 200) - Correct JSDoc: searchExternalPapersForEdge uses LLM (Gemini), not deterministic - Make buildSearchQuery private (no external callers) and reuse within the module - Document cache limitations: in-memory resets on serverless cold starts, FIFO eviction
The example file previously set EXTERNAL_SEARCH_ENABLED="true" which contradicts the documented default of disabled. Use empty string so copying .env.example verbatim keeps external search disabled until explicitly enabled.
PR Review: Feat/semantic-scholar (#194)This is a well-structured feature that adds Semantic Scholar integration for supplementing internal evidence matching. The architecture is clean, the degradation is graceful, and the opt-in feature flag approach is the right call for production safety. A few issues worth addressing before merge: Bugs / Correctness1. Missing
Suggested addition to 2. Double cache writes in
3. const lastSpace = trimmed.lastIndexOf(" ", 200);
const searchQuery = trimmed.length > 200 ? trimmed.slice(0, lastSpace > 0 ? lastSpace : 200) : trimmed;
Security4. Unvalidated
href={paper.doi ? `https://doi.org/${paper.doi}` : paper.url}
url: raw.url?.startsWith("https://") ? raw.url : undefined,Performance5. No concurrency limit on parallel Semantic Scholar requests
6. The project has a configurable Code Quality7. Currently 8. The new description "X items linked to this connection" groups blockchain-attested evidence and unvetted external papers into one count. Since these have meaningfully different epistemic weight, showing them separately would be clearer for users: Positives
SummaryThe main pre-merge fix is adding |
Add Step 2.5 (external academic paper search) to workflow diagrams and descriptions. Document three-tier edge color system (green/blue/gray), ExternalPaper type, and new file references for academic API clients.
Add attested evidence vs external papers comparison table, update UI presentation to three-tier color system, and add external search file references.
Document three-tier color logic (green/blue/gray), FileText/BookOpen button differentiation, two-section evidence dialog layout, and updated code examples matching current implementation.
Mirror all English documentation changes in Japanese: add Step 2.5 workflow diagrams, external paper search description, three-tier color system, two-section dialog, and updated frontend component details.
Add Semantic Scholar API to technology stack in CLAUDE.md and mention optional external paper search in README.md features section.
PR Review: Feat/Semantic Scholar IntegrationGreat feature addition! The Semantic Scholar integration is well-structured and adds meaningful academic context for under-evidenced edges. The code is clean and architecture decisions are sound. Here is my detailed feedback: Strengths
Must FixMissing Gemini API key documentation in The new Suggested addition to Should FixXSS risk: unvalidated
Option A — validate at render time: href={
paper.doi
? `https://doi.org/${paper.doi}`
: paper.url?.match(/^https?:\/\//)
? paper.url
: undefined
}Option B — validate in Suggestions1. Mixed-language comment in English docs ( A Japanese comment was accidentally left in the English Mermaid diagram (around line 952 of the diff): Should be English in 2. Step numbering: "Step 2.5" is unconventional Renaming to a clean 4-step sequence would improve log readability:
3. Server-side abstract truncation is redundant ( The UI already applies 4. Document the intentional double-write to cache ( In 5. In-memory cache lifespan on serverless (awareness note) The existing comment correctly notes the cache resets on cold starts. The 24h TTL is effectively meaningless on Vercel/Lambda deployments. Worth noting as a known limitation — a shared cache (Redis/Upstash) would make the TTL durable at production scale. Summary
The core logic is solid. The two items to address before merging are the missing |
Uh oh!
There was an error while loading. Please reload this page.