Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 8 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,6 @@ If you see version numbers for all three, you are ready to proceed with the inst

## Installation


### Running on Claude Desktop

To configure Octagon MCP for Claude Desktop:
Expand Down Expand Up @@ -276,6 +275,14 @@ Research the financial impact of Apple's privacy changes on digital advertising
2. **Connection Issues**: Make sure the connectivity to the Octagon API is working properly.
3. **Rate Limiting**: If you encounter rate limiting errors, reduce the frequency of your requests.

## Running Evals

The evals package loads an mcp client that then runs the index.ts file, so there is no need to rebuild between tests. You can load environment variables by prefixing the npx command. Full documentation can be found [here](https://www.mcpevals.io/docs).

```bash
OPENAI_API_KEY=your-key npx mcp-eval src/evals/evals.ts src/index.ts
```

## Installation

### Running with npx
Expand Down
5 changes: 3 additions & 2 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,8 @@
"@modelcontextprotocol/sdk": "^1.0.0",
"dotenv": "^16.3.1",
"openai": "^4.20.1",
"zod": "^3.22.4"
"zod": "^3.22.4",
"mcp-evals": "^1.0.18"
},
"devDependencies": {
"@types/node": "^20.10.0",
Expand All @@ -56,4 +57,4 @@
"url": "https://github.com/OctagonAI/octagon-mcp-server/issues"
},
"homepage": "https://docs.octagonagents.com"
}
}
59 changes: 59 additions & 0 deletions src/evals/evals.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
//evals.ts

import { EvalConfig } from 'mcp-evals';
import { openai } from "@ai-sdk/openai";
import { grade, EvalFunction } from "mcp-evals";

const octagonSecAgentEval: EvalFunction = {
name: "octagon-sec-agent Tool Evaluation",
description: "Evaluates the SEC filings analysis capabilities of the octagon-sec-agent",
run: async () => {
const result = await grade(openai("gpt-4"), "What was Apple's R&D expense as a percentage of revenue in their latest fiscal year?");
return JSON.parse(result);
Comment on lines +11 to +12
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error handling for grade function and JSON parsing

The grade function could fail if there are API issues, and JSON.parse might fail if the result isn't properly formatted JSON.

-    run: async () => {
-        const result = await grade(openai("gpt-4"), "What was Apple's R&D expense as a percentage of revenue in their latest fiscal year?");
-        return JSON.parse(result);
+    run: async () => {
+        try {
+            const result = await grade(openai("gpt-4"), "What was Apple's R&D expense as a percentage of revenue in their latest fiscal year?");
+            return JSON.parse(result);
+        } catch (error) {
+            console.error("Error in octagonSecAgentEval:", error);
+            throw error;
+        }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const result = await grade(openai("gpt-4"), "What was Apple's R&D expense as a percentage of revenue in their latest fiscal year?");
return JSON.parse(result);
run: async () => {
try {
const result = await grade(openai("gpt-4"), "What was Apple's R&D expense as a percentage of revenue in their latest fiscal year?");
return JSON.parse(result);
} catch (error) {
console.error("Error in octagonSecAgentEval:", error);
throw error;
}
}
🤖 Prompt for AI Agents
In src/evals/evals.ts around lines 11 to 12, add error handling for the grade
function call and the JSON.parse operation. Wrap the await grade call and
JSON.parse in a try-catch block to catch any exceptions from API failures or
invalid JSON. In the catch block, handle or log the error appropriately to
prevent the function from crashing unexpectedly.

}
};

const octagonTranscriptsAgentEval: EvalFunction = {
name: "octagon-transcripts-agent Evaluation",
description: "Evaluates the accuracy and completeness of the octagon-transcripts-agent for analyzing earnings call transcripts",
run: async () => {
const result = await grade(openai("gpt-4"), "What did Amazon's CEO say about AWS growth expectations in the latest earnings call?");
return JSON.parse(result);
}
};

const octagonFinancialsAgentEval: EvalFunction = {
name: "octagon-financials-agent Evaluation",
description: "Evaluates the financial analysis and ratio calculation capabilities of the octagon-financials-agent",
run: async () => {
const result = await grade(openai("gpt-4"), "Compare the gross margins, operating margins, and net margins of Apple, Microsoft, and Google over the last 3 years and provide insights on which company shows the strongest profitability trends.");
return JSON.parse(result);
}
};

const octagonStockDataAgentEval: EvalFunction = {
name: "Octagon Stock Data Agent Evaluation",
description: "Evaluates the performance of the Octagon Stock Data Agent for stock market data and valuation analysis",
run: async () => {
const result = await grade(openai("gpt-4"), "Compare Apple's stock performance to the S&P 500 over the last 6 months, including any significant events or catalysts that influenced price movements.");
return JSON.parse(result);
}
};

const octagonCompaniesAgentEval: EvalFunction = {
name: 'octagon-companies-agent Evaluation',
description: 'Evaluates the specialized private market intelligence tool for company info lookups and financials',
run: async () => {
const result = await grade(openai("gpt-4"), "List the top 5 companies in the AI sector by revenue growth");
return JSON.parse(result);
}
};

const config: EvalConfig = {
model: openai("gpt-4"),
evals: [octagonSecAgentEval, octagonTranscriptsAgentEval, octagonFinancialsAgentEval, octagonStockDataAgentEval, octagonCompaniesAgentEval]
};

export default config;

export const evals = [octagonSecAgentEval, octagonTranscriptsAgentEval, octagonFinancialsAgentEval, octagonStockDataAgentEval, octagonCompaniesAgentEval];