Enterprises often face challenges in retrieving accurate and unified information across disparate data systems. Agent Search addresses this by combining AWS-native services such as Amazon EKS, Bedrock, Neptune, OpenSearch, and RDS to deliver semantic search capabilities over both structured and unstructured data. Built on GraphRAG architecture, the platform leverages LLM-powered reasoning with real-time context retrieval from graph and vector stores. This case study outlines how Agent Search helped customers unify access to knowledge silos, enhancing productivity, compliance, and decision-making across departments.
Customer Information
Customer: Confidential (Representative of BFSI, Healthcare, and E-commerce sectors)
Industry: Multi-industry (BFSI, Healthcare, E-commerce)
Location: Global operations
Company Size: 1,000+ employees
Data scattered across relational databases, search indexes, object storage, and knowledge graphs created a fragmented experience for business analysts and compliance teams.
Existing tools offered only keyword-based search with no semantic understanding, limiting their effectiveness for legal, policy, and customer insights.
Stakeholders lacked traceable, accurate answers to business-critical questions.
Regulatory requirements demanded precise documentation access with traceability.
There was increasing pressure to enable real-time policy discovery, especially in fast-changing compliance environments.
Integrating and indexing data from multiple sources: S3, RDS, Neptune, and OpenSearch.
Ensuring secure IAM-based access while maintaining auditability and compliance.
Enabling performant semantic search at scale with scalable, containerized infrastructure.
Operationalizing GraphRAG on Kubernetes while managing LLM interactions through Bedrock.
Aligning with AWS security and VPC policies for enterprise compliance.
Agent Search delivers an intelligent, LLM-powered, real-time semantic search platform built on Kubernetes and AWS-native services. The platform enables cross-silo search by integrating graph databases (Neptune), vector stores (OpenSearch), and relational data (RDS), enhanced through Bedrock LLMs. A modular indexing framework built on Kubernetes jobs handles ingestion, while GraphRAG enables dynamic retrieval-augmented generation (RAG). IAM-secured APIs allow seamless integration across the enterprise.
Amazon EKS: Hosts the GraphRAG API server and indexing workers.
Amazon Bedrock: Provides LLM reasoning capabilities.
Amazon Neptune: Stores graph data and metadata context.
Amazon OpenSearch: Used as a vector DB for semantic similarity search.
Amazon RDS: Stores structured data such as policy tables.
Amazon S3: Houses documents and unstructured files.
IAM: Controls secure, scoped access to services.
CloudWatch: Enables observability and alerting.
Amazon ECR: Container image repository for deployment.
Deployment: Helm-based deployment on Amazon EKS using separate control and data planes.
Methodology: Agile delivery with continuous integration and testing pipelines.
Security: IAM roles scoped per service with TLS in transit and SSE encryption at rest.
Testing: Performed synthetic and real-data search scenarios with trace logging.
Timeline: Initial prototype in 2 weeks; production deployment in 6 weeks.
Integration: API Gateway enabled access to web UIs and Slack bots.
Observability: CloudWatch and custom dashboards ensure transparency in query resolution.
Used GraphRAG with AWS-native vector and graph stores for real-time contextual augmentation.
Employed Kubernetes autoscaling for indexing scalability and latency reduction.
Followed AWS Well-Architected Framework: security, reliability, performance efficiency.
Integrated CI/CD pipelines to manage iterative improvement cycles and LLM prompt tuning.
Introduced reusable Helm charts and IAM templates for faster customer rollout.
Reduction in Time to Insight: 65% faster discovery of policy and compliance documents.
Increased Query Accuracy: 80% improvement in answer relevancy compared to legacy search.
Improved Compliance: Enabled traceable document access for audits and regulators.
Team Productivity: Reduced analyst research time by ~40% via semantic automation.
Cost Efficiency: Eliminated siloed tools, reducing TCO by ~30%.
Elastic Scalability: EKS-based worker pods scale with query load.
Performance: Optimized through async indexing and Bedrock’s LLM streaming.
Availability: High availability via multi-zone Kubernetes deployment.
Security: Full audit trail with scoped IAM, VPC isolation, and encrypted data layers.
Modular Extensibility: New sources and use cases can be added with minimal rework.
Initial ingestion of large legacy datasets required batch migration jobs and tuning of indexing pods.
Managing Bedrock API limits requires implementing retry and fallback strategies.
Ensuring consistent context mapping across graph/vector stores requires schema unification.
Pre-indexing data during off-peak hours reduces latency under heavy query load.
Establish robust observability (CloudWatch + custom traces) for accelerated debugging.
Modularizing the agent pipeline helps isolate and tune RAG components independently.
Expand deployment to multiple regions and introduce multi-tenant features.
Add support for user feedback loops to tune model outputs.
Integrate with Microsoft Teams and Confluence for inline search.
Explore AWS Marketplace launch and open ecosystem partner integrations.