A high-performance search engine designed to understand the meaning behind your query, not just the keywords.
Traditional search engines look for exact word matches. Semantic Search goes deeper. It converts text into numerical vectors—representations of meaning—allowing you to find relevant results even if the exact keywords don't match.
For example, searching for "protecting forests from fire" will correctly find documents about "wildfire prevention," even if the words differ.
- Context-Aware Search: Understands user intent and query nuances.
- High Performance: Optimized for speed using modern vector database technologies.
- Scalable Architecture: Built to handle large datasets efficiently.
- Real-time Indexing: Updates search results as new data is added.
- Ingestion: Documents are processed and chunks of text are converted into embeddings (vectors) using OpenAI's embedding models.
- Storage: These vectors are stored in a Pinecone vector database.
- Retrieval: When you search, your query is also converted into a vector. The system then finds the most mathematically similar (closest) documents in the database.
- TypeScript: The core language for Type-Safe logic.
- Pinecone: The vector database engine.
- OpenAI API: For generating text embeddings.
- Next.js: The framework powering the API and interface.
-
Clone the repository
git clone https://github.com/yashmahe2020/SemanticSearch.git cd SemanticSearch -
Install dependencies
npm install
-
Configure Environment Create a
.env.localfile with your API keys:OPENAI_API_KEY=your_key PINECONE_API_KEY=your_key PINECONE_ENVIRONMENT=your_env
-
Run the app
npm run dev