β Help us reach more developers and grow the Airweave community. Star this repo!
Airweave is a fully open-source context retrieval layer for AI agents across apps and databases. It connects to apps, productivity tools, databases, or document stores and transforms their contents into searchable knowledge bases, accessible through a standardized interface for agents.
The search interface is exposed via REST API or MCP. When using MCP, Airweave essentially builds a semantically searchable MCP server. The platform handles everything from auth and extraction to embedding and serving. You can find our documentation here.
πΊ Check out a quick demo of Airweave below:
Airweave.Demo.mp4
Managed Service: Airweave Cloud
Make sure docker and docker-compose are installed, then...
# 1. Clone the repository
git clone https://github.com/airweave-ai/airweave.git
cd airweave
# 2. Build and run
chmod +x start.sh
./start.shThat's it! Access the dashboard at http://localhost:8080
- Access the UI at
http://localhost:8080 - Connect sources, configure syncs, and query data
- Swagger docs:
http://localhost:8001/docs - Create connections, trigger syncs, and search data
pip install airweave-sdkfrom airweave import AirweaveSDK
# Initialize client
client = AirweaveSDK(
api_key="YOUR_API_KEY",
base_url="http://localhost:8001"
)
# Create a collection
collection = client.collections.create(name="My Collection")
# Add a source connection
source = client.source_connections.create(
name="My Stripe Connection",
short_name="stripe",
readable_collection_id=collection.readable_id,
authentication={
"credentials": {"api_key": "your_stripe_api_key"}
}
)
# Semantic search (default)
results = client.collections.search(
readable_id=collection.readable_id,
query="Find recent failed payments"
)
# Hybrid search (semantic + keyword)
results = client.collections.search(
readable_id=collection.readable_id,
query="customer invoices Q4 2024",
search_type="hybrid"
)
# With query expansion and reranking
results = client.collections.search(
readable_id=collection.readable_id,
query="technical documentation",
enable_query_expansion=True,
enable_reranking=True,
top_k=20
)
# Search with recency bias (prioritize recent results)
results = client.collections.search(
readable_id=collection.readable_id,
query="critical bugs",
recency_bias=0.8, # 0.0 to 1.0, higher = more recent
limit=10
)
# Get AI-generated answer instead of raw results
answer = client.collections.search(
readable_id=collection.readable_id,
query="What are our customer refund policies?",
response_type="completion",
enable_reranking=True
)npm install @airweave/sdk
# or
yarn add @airweave/sdkimport { AirweaveSDKClient, AirweaveSDKEnvironment } from "@airweave/sdk";
// Initialize client
const client = new AirweaveSDKClient({
apiKey: "YOUR_API_KEY",
environment: AirweaveSDKEnvironment.Local
});
// Create a collection
const collection = await client.collections.create({
name: "My Collection"
});
// Add a source connection
const source = await client.sourceConnections.create({
name: "My Stripe Connection",
shortName: "stripe",
readableCollectionId: collection.readableId,
authentication: {
credentials: { apiKey: "your_stripe_api_key" }
}
});
// Semantic search (default)
const results = await client.collections.search(
collection.readableId,
{ query: "Find recent failed payments" }
);
// Hybrid search (semantic + keyword)
const hybridResults = await client.collections.search(
collection.readableId,
{
query: "customer invoices Q4 2024",
searchType: "hybrid"
}
);
// With query expansion and reranking
const advancedResults = await client.collections.search(
collection.readableId,
{
query: "technical documentation",
enableQueryExpansion: true,
enableReranking: true,
topK: 20
}
);
// Search with recency bias (prioritize recent results)
const recentResults = await client.collections.search(
collection.readableId,
{
query: "critical bugs",
recencyBias: 0.8, // 0.0 to 1.0, higher = more recent
limit: 10
}
);
// Get AI-generated answer instead of raw results
const answer = await client.collections.search(
collection.readableId,
{
query: "What are our customer refund policies?",
responseType: "completion",
enableReranking: true
}
);- Data synchronization from 30+ sources with minimal config
- Entity extraction and transformation pipeline
- Multi-tenant architecture with OAuth2
- Incremental updates using content hashing
- Semantic search for agent queries
- Versioning for data changes
- Frontend: React/TypeScript with ShadCN
- Backend: FastAPI (Python)
- Databases: PostgreSQL (metadata), Qdrant (vectors)
- Workers: Temporal (workflow orchestration), Redis (pub/sub)
- Deployment: Docker Compose (dev), Kubernetes (prod)
We welcome contributions! Please check CONTRIBUTING.md for details.
Airweave is released under the MIT license.
- Discord - Get help and discuss features
- GitHub Issues - Report bugs or request features
- Twitter - Follow for updates