Hello World Tutorial

Multi-model queries in action - see entropyDB's power in one example

What You'll Build

In this tutorial, you'll build a simple application that demonstrates entropyDB's multi-model capabilities. You'll work with relational data, JSON documents, time-series events, and vector embeddings—all in a single database.

💡 What You'll Learn

Creating tables with multiple data types
Querying JSON documents
Working with time-series data
Performing vector similarity search
Cross-model joins and queries

Step 1: Create the Schema

Let's create a simple e-commerce schema that uses all data models:

-- Relational: Users table
CREATE TABLE users (
  id SERIAL PRIMARY KEY,
  username TEXT NOT NULL UNIQUE,
  email TEXT NOT NULL,
  created_at TIMESTAMP DEFAULT NOW()
);

-- Document: Product catalog with flexible schema
CREATE TABLE products (
  id SERIAL PRIMARY KEY,
  name TEXT NOT NULL,
  price DECIMAL(10,2),
  metadata JSONB,  -- Flexible product attributes
  embedding VECTOR(768),  -- For semantic search
  created_at TIMESTAMP DEFAULT NOW()
);

-- Time-Series: User activity events
CREATE TABLE events (
  timestamp TIMESTAMP DEFAULT NOW(),
  user_id INT REFERENCES users(id),
  event_type TEXT,
  properties JSONB
);

-- Create indexes
CREATE INDEX idx_products_metadata ON products USING GIN(metadata);
CREATE INDEX idx_products_embedding ON products USING hnsw(embedding vector_cosine_ops);
CREATE INDEX idx_events_timestamp ON events(timestamp DESC);

Step 2: Insert Sample Data

Add Users

INSERT INTO users (username, email) VALUES
  ('alice', 'alice@example.com'),
  ('bob', 'bob@example.com'),
  ('charlie', 'charlie@example.com');

Add Products with JSON Metadata

INSERT INTO products (name, price, metadata) VALUES
  ('Laptop', 999.99, '{
    "brand": "TechCorp",
    "specs": {"cpu": "Intel i7", "ram": "16GB", "storage": "512GB SSD"},
    "tags": ["electronics", "computers", "featured"]
  }'::jsonb),
  ('Wireless Mouse', 29.99, '{
    "brand": "TechCorp", 
    "specs": {"dpi": "4000", "wireless": true},
    "tags": ["electronics", "accessories"]
  }'::jsonb),
  ('USB-C Cable', 12.99, '{
    "brand": "CableCo",
    "specs": {"length": "2m", "speed": "USB 3.1"},
    "tags": ["accessories", "cables"]
  }'::jsonb);

Add User Events

INSERT INTO events (user_id, event_type, properties) VALUES
  (1, 'page_view', '{"page": "product", "product_id": 1}'::jsonb),
  (1, 'add_to_cart', '{"product_id": 1, "quantity": 1}'::jsonb),
  (2, 'page_view', '{"page": "product", "product_id": 2}'::jsonb),
  (2, 'purchase', '{"product_id": 2, "amount": 29.99}'::jsonb),
  (3, 'search', '{"query": "laptop", "results": 5}'::jsonb);

Step 3: Multi-Model Queries

Query 1: JSON Document Search

Find all electronics with specific specs:

-- Find products with "electronics" tag
SELECT 
  name, 
  price,
  metadata->>'brand' as brand,
  metadata->'tags' as tags
FROM products
WHERE metadata @> '{"tags": ["electronics"]}';

Query 2: Time-Series Analysis

Analyze user activity over time:

-- Events by hour with user details
SELECT 
  DATE_TRUNC('hour', e.timestamp) as hour,
  e.event_type,
  COUNT(*) as event_count,
  array_agg(DISTINCT u.username) as users
FROM events e
JOIN users u ON e.user_id = u.id
GROUP BY hour, e.event_type
ORDER BY hour DESC;

Query 3: Complex Join Across Models

Find user purchase behavior with product details:

-- User activity report with product info
SELECT 
  u.username,
  u.email,
  e.event_type,
  p.name as product_name,
  p.metadata->>'brand' as brand,
  e.properties->>'amount' as amount,
  e.timestamp
FROM users u
JOIN events e ON u.id = e.user_id
LEFT JOIN products p ON (e.properties->>'product_id')::int = p.id
WHERE e.event_type IN ('add_to_cart', 'purchase')
ORDER BY e.timestamp DESC;

Query 4: Aggregations with JSON

Product statistics by brand:

-- Brand analytics
SELECT 
  metadata->>'brand' as brand,
  COUNT(*) as product_count,
  AVG(price) as avg_price,
  array_agg(DISTINCT jsonb_array_elements_text(metadata->'tags')) as all_tags
FROM products
GROUP BY metadata->>'brand'
ORDER BY product_count DESC;

Step 4: Add Vector Search

Let's add semantic search capabilities using vector embeddings:

-- Update products with sample embeddings
-- In production, you'd generate these from actual product descriptions
UPDATE products 
SET embedding = random_vector(768)  -- entropyDB helper function
WHERE id = 1;

-- Semantic search for similar products
SELECT 
  p1.name,
  p1.price,
  p1.metadata->>'brand' as brand,
  1 - (p1.embedding <=> p2.embedding) as similarity
FROM products p1
CROSS JOIN products p2
WHERE p2.id = 1  -- Find products similar to product 1
  AND p1.id != p2.id
ORDER BY p1.embedding <=> p2.embedding
LIMIT 5;

💡 Pro Tip

In a real application, you'd generate embeddings using OpenAI, Cohere, or other embedding models based on product descriptions.

Step 5: Multi-Model Transactions

entropyDB ensures ACID guarantees across all data models:

-- Complete purchase transaction across multiple models
BEGIN;
  -- Update user data (relational)
  UPDATE users 
  SET metadata = jsonb_set(
    COALESCE(metadata, '{}'::jsonb),
    '{last_purchase}',
    to_jsonb(NOW())
  )
  WHERE id = 1;
  
  -- Record event (time-series)
  INSERT INTO events (user_id, event_type, properties)
  VALUES (1, 'purchase', '{"product_id": 1, "amount": 999.99}'::jsonb);
  
  -- Update product analytics (document)
  UPDATE products
  SET metadata = jsonb_set(
    metadata,
    '{sales_count}',
    to_jsonb(COALESCE((metadata->>'sales_count')::int, 0) + 1)
  )
  WHERE id = 1;
COMMIT;

-- Everything succeeds or nothing does!

Complete Example Output

Here's what a complete query result looks like:

username | event_type   | product_name | brand    | amount  | timestamp
---------|--------------|--------------|----------|---------|-------------------------
bob      | purchase     | Mouse        | TechCorp | 29.99   | 2025-11-17 10:30:15
alice    | add_to_cart  | Laptop       | TechCorp | null    | 2025-11-17 10:25:42
charlie  | search       | null         | null     | null    | 2025-11-17 10:20:10

(3 rows)

Next Steps

Now that you've seen entropyDB's multi-model capabilities, dive deeper into specific features:

→ SQL Reference Guide → Vector Search Deep Dive → Time-Series Operations → Build with Python SDK