Time-Series Data
High-performance ingestion, aggregation, and retention for temporal data
Overview
entropyDB provides native support for time-series data with automatic partitioning, compression, and retention policies. Ideal for metrics, logs, IoT sensor data, and analytics.
📈 High Throughput
Millions of writes per second with automatic batching
🗜️ Compression
Automatic time-series compression reduces storage by 90%+
⏰ Retention Policies
Automatic data expiration and archival
📊 Continuous Aggregates
Pre-computed rollups for fast queries
Creating Time-Series Tables
-- Create time-series table
CREATE TABLE metrics (
time TIMESTAMPTZ NOT NULL,
metric_name TEXT NOT NULL,
value DOUBLE PRECISION,
tags JSONB
);
-- Convert to time-series (hypertable)
SELECT create_hypertable('metrics', 'time');
-- Add index on commonly queried columns
CREATE INDEX idx_metrics_name_time ON metrics (metric_name, time DESC);
CREATE INDEX idx_metrics_tags ON metrics USING GIN (tags);💡 Hypertables
Hypertables automatically partition data by time, providing optimal performance for time-series workloads.
Data Ingestion
Single Insert
INSERT INTO metrics (time, metric_name, value, tags)
VALUES (
NOW(),
'cpu.usage',
75.5,
'{"host": "server1", "region": "us-west"}'::jsonb
);Batch Insert
INSERT INTO metrics (time, metric_name, value, tags) VALUES
(NOW(), 'cpu.usage', 75.5, '{"host": "server1"}'::jsonb),
(NOW(), 'memory.usage', 8192, '{"host": "server1"}'::jsonb),
(NOW(), 'disk.io', 1500, '{"host": "server1"}'::jsonb),
(NOW(), 'cpu.usage', 82.3, '{"host": "server2"}'::jsonb);Using COPY for Bulk Load
-- From CSV file COPY metrics (time, metric_name, value, tags) FROM '/path/to/metrics.csv' WITH (FORMAT csv, HEADER true); -- Can achieve 1M+ rows/second
Querying Time-Series Data
Recent Data
-- Last 1 hour of data SELECT time, metric_name, value FROM metrics WHERE time > NOW() - INTERVAL '1 hour' AND metric_name = 'cpu.usage' ORDER BY time DESC;
Aggregations
-- 5-minute averages
SELECT
time_bucket('5 minutes', time) AS bucket,
metric_name,
AVG(value) as avg_value,
MAX(value) as max_value,
MIN(value) as min_value
FROM metrics
WHERE time > NOW() - INTERVAL '24 hours'
GROUP BY bucket, metric_name
ORDER BY bucket DESC;Downsampling
-- Hourly rollup with percentiles
SELECT
time_bucket('1 hour', time) AS hour,
tags->>'host' as host,
AVG(value) as avg_cpu,
percentile_cont(0.95) WITHIN GROUP (ORDER BY value) as p95_cpu,
percentile_cont(0.99) WITHIN GROUP (ORDER BY value) as p99_cpu
FROM metrics
WHERE metric_name = 'cpu.usage'
AND time > NOW() - INTERVAL '7 days'
GROUP BY hour, host
ORDER BY hour DESC;Gap Filling
-- Fill missing time intervals with interpolation
SELECT
time_bucket_gapfill('1 minute', time) AS minute,
locf(AVG(value)) as interpolated_value
FROM metrics
WHERE metric_name = 'temperature'
AND time > NOW() - INTERVAL '1 hour'
GROUP BY minute
ORDER BY minute;Continuous Aggregates
Pre-compute aggregations for faster queries:
-- Create continuous aggregate
CREATE MATERIALIZED VIEW metrics_hourly
WITH (timescaledb.continuous) AS
SELECT
time_bucket('1 hour', time) AS hour,
metric_name,
tags->>'host' as host,
AVG(value) as avg_value,
MAX(value) as max_value,
MIN(value) as min_value,
COUNT(*) as sample_count
FROM metrics
GROUP BY hour, metric_name, host;
-- Add refresh policy (auto-update)
SELECT add_continuous_aggregate_policy('metrics_hourly',
start_offset => INTERVAL '3 hours',
end_offset => INTERVAL '1 hour',
schedule_interval => INTERVAL '1 hour'
);
-- Query the aggregate (much faster!)
SELECT * FROM metrics_hourly
WHERE hour > NOW() - INTERVAL '30 days';Retention Policies
Automatically drop old data:
-- Drop data older than 90 days
SELECT add_retention_policy('metrics', INTERVAL '90 days');
-- Different retention for different data
-- Keep raw data for 30 days
SELECT add_retention_policy('metrics', INTERVAL '30 days');
-- Keep hourly aggregates for 1 year
SELECT add_retention_policy('metrics_hourly', INTERVAL '1 year');
-- Keep daily aggregates forever (no retention)
-- View retention policies
SELECT * FROM timescaledb_information.jobs
WHERE proc_name = 'policy_retention';Compression
Enable automatic compression for older data:
-- Enable compression
ALTER TABLE metrics SET (
timescaledb.compress,
timescaledb.compress_segmentby = 'metric_name, tags',
timescaledb.compress_orderby = 'time DESC'
);
-- Add compression policy
SELECT add_compression_policy('metrics', INTERVAL '7 days');
-- Manually compress chunks
SELECT compress_chunk(chunk)
FROM show_chunks('metrics', older_than => INTERVAL '7 days');
-- View compression stats
SELECT
pg_size_pretty(before_compression_total_bytes) as uncompressed,
pg_size_pretty(after_compression_total_bytes) as compressed,
round((1 - after_compression_total_bytes::numeric /
before_compression_total_bytes::numeric) * 100, 2) as savings_pct
FROM hypertable_compression_stats('metrics');✓ Compression Benefits
Typical compression ratios: 90-95% space savings for time-series data with minimal performance impact.
Real-Time Analytics
-- Real-time dashboard query
WITH recent_metrics AS (
SELECT
time_bucket('1 minute', time) AS minute,
tags->>'host' as host,
metric_name,
AVG(value) as value
FROM metrics
WHERE time > NOW() - INTERVAL '15 minutes'
GROUP BY minute, host, metric_name
)
SELECT
host,
MAX(CASE WHEN metric_name = 'cpu.usage' THEN value END) as cpu,
MAX(CASE WHEN metric_name = 'memory.usage' THEN value END) as memory,
MAX(CASE WHEN metric_name = 'disk.io' THEN value END) as disk_io
FROM recent_metrics
WHERE minute = (SELECT MAX(minute) FROM recent_metrics)
GROUP BY host;Best Practices
1. Choose the Right Time Column
Use TIMESTAMPTZ for timezone-aware timestamps. Ensure time values are indexed.
2. Batch Your Writes
Insert multiple rows at once for better performance. Use COPY for bulk loads.
3. Use Continuous Aggregates
Pre-compute common aggregations to speed up queries on historical data.
4. Enable Compression
Compress data older than your active query window to save storage costs.
5. Set Retention Policies
Automatically drop old data you no longer need to maintain performance.