Skip to content

SQLite扩展SQL特性

SQLite扩展SQL特性概述

SQLite支持大部分标准SQL语法,同时也提供了一些扩展特性,这些特性增强了SQLite的功能,提高了开发效率。本文档将介绍SQLite的主要扩展SQL特性。

窗口函数

窗口函数是SQLite 3.25.0引入的特性,用于在一组行上执行计算,同时保留每行的结果。

1. 基本语法

sql
SELECT
    column1,
    column2,
    窗口函数() OVER (
        [PARTITION BY partition_column]
        [ORDER BY order_column]
        [ROWS BETWEEN frame_start AND frame_end]
    ) AS window_result
FROM table;

2. 常用窗口函数

函数类型函数名描述
聚合函数SUM()计算窗口内的和
聚合函数AVG()计算窗口内的平均值
聚合函数COUNT()计算窗口内的行数
聚合函数MAX()计算窗口内的最大值
聚合函数MIN()计算窗口内的最小值
排名函数ROW_NUMBER()为窗口内的行分配唯一序号
排名函数RANK()为窗口内的行分配排名,相同值排名相同,后续排名跳过
排名函数DENSE_RANK()为窗口内的行分配排名,相同值排名相同,后续排名不跳过
排名函数NTILE(n)将窗口内的行分为n个组,返回组号
偏移函数LAG(expr, offset)返回当前行之前第offset行的值
偏移函数LEAD(expr, offset)返回当前行之后第offset行的值
偏移函数FIRST_VALUE(expr)返回窗口内的第一个值
偏移函数LAST_VALUE(expr)返回窗口内的最后一个值

3. 使用示例

基本窗口函数

sql
SELECT
    name,
    department,
    salary,
    AVG(salary) OVER (PARTITION BY department) AS avg_department_salary,
    SUM(salary) OVER (PARTITION BY department) AS total_department_salary,
    MAX(salary) OVER (PARTITION BY department) AS max_department_salary,
    MIN(salary) OVER (PARTITION BY department) AS min_department_salary
FROM employees;

排名函数

sql
SELECT
    name,
    department,
    salary,
    ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS row_num,
    RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS rank,
    DENSE_RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS dense_rank,
    NTILE(3) OVER (PARTITION BY department ORDER BY salary DESC) AS ntile
FROM employees;

偏移函数

sql
SELECT
    name,
    department,
    salary,
    LAG(salary, 1) OVER (PARTITION BY department ORDER BY salary DESC) AS previous_salary,
    LEAD(salary, 1) OVER (PARTITION BY department ORDER BY salary DESC) AS next_salary,
    FIRST_VALUE(name) OVER (PARTITION BY department ORDER BY salary DESC) AS highest_paid,
    LAST_VALUE(name) OVER (PARTITION BY department ORDER BY salary DESC) AS lowest_paid
FROM employees;

窗口框架

sql
SELECT
    name,
    department,
    salary,
    SUM(salary) OVER (
        PARTITION BY department
        ORDER BY salary DESC
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS cumulative_sum,
    AVG(salary) OVER (
        PARTITION BY department
        ORDER BY salary DESC
        ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING
    ) AS moving_average
FROM employees;

公共表表达式(CTE)

公共表表达式(Common Table Expression,CTE)是SQLite 3.8.0引入的特性,用于创建临时结果集,提高查询的可读性和可维护性。

1. 基本语法

sql
WITH cte_name (column1, column2) AS (
    SELECT column1, column2 FROM table WHERE condition
)
SELECT * FROM cte_name;

2. 非递归CTE

非递归CTE用于创建一次性的临时结果集:

sql
WITH top_employees AS (
    SELECT name, department, salary FROM employees WHERE salary > 50000
)
SELECT department, COUNT(*) AS high_salary_count
FROM top_employees
GROUP BY department;

3. 递归CTE

递归CTE用于处理层次结构数据,如组织结构、文件系统等:

sql
-- 创建组织结构表
CREATE TABLE organization (
    id INTEGER PRIMARY KEY,
    name TEXT NOT NULL,
    manager_id INTEGER REFERENCES organization(id)
);

-- 插入示例数据
INSERT INTO organization (id, name, manager_id) VALUES
(1, 'CEO', NULL),
(2, 'CTO', 1),
(3, 'CFO', 1),
(4, 'Engineering Manager', 2),
(5, 'Developer', 4),
(6, 'Marketing Manager', 3),
(7, 'Marketing Specialist', 6);

-- 使用递归CTE查询组织结构
WITH RECURSIVE org_hierarchy AS (
    -- 基础查询:根节点
    SELECT id, name, manager_id, 0 AS level
    FROM organization
    WHERE manager_id IS NULL
    UNION ALL
    -- 递归查询:子节点
    SELECT o.id, o.name, o.manager_id, oh.level + 1 AS level
    FROM organization o
    INNER JOIN org_hierarchy oh ON o.manager_id = oh.id
)
SELECT id, name, manager_id, level, REPEAT('  ', level) || name AS indented_name
FROM org_hierarchy
ORDER BY level, id;

4. 多个CTE

可以在一个查询中定义多个CTE:

sql
WITH 
    cte1 AS (
        SELECT name, salary FROM employees WHERE department = 'Engineering'
    ),
    cte2 AS (
        SELECT name, salary FROM employees WHERE department = 'Marketing'
    )
SELECT 'Engineering' AS department, AVG(salary) AS avg_salary FROM cte1
UNION ALL
SELECT 'Marketing' AS department, AVG(salary) AS avg_salary FROM cte2;

JSON支持

SQLite 3.9.0引入了JSON支持,允许存储、查询和修改JSON数据。

1. JSON数据类型

SQLite使用TEXT类型存储JSON数据:

sql
CREATE TABLE products (
    id INTEGER PRIMARY KEY,
    name TEXT NOT NULL,
    details TEXT CHECK (json_valid(details)) -- 确保是有效的JSON
);

-- 插入JSON数据
INSERT INTO products (name, details) VALUES (
    'Product A',
    '{"price": 100, "category": "Electronics", "tags": ["gadget", "tech"]}'
);

2. JSON函数

函数描述示例
json_valid()检查是否为有效的JSONjson_valid('{"name": "test"}')
json_extract()提取JSON数据json_extract(details, '$.price')
json_insert()插入JSON数据json_insert(details, '$.color', 'red')
json_replace()替换JSON数据json_replace(details, '$.price', 150)
json_set()插入或替换JSON数据json_set(details, '$.price', 150)
json_remove()删除JSON数据json_remove(details, '$.tags[0]')
json_array()创建JSON数组json_array(1, 2, 3)
json_object()创建JSON对象json_object('name', 'test', 'value', 123)
json_type()返回JSON值的类型json_type(details, '$.price')
json_each()遍历JSON对象或数组SELECT * FROM json_each(details, '$.tags')
json_tree()递归遍历JSONSELECT * FROM json_tree(details)

3. JSON路径语法

语法描述示例
$根元素$.name
.key对象属性$.details.price
[index]数组元素$.tags[0]
[*]所有数组元素$.tags[*]
.*所有对象属性$.details.*
[start:end]数组切片$.tags[0:2]

4. JSON查询示例

sql
-- 提取JSON数据
SELECT 
    name,
    json_extract(details, '$.price') AS price,
    json_extract(details, '$.category') AS category,
    json_extract(details, '$.tags[0]') AS first_tag
FROM products;

-- 使用简化语法(SQLite 3.38.0+)
SELECT 
    name,
    details->'$.price' AS price,
    details->>'$.category' AS category -- ->> 返回文本值,而不是JSON值
FROM products;

-- 查询JSON数组
SELECT name, tag.value
FROM products,
     json_each(details, '$.tags') AS tag;

-- 过滤JSON数据
SELECT name, details
FROM products
WHERE json_extract(details, '$.price') > 100;

-- 更新JSON数据
UPDATE products
SET details = json_set(details, '$.price', 150, '$.in_stock', true)
WHERE id = 1;

UPSERT

UPSERT是SQLite 3.24.0引入的特性,用于在插入数据时,如果数据已存在则更新,否则插入。

1. 基本语法

sql
INSERT INTO table (column1, column2) VALUES (value1, value2)
ON CONFLICT (conflict_column) DO UPDATE SET column1 = value1;

2. 使用示例

sql
-- 创建带有唯一约束的表
CREATE TABLE users (
    id INTEGER PRIMARY KEY,
    email TEXT UNIQUE NOT NULL,
    name TEXT NOT NULL,
    login_count INTEGER DEFAULT 0
);

-- 使用UPSERT插入或更新数据
INSERT INTO users (email, name, login_count) VALUES ('user@example.com', 'User Name', 1)
ON CONFLICT (email) DO UPDATE SET 
    login_count = users.login_count + 1,
    name = excluded.name;

-- 使用多个冲突列
CREATE TABLE unique_pairs (
    a INTEGER,
    b INTEGER,
    value TEXT,
    PRIMARY KEY (a, b)
);

INSERT INTO unique_pairs (a, b, value) VALUES (1, 2, 'new value')
ON CONFLICT (a, b) DO UPDATE SET value = excluded.value;

-- 使用WHERE子句
INSERT INTO users (email, name, login_count) VALUES ('user@example.com', 'User Name', 1)
ON CONFLICT (email) DO UPDATE SET 
    login_count = users.login_count + 1
WHERE users.login_count < 10;

FILTER子句

FILTER子句是SQLite 3.30.0引入的特性,用于在聚合函数中过滤数据。

1. 基本语法

sql
SELECT 
    department,
    COUNT(*) AS total_employees,
    COUNT(*) FILTER (WHERE salary > 50000) AS high_salary_count,
    AVG(salary) FILTER (WHERE active = 1) AS avg_active_salary
FROM employees
GROUP BY department;

2. 使用示例

sql
-- 统计不同部门的员工数量和高薪资员工数量
SELECT 
    department,
    COUNT(*) AS total_employees,
    COUNT(*) FILTER (WHERE salary > 50000) AS high_salary_count,
    SUM(salary) FILTER (WHERE salary > 50000) AS high_salary_total,
    AVG(salary) FILTER (WHERE active = 1) AS avg_active_salary,
    AVG(salary) FILTER (WHERE active = 0) AS avg_inactive_salary
FROM employees
GROUP BY department;

-- 比较不同性别的薪资
SELECT 
    AVG(salary) FILTER (WHERE gender = 'Male') AS avg_male_salary,
    AVG(salary) FILTER (WHERE gender = 'Female') AS avg_female_salary,
    AVG(salary) FILTER (WHERE gender = 'Other') AS avg_other_salary
FROM employees;

生成列

生成列是SQLite 3.31.0引入的特性,用于自动计算并存储派生值。

1. 基本语法

sql
CREATE TABLE table (
    column1 INTEGER,
    column2 INTEGER,
    generated_column INTEGER GENERATED ALWAYS AS (column1 + column2) STORED
);

2. 虚拟生成列与存储生成列

类型描述优点缺点
VIRTUAL每次查询时计算节省存储空间每次查询都要计算,可能影响性能
STORED存储计算结果查询速度快占用额外存储空间

3. 使用示例

sql
-- 创建带有生成列的表
CREATE TABLE products (
    id INTEGER PRIMARY KEY,
    name TEXT NOT NULL,
    price REAL NOT NULL,
    quantity INTEGER NOT NULL DEFAULT 0,
    total_value REAL GENERATED ALWAYS AS (price * quantity) STORED,
    is_in_stock INTEGER GENERATED ALWAYS AS (quantity > 0) VIRTUAL
);

-- 插入数据,生成列自动计算
INSERT INTO products (name, price, quantity) VALUES ('Product A', 100, 5);

-- 查询生成列
SELECT name, price, quantity, total_value, is_in_stock FROM products;

-- 更新数据,生成列自动更新
UPDATE products SET quantity = 10 WHERE id = 1;

行值比较

行值比较允许同时比较多个列:

sql
-- 创建表
CREATE TABLE points (
    id INTEGER PRIMARY KEY,
    x INTEGER NOT NULL,
    y INTEGER NOT NULL
);

-- 插入数据
INSERT INTO points (x, y) VALUES (1, 2), (3, 4), (5, 6);

-- 使用行值比较
SELECT * FROM points WHERE (x, y) > (2, 3);
SELECT * FROM points WHERE (x, y) BETWEEN (2, 3) AND (4, 5);
SELECT * FROM points ORDER BY (x, y);

-- 在IN子句中使用行值
SELECT * FROM points WHERE (x, y) IN ((1, 2), (3, 4));

部分索引

部分索引是基于表中部分行创建的索引,只包含满足条件的行。

sql
-- 创建部分索引
CREATE INDEX idx_orders_active ON orders(order_date) WHERE status = 'active';

-- 查询使用部分索引
SELECT * FROM orders WHERE status = 'active' AND order_date > '2023-01-01';

表达式索引

表达式索引基于表达式创建,而非单个列:

sql
-- 创建表达式索引
CREATE INDEX idx_users_name_lower ON users(lower(name));

-- 查询使用表达式索引
SELECT * FROM users WHERE lower(name) = 'alice';

最佳实践

1. 窗口函数

  • 合理使用PARTITION BY和ORDER BY,避免过度复杂的窗口定义
  • 对于大型数据集,窗口函数可能影响性能,需要测试和优化
  • 优先使用内置窗口函数,避免自定义窗口函数

2. CTE

  • 使用CTE提高查询的可读性和可维护性
  • 递归CTE要注意终止条件,避免无限递归
  • 对于复杂查询,考虑将CTE拆分为多个简单的CTE

3. JSON支持

  • 使用json_valid()约束确保JSON数据的有效性
  • 对于频繁查询的JSON字段,考虑提取为单独的列
  • 对于大型JSON数据,考虑使用JSONB(SQLite 3.33.0+)

4. UPSERT

  • 确保有适当的唯一约束或主键,否则UPSERT不会生效
  • 合理使用WHERE子句,控制更新条件
  • 注意excluded表的使用,excluded包含要插入的值

5. 生成列

  • 对于频繁查询的派生值,使用STORED生成列
  • 对于不频繁查询的派生值,使用VIRTUAL生成列
  • 避免在生成列中使用复杂表达式,影响性能

常见问题(FAQ)

Q: 如何选择窗口函数的窗口框架?

A: 窗口框架的选择取决于具体需求:

  • UNBOUNDED PRECEDING AND CURRENT ROW:累积计算
  • CURRENT ROW AND UNBOUNDED FOLLOWING:从当前行到末尾
  • n PRECEDING AND n FOLLOWING:移动窗口

Q: 递归CTE的最大递归深度是多少?

A: SQLite默认的最大递归深度是1000,可以通过PRAGMA语句调整:

sql
PRAGMA recursion_limit = 5000;

Q: 如何优化JSON查询性能?

A: 优化JSON查询性能的方法:

  • 对于频繁查询的JSON字段,提取为单独的列
  • 使用表达式索引加速JSON查询
  • 对于大型JSON数据,考虑使用JSONB
  • 避免在WHERE子句中使用复杂的JSON函数

Q: 生成列会影响插入和更新性能吗?

A: 是的,生成列会增加插入和更新的开销,因为需要计算生成列的值。对于STORED生成列,还需要额外的存储空间。

Q: 部分索引和表达式索引有什么区别?

A: 部分索引基于行的条件创建,只包含满足条件的行;表达式索引基于列的表达式创建,包含所有行,但索引键是表达式的计算结果。

总结

SQLite提供了丰富的扩展SQL特性,包括窗口函数、CTE、JSON支持、UPSERT、FILTER子句、生成列等。这些特性增强了SQLite的功能,提高了开发效率。在实际应用中,需要根据具体需求和场景,合理使用这些特性,优化查询性能,确保数据安全。

掌握这些扩展特性可以帮助开发者编写更高效、更可读、更维护的SQL查询,处理复杂的数据场景,如层次结构数据、JSON数据、窗口计算等。