This is my note from some tutorials while learning sql.

Copyright belongs to the original source.

First tutorial about query

The tutorial comes from http://www.xuesql.cn/

Outer Join

1
2
3
4
5
6
7
8
--用LEFT/RIGHT/FULL JOINs 做多表查询
SELECT column, another_column, …
FROM mytable
INNER/LEFT/RIGHT/FULL JOIN another_table
ON mytable.id = another_table.matching_id
WHERE condition(s)
ORDER BY column, … ASC/DESC
LIMIT num_limit OFFSET num_offset;

an example

we can select columns from other table(building)

1
2
3
4
5
select distinct buildings.building_name, buildings.capacity
from employees
left join buildings
on buildings.building_name = employees.building
where building_name

Expression

1
2
3
4
SELECT  particle_speed / 2.0 AS half_particle_speed (对结果做了一个除2
FROM physics_data
WHERE ABS(particle_position) * 10.0 >500
(条件要求这个属性绝对值乘以10大于500);

Exercise:

![Screen Shot 2020-10-30 at 21.19.29](../../../../../Library/Application Support/typora-user-images/Screen Shot 2020-10-30 at 21.19.29.png)

Screen Shot 2020-10-30 at 21.19.38

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
/*1.
select id, title, (Boxoffice.Domestic_sales+boxoffice.International_sales)/1000000 AS sales
from movies
left join boxoffice
on movies.id = boxoffice.movie_id
*/

/*2.
select id, title, rating*10 AS Market_index
from movies
left join boxoffice
on movies.id = boxoffice.movie_id
*/

/*3.
select id, title, year
from movies
where year%2 = 0
*/

select title, (domestic_sales + international_sales)/length_minutes as value
from movies
left join boxoffice
on movies.id=boxoffice.movie_id
where director="John Lasseter"
order by value desc
limit 3

we can use year%2 to express mod

Statistics I

1
2
3
SELECT AGG_FUNC(column_or_expression) AS aggregate_description, …
FROM mytable
WHERE constraint_expression;
Function Description
COUNT(***), *COUNT(column*) 计数!COUNT(*) 统计数据行数,COUNT(column) 统计column非NULL的行数.
MIN(**column)** 找column最小的一行.
MAX(**column)** 找column最大的一行.
*AVG(**column*) 对column所有行取平均值.
SUM(**column)** 对column所有行求和.

Group by

1
2
3
4
SELECT AGG_FUNC(column_or_expression) AS aggregate_description, …
FROM mytable
WHERE constraint_expression
GROUP BY column;

Exercise

Screen Shot 2020-10-30 at 21.49.54

Screen Shot 2020-10-30 at 21.50.06

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
--select name, max(years_employed) from employees;

/*
select role,avg(years_employed) from employees
group by role;
*/

/*
select building,sum(years_employed) from employees
group by building
*/

select building, count(name) from employees
where building
group by building

Statistics II

HAVING select after group

1
2
3
4
5
SELECT group_by_column, AGG_FUNC(column_expression) AS aggregate_result_alias, …
FROM mytable
WHERE condition
GROUP BY column
HAVING group_condition;

Exercise

![Screen Shot 2020-10-30 at 22.03.08](../../../../../Library/Application Support/typora-user-images/Screen Shot 2020-10-30 at 22.03.08.png)

Screen Shot 2020-10-30 at 22.03.15

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
--select count(name) from employees where role="Artist";

/*
select role, count(*) from employees
group by role;
*/

/*
select sum(years_employed) from employees
where role="Engineer"
*/

select role, count(*) as count, building is not null as bn
from employees
group by role, bn

we can group by two group

use is null as a boolean divider

Query Order

1
2
3
4
5
6
7
8
9
SELECT DISTINCT column, AGG_FUNC(column_or_expression), …
FROM mytable
JOIN another_table
ON mytable.column = another_table.column
WHERE constraint_expression
GROUP BY column
HAVING constraint_expression
ORDER BY column ASC/DESC
LIMIT count OFFSET COUNT;

1. FROMJOINs

FROMJOIN会第一个执行,确定一个整体的数据范围. 如果要JOIN不同表,可能会生成一个临时Table来用于 下面的过程。总之第一步可以简单理解为确定一个数据源表(含临时表)

2. WHERE

我们确定了数据来源 WHERE 语句就将在这个数据源中按要求进行数据筛选,并丢弃不符合要求的数据行,所有的筛选col属性 只能来自FROM圈定的表. AS别名还不能在这个阶段使用,因为可能别名是一个还没执行的表达式

3. GROUP BY

如果你用了 GROUP BY 分组,那GROUP BY 将对之前的数据进行分组,统计等,并将是结果集缩小为分组数.这意味着 其他的数据在分组后丢弃.

4. HAVING

如果你用了 GROUP BY 分组, HAVING 会在分组完成后对结果集再次筛选。AS别名也不能在这个阶段使用.

5. SELECT

确定结果之后,SELECT用来对结果col简单筛选或计算,决定输出什么数据.

6. DISTINCT

如果数据行有重复DISTINCT 将负责排重.

7. ORDER BY

在结果集确定的情况下,ORDER BY 对结果做排序。因为SELECT中的表达式已经执行完了。此时可以用AS别名.

8. LIMIT / OFFSET

最后 LIMITOFFSET 从排序的结果中截取部分数据.

Exercise

Screen Shot 2020-10-30 at 23.07.36

Screen Shot 2020-10-30 at 23.07.50

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
/*
select director, count(*) from movies
group by director
*/

/*
select director, sum(domestic_sales + international_sales)
from movies
left join boxoffice
on movies.id=boxoffice.movie_id
group by director
*/

/*
select director, sum(domestic_sales + international_sales) as total_sales, count(*) as cnt, sum(domestic_sales + international_sales)/count(*)
from movies
join boxoffice
on movies.id=boxoffice.movie_id
group by director
having count(*) > 1
order by total_sales/cnt desc
limit 1
*/


select title, (select max(domestic_sales+international_sales)
from movies
left join boxoffice
on movies.id=boxoffice.movie_id)
-(domestic_sales + international_sales) as diff
from movies
join boxoffice
on movies.id=boxoffice.movie_id

Query something and use it as a variable

A summary

http://www.xuesql.cn/static/%E9%87%91%E8%80%81%E5%B8%88%E6%89%8B%E5%86%8C.html

There is a better resource…

https://sqlbolt.com/lesson/inserting_rows

Insert

What is a Schema?

We previously described a table in a database as a two-dimensional set of rows and columns, with the columns being the properties and the rows being instances of the entity in the table. In SQL, the database schema is what describes the structure of each table, and the datatypes that each column of the table can contain.

Insert statement with values for all columns

1
2
INSERT INTO mytable 
VALUES (value_or_expr, another_value_or_expr, …), (value_or_expr_2, another_value_or_expr_2, …), …;

Insert statement with specific columns

1
2
3
4
5
INSERT INTO mytable 
(column, another_column, …)
VALUES (value_or_expr, another_value_or_expr, …),
(value_or_expr_2, another_value_or_expr_2, …),
…;

In these cases, the number of values need to match the number of columns specified.

In addition, you can use mathematical and string expressions with the values that you are inserting.

Example Insert statement with expressions

1
2
3
INSERT INTO boxoffice 
(movie_id, rating, sales_in_millions)
VALUES (1, 9.9, 283742034 / 1000000);

Update

1
2
3
4
5
UPDATE mytable
SET column = value_or_expr,
other_column = another_value_or_expr,

WHERE condition;

Delete

1
2
DELETE FROM mytable
WHERE condition;

Create tables

1
2
3
4
5
CREATE TABLE IF NOT EXISTS mytable (
column DataType TableConstraint DEFAULT default_value,
another_column DataType TableConstraint DEFAULT default_value,

);
Data type Description
INTEGER, BOOLEAN The integer datatypes can store whole integer values like the count of a number or an age. In some implementations, the boolean value is just represented as an integer value of just 0 or 1.
FLOAT, DOUBLE, REAL The floating point datatypes can store more precise numerical data like measurements or fractional values. Different types can be used depending on the floating point precision required for that value.
CHARACTER(num_chars), VARCHAR(num_chars), TEXT The text based datatypes can store strings and text in all sorts of locales. The distinction between the various types generally amount to underlaying efficiency of the database when working with these columns.Both the CHARACTER and VARCHAR (variable character) types are specified with the max number of characters that they can store (longer values may be truncated), so can be more efficient to store and query with big tables.
DATE, DATETIME SQL can also store date and time stamps to keep track of time series and event data. They can be tricky to work with especially when manipulating data across timezones.
BLOB Finally, SQL can store binary data in blobs right in the database. These values are often opaque to the database, so you usually have to store them with the right metadata to requery them.
Constraint Description
PRIMARY KEY This means that the values in this column are unique, and each value can be used to identify a single row in this table.
AUTOINCREMENT For integer values, this means that the value is automatically filled in and incremented with each row insertion. Not supported in all databases.
UNIQUE This means that the values in this column have to be unique, so you can’t insert another row with the same value in this column as another row in the table. Differs from the PRIMARY KEY in that it doesn’t have to be a key for a row in the table.
NOT NULL This means that the inserted value can not be NULL.
CHECK (expression) This allows you to run a more complex expression to test whether the values inserted are valid. For example, you can check that values are positive, or greater than a specific size, or start with a certain prefix, etc.
FOREIGN KEY This is a consistency check which ensures that each value in this column corresponds to another value in a column in another table. For example, if there are two tables, one listing all Employees by ID, and another listing their payroll information, the FOREIGN KEY can ensure that every row in the payroll table corresponds to a valid employee in the master Employee list.

Altering tables

Altering table to add new column(s)

1
2
3
ALTER TABLE mytable 
ADD column DataType OptionalTableConstraint
DEFAULT default_value;

Altering table to remove column(s)

1
2
ALTER TABLE mytable 
DROP column_to_be_deleted;

Altering table name

1
2
ALTER TABLE mytable 
RENAME TO new_table_name;

Dropping tables

Drop table statement

1
DROP TABLE IF EXISTS mytable;

Second tutorial

https://www.w3school.com.cn/sql

Order by

Screen Shot 2020-10-31 at 11.45.52

Insert into

1
INSERT INTO 表名称 VALUES (值1, 值2,....)

我们也可以指定所要插入数据的列:

1
INSERT INTO table_name (列1, 列2,...) VALUES (值1, 值2,....)

Update

1
UPDATE 表名称 SET 列名称 = 新值 WHERE 列名称 = 某值

更新某一行中的若干列

我们会修改地址(address),并添加城市名称(city):

1
2
UPDATE Person SET Address = 'Zhongshan 23', City = 'Nanjing'
WHERE LastName = 'Wilson'

结果

LastName FirstName Address City
Gates Bill Xuanwumen 10 Beijing
Wilson Fred Zhongshan 23 Nanjing

Delete

1
DELETE FROM 表名称 WHERE 列名称 = 值

删除所有行

可以在不删除表的情况下删除所有的行。这意味着表的结构、属性和索引都是完整的:

1
DELETE FROM table_name

或者:

1
DELETE * FROM table_name

Wildcard

通配符 描述
% 替代一个或多个字符
_ 仅替代一个字符
[charlist] 字符列中的任何单一字符
[^charlist]或者[!charlist] 不在字符列中的任何单一字符

我们希望从上面的 “Persons” 表中选取居住的城市不以 “A” 或 “L” 或 “N” 开头的人:

我们可以使用下面的 SELECT 语句:

1
2
SELECT * FROM Persons
WHERE City LIKE '[!ALN]%'

In

IN 操作符允许我们在 WHERE 子句中规定多个值。

1
2
3
SELECT column_name(s)
FROM table_name
WHERE column_name IN (value1,value2,...)

现在,我们希望从上表中选取姓氏为 Adams 和 Carter 的人:

我们可以使用下面的 SELECT 语句:

1
2
SELECT * FROM Persons
WHERE LastName IN ('Adams','Carter')

Between

操作符 BETWEEN … AND 会选取介于两个值之间的数据范围。这些值可以是数值、文本或者日期。

1
2
3
4
SELECT column_name(s)
FROM table_name
WHERE column_name
(NOT) BETWEEN value1 AND value2

重要事项:不同的数据库对 BETWEEN…AND 操作符的处理方式是有差异的。某些数据库会列出介于 “Adams” 和 “Carter” 之间的人,但不包括 “Adams” 和 “Carter” ;某些数据库会列出介于 “Adams” 和 “Carter” 之间并包括 “Adams” 和 “Carter” 的人;而另一些数据库会列出介于 “Adams” 和 “Carter” 之间的人,包括 “Adams” ,但不包括 “Carter” 。

所以,请检查你的数据库是如何处理 BETWEEN….AND 操作符的!

Alias

假设我们有两个表分别是:”Persons” 和 “Product_Orders”。我们分别为它们指定别名 “p” 和 “po”。

现在,我们希望列出 “John Adams” 的所有定单。

我们可以使用下面的 SELECT 语句:

1
2
3
SELECT po.OrderID, p.LastName, p.FirstName
FROM Persons AS p, Product_Orders AS po
WHERE p.LastName='Adams' AND p.FirstName='John'

Union

UNION 操作符用于合并两个或多个 SELECT 语句的结果集。

请注意,UNION 内部的 SELECT 语句必须拥有相同数量的列。列也必须拥有相似的数据类型。同时,每条 SELECT 语句中的列的顺序必须相同。

SQL UNION 语法

1
2
3
SELECT column_name(s) FROM table_name1
UNION
SELECT column_name(s) FROM table_name2

注释:默认地,UNION 操作符选取不同的值。如果允许重复的值,请使用 UNION ALL。

SQL UNION ALL 语法

1
2
3
SELECT column_name(s) FROM table_name1
UNION ALL
SELECT column_name(s) FROM table_name2

另外,UNION 结果集中的列名总是等于 UNION 中第一个 SELECT 语句中的列名。

Select into

SELECT INTO 语句从一个表中选取数据,然后把数据插入另一个表中。

SELECT INTO 语句常用于创建表的备份复件或者用于对记录进行存档。

SQL SELECT INTO 语法

您可以把所有的列插入新表:

1
2
3
SELECT *
INTO new_table_name [IN externaldatabase]
FROM old_tablename

或者只把希望的列插入新表:

1
2
3
SELECT column_name(s)
INTO new_table_name [IN externaldatabase]
FROM old_tablename

Create database and table

1
CREATE DATABASE database_name
1
2
3
4
5
6
7
8
CREATE TABLE Persons
(
Id_P int,
LastName varchar(255),
FirstName varchar(255),
Address varchar(255),
City varchar(255)
)

Delete databases

1
DROP DATABASE <database name>;

Constraints

Not null

NOT NULL 约束强制列不接受 NULL 值。

NOT NULL 约束强制字段始终包含值。这意味着,如果不向字段添加值,就无法插入新记录或者更新记录。

Unique

UNIQUE 约束唯一标识数据库表中的每条记录。

UNIQUE 和 PRIMARY KEY 约束均为列或列集合提供了唯一性的保证。

PRIMARY KEY 拥有自动定义的 UNIQUE 约束。

SQL UNIQUE Constraint on CREATE TABLE

MySQL在 “Persons” 表创建时在 “Id_P” 列创建 UNIQUE 约束:

1
2
3
4
5
6
7
8
9
CREATE TABLE Persons
(
Id_P int NOT NULL,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Address varchar(255),
City varchar(255),
UNIQUE (Id_P)
)

如果需要命名 UNIQUE 约束,以及为多个列定义 UNIQUE 约束,请使用下面的 SQL 语法:

MySQL / SQL Server / Oracle / MS Access:

1
2
3
4
5
6
7
8
9
CREATE TABLE Persons
(
Id_P int NOT NULL,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Address varchar(255),
City varchar(255),
CONSTRAINT uc_PersonID UNIQUE (Id_P,LastName)
)
SQL UNIQUE Constraint on ALTER TABLE

当表已被创建时,如需在 “Id_P” 列创建 UNIQUE 约束,请使用下列 SQL:

MySQL / SQL Server / Oracle / MS Access:

1
2
ALTER TABLE Persons
ADD UNIQUE (Id_P)

如需命名 UNIQUE 约束,并定义多个列的 UNIQUE 约束,请使用下面的 SQL 语法:

MySQL / SQL Server / Oracle / MS Access:

1
2
ALTER TABLE Persons
ADD CONSTRAINT uc_PersonID UNIQUE (Id_P,LastName)

Foreign key

“Persons” 表:

Id_P LastName FirstName Address City
1 Adams John Oxford Street London
2 Bush George Fifth Avenue New York
3 Carter Thomas Changan Street Beijing

“Orders” 表:

Id_O OrderNo Id_P
1 77895 3
2 44678 3
3 22456 1
4 24562 1
SQL FOREIGN KEY Constraint on CREATE TABLE

下面的 SQL 在 “Orders” 表创建时为 “Id_P” 列创建 FOREIGN KEY:

MySQL:

1
2
3
4
5
6
7
8
CREATE TABLE Orders
(
Id_O int NOT NULL,
OrderNo int NOT NULL,
Id_P int,
PRIMARY KEY (Id_O),
FOREIGN KEY (Id_P) REFERENCES Persons(Id_P)
)

如果需要命名 FOREIGN KEY 约束,以及为多个列定义 FOREIGN KEY 约束,请使用下面的 SQL 语法:

MySQL / SQL Server / Oracle / MS Access:

1
2
3
4
5
6
7
8
9
CREATE TABLE Orders
(
Id_O int NOT NULL,
OrderNo int NOT NULL,
Id_P int,
PRIMARY KEY (Id_O),
CONSTRAINT fk_PerOrders FOREIGN KEY (Id_P)
REFERENCES Persons(Id_P)
)
SQL FOREIGN KEY Constraint on ALTER TABLE

如果在 “Orders” 表已存在的情况下为 “Id_P” 列创建 FOREIGN KEY 约束,请使用下面的 SQL:

MySQL / SQL Server / Oracle / MS Access:

1
2
3
ALTER TABLE Orders
ADD FOREIGN KEY (Id_P)
REFERENCES Persons(Id_P)

如果需要命名 FOREIGN KEY 约束,以及为多个列定义 FOREIGN KEY 约束,请使用下面的 SQL 语法:

MySQL / SQL Server / Oracle / MS Access:

1
2
3
4
ALTER TABLE Orders
ADD CONSTRAINT fk_PerOrders
FOREIGN KEY (Id_P)
REFERENCES Persons(Id_P)

Check

如果需要命名 CHECK 约束,以及为多个列定义 CHECK 约束,请使用下面的 SQL 语法:

MySQL / SQL Server / Oracle / MS Access:

1
2
3
4
5
6
7
8
9
CREATE TABLE Persons
(
Id_P int NOT NULL,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Address varchar(255),
City varchar(255),
CONSTRAINT chk_Person CHECK (Id_P>0 AND City='Sandnes')
)
SQL CHECK Constraint on ALTER TABLE

如果在表已存在的情况下为 “Id_P” 列创建 CHECK 约束,请使用下面的 SQL:

MySQL / SQL Server / Oracle / MS Access:

1
2
ALTER TABLE Persons
ADD CHECK (Id_P>0)

如果需要命名 CHECK 约束,以及为多个列定义 CHECK 约束,请使用下面的 SQL 语法:

MySQL / SQL Server / Oracle / MS Access:

1
2
ALTER TABLE Persons
ADD CONSTRAINT chk_Person CHECK (Id_P>0 AND City='Sandnes')

Default

SQL DEFAULT Constraint on CREATE TABLE

下面的 SQL 在 “Persons” 表创建时为 “City” 列创建 DEFAULT 约束:

My SQL / SQL Server / Oracle / MS Access:

1
2
3
4
5
6
7
8
CREATE TABLE Persons
(
Id_P int NOT NULL,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Address varchar(255),
City varchar(255) DEFAULT 'Sandnes'
)

通过使用类似 GETDATE() 这样的函数,DEFAULT 约束也可以用于插入系统值:

1
2
3
4
5
6
7
CREATE TABLE Orders
(
Id_O int NOT NULL,
OrderNo int NOT NULL,
Id_P int,
OrderDate date DEFAULT GETDATE()
)
SQL DEFAULT Constraint on ALTER TABLE

如果在表已存在的情况下为 “City” 列创建 DEFAULT 约束,请使用下面的 SQL:

MySQL:

1
2
ALTER TABLE Persons
ALTER City SET DEFAULT 'SANDNES'

SQL Server / Oracle / MS Access:

1
2
ALTER TABLE Persons
ALTER COLUMN City SET DEFAULT 'SANDNES'

Practice

Create database with .sql file

ref: https://stackoverflow.com/questions/10769344/create-mysql-database-with-sql-file

1
mysql -u username -p ojs < "C:\ojs.sql"

Split query and data

We can split query statement and data in PyMySQL

1
2
3
4
5
MySQL_query = "INSERT INTO table (col1, col2) VALUES (%s, %s)"

MySQL_data = (var1, var2)

cursor.execute(MySQL_query, MySQL_data)

MySQL INSERT ON DUPLICATE KEY UPDATE

Ref: https://www.mysqltutorial.org/mysql-insert-or-update-on-duplicate-key-update/

The INSERT ON DUPLICATE KEY UPDATE is a MySQL’s extension to the SQL standard’s INSERT statement.

When you insert a new row into a table if the row causes a duplicate in UNIQUE index or PRIMARY KEY , MySQL will issue an error.

However, if you specify the ON DUPLICATE KEY UPDATE option in the INSERT statement, MySQL will update the existing row with the new values instead.

The syntax of INSERT ON DUPLICATE KEY UPDATE statement is as follows:

1
2
3
4
5
6
INSERT INTO table (column_list)
VALUES (value_list)
ON DUPLICATE KEY UPDATE
c1 = v1,
c2 = v2,
...;Code language: SQL (Structured Query Language) (sql)

The only addition to the INSERT statement is the ON DUPLICATE KEY UPDATE clause where you specify a list of column-value-pair assignments in case of duplicate.

Basically, the statement first tries to insert a new row into the table. If a duplicate error occurs, it will update the existing row with the value specified in the ON DUPLICATE KEY UPDATE clause.

MySQL case sensitive configuration

Ref: https://blog.csdn.net/fdipzone/article/details/73692929

1
2
3
4
5
6
7
8
show global variables like '%lower_case%';		

+------------------------+-------+
| Variable_name | Value |
+------------------------+-------+
| lower_case_file_system | ON |
| lower_case_table_names | 0 |
+------------------------+-------+

lower_case_file_system is a readonly variable showing the system case sensitive situation;

ON: not sensitive
OFF: sensitive

lower_case_table_names shows table/database name sensitive situation

lower_case_table_names = 0,mysql will directly use the table name, case sensitive
lower_case_table_names = 1时,mysql will change it to lower case, case insensive

Config in my.cnf, add this and restart:

lower_case_table_names = 0/1

MySQL database rename

Ref: https://stackoverflow.com/questions/67093/how-do-i-quickly-rename-a-mysql-database-change-schema-name

Use these few simple commands:

1
2
3
mysqldump -u username -p -v olddatabase > olddbdump.sql
mysql -u username -p create newdatabase
mysql -u username -p newdatabase < olddbdump.sql

Or to reduce I/O use the following as suggested by @Pablo Marin-Garcia:

1
2
mysqladmin -u username -p create newdatabase
mysqldump -u username -v olddatabase -p | mysql -u username -p -D newdatabase