分享SQL性能优化的21个技巧!

17337
23
Catalogue

一、Inquire SQL try not to use select *, but specific fields
二、 Avoid using or in the where clause to connect conditions
三、Use numeric values instead of character string as much as possible
四、Use varchar instead of char
五、Technical extension, the differences between char and varchar2?
六、Using default value instead of null in where
七、 Avoid using the where clause with ! = or <> operator
八、inner join, left join, right join, give priority to using inner join
九、 Improve the efficiency of the group by statement
十、When emptying the table, give priority to the use of truncate
十一、 Operate delete or update statement, add a limit or loop to delete in batches
十二、UNION operator
十三、 Batch insertion performance improvement
十四、 Not too many table joins, not too many indexes, generally within 5
十五、Avoid using built-in functions on index columns
十六、the combination of indexes
十七、 composite index leftmost feature
十八、 optimize like statement
十九、Use explain to analyze your SQL execution plan
二十、Some other optimizing ways
二十一、Mind Map
、Inquire SQL try not to use select *, but specific fields
1、counter example
1|SELECT * FROM user
2、positive example
1|SELECT id,username,tel FROM user
3、Reason
1.Save resources, reduce network expense.
2.Possibly using covering indexes to reduce table returns and improve query efficiency.
Note: To save time, the sample fields below have been replaced with *.
二、 Avoid using or in the where clause to connect conditions
1、counter example
1|SELECT * FROM user WHERE id=1 OR salary=5000
2、positive example
(1)use UNION ALL
1| SELECT * FROM user WHERE id=1
2| UNION ALL
3| SELECT * FROM user WHERE salary=5000
(2) Write two separate sql
1|SELECT * FROM user WHERE id=1
2|
3|SELECT * FROM user WHERE salary=5000
3. Reasons
1. the use of or may invalidate the index, and thus the full table scan.
2. for or without the index of the salary this case, assume that it takes the index of id, but when it goes to the salary query conditions, it still has to full table scan.
3. that is to say, the whole process requires three steps: full table scan + index scan + merge. If it had started with a full table scan, it is done with one scan directly.
4. although mysql has an optimizer, considered in terms of efficiency and cost, encountering or conditions, the index may still be invalid;.
三、Use numeric values instead of character string as much as possible
1、positive example
1. primary key (id): primary key gives priority to the use of numeric type int, tinyint
2. gender (sex): 0 represents female, 1 represents male; database does not have boolean type, mysql is recommended to use tinyint
2、Reason
1. Because the engine compares each character in the character string one by one when processing queries and joins.
2. While for numeric types it is enough to compare only once.
3. Characters will reduce the performance of queries and joins and will increase the storage expense.
四、 Using varchar instead of char
1、 counter example
1|address` char(100) DEFAULT NULL COMMENT 'address'
2、 Positive example
1|address` varchar(100) DEFAULT NULL COMMENT 'address'
3. Reasons
1. varchar variable-length fields are stored according to the actual length of the data content, with small storage space, which can save storage space.
2. char is stored according to the declared size, insufficient to fill in the spaces.
3. next for queries, search within a relatively small field, more efficient.
五、Technical extension, the difference between char and varchar2?
1. The length of char is fixed, while the length of varchar2 can be changed.
For example,Store String "101", for char(10), it means that the character you store will take up 10 characters (including 7 empty characters), and it is occupied by spaces in the database, while the same varchar2(10) only takes up 3 bytes of length, 10 is just the maximum, when the character you store is less than 10, it will be stored according to the actual length.
2、char is slightly more efficient than varchar2. 3、When do we use char and when do we use varchar2? char and varchar2 is more efficient than varchar2.
3、When do we use char and when do we use varchar2?
char and varchar2 is a contradictory unity, the two are complementary relationship, varchar2 is less space-saving than char, but a little less efficient than char, if you want to get efficiency, you must sacrifice a little space, this is what we often say in the database design " use the space in exchange for efficiency".
Although varchar2 is more space-saving than char, but if a varchar2 column is frequently modified, and each time the length of the modified data is different, this will cause "Row Migrating" phenomenon, and this causes extra I/O, which is the one to avoid in the database design, it is better to use char instead of varchar2 in this case. char will also automatically fill in the spaces, because you insert to a char field automatically supplemented by spaces, but select after the space is not deleted, so when querying the char type you must remember to use trim, which is the reason for writing this article.
If a developer refines the use of the rpad() trick to convert a bind variable to some type that can be compared to a char field (of course, it is better to fill the bind variable than to truncate the trim database column, since applying the function trim to a column can easily result in not being able to use the existing index on that column), it may have to take into account that changes in the column length over time. If there is a change in the size of the field, the application will be affected because it will have to modify the field width.
It is for the above reasons that fixed-width storage space can lead to tables and associated indexes that are much larger than usual, and is accompanied by problems with bind variables, so it is important to avoid using char types no matter what circumstances.
六、Using default value instead of null in where
1、counter example
1|SELECT * FROM user WHERE age IS NOT NULL
2、positive example
1|SELECT * FROM user WHERE age>0
3、reason
1. it is not that if one uses is null or is not null, one will not go to the index, this is related to mysql version and the cost of the query.
2. If the mysql optimizer finds that the cost of going to the index is higher than not going to the index, it will give up the index, these conditions ! =, <>, is null, is not null are often considered to invalidate the index.
3. in fact, it is because in general, the cost of the query is high, the optimizer automatically gives up the index
4. if the null value, replaced by the default value, many times let the go index become possible, at the same time, the expressed meaning is relatively clear a little;.
七、 Avoid using the != or <> operator in the where clause. = or <> operator
1. Counter example
1|SELECT * FROM user WHERE salary!=5000
2|
3|SELECT * FROM user WHERE salary<>5000
2、Reason
1.Using ! = and <> is likely to invalidate the index
2.You should try to avoid using the ! = or <> operator, otherwise the engine will give up using the index and perform a full table scan
3.Achieve business priority, there is really no choice but to use, not unable to use
八、inner join, left join, right join, give priority to using inner join
If the results of the three joins are the same, the priority is to use inner join, if you use left join the left table should be as small as possible.
; inner join within connect, only retaining the exact matching result set in the two tables.
; left join will return all rows of the left table, even if there are no matching records in the right table.
; right join will return all rows of the right table, even if there are no matching records in the left table.
Why?
; if inner join is an equal-join, it returns fewer rows, so performance is relatively better.
; uses a left join, the left table data result is as small as possible, and the conditions are placed on the left side of the process as much as possible,meaning that the number of rows returned may be less.
; this is the mysql principle of optimality that small tables drive large tables, and small data sets drive large data sets, thus allowing better performance;
九、 Improve the efficiency of the group by statement
1、Counter example
Group first, then filter
1|select job, avg(salary) from employee
2|group by job
3|having job = 'develop' or job = 'test';
2、Positive example
Filter first, group later
1|select job, avg(salary) from employee
2|where job = 'develop' or job = 'test'
3|group by job;
3、Reason
You can filter out the unwanted records before executing to this statement.
十、When emptying the table, give priority to the use of truncate
truncate table is functionally the same as the delete statement without the where clause: both delete all the rows in the table. But truncate table is faster than delete and uses less system and transaction journal resources.
The delete statement deletes one row each time and records an item in the transaction log for each row deleted. A truncate table deletes data by releasing the data pages used to store the table data and records the page release only in the transaction log.
truncate table removes all rows from the table, but the table structure and its columns, constraints, indexes, etc. remain unchanged. The count value used for the new row identifier is reset to the seed of the column. If you want to keep the identification count value, use DELETE instead. to delete the table definition and its data, use the drop table statement.
For a table referenced by a foreign key constraint, you cannot use truncate table; instead, use the DELETE statement without the where clause. Since truncate table is not recorded in the log, it does not activate triggers.
A truncate table cannot be used for a table that participates in an indexed view.
十一、operate delete or update statement, add a limit or loop to delete in batches
1、Reduce the cost of writing the wrong SQL
Emptying table data is not a little things, a shaky hand all gone, delete the library and run away? If you add limit, delete by mistake just lose part of the data, you can quickly recover through the binlog log.
2、SQL efficiency is likely to be higher
With the addition of limit 1 in SQL, if the first one hits the target return, without the limit, it will continue to execute the scan table.
3、Avoid long transactions
When delete is executed, if age adds an index, MySQL will add write locks and gap locks to all relevant rows, and all execution-related rows will be locked, and if the number of deletions is large, it will directly cause the related business to be unusable.
4. If the data volume is large, it is easy to fill up the CPU
If you delete a large amount of data, do not add a limit to limit the number of records, it is easy to hit the cpu full, resulting in the more slow deletion.
5、Lock table
Deleting too much data at once may cause lock table, there will be lock wait timeout exceeded error, so it is recommended to operate in batches.
十二、UNION operator
UNION will filter out duplicate records after table linking, so the resulting result set will be sorted and operated after the table linking to remove duplicate records before returning the results.
In practice, most applications do not generate duplicate records, the most common is the process table and the history table UNION.
1| select username,tel from user
2| union
3| select departmentname from department
This SQL first takes out the results of the two tables at runtime, and then sorts them with sorting space to remove duplicate records, and finally returns the result set, which may lead to sorting with disk if the table data quantity is large.
Recommended programme: Use UNION ALL operator instead of UNION, because UNION ALL operation simply combines the two results and returns later.
十三、 Batch insertion performance improvement
1、Multiple submissions
1| INSERT INTO user (id,username) VALUES(1,'哪吒编程');
2|
3| INSERT INTO user (id,username) VALUES(2,'妲己');
2、Batch submission
1| INSERT INTO user (id,username) VALUES(1,'哪吒编程'),(2,'妲己');
3、Reason
Default added SQL has transaction control, resulting in each one needs the start of the transaction and commitment of the transaction, while batch processing is a the start of the transaction and commitment of the transaction, efficiency promotion is obvious, to a certain degree of volume, the results are significant, usually invisible.
十四、 Not too many table joins, not too many indexes, generally within 5
1、Table joins should not be too many, generally within 5
1. The more the number of linked tables, the more the compilation time and expense will be
2. A temporary table is generated in memory for each connection
3. The connection table should be broken up into smaller executions for higher readability
4. If you have to join many tables to get the data, it means it's a bad design
5. In Ali specification, it is suggested to check three tables or less for multi-table joins
2, the index should not be too many, generally within 5
1. index is not the more the better, although it improves the efficiency of the query, but will reduce the efficiency of insertion and update.
2. index can be understood that one is a table, which can store data, its data will take up space.
3. index table data is sorted, sorting is also time consuming.
4. insert or update when it is possible to rebuild the index, if the data size is huge, the reconstruction will be carried out to reorder the records, so building an index needs to be carefully considered, depending on the specific situation.
5. the number of indexes in a table should better not exceed 5, if there are too many you need to consider whether some indexes are necessary to exist.
十五、Avoid using built-in functions on index columns
1、counter example
1| SELECT * FROM user WHERE DATE_ADD(birthdate,INTERVAL 7 DAY) >=NOW();
2、Positive example
1| SELECT * FROM user WHERE birthday >= DATE_ADD(NOW(),INTERVAL 7 DAY);
3、The reason
Use the built-in function on the index column, the index fails.
十六、Combined indexes
Sorting should be done in the order of the columns in the combined index, even if only one column in the index is to be sorted, otherwise the sorting performance will be worse.
1|create index IDX_USERNAME_TEL on user(deptid,position,createtime);
2|select username,tel from user where deptid= 1 and position = 'java developer' order by deptid,position,createtime desc;
Actually just query the records that meet the deptid= 1 and position = 'java development' condition and sort them in descending order by createtime, but write it as order by createtime desc for poor performance.
十七、 composite index leftmost feature
1、create a composite index
ALTER TABLE employee ADD INDEX idx_name_salary (name,salary)
2、 to meet the leftmost characteristic of the composite index, even if only partially, the composite index takes effect
SELECT * FROM employee WHERE NAME='which Zha programming'
3、there is no left field, it does not satisfy the leftmost characteristic, the index is invalid
SELECT * FROM employee WHERE salary=5000
4、the composite index is fully used, according to the left order of appearance name,salary, and the index takes effect
SELECT * FROM employee WHERE NAME='which Zha programming' AND salary=5000
5、Although the leftmost characteristic is violated, MySQL will optimize the execution of SQL, and bottom layer is reversed to optimize
SELECT * FROM employee WHERE salary=5000 AND NAME='Nezha Programming'
6、Reason
Composite index is also called a joint index, when we create a joint index, such as (k1,k2,k3), which is equivalent to creating (k1), (k1,k2) and (k1,k2,k3) three indexes, which is the principle of the leftmost match.
The joint index does not satisfy the leftmost principle and the index will generally fail.
十八、 optimize like statement
Fuzzy queries, programmers prefer to use like, but like is likely to invalidate your index.
1、counter example
select * from citys where name like '%Dalian' ( do not use index )
select * from citys where name like '%Dalian%' ( do not use index )
2. Positive example
select * from citys where name like '%Dalian' (use index).
3、Reason
; First try to avoid fuzzy queries, if you have to use, do not use the full fuzzy query, should also try to use the right fuzzy query, that is, like '...%', is going to use the index.
; Left fuzzy like '%...' cannot use the index directly, but can be changed into like '...%' by using the form reverse + function index.
; Full fuzzy query is not optimized, must be used if you recommend the use of search engines.
十九、 using explain to analyze your SQL execution plan
1、type
1. system: only one row of the table, basically not used.
2. const: table with a maximum of one row of data, with more triggers when querying the primary key.
3. eq_ref: for each combination of rows from the previous table, read a row from that table. This is probably the best join type, except for the const type.
4. ref: for each combination of rows from the preceding table, all rows with matching index values will be read from this table.
5. range: only rows in the given range are retrieved, using an index to select rows. range can be used when comparing keyword columns with constants using the =, <>, >, >=, <, <=, IS NULL, <=>, BETWEEN or IN operators.
6. index: This join type is the same as ALL, except that only the index tree is scanned. This is usually faster than ALL because index files are usually smaller than data files.
7. all: a full table scan.
8. Performance ranking: system > const > eq_ref > ref > range > index > all.
9. In actual sql optimization, the ref or range level is achieved last.
2、Extra common keywords
; Using index: only getting information from the index tree, without querying back to the table.
; Using where: The WHERE clause is used to restrict which row matches the next table or is sent to the client. Unless you specifically request or check all rows from the table, the query may have some errors if the Extra value is not Using where and the table join type is ALL or index. You need to query back to the table.
; Using temporary: mysql often builds a temporary table to contain the results, typically when the query contains GROUP BY and ORDER BY clauses that can list columns by case.
二十、some other optimizing way
1, when designing the table, all tables and fields are added with the corresponding comments.
2, SQL writing formats, keyword size to stay the same, use retract.
3、Before modifying or deleting important data, make a backup.
4、Many times using exists instead of in is a good choice
5、The fields after where, pay attention to the implicit conversion of its data type.
Not using indexes
1| SELECT * FROM user WHERE NAME=110
(1) Because without the single quotes, it is a comparison between character strings and numbers, and their types do not match
(2) MySQL will do an implicit type conversion to convert them to numeric types and then compare them
6. Try to define all columns as NOT NULL
NOT NULL columns are more space-saving, and NULL columns need an extra byte as a Flags to determine whether they are NULL or not.
NULL columns need to pay attention to the null pointer problem, NULL columns need to pay attention to the null pointer problem when calculating and comparing.
7, pseudo-delete design
8, the database and table character set as far as possible to unify the use of UTF8
(1) can avoid the problem of NetBeans.
(2) can avoid, different character set for the conversion of comparison, resulting in the index failure problem.
9. select count(*) from table.
Such a count without any conditions will cause a full table scan, and there is no business sense, is a must to eliminate.
10、Avoid expression operations on fields in where
(1) SQL analysis, if the field is related to the expression on the full table scan
(2) clean fields without expressions, the index takes effect
11、About temporary table
(1) avoid frequent creation and deletion of temporary tables to reduce the consumption of system table resources.
(2) in the new temporary table, if a one-time insertion of a large amount of data, then you can use select into instead of create table, to avoid creating art5r into a large number of log.
(3) If the amount of data is not large, in order to ease the resources of the system table, you should first create table, and then insert.
(4) If temporary tables are used, be sure to delete all temporary tables explicitly at the end of the procedure. First truncate table, and then drop table, so that you can avoid a longer period of time to lock the system table.
12、indexes are not suitable for building on fields with a lot of duplicate data, such as gender, sorting fields that should be created index
13、Remove distinct filter fields to less
1.The statement with distinct occupies more cpu time than the statement without distinct
2.When querying many fields, if you use distinct, the database engine will compare the data and filter out duplicate data
3.However, this comparison, filtering process will take up system resources, such as cpu time
14、try to avoid large transaction operations to improve the concurrency of the system
15、all tables must use Innodb storage engine
Innodb "support transactions, support row-level locking, better recovery", better performance under high concurrency, so it is no special requirements (that is, Innodb can not meet the function such as: column storage, storage space data, etc.), all tables must use Innodb storage engine.
16、try to avoid using cursors
Because the cursor efficiency is poor, if the cursor operation data more than 10,000 lines, then you should consider rewriting.
二十一、Mind Map
  • 641
23条回复