Add index on column for select performance to Mysql Database
Recently, I got a request from co-working department. They asked me to find rows using specific keys. Thank goodness, it was just a single table selection. The problem was that they need 50,000 specific records out of million records. More over, the department didn't have any information about the primary keys. They only know the specific data which is exactly mapping to the column on mysql table.
So I decided to add index on the table. Before adding index, I tried on Mysql console to find out how long it will takes and what the result of describe query will be.
About the table,
select count(*) from table;
==> 1206240;
SELECT
table_name AS `Table`,
round(((data_length + index_length) / 1024 / 1024), 2) `Size in MB`
FROM information_schema.TABLES
WHERE table_schema = "database"
AND table_name = "table";
==> 292.30 MB
Every value of the specific_column's length is 24.
About the Server,
Mysql Server : AWS RDS mysql 5.7.17 mysql community server
Instance Type : db.r3.xlarge
Disk Type : ssd
Region : EU (Irlend)
Under this circumstances, when I tried to select using primary key on where,
select * from table where pk_column=1;
==> 0.29s,
desc select * from table where pk_column=1;
==> type : const, ref : const
And If I select using the column before index,
select * from table where specific_colum='abcdefg';
==> 2.18s
desc select * from table where specific_colum='abcdefg';
==> type : all, ref : null
After indexing completed, I tried to select again hoping to see better result.
select * from table where specific_colum='abcdefg';
==> 0.28s
I run this command on my laptop using Mysql Client. The result was better and I know we take it for granted. That's why the indexing exists. Maybe there will be another constant connection time, region routing time, and other delays which is depend on network.
Anyway, thanks to the index, I could deliver the data on time as requested from the department.
So I decided to add index on the table. Before adding index, I tried on Mysql console to find out how long it will takes and what the result of describe query will be.
About the table,
select count(*) from table;
==> 1206240;
SELECT
table_name AS `Table`,
round(((data_length + index_length) / 1024 / 1024), 2) `Size in MB`
FROM information_schema.TABLES
WHERE table_schema = "database"
AND table_name = "table";
==> 292.30 MB
Every value of the specific_column's length is 24.
About the Server,
Mysql Server : AWS RDS mysql 5.7.17 mysql community server
Instance Type : db.r3.xlarge
Disk Type : ssd
Region : EU (Irlend)
Under this circumstances, when I tried to select using primary key on where,
select * from table where pk_column=1;
==> 0.29s,
desc select * from table where pk_column=1;
==> type : const, ref : const
And If I select using the column before index,
select * from table where specific_colum='abcdefg';
==> 2.18s
desc select * from table where specific_colum='abcdefg';
==> type : all, ref : null
After indexing completed, I tried to select again hoping to see better result.
select * from table where specific_colum='abcdefg';
==> 0.28s
desc select * from table where specific_colum='abcdefg';
==> type : ref, ref : const
I run this command on my laptop using Mysql Client. The result was better and I know we take it for granted. That's why the indexing exists. Maybe there will be another constant connection time, region routing time, and other delays which is depend on network.
Anyway, thanks to the index, I could deliver the data on time as requested from the department.
Comments
Post a Comment