MySQL select optimization
Posted by ark, ,
This came up at work but the lessons are so general I'm putting it here in my public blog rather than my work one.

work out how your table is setup
-- SHOW CREATE TABLE tablename;

work out how much space your table is taking up
-- SHOW TABLE STATUS LIKE 'tablename';

work out which tables are using the most space (naive)
-- 'show table status \G' | egrep 'Name|Data_length:' | while read fil; do read s; echo "${s/Data_length: /} $fil"; done | sort -n

now explain your sql query
-- EXPLAIN SELECT date , SUM(number) AS number
FROM tablename
WHERE date >= '2005-04-06' AND date <= '2005-04-09' AND metric <= 7 GROUP BY date; Look at the rows column in that output and try and make some new keys that will help make it smaller. Once you do you need to add another index -- ALTER TABLE tablename ADD INDEX (days, platform); while you're doing this you can monitor progress by having a look in /var/lib/mysql/databasename/
there should be some # files
$ \ls -Fsh \#* tablename.* | cat
12K #sql-1abd_8.frm
261M #sql-1abd_8.MYD
136M #sql-1abd_8.MYI
12K tablename.frm
343M tablename.MYD
135M tablename.MYI
when the # files approch the size of the tablename files you know it's close to being done.

None of that helped me much, what DID speed up my queries was getting rid of the <= operators. This involves making two other tables one with dates in them and another with numbers DROP TABLE IF EXISTS dates; CREATE TABLE dates ( date date NOT NULL default '0000-00-00', PRIMARY KEY (date) ) TYPE=MyISAM; -- fill it like this -- for fil in $(seq 0 5000); do D=$(date +%F -d "2000-01-01 +$fil days"); -- echo "INSERT into dates (date) values ('$D');"; done | mysql blah blah blah and DROP TABLE IF EXISTS numbers; CREATE TABLE numbers ( value int unsigned NOT NULL default 0, PRIMARY KEY (value) ) TYPE=MyISAM; INSERT into numbers (value) values (0); INSERT into numbers (value) values (1); INSERT into numbers (value) values (2); INSERT into numbers (value) values (3);... then that query becomes: EXPLAIN SELECT t.date as date, SUM(t.number) AS number FROM tablename t, dates d, numbers n WHERE d.date >= '2005-04-06' AND d.date <= '2005-04-09'
AND n.value <= 7
AND d.date = t.date AND d.number = n.value
GROUP BY date;

which should be MUCH faster.

Comments