Monday 26 October 2009

Version 1.1.3 of improved Cacti graphs for MySQL released

Version 1.1.3 of improved Cacti graphs for MySQL released: "

I’ve just released version 1.1.3 of the Cacti templates I wrote for MySQL. This is a bug-fix release only, and affects only ss_get_mysql_stats.php. To upgrade from the previous release, upgrade ss_get_mysql_stats.php. Don’t forget to save and restore your configuration options, if any. (Note that there is a feature to help with this: you can keep configuration options in ss_get_mysql_stats.php.cnf to avoid making them in ss_get_mysql_stats.php.)



Next up: actual template changes! More graphs!



The changelog follows.



2009-10-24: version 1.1.3

* This is a bug-fix release only, and contains no template changes.
* To upgrade from the previous release, upgrade ss_get_mysql_stats.php.
* MySQL 5.1 broke backwards compatibility with table_cache (issue 63).
* Added a version number to the script (partial fix for issue 79).
* Added a test suite (issue 76, issue 59).
* Math operations were done in MySQL instead of PHP (issue 25).
* SHOW STATUS values didn't override SHOW INNODB STATUS parsing (issue 24).
* Long error messages were not appearing in the Cacti log.
* SHOW INNODB STATUS parsing for unpurged_txns was broken.
* SHOW INNODB STATUS parsing for innodb_lock_structs was broken.
* SHOW INNODB STATUS parsing for pending_log_flushes was broken (issue 62).
* SHOW INNODB STATUS parsing for pending_buf_pool_flushes was broken.
* SHOW INNODB STATUS parsing for pending_ibuf_aio_reads was broken.
* SHOW INNODB STATUS parsing for pending_aio_log_ios was broken.
* SHOW INNODB STATUS parsing for pending_aio_sync_ios was broken.
* Made SHOW INNODB STATUS parsing less sensitive to false positive matches.


Related posts:

  1. Version 1.1.2 of improved Cacti templates released I’ve
  2. Maatkit version 4334 released Maatkit ve
  3. Maatkit version 3329 released Maatkit ve


Related posts brought to you by Yet Another Related Posts Plugin.


PlanetMySQL Voting:
Vote UP /
Vote DOWN"

Sunday 25 October 2009

What is the difference among VIRT, RES, and SHR in top output

What is the difference among VIRT, RES, and SHR in top output: "VIRT stands for the virtual size of a process, which is the sum of memory it is actually using, memory it has mapped into itself (for instance the video card's RAM for the X server), files on disk that have been mapped into it (most notably shared libraries), and memory shared with other processes. VIRT represents how much memory the program is able to access at the present moment.

RES stands


"

How to Tune Swap Setting on Linux for Maximum Performance

How to Tune Swap Setting on Linux for Maximum Performance: "HANDLE WITH CARE

Since 2.6, there has been a way to tune how much Linux favors swapping out to disk compared to shrinking the caches when memory gets full.

ghoti adds: 'When an application needs memory and all the RAM is fully occupied, the kernel has two ways to free some memory at its disposal: it can either reduce the disk cache in the RAM by eliminating the oldest data or it may swap some


"

Use MySQL? You need Maatkit

Use MySQL? You need Maatkit: "

Maatkit is a pretty useful set of utilities for MySQL. From their site:


You can use Maatkit to prove replication is working correctly, fix corrupted data, automate repetitive tasks, speed up your servers, and much, much more.

One of the first things you can do after installing the toolkit (which may already be installed if you are running CentOS or Debian) is to run the mk-audit utility. It will give you a nice summary of your server, as well as point out potential problems in your configuration.


Here's a list of all the utilities included in Maatkit:




  • mk-archiver Archive rows from a MySQL table into another table or a file.

  • mk-audit Analyze, summarize and report on MySQL config, schema and operation

  • mk-checksum-filter Filter checksums from mk-table-checksum.


  • mk-deadlock-logger Extract and log MySQL deadlock information.

  • mk-duplicate-key-checker Find duplicate indexes and foreign keys on MySQL tables.

  • mk-fifo-split Split files and pipe lines to a fifo without really splitting.

  • mk-find Find MySQL tables and execute actions, like GNU find.

  • mk-heartbeat Monitor MySQL replication delay.


  • mk-kill Kill MySQL queries that match certain criteria.

  • mk-loadavg Watch MySQL load and take action when it gets too high.

  • mk-log-player Split and play MySQL slow logs.

  • mk-parallel-dump Dump sets of MySQL tables in parallel.

  • mk-parallel-restore Load files into MySQL in parallel.


  • mk-profile-compact Compact the output from mk-query-profiler.

  • mk-query-digest Parses logs and more. Analyze, transform, filter, review and
    report on queries.

  • mk-query-profiler Execute SQL statements and print statistics, or measure
    activity caused by other processes.

  • mk-show-grants Canonicalize and print MySQL grants so you can effectively
    replicate, compare and version-control them.

  • mk-slave-delay Make a MySQL slave server lag behind its master.


  • mk-slave-find Find and print replication hierarchy tree of MySQL slaves.

  • mk-slave-move Move a MySQL slave around in the replication hierarchy.

  • mk-slave-prefetch Pipeline relay logs on a MySQL slave to pre-warm caches.

  • mk-slave-restart Watch and restart MySQL replication after errors.

  • mk-table-checksum Perform an online replication consistency check, or
    checksum MySQL tables efficiently on one or many servers.


  • mk-table-sync Synchronize MySQL tables efficiently.

  • mk-upgrade Execute SQL statements against two MySQL servers and compare the results.

  • mk-visual-explain Format EXPLAIN output as a tree.



PlanetMySQL Voting:
Vote UP /
Vote DOWN"

Friday 23 October 2009

Secure Your Server – Tips for DBAs

Secure Your Server – Tips for DBAs: "It has been the responsibility of the DBA to implement a security strategy, at least on the database level. All DBAs should take action, if not already, by taking a proactive approach to this crucial part of database administration. For this post I have compiled a list of general best practices below.



-- The List

1. Make sure that ALL root level accounts have strong passwords and apply them using hashes not plain text. Otherwise anyone can connect as root without a password and be granted all privileges.



2. For ALL other accounts use passwords and apply them using hashes not plain text.

A good practice is to use the following:



mysql> grant SELECT, INSERT……. ON `database`.* to ‘user’@’10.%’ IDENTIFIED BY ‘*9B9C8E678DB2D62877E829D633535CBEB2B7E6E0’;



NOTE: make sure you are using HASHS and not plain text!

If you run something like below…



mysql> update user set password = password(‘somepassword’) where user = ‘someuser’;

Make sure you remove it from the ~/.mysql_history file in your home dir and or roots home directory. Otherwise, if your server becomes compromised, the intruder can take full advantage of this file and do more damage.



3. Give permissions only as needed and use different logins for different purposes.



NOTE: given that column level permissions hurt performance you should try and stick to table, at the lowest, and database level permissions.



4. Avoid using FQDN or hostnames when granting access. Try to keep it to a subnet (Example: 10.0.2.%).



5. Avoid having the MySQL server open to the web by using ‘%’ for any user. Although remote exploits are few and far between, it is just better not to risk it.



6. Remove ALL blank user accounts. Even if they just have USAGE!



mysql> DELETE from user WHERE user = ‘’;



7. Remove the following from the mysql.db table



mysql> DELETE from mysql.db WHERE Db = ‘test’ OR Db = ‘test\_%’;



You might ask why for the above removal, well, here is an explanation…



Let’s say you have many different databases on a single MySQL instance where the users control the names of their databases. The traditional rules apply like database name have to be unique, but some of these users have the same idea; adding a test database to their MySQL instance. Given that test is already taken on the server one user would create a database named test_a and another would create test_b. With the grant in the mysql.db table, Db = ‘test\_%’, BOTH users would have read and write access to each others database. If the overall goal is keep users from seeing each others databases than the Db = ‘test\_%’ grant must be removed.



A long explanation but I have seen it happen!
PlanetMySQL Voting:
Vote UP /
Vote DOWN"

Thursday 22 October 2009

Basics of MySQL failover (using replication)

Basics of MySQL failover (using replication): "

For easy MySQL failover, replication is a great solution. This post will cover the basics of performing a failover in case your master dies.


First, setup one-way replication (or two-way, but don't plan to use both servers for writes at the same time). Next, you'll want to direct all activity, or at least the writes, to the master. If the master dies, there are two major concerns:



  1. Redirecting the clients to the slave. There are several ways to handle this, such as Heartbeat or MySQL's JDBC driver.

  2. Checking if the slave is caught up. This is trickier. If the master's binary logs are still available, then you can do SHOW SLAVE STATUS on the slave and compare the READ_MASTER_LOG_POS to the master's binary log. For example, if SHOW SLAVE STATUS shows:
    Master_Log_File: localhost-bin.000051
    Read_Master_Log_Pos: 605


    Then look at the master's binary log files. A long listing will do the trick:



    ls -l *bin*

    -rw-rw---- 1 mysql mysql 698 2009-10-08 18:24 localhost-bin.000051





    Note that the binary log has a size of 698 bytes. Yet the slave only read up to byte position 605. So you can manually execute the activity that the slave missed:

    mysqlbinlog --start-position=605 localhost-bin.000051 | mysql -h slave-host-name






You may want to manually get the slave up-to-date before failing over. Or you can keep the slave read-only until you catch the slave up, then turn off read-only by doing SET GLOBAL read_only=0;


You may want to setup replication in both directions (sometimes called 'master-master' or two-way replication). This is a tricky setup if both servers are actively doing updates. But if you know that only one of the servers is updating at any one time, then it works quite well. Once you've failed-over to the slave, it is now acting as a master. Any updates done on this server can be replicated back to the other server when it comes back up.


This solution is not perfect and may not be suitable for all situations. There is potential data loss (possibly unknown data loss) if the binary logs are no longer available on the master. For some scenarios it is better to have additional means of failover such as MySQL Cluster or DRBD for localized failures.


PlanetMySQL Voting:
Vote UP /
Vote DOWN"

Wednesday 21 October 2009

How to capture debugging information with mk-loadavg

How to capture debugging information with mk-loadavg: "

Maatkit’s mk-loadavg tool is a helpful way to gather information about infrequent conditions on your database server (or any other server, really). We wrote it at Percona to help with those repeated cases of things like “every two weeks, my database stops processing queries for 30 seconds, but it’s not locked up and during this time there is nothing happening.” That’s pretty much impossible to catch in action, and these conditions can take months to resolve without the aid of good tools.



In this blog post I’ll illustrate a very simple usage of mk-loadavg to help in solving a much smaller problem: find out what is happening on the database server during periods of CPU spikes that happen every so often.



First, set everything up.




  1. Start a screen session: screen -S loadmonitoring. If you don’t have screen, you can run mk-loadavg as a daemon, but it’s much better to use screen in my opinion.

  2. Get mk-loadavg. For purposes of this blog post, I’m going to get the latest trunk code, because I know a bug or two has been fixed since the last release. wget http://www.maatkit.org/trunk/mk-loadavg

  3. Create a directory to hold the collected information in files. mkdir collected



Next let’s set up a script that mk-loadavg can use to gather some information when it detects a high CPU condition. Save the contents of this script as “collect-stats.sh”. The script will gather about 30 seconds worth of statistics. It uses a simple sentinel file /tmp/gatherinfo to prevent multiple occurrences from gathering statistics at the same time. (This is intentionally simple for demo purposes.)




#!/bin/bash

if [ -f /tmp/gatherinfo ]; then exit 0; fi
touch /tmp/gatherinfo
d=`date +%F-%T`
echo "gathering info for $d"
ps -eaf >> collected/$d-ps 2>&1 &
top -bn1 > collected/$d-top 2>&1 &
mysql -e 'show innodb status\G show full processlist\G' >> collected/$d-innodbstatus 2>&1 &
vmstat 1 30 >collected/$d-vmstat 2>&1 &
iostat -dx 1 30 >collected/$d-iostat 2>&1 &
mysqladmin ext -i1 -c30 > collected/$d-mysqladmin 2>&1 &
sleep 30
ps -eaf >> collected/$d-ps 2>&1 &
mysql -e 'show innodb status\G show full processlist\G' >> collected/$d-innodbstatus 2>&1 &
rm /tmp/gatherinfo


Now make the script executable: chmod +x collect-stats.sh. At this point we’re ready to start working. Let’s fire the stats-collection script when the system’s user CPU goes above 40%.




perl mk-loadavg --watch "Server:vmstat:us:>:40" --interval 1 --execute collect-stats.sh


If the CPU goes over 40%, you'll get a bunch of files in the collected directory, with helpful information to diagnose the problem. This example usage is pretty similar to a real-life one I set up recently. It enabled me to take a methodical approach to the problem:


  1. From the top output I was able to identify that MySQL was causing the spike.

  2. I then looked at the SHOW STATUS output to see what the database server was doing, using mext as a helper.

  3. From Select_full_scan and Handler_read_rnd_next I isolated table scans as a problem.

  4. From the saved SHOW PROCESSLIST I found problem queries and optimized them.



You would be right if you said there are much better tools for finding problem queries -- but remember two things: 1) sometimes clients ask for the lightweight, short-term solution that can be set up in about 5 minutes and checked the next day; and 2) when it is unclear that queries are the problem, setting up only a query monitor is stabbing in the dark and will not get results.



In addition to watching vmstat to measure system CPU usage, mk-loadavg can watch many other things, such as the MySQL server's SHOW PROCESSLIST, parsing values from SHOW INNODB STATUS, and so on.



Related posts:

  1. A metric for MySQL load average If you wer
  2. Maatkit version 3119 released Your month
  3. An alternative to the MySQL Query Analyzer MySQL just


Related posts brought to you by Yet Another Related Posts Plugin.


PlanetMySQL Voting:
Vote UP /
Vote DOWN

"

No Password – No Problem

No Password – No Problem: "In the past I have needed to gain access to MySQL servers when the password had been strangely forgotten. Below there are thee different processes to gain access to MySQL if you just don’t have the right kind of access you need or want.



Option 1: --skip-grant-tables



This will allow you to login to the server and change what you need to in the mysql.user table. Of course you will need to restart the server again without --skip-grant-tables if you want the current and newly added or modified user account to work.



This option is a good way to gain access to the server in the event that you have no elevated (root) privileges on the MySQL server. You do need root on the server you are on so you can kill the pid and restart with --skip-grant-tables.



Option 2: elevating your privileges



You will need to have an account on the server with WRITE access to the mysql database.



mysql> show grants for ‘chris’@’localhost’;

+----------------------------------------------------------+

| Grants for chris@localhost |

+----------------------------------------------------------+

| GRANT USAGE ON *.* TO 'chris'@'localhost' |

| GRANT SELECT, INSERT ON `mysql`.* TO 'chris'@'localhost' |

+----------------------------------------------------------+

2 rows in set (0.00 sec)



mysql> select * from user where user = 'chris'\G

*************************** 1. row ***************************

Host: localhost

User: chris

Password:

Select_priv: N

Insert_priv: N

Update_priv: N

Delete_priv: N

Create_priv: N

Drop_priv: N

Reload_priv: N

Shutdown_priv: N

Process_priv: N

File_priv: N

Grant_priv: N

References_priv: N

Index_priv: N

Alter_priv: N

Show_db_priv: N

Super_priv: N

Create_tmp_table_priv: N

Lock_tables_priv: N

Execute_priv: N

Repl_slave_priv: N

Repl_client_priv: N

Create_view_priv: N

Show_view_priv: N

Create_routine_priv: N

Alter_routine_priv: N

Create_user_priv: N

Event_priv: N

Trigger_priv: N

ssl_type:

ssl_cipher:

x509_issuer:

x509_subject:

max_questions: 0

max_updates: 0

max_connections: 0

max_user_connections: 0

1 row in set (0.00 sec)



mysql> insert into mysql.user VALUES ('%','sneekyuser','','Y','Y','Y','Y','Y','Y','Y','Y','Y','Y','Y','Y','Y','Y','Y','Y','Y','Y','Y','Y','Y','Y','Y','Y','Y','Y','Y','Y','','','','',0,0,0,0);

Query OK, 1 row affected (0.00 sec)



mysql> flush privileges;

ERROR 1227 (42000): Access denied; you need the RELOAD privilege for this operation



Note the ERROR above.



mysql> select * from user where user = 'sneekyuser'\G

*************************** 1. row ***************************

Host: %

User: sneekyuser

Password:

Select_priv: Y

Insert_priv: Y

Update_priv: Y

Delete_priv: Y

Create_priv: Y

Drop_priv: Y

Reload_priv: Y

Shutdown_priv: Y

Process_priv: Y

File_priv: Y

Grant_priv: Y

References_priv: Y

Index_priv: Y

Alter_priv: Y

Show_db_priv: Y

Super_priv: Y

Create_tmp_table_priv: Y

Lock_tables_priv: Y

Execute_priv: Y

Repl_slave_priv: Y

Repl_client_priv: Y

Create_view_priv: Y

Show_view_priv: Y

Create_routine_priv: Y

Alter_routine_priv: Y

Create_user_priv: Y

Event_priv: Y

Trigger_priv: Y

ssl_type:

ssl_cipher:

x509_issuer:

x509_subject:

max_questions: 0

max_updates: 0

max_connections: 0

max_user_connections: 0

1 row in set (0.00 sec)



The problem with this method is that you need a MySQL restart, a full server restart or a user with RELOAD to run “FLUSH PRIVILEGES”. Although these actions can occur on production systems you might have to wait a while if you’re not the one in control of these actions. For this example I restarted the MySQL server and obtained the grants I wanted.



shell> mysql -usneekyuser -S /tmp/mysql.sock



mysql> show grants;

+-------------------------------------------------------------------+

| Grants for sneekyuser@% |

+-------------------------------------------------------------------+

| GRANT ALL PRIVILEGES ON *.* TO 'sneekyuser'@'%' WITH GRANT OPTION |

+-------------------------------------------------------------------+

1 row in set (0.00 sec)





Option 3: MyISAM is GREAT



All of the system tables that MySQL uses are in the MyISAM storage engine. That said and knowing that the privilege tables are loaded into memory at runtime you can just replace the user.frm, user.MYD and user.MYI file on the OS level and restart.



Please keep in mind that these are “operational tasks” that could prove useful to your environment given the right situation. The processes listed above are NOT the gateway into hacking the MySQL server given that you need some sort of preexisting access to the server or MySQL.
PlanetMySQL Voting:
Vote UP /
Vote DOWN"

Setting up an LVM filesystem

Setting up an LVM filesystem: "



Setting up an LVM filesystem is quite easy assuming you have the right tools installed and a recent kernel. LVM has a lot of advantages, most notably the ability to take snapshots of the current filesystem – this is why LVM is often used in live database environments.


Assuming a Debian Lenny machine, get the relevant packages. Some may already be installed: apt-get install lvm2 dmsetup mdadm


In this example, we will assuming that /dev/sda is your boot drive, and that you want to leave it out of your LVM array, but include /dev/sdb and /dev/sdc. Both /dev/sdb and /dev/sdc should be of equal sizes.


Firstly, using fdisk, remove any existing partitions with ‘d’, on /dev/sdb and /dev/sdc, and create one new partition to span the drive. Change the partition type to ‘8e’ which is the LVM type.


Now prepare your physical disk for LVM with the ‘pvcreate’ tool:


pvcreate /dev/sdb1 /dev/sdc1


Note that you can reverse this with pvremove. You can also use pvdisplay now to display information on all physical volumes.


Oh – you do realie that you can use /dev/mdX just as easily to create LVM on your RAID devices?


Now, we need to create a ‘volume group’: vgcreate myvg /dev/sdb1 /dev/sdc1




We can also use vgdestroy just the same as pvdestroy to remove the volume group. At this point, we can also use vgscan to scan for volume groups, and vgrename to rename a volume group. vgdisplay will display information about known volume groups, again, just the same as pvdisplay.


Now we have a volume group on top of our physical volume, we need to create a logical volume on top of the volume group.


lvcreate –name mylv1 –size 10G myvg willcreate a 10Gb sized logical volume called ‘mylv1′ on side the ‘myvg’ volume group. Again, same as with the above, we can also use lvdestroy, lvscan, lvrename and lvdisplay


To extend the size of the logical volume, we can use lvextend i.e. lvextend -L20G /dev/myvg/mylv1 and then back again with lvreduce i.e. lvreduce -L10G /dev/myvg/mylv1


*** Do take appropriate backups before editing and/or resizing your partitions. ***


Now, we can create a filesystem if we wanted to. We can use mkfs.ext3 /dev/myvg/mylv1 to do this, in the same way you would create any filesystem.


Now this is done, you can resize the partition as you would any other partition. If you wish to make it larger, unmount the partition, use lvexpand to increase the ‘physical’ size, then use resize2fs to increate the ext3 partition size. If you wish to reduce it, unmount it, use resize2fs to decrease the ext3 partition size and then lvreduce to reduce the physical space. Be very careful with your maths or you’ll suffer disasterous results.


Now, assume that you’ve just added a hard disk (/dev/sdd) to your machine and wish to add it to our array. After preparing the partition table for LVM, we need to create a physical volume under LVM and then extend our volume group to this disk – vgcreate /dev/sdd1; vgextend myvg /dev/sdd1. To remove a disk, first move the data from it to another, pvmove /dev/sde1 /dev/sdd1; vgreduce myvg /dev/sde1


Now the bit we’ve all been waiting for – how to take a snapshot.


First, you’ll need to create a logical volume the same size as the volume you wish to snapshot. We’ve created a logical volume earlier: lvcreate –name mylv1 –size 10G myvg willcreate a 10Gb


Now, to create a snapshot, we’ll use: lvcreate -L10G -s -n mysnapshot /dev/myvg/mvlv1


lvdisplay will confirm the snapshot has been created. Now, you can mount your /dev/myvg/mysnapshot as you would have done /dev/myvg/mylv1. You can back it up with ‘tar’ or rsync once mounted, or even just dd it and leave it unmounted. It’s a regular ‘block device’ for you to use. snapshots will be created ‘pseudo-instantly’ for you. If you’re using virtual machines, you may notice a slight pause.




"

Tuesday 20 October 2009

How to calculate a good InnoDB log file size – recap

How to calculate a good InnoDB log file size – recap: "

Following Baron Schwartz’ post: How to calculate a good InnoDB log file size, which shows how to make an estimate for the InnoDB log file size, and based on SQL: querying for status difference over time, I’ve written a query to run on MySQL 5.1, which, upon sampling 60 seconds of status, estimates the InnoDB transaction log bytes that are expected to be written in the period of 1 hour.


Recap: this information can be useful if you’re looking for a good innodb_log_file_size value, such that will not pose too much I/O (smaller values will make for more frequent flushes), not will make for a too long recovery time (larger values mean more transactions to recover upon crash).


It is assumed that the 60 seconds period represents an average system load, not some activity spike period. Edit the sleep time and factors as you will to sample longer or shorter periods.



SELECT
innodb_os_log_written_per_minute*60
AS estimated_innodb_os_log_written_per_hour,
CONCAT(ROUND(innodb_os_log_written_per_minute*60/1024/1024, 1), 'MB')
AS estimated_innodb_os_log_written_per_hour_mb
FROM
(SELECT SUM(value) AS innodb_os_log_written_per_minute FROM (
SELECT -VARIABLE_VALUE AS value
FROM INFORMATION_SCHEMA.GLOBAL_STATUS
WHERE VARIABLE_NAME = 'innodb_os_log_written'
UNION ALL
SELECT SLEEP(60)
FROM DUAL
UNION ALL
SELECT VARIABLE_VALUE
FROM INFORMATION_SCHEMA.GLOBAL_STATUS
WHERE VARIABLE_NAME = 'innodb_os_log_written'
) s1
) s2
;


Sample output:



+------------------------------------------+---------------------------------------------+
| estimated_innodb_os_log_written_per_hour | estimated_innodb_os_log_written_per_hour_mb |
+------------------------------------------+---------------------------------------------+
| 584171520 | 557.1MB |
+------------------------------------------+---------------------------------------------+


PlanetMySQL Voting:
Vote UP /
Vote DOWN"

MySAR, a Sidekick for Other Monitoring Tools

MySAR, a Sidekick for Other Monitoring Tools: "

I’m sure that most people have at least one of the tools listed in Ronald Bradford’s article: Monitoring MySQL Options. Many of these tools, such as Nagios and Cacti, also monitor the operating system. However, in the same way that a quick look at sar’s output can give you some insight on the OS, with MySAR you can do the same for the MySQL server. This is especially useful when it is not possible to access a monitoring tool’s graphic interfaces.


What Was Going On Around 2:30pm?


This is a question a customer asked us. To answer it we ran MySAR for a few days and queried the results for analysis. Looking at the data, we determined that the number of INSERT operations was significantly higher than any other, so we queried for the Com_insert status values. Com_insert is a counter that accumulates the number of INSERTs issued since the last server start (or since the last FLUSH STATUS command). For details on the variables available check Chapter 1. mysqld Options/Variables Reference.


The initial query we used was:


select value_stat_name name, value_stat_timestamp ti, value_stat_number val, value_stat_delta delta from value_stat
where value_stat_name like 'Com_insert'
and HOUR(value_stat_timestamp) between 13 and 15
and date(value_stat_timestamp) between '2009-08-22' and '2009-08-25'
order by 1, 2
+------------+---------------------+-----------+-------+
| name | ti | val | delta |
+------------+---------------------+-----------+-------+
| Com_insert | 2009-08-22 13:00:01 | 163777289 | 16051 |
| Com_insert | 2009-08-22 13:10:01 | 163794262 | 16973 |
| Com_insert | 2009-08-22 13:20:01 | 163810327 | 16065 |
| Com_insert | 2009-08-22 13:30:01 | 163826242 | 15915 |
....
| Com_insert | 2009-08-25 15:20:01 | 168362221 | 13835 |
| Com_insert | 2009-08-25 15:30:01 | 168376888 | 14667 |
| Com_insert | 2009-08-25 15:40:01 | 168392410 | 15522 |
| Com_insert | 2009-08-25 15:50:02 | 168408476 | 16066 |
+------------+---------------------+-----------+-------+
72 rows in set (1.56 sec)

72 rows of numbers are not very easy to interpret, so the next step was to create a graphic to visualize the results. Taking the ideas from Shlomi Noach’s blog, SQL pie chart, I decided to try and create the charts using the Google Charts API.


Easy Graphics How-To


The first step was to create the string of values needed for the chart. I achieved this using GROUP_CONCAT, adding some formulas to scale the values properly:


select day(value_stat_timestamp) as day,
GROUP_CONCAT((value_stat_delta-12000)/70) as series from value_stat
where value_stat_name like 'Com_insert' and HOUR(value_stat_timestamp) between 13 and 15
and date(value_stat_timestamp) between '2009-08-22' and '2009-08-25'
group by day(value_stat_timestamp);
+------+------------------------------------------------------------------------------------------
| day | series
+------+-----------------------------------------------------------------------------------
| 22 | 57.8714,80.3000,81.2571,94.3857,85.2571,66.0857,94.6429,74.2857,91.1000,...
| 23 | 74.4857,95.5857,103.4571,101.3000,75.9429,75.3571,79.3000,83.0571,94.7714,...
| 24 | 25.6286,38.8429,31.7000,36.3857,50.0143,44.3286,46.1571,39.4714,34.3857,21.5571,...
| 25 | 6.8000,12.9714,33.0429,13.8286,26.2143,38.1000,50.3143,19.4429,13.0857,3.2143,7.9143,...
+------+------------------------------------------------------------------------------------------
4 rows in set (0.34 sec)

(To make the output more readable, I’m not including the whole lines.) Using the values corresponding day 22, the most simple line graphic can be created with following HTML tag:


<img src="http://chart.apis.google.com/chart?cht=lc&chs=400x200&
chd=t:57.8714,80.3000,81.2571,94.3857,85.2571,66.0857,94.6429">

Resulting in:


Simple Chart


In the URL string, cht, chs, and chd represent chart type, chart size, and chart data respectively; you can check the details in the Google Chart API URL I quoted above.


You can copy and paste the <img> tag above into an HTML document (no need to add any text) and open it with any browser. For Firefox, the URL to open a file on disk would look similar to this:


file:///path/to/file/file_name.html

Go ahead, open any text editor, copy and paste the example above in any file, save it with extension .html and open it with your browser.


The chart with the complete data set and and some decorations looks like this—each color represents a different day from the dataset above:


Complete Chart

Using the function CONCAT, it is possible to create the complete <img src=”…”> tag directly from the SQL statement by adding the proper sub-strings.


Conclusion


Using MySAR, it is possible to to collect the data, query the data for diagnosis. and create graphics to visualize those values.


By the way, there was nothing significant happening in the database at 2:30pm.


PlanetMySQL Voting:
Vote UP /
Vote DOWN"

Saturday 17 October 2009

Finding your MySQL High-Availability solution – The questions

Finding your MySQL High-Availability solution – The questions: "

After having reviewed the definition my the previous post (The definitions), the next step is to respond to some questions.


Do you need MySQL High-Availability?


That question is quite obvious but some times, it is skipped. It can also be formulated “What is the downtime cost of the service?”. In the cost, you need to include lost revenue from the service and you also need to consider less direct impact like loss of corporate image and other marketing costs. If your downtime cost is under $10/h, you can stop reading this document, you don’t need HA. For the others, let’s move on!


How to determine which MySQL High-Availability solution is best?


What is really tricky with MySQL is the number of possible HA solutions. From the simplest the most complex let’s list the most common ones:


- MySQL replication with manual failover

- Master-Master with MMM manager

- Heartbeat/SAN

- Heartbeat/DRBD

- NDB Cluster


These technologies are by far, not a one size fits all and many deployments use combination of solutions. I will not cover ScaleDB and Continuent because I know almost nothing of these solutions. There are many more questions you need to ask yourself before being able to pick the right one. Below, I listed the most common questions, I might have missed some.


1. What level of HA do you need?


Since all the technologies do not offer the same level of availability, this is a first important sorting factor. Here are estimates of the level of availability offered by the various solutions.



























Level of availability
Simple replication98 to 99%
Master-Master with MMM manager99%
Heartbeat/SAN (depends on SAN)99.5% to 99.9%
Heartbeat/DRBD99.9%
NDB Cluster99.999%

From the table, if your requirements are for 99.99%, you are restricted to NDB Cluster while if it is only 99% you have more options. I recall that the level of availability is hard to estimate and subject to debate. These are the usually accepted level of availability for these technologies.


2. Can you afford to lose data?


Obviously, if you are concerned about loss of data, you are most likely using the InnoDB storage engine, since MyISAM is not transactional and do not sync data to disk. Similarly, MySQL replication is an asynchronous process and although it is fairly fast at transferring data between the master and the slaves, there is a window of time where data loss is possible.


If you can afford to lose some data, you can consider “MySQL replication” and “Master-Master with MMM manager” otherwise, you can only consider the other three solutions.



























Data 100% safe
MySQL replicationno
Master-Master with MMM managerno
Heartbeat/SAN (depends on SAN)yes
Heartbeat/DRBD yes
NDB Clusteryes

3. Does your application use MyISAM only features?


There are some features like Full text indexes and GIS indexes that are supported only by MyISAM. The HA solutions that work well with MyISAM are “MySQL replication” and “Master-Master with MMM manager”. Depending on the application, the MyISAM Full text indexes might be replaced by another search engine like Sphinx in order to remove the restriction. There is no HA solution other than the ones based on replication that handles GIS indexes.



















HA solutions
Need MyISAM Full text or GIS indexes “MySQL replication” and “Master-Master with MMM manager”
Don’t use any special MyISAM featureAll
Can change MyISAM Full text to SphinxAll

4. What is the write load?


The HA solutions we present are not equal in term of their write capacity. Due to the way replication is implemented, only one thread on the slave can handle the write operations. If the replication master is multi-cores servers and is heavily writing using multiple threads, the slaves will likely not be able to keep up. Replication is not the only technology that put a strain on the write capacity, DRBD, a shared storage emulator for Linux, also reduce by about 30% (very dependent on hardware) the write capacity of a database server. In term of write capacity here are you choices.



























Write capacity
MySQL replicationFair
Master-Master with MMM managerFair
Heartbeat/SAN (depends on SAN)Excellent
Heartbeat/DRBDGood
NDB ClusterExcellent

5. For what level of growth are you planning?


Since NDB Cluster is an integrated sharding environment, if you are planning for a growth that will need sharding (splitting the database over multiple servers), then you might need to take a serious at that solution. If not, then, apart from the write capacity, all the solutions are approximately equal.


6. How qualified is your staff or support company?


There is a quite direct relationship between the level of availability and the complexity of the solution. In order to reach the promised level of availability, the staff maintaining the HA setup, either internal or external, must have the required level of expertise. The required expertise level is summarized in the table below.



























Expertise level
MySQL replicationTypical, average MySQL DBA + some Sysadmin skills
Master-Master with MMM managerGood, average MySQL DBA + good Sysadmin skills
Heartbeat/SAN (depends on SAN)High, Good MySQL DBA + strong Sysadmin skills
Heartbeat/DRBDHigh, Good MySQL DBA + strong Sysadmin skills
NDB ClusterVery high, Specific NDB knowledge, strom MySQL skills and strong Sysadmin skills

7. How deep are your pocket?


The last aspect that needs to be considered is the budget, complexity is expensive. We will consider two types of setup. The first one is a basic proof of concept of the technology with the hardware tested, the data imported and basic testing and documentation. A proof of concept setup is a good way to get used to a technology and experiment with it in a test environment. The other type of setup we will consider is a full production setup that includes extensive testing, fire drills, full documentation, monitoring, alerting, backups, migration to production and post migration monitoring. Of course, it is the safest way to migrate an HA solution to production. All the times here are estimates based on field experience, the values presented here are fairly typical and contains some buffers for unexpected problems. Although an HA solution can be built remotely through a KVM over IP and adequate remote power management, an on site intervention with physical access to the servers is the preferred way, especially for the most complex solutions.

































Proof of conceptMigration to Production
MySQL replication4 hours12 hours
Master-Master with MMM manager8 hours24 hours
Heartbeat/SAN (depends on SAN) 32 hours120 hours
Heartbeat/DRBD40 hours120 hours
NDB Cluster40 hours120 hours+



Entry posted by yves |
5 comments


Add to: delicious | digg | reddit | netscape | Google Bookmarks


PlanetMySQL Voting:
Vote UP /
Vote DOWN"

Friday 16 October 2009

Be careful with FLUSH TABLES WITH READ LOCK

Be careful with FLUSH TABLES WITH READ LOCK: "Be careful when using FLUSH TABLES WITH READ LOCK (aka FTWRL). I have written about potential problems that may occur when using FTWRL. Anyone who runs ibbackup or xtrabackup on a server that writes a binlog needs FTWRL to run as fast as possible with as few problems as possible, but that is not always the case. In its current form, you must monitor FTWRL and either kill it or long-running queries when FTWRL takes too long.

MySQL does three things when processing FTWRL. First it sets the global read lock. Then it closes open tables. Finally it sets a flag to block commits. You will have problems in production when FTWRL doesn't return quickly. It doesn't return quickly when there are long running queries as it waits for the current queries to finish. The problem is that insert, update, delete and replace statements are blocked after the first step. When FTWRL lingers in the second step (close open tables), then your server will stop accepting writes. An additional problem is that for deployments with many open tables, it is a lot of work to close and then re-open them. I need to confirm whether re-open is done serially because a mutex is held and whether InnoDB re-samples all indexes on all reopened tables to get optimizer statistics.

I blame MyISAM for the current problems. As I am not a MyISAM expert, this is an educated guess and I welcome your feedback. The problem with FTWRL is FT (flush tables) and MyISAM is the reason that tables must be flushed. The --delay-key-write option and possibly other features in MyISAM allow open tables to buffer committed changes. The buffered changes are written to MyISAM data files when the open table is closed.

INSERT DELAYED might also cause problems, but anybody who needs a hot backup shouldn't be using that option.

I think we can make this better and my solution is DFTBGRL (don't flush tables but get read lock). Maybe it needs a better name. DFTBGRL skips the second step of FTWRL -- it sets the global read lock and then it sets a flag to block commits. This should be much safer to use in production.

PlanetMySQL Voting:
Vote UP /
Vote DOWN"

MySQL-Memcached or NOSQL Tokyo Tyrant – part 2

MySQL-Memcached or NOSQL Tokyo Tyrant – part 2: "

Part 1 of our series set-up our 'test' application and looked at boosting performance of the application by buffer MySQL with memcached. Our test application is simple and requires only 3 basic operations per transaction 2 reads and 1 write. Using memcached combined with MySQL we ended up nearly getting a 10X performance boost from the application. Now we are going to look at what we could achieve if we did not have to write to the database at all. So let's look at what happens if we push everything including writes into memcached.


Benchmarks if everything is in memcached


Wow that's shockingly fast isn't it! I guess being completely in memory helps for this app. What is very interesting is accessing 100% of the data in memcached gives very similar numbers to accessing 100% of the data in memory in the DB ( part 1 benchmarked a 4GB bp as being able to handle 7K TPS)... something is not 100% right here. It stands to reason that memcached should be faster for this application then the DB. Its just doing two gets via key and 1 set. So why the similar numbers?


Well glad you asked. It's the API. The api in this case was Cache::Memcached, by switching to using Cache::Memcached::Fast look what happens:


Memcached API - Fast


That is a nice jump in performance!


Using Memcached::Fast was kind of a mixed bag when looking at the benchmarks for mixing MySQL and Memcached in my tests:


Sometimes Api changes can make a huge difference


In this case I think the Fast api was slower when working with MySQL with a 256m BP because the slower returns from memcached acted as a bottleneck to thin the demands on MySQL to write data, smoothing out the work load. When we eliminate this bottleneck with the Fast api, MySQL gets hammered. This type of thing happens a lot. For example an application is CPU bound, so you add more processing power, but then you hit disks harder and now your disk bound.


A couple of good things to remember here: #1 resolving 1 bottleneck can open another bottleneck that is much worse. #2 is to understand that not all API's are created equal. Additionally the configuration and setup that works well on one system may not work well on another. Because of this people often leave lots of performance on the table. Don't just trust that your current API or config is optimal, test and make sure it fits your application.


So adding Memcached on top of MySQL for our test application can significantly boost performance. But you notice that if we were running 100% in memcached and could cut out MySQL we could get 2.5x more performance over a mixed solution and 100X over just stock MySQL. As the number of writes against the database increase this gap will increase. So let's ditch the database! But wait! you need the DB for persistence, right?


It depends. A database may not be the best fit for every application. There are several “NOSQL” solutions out in the open source space that can give you some of the ease of a Memcached but with persistence most people use their database for. Each application is different and understanding the application's requirements is key to picking an appropriate solution. I am going to look at several database alternatives over the next few months. I need to start somewhere, so I decided to start with Tokyo Tyrant and Cabinet. So stop in next time for part 3 of this series where we will focus on running the same tests against Tokyo Tyrant.



Wow that's shockingly fast isn't it! I guess being completely in memory helps for this app. What is very interesting is accessing 100% of the data in memcached gives very similar numbers to accessing 100% of the data in memory in the DB... something is not 100% right here. It stands to reason that memcached should be faster for this application then the DB, two gets via key and 1 set. So why the similar numbers?



Well glad you asked. It's the API. The api in this case was Cache::Memcached, by switching to using Cache::Memcached::Fast look what happens:






That is a nice jump in performance!



Using Memcached::Fast was kind of a mixed bag when looking at the benchmarks for mixing MySQL and Memcached in my tests:




In this case I think Fast was slower when working with MySQL with a 256m BP because the slower returns from memcached acted as a bottleneck to thin the demands on MySQL to write data, smoothing out the work load. When we eliminate this bottleneck with the Fast api, MySQL gets hammered.



A couple of good things to remember here: #1 resolving 1 bottleneck can open another bottleneck that is much worse. #2 is to understand that not all API's are created equal. Additionally the configuration and setup that works well on one system may not work well on another. Because of this people often leave lots of performance on the table. Don't just trust that your current API or config is optimal, test and make sure it fits your application.



So adding Memcached on top of MySQL for our test application can significantly boost performance. But you notice that if we were running 100% in memcached and could cut out MySQL we could get 2.5x more performance. As the number of writes against the database increase this gap will increase. So let's ditch the database! But wait! you need the DB for persistence, right?



It depends. A database may not be the best fit for every application. There are several “NOSQL” solutions out in the open source space that can give you some of the ease of a Memcached but with persistence most people use their database for. Each application is different and understanding the application's requirements is key to picking an appropriate solution. I am going to look at several database alternatives over the next few months. I need to start somewhere, so I decided to start with Tokyo Tyrant and Cabinet.





Entry posted by matt |
No comment


Add to: delicious | digg | reddit | netscape | Google Bookmarks


PlanetMySQL Voting:
Vote UP /
Vote DOWN"

MySQL-Memcached or NOSQL Tokyo Tyrant – part 1

MySQL-Memcached or NOSQL Tokyo Tyrant – part 1: "

All to often people force themselves into using a database like MySQL with no thought into whether if its the best solution to there problem. Why? Because their other applications use it, so why not the new application? Over the past couple of months I have been doing a ton of work for clients who use their database like most people use memcached . Lookup a row based on a key, update the data in the row, stuff the row back in the database. Rinse and repeat. Sure these setups vary sometimes, throwing in a “lookup” via username, or even the rare count. But for the most part they are designed to be simple.


A classic example is a simple online game. An online game may only require that an application retrieve a single record from the database. The record may contain all the vital stats for the game, be updated and stuffed back into the database. You would be surprised how many people use this type of system as I run into this type of application frequently. Keeping it simple, ensures that application is generally mean and lean and performs well. The issue is even this simple design can start to have issues as the data size increases and you blow through your available memory. Is there a better architecture? Is there a way to get more scalability out of your database? Is the database even the best place for this data?


I decided to walk through setting up a very simple application that does what I have seen many clients do. Using this application I can then compare using MySQL to using MySQL + Memcached, and then to other solutions like Tokyo Tyrant or Cassandra. My Application does the following:


A.) read a row from a database based on an integer based primary key

B.) Update data from that row and replace the stored contents on disk

C.) Use the data from that row to lookup up a row in another table based on a text field ( called email address ).


Seems simple enough right? My two tables each contain 5M rows of data. let’s see what happens:


DB Fits into Memory


Chart of TPS for benchmark application


You can see a dramatic drop off in performance as my data falls out of memory, that’s not cool is it? After all database sizes tend to always grow and very rarely shrink. Which leads to a challenge faced by almost everyone how do you maintain your performance in the face of increasing data size?


Here is where people start to scratch their heads. They naturally assume they need to scale more, we need more memory! If performance sucks, we must need more. So here comes the bigger boxes, the read-only slaves, the complex sharding systems, the discussions on cluster, more memcached. We need to cover up the databases inefficiencies to ensure that our application scales.


The problem is for some applications, we are fixing symptoms, not the problem itself. No matter how much you want it to fit, some things may not work (like the Godfather 3). The issue is people assume that data storage has to be in the database. “It’s data, it needs to go into the database.” is often the battle cry. But hold on to your hat, I am going to shock you. For some applications, putting your data in the database is silly. Yes the guy who blogs on bigdbahead.com and is writing this on the mysqlperformanceblog is saying you may not want to use a database. Heresy I know! But for many of us we already accept storing data ( at least temporarily ) outside the DB. Think memcached.


Almost everyone loves memcached, it’s simple, fast, and just works. When your dataset exceeds your memory limitations or the database can simply not keep up any more this solution can really boost performance. I know you’re thinking my simple key lookup should really benefit from memcached. So let’s try it! I took the simple app I created that reads two rows, and update one of them to read from memcached if available, remove on update, and read from the db only when required. I tested with a memcached size of 1GB, 2GB, and 4GB. For these tests I left Innodb with a 256M buffer pool, or roughly with 9% of the total data in memory.


let’s look at the 1GB Setting:




Ensure you have enough memory for memcached



What, a performance regression? But we threw more memory at it!! How can that be!


Memcached is not a cure all. I have talked to many client’s who say “we will just throw memcached at it”. Sometimes an app will scream other times it won’t… and yet others require lots and lots of memory allocated to memcached to be successful. This application selects a random # between 1 and 2 Million and looks up the result via that key. It then uses data from that random row to look up a second piece of information via email address. Because the entire dataset is about 4GB and only 1G is in memcached, I keep pushing data out of memcached to make room for new records I am reading from the database. Remember memcached needs repeatability to be helpful. I am still getting a really solid # of hits in memcached, but the # of writes in MySQL coupled with the still large # of reads takes its toll. Another place where I have seen this kill clients is in apps that do some sequential scanning and do not have enough memory for memcached. For instance, if you have 1,000,000 rows of data, but enough memory to only store 500,000 rows… sequentially accessing this data will destroy the use of cache:


get record 1, miss, read from disk, cache record 1

….

get record 500,001, miss, read from disk, expunge record 1, cache record 500,001

….

get record 1, miss, read from disk, expunge record 500,001, cache record 1


you keep overwriting the cache before you can use it. So here the complexity of adding memcached hurts us, because the cache is not actually buying us anything.


Now bumping this up to 2GB actually makes the TPS jump around a lot, sometimes hitting 400 or 500 TPS and other times hitting as high as 1800 TPS. My guess is the movement is caused by the random #’s being generated and simply the luck of the draw.


Finally let’s look when we have 4GB of memory allocated to memcached (full dataset fits ):


Transactions with and without Memcached


Here you can see that our “transactions” per second for this app increased almost 10Xby using memcached. The TPS I get here vary from 1100 TPS to just under 2000TPS with the average around 1400TPS. I think we would all be very happy if we could get a 10X performance boost from your application.


But wouldn’t it be great if we could get more? I mean our reads are going pretty fast, but our writes leave a lot to be desired:


Read -vs- write times with memcached + mysql mixed

Over 17 MS to do an update. Wouldn’t be great to just eliminate all the updates as well? What sort of throughput would we get? I will show you in part 2. Part 2 of this post will talk about performance in a 100% pure memcached environment. Part 3 will focus on these same benchmarks in Tokyo tyrant.



All to often people force themselves into using a database like MySQL with no thought into whether if its the best solution to there problem. Why? Because their other applications use it, so why not the new application? Over the past couple of months I have been doing a ton of work for clients who use their database like most people use memcached . Lookup a row based on a key, update the data in the row, stuff the row back in the database. Rinse and repeat. Sure these setups vary sometimes, throwing in a “lookup” via username, or even the rare count. But for the most part they are designed to be simple.



A classic example is a simple online game. An online game may only require that an application retrieve a single record from the database. The record may contain all the vital stats for the game, be updated and stuffed back into the database. You would be surprised how many people use this type of system as I run into this type of application frequently. Keeping it simple, ensures that application is generally mean and lean and performs well. The issue is even this simple design can start to have issues as the data size increases and you blow through your available memory. Is there a better architecture? Is there a way to get more scalability out of your database? Is the database even the best place for this data?



I decided to walk through setting up a very simple application that does what I have seen many clients do. Using this application I can then compare using MySQL to using MySQL + Memcached, and then to other solutions like Tokyo Tyrant or Cassandra. My Application does the following:



A.) read a row from a database based on an integer based primary key


B.) Update data from that row and replace the stored contents on disk


C.) Use the data from that row to lookup up a row in another table based on a text field ( called email address ).



Seems simple enough right? My two tables each contain 5M rows of data. let’s see what happens:










You can see a dramatic drop off in performance as my data falls out of memory, that’s not cool is it? After all database sizes tend to always grow and very rarely shrink. Which leads to a challenge faced by almost everyone how do you maintain your performance in the face of increasing data size?



Here is where people start to scratch their heads. They naturally assume they need to scale more, we need more memory! If performance sucks, we must need more. So here comes the bigger boxes, the read-only slaves, the complex sharding systems, the discussions on cluster, more memcached. We need to cover up the databases inefficiencies to ensure that our application scales.



The problem is for some applications, we are fixing symptoms, not the problem itself. No matter how much you want it to fit, somethings may not work (like the Godfather 3). The issue is people assume that data storage has to be in the database. “It’s data, it needs to go into the database.” is often the battle cry. But hold on to your hat, I am going to shock you. For some applications, putting your data in the database is silly. Yes the guy who blogs on bigdbahead.com and is writing this on the mysqlperformanceblog is saying you may not want to use a database. Heresy I know! But for many of us we already accept storing data ( at least temporarily ) outside the DB. Think memcached.



Almost everyone loves memcached, it’s simple, fast, and just works. When your dataset exceeds your memory limitations or the database can simply not keep up any more this solution can really boost performance. I know you’re thinking my simple key lookup should really benefit from memcached. So let’s try it! I took the simple app I created that reads two rows, and update one of them to read from memcached if available, remove on update, and read from the db only when required. I tested with a memcached size of 1GB, 2GB, and 4GB. For these tests I left Innodb with a 256M buffer pool, or roughly with 9% of the total data in memory.



let’s look at the 1GB Setting:





What, a performance regression? But we threw more memory at it!! How can that be!



Memcached is not a cure all. I have talked to many client’s who say “we will just throw memcached as it”. Sometimes an app will scream other times it won’t… and yet others require lots and lots of memory allocated to memcached to be successful. This application selects a random # between 1 and 2 Million and looks up the result via that key. It then uses data from that random row to look up a second piece of information via email address. Because the entire dataset is about 4GB and only 1G is in memcached, I keep pushing data out of memcached to make room for new records I am reading from the database. Remember memcached needs repeatability to be helpful. I am still getting a really solid # of hits in memcached, but the # of writes in MySQL coupled with the still large # of reads takes its toll. Another place where I have seen this kill clients is in apps that do some sequential scanning and do not have enough memory for memcached. For instance, if you have 1,000,000 rows of data, but enough memory to only store 500,000 rows… sequentially accessing this data will destroy the use of cache:





  • get record 1, miss, read from disk, cache record 1




  • ….




  • get record 500,001, miss, read from disk, expunge record 1, cache record 500,001




  • ….




  • get record 1, miss, read from disk, expunge record 500,001, cache record 1



    you keep overwriting the cache before you can use it. So here the complexity of adding memcached hurts us, because the cache is not actually buying us anything.





Now bumping this up to 2GB actually makes the TPS jump around a lot, sometimes hitting 400 or 500 TPS and other times hitting as high as 1800 TPS. My guess is the movement is caused by the random #’s being generated and simply the luck of the draw.



Finally let’s look when we have 4GB of memory allocated to memcached (full dataset fits ):







Here you can see that our “transactions” per second for this app increased almost 10Xby using memcached. The TPS I get here vary from 1100 TPS to just under 2000TPS with the average around 1400TPS. I think we would all be very happy if we could get a 10X performance boost from your application.



But wouldn’t it be great if we could get more? I mean our reads are going pretty fast, but our writes leave a lot to be desired:






Over 17 MS to do an update. Wouldn’t be great to just eliminate all the updates as well? What sort of throughput would we get?





Entry posted by matt |
3 comments


Add to: delicious | digg | reddit | netscape | Google Bookmarks


PlanetMySQL Voting:
Vote UP /
Vote DOWN"

Wednesday 14 October 2009

Using mext to format saved mysqladmin output nicely

Using mext to format saved mysqladmin output nicely: "

I wrote a while ago about how mext works — it runs “mysqladmin extended-status” and formats it nicely. But what if you want to use it to format saved output that you’ve put into a file? It’s actually very easy. You can tell it what command-line to run to generate its input. By default you are probably going to tell it to run “mysqladmin ext -ri10″ or something like that, but you can just as easily make it run “cat my-saved-output”.



Let’s see how this can be useful. Imagine I have a server that stalls every now and then, and I’ve set up mk-loadavg to watch for this and capture information about system activity with a script that contains



$ mysqladmin ext -c 30 -i1 > mysqladmin-output.txt


That’ll gather 30 samples one second apart. Now I’ll format it:




$ wget -q http://www.maatkit.org/mext
$ sh mext -r -- cat mysqladmin-output.txt | less -S


I’m piping the output into less -S so that I can see unwrapped output. 30 samples of mysql status variables are going to be aligned in columns next to each other, so without the -S flag I’ll probably see something unhelpful.



If you have a hard time visualizing the above, go ahead and run the commands! It’ll take only a minute, and it’ll make a lot more sense to you then. This is a really useful way to summarize and understand what is going on (or has gone on) inside your MySQL server.



Related posts:

  1. Formatting mysqladmin extended-status nicely I always s
  2. A tweak to column alignment for the mext script I tweaked
  3. Strangest datetime format ever I’ve


Related posts brought to you by Yet Another Related Posts Plugin.


PlanetMySQL Voting:
Vote UP /
Vote DOWN"

Tuesday 13 October 2009

Comments are back!

Comments are back!: "

Thanks to a suggestion from Mohammad Lahlouh, all deleted comments are restored. Thanks!



I was able to do this with the mysql binlog, which is a binary log containing all statements executed on the server. If you face a similar problem, here are the steps to restore the DB:




  1. See if you're actually storing binlogs. For me this happened to be the case, although I never actually payed attention (My friend Jens Meijer is so kind to provide hosting for me). In my case they were stored in /var/log/mysql.

  2. Grab your latest backup. In my case it was from October 2008.

  3. Make sure you also create a new backup, just in case this process goes wrong.

  4. Restore your old back.

  5. Find the binlog that corresponds to the date of your backup. I did a simple copy of the ~300 binlogs to a different directory.

  6. Run the binlog through 'mysqlbinlog' and pipe into a new file. I had to append the --database argument, because I wasn't interested in all the other databases. The full command for me was mysqlbinlog mysql-bin.* --database evert2 > commentsbinlog.sql
  7. .
  8. Open the new file, and make sure the first queries are correct. Likely there are some updates/inserts/deletes on top of the file that were already part of your last backup.

  9. Go all the way to the bottom of the file, and remove the offending queries. In my case I had to remove the query that deleted all my comments.

  10. Apply the sql file to your database: mysql -u root -p evert2 < < commentsbinlog.sql



It's wise to assure nobody can access the DB while you are doing this. I hope this helps anyone else with this problem.


PlanetMySQL Voting:
Vote UP /
Vote DOWN"

Monday 12 October 2009

What are the timings shown in MySQL Query Browser?

What are the timings shown in MySQL Query Browser?: "

Past weekend we got a customer request asking what the execution numbers shown in the statusbar of MySQL Query Browser really mean. Digging in the source and checking with developers, we got it explained and now hopefully documented for future generations.

Lets execute a SELECT-statement using MySQL Query Browser (in Windows):


SELECT * FROM django_admin_log d;
2664 rows fetched in 0,0659s (0,0016s)

N rows fetched in F sec (Q sec), where

  • N = number of rows fetched
  • F = number of seconds it took for the GUI tool to fetch andprocess the query result (fetch_time)
  • Q = the number of seconds it took the MySQL server to executethe query (query_time)

F will be mostly bigger than Q. However if the query is complex and resultset is for example only 1 row, then Q might be bigger than F.


PlanetMySQL Voting:
Vote UP /
Vote DOWN

"

Sunday 11 October 2009

Recursive FTP/HTTP Download

Recursive FTP/HTTP Download: "



wget is a great tool. Here’s a short example on how to recursively get a directory from an FTP or HTTP server, also specifying credentials if required.


In short, wget -r http://www.domain.com/directory/ – this works the same with FTP: wget -r ftp://www.domain.com/directory/


If you need to specify a username and password to the FTP server: wget -r ftp://username:password@www.domain.com/directory/


To resume an existing session, specify -c and each file downloaded will attempt to resume from where it left off, if necessary.




"

Sike's shared items