Saturday, 28 November 2009

Restore a Single Table From mysqldump

Restore a Single Table From mysqldump: "Ok, this is inspired by another post on the same subject. I would have just commented, but there was a required signup. Ew.


While the method there with juggling grants works fine, I'd like to highlight another way that I think is much simpler: use grep & sed to create a file which contains only the table you want to restore.


mysql> show tables;
+------------------+
| Tables_in_gtowey |
+------------------+
| t1 |
| t2 |
| t3 |
+------------------+
3 rows in set (0.00 sec)


# mysqldump gtowey > dump.sql

Given 3 tables in a mysqldump file, lets restore just t2


# grep -n 'Table structure' dump.sql
19:-- Table structure for table `t1`
40:-- Table structure for table `t2`
61:-- Table structure for table `t3`


Now just use the line numbers to extract the table with sed:

sed -n '40,61 p' dump.sql > t2.sql

That's it, you have everything you need with no cleanup of mysql grants to do afterward.

PlanetMySQL Voting:
Vote UP /
Vote DOWN"

Monday, 23 November 2009

Fast restore one database from a dump of ALL databases

Fast restore one database from a dump of ALL databases: "There have been times that I needed to restore one database from a dump file that contains all of the databases from the server. There are a few ways to accomplish this goal. One way would be to grep through the entire file for the table schema and insert statements. There are some problems with this method in some environments. For example, what if a table name in the targeted databases had the same name in another database on the same instance? The logic for grep now gets increasingly more complex.



It’s a good thing that the developers at MySQL already thought of this and implemented it, --one-database. In the example below I have created 4 databases, test, test[1-3] and filled each of them with 10K rows. I run mysqldump for all databases and put them in a file called test.all.sql then truncate test2.tbl1. Using mysql with the --one-database option I restore the data to its original state.



Example:



mysql> select min(anum), max(anum) from test.tbl1;

+-----------+-----------+

| min(anum) | max(anum) |

+-----------+-----------+

| 0 | 10000 |

+-----------+-----------+

1 row in set (0.01 sec)



mysql> select min(anum), max(anum) from test1.tbl1;

+-----------+-----------+

| min(anum) | max(anum) |

+-----------+-----------+

| 10001 | 20000 |

+-----------+-----------+

1 row in set (0.01 sec)



mysql> select min(anum), max(anum) from test2.tbl1;

+-----------+-----------+

| min(anum) | max(anum) |

+-----------+-----------+

| 20001 | 30000 |

+-----------+-----------+

1 row in set (0.01 sec)



mysql> select min(anum), max(anum) from test3.tbl1;

+-----------+-----------+

| min(anum) | max(anum) |

+-----------+-----------+

| 30001 | 40000 |

+-----------+-----------+

1 row in set (0.01 sec)



shell> mysqldump -uroot -S mysql.sock --opt --all-databases > test.all.sql



mysql> use test2;



mysql> select count(*) from tbl1;

+----------+

| count(*) |

+----------+

| 10000 |

+----------+

1 row in set (0.01 sec)



mysql> truncate table tbl1;

Query OK, 0 rows affected (0.05 sec)



mysql> select count(*) from tbl1;

+----------+

| count(*) |

+----------+

| 0 |

+----------+

1 row in set (0.00 sec)



shell> mysql -uroot -S mysql.sock --one-database test2 < test.all.sql



mysql> select min(anum), max(anum) from test.tbl1;

+-----------+-----------+

| min(anum) | max(anum) |

+-----------+-----------+

| 0 | 10000 |

+-----------+-----------+

1 row in set (0.01 sec)



mysql> select min(anum), max(anum) from test1.tbl1;

+-----------+-----------+

| min(anum) | max(anum) |

+-----------+-----------+

| 10001 | 20000 |

+-----------+-----------+

1 row in set (0.01 sec)



mysql> select min(anum), max(anum) from test2.tbl1;

+-----------+-----------+

| min(anum) | max(anum) |

+-----------+-----------+

| 20001 | 30000 |

+-----------+-----------+

1 row in set (0.01 sec)



mysql> select min(anum), max(anum) from test3.tbl1;

+-----------+-----------+

| min(anum) | max(anum) |

+-----------+-----------+

| 30001 | 40000 |

+-----------+-----------+

1 row in set (0.01 sec)



As you can see test2.tbl1 has the original data back in and we did not have to write our own script to accomplish this.
PlanetMySQL Voting:
Vote UP /
Vote DOWN"

Tuesday, 17 November 2009

mk-query-digest now understands HTTP

mk-query-digest now understands HTTP: "

You used to use mk-query-digest to aggregate and report on MySQL’s slow query log. Then it got the ability to grab query events from polling SHOW PROCESSLIST. Next we thought, really, how hard can it be to implement the libmysql wire protocol, so we can sniff TCP packets? … it’s hard, but not that hard as it turns out. But why stop there, why not implement memcached protocol too? I think you can see where this is headed.



So now mk-query-digest is a tool that can understand and “do stuff with” a variety of query/response types of information. The latest is HTTP. HTTP traffic is just a query-response flow of events, perfectly suitable for response-time analysis. Example:



baron@kanga:~$ mk-query-digest sample-http.txt --type http
# 1.6s user time, 100ms system time, 14.20M rss, 17.07M vsz
# Overall: 56 total, 30 unique, 1.27 QPS, 0.06x concurrency ______________
# total min max avg 95% stddev median
# Exec time 3s 746us 563ms 49ms 293ms 110ms 3ms
# Time range 2009-11-14 07:04:50.997506 to 2009-11-14 07:05:34.969818
# bytes 0 0 0 0 0 0 0
# Transmit 10s 0 1s 176ms 777ms 301ms 28us

# Query 1: 0.28 QPS, 0.02x concurrency, ID 0x198704D47EE1BF0C at byte 103987
# pct total min max avg 95% stddev median
# Count 21 12
# Exec time 24 670ms 3ms 563ms 56ms 13ms 150ms 12ms
# Hosts 12 12.38.8.23... (1), 121.44.246... (1)... 10 more
# Time range 2009-11-14 07:04:51.763526 to 2009-11-14 07:05:34.969818
# bytes 0 0 0 0 0 0 0 0
# Status co 1 200
# Transmit 62 6s 0 1s 514ms 900ms 376ms 580ms
# Virtual h 1 www.mysqlperformanceblog.com
# Query_time distribution
# 1us
# 10us
# 100us
# 1ms ####################################
# 10ms ################################################################
# 100ms #########
# 1s
# 10s+
get www.mysqlperformanceblog.com/feed/\G

... snip a bunch of output ...

# Rank Query ID Response time Calls R/Call Item
# ==== ================== ================ ===== ======== ================
# 1 0x198704D47EE1BF0C 0.6700 25.5% 12 0.0558 GET www.mysqlperformanceblog.com/feed/
# 2 0x2CBDA396697F8145 0.3408 13.0% 1 0.3408 GET www.mysqlperformanceblog.com/
# 3 0x2E18EB8C0CD9AED9 0.3100 11.8% 1 0.3100 GET www.mysqlperformanceblog.com/page/2/
# 4 0x4686B61E6556B753 0.3042 11.6% 1 0.3042 GET www.mysqlperformanceblog.com/
# 5 0x247CABBCB1B76C01 0.2705 10.3% 1 0.2705 GET www.mysqlperformanceblog.com/page/28/
# 6 0x8C239A43A9C80FD2 0.2373 9.0% 1 0.2373 GET www.mysqlperformanceblog.com/
# 7 0x4D4095C546E65CD4 0.1959 7.5% 1 0.1959 GET www.mysqlperformanceblog.com/2008/11/26/
# 8 0x49CC22FAC68CD475 0.1906 7.3% 1 0.1906 GET /favicon.ico


This is suitable for lots of things. We’re trying to look at the most useful protocols, because the variety of inputs is really unlimited; we could implement almost anything that fits into the notion of query and response. For example, the memcached protocol is becoming something of a lingua franca for a lot of different systems, so there’s a big value-add. HTTP has been used a long time as a transport layer for REST, SOAP, and so on (CouchDB anyone?). Valid, and interesting, suggestions are Sphinx, PostgreSQL, and Gearman. (Please offer to sponsor any that you want to see.)



Back to HTTP: implementing it gives an easy way to measure website response time, including useful things like 95th percentile goodness. And from there, you can drill down into the performance of the work done for these requests. If you want to get really fancy, you can even capture some samples of netstat at the same time as you tcpdump traffic for HTTP, memcached, and MySQL — so you can blame database queries and memcached requests on specific HTTP requests!



Related posts:

  1. An alternative to the MySQL Query Analyzer MySQL just
  2. Seeking input on a badness score for query execution Suppose th
  3. Analyze and optimize memcached usage with Maatkit Ryan poste


Related posts brought to you by Yet Another Related Posts Plugin.


PlanetMySQL Voting:
Vote UP /
Vote DOWN"

Monday, 16 November 2009

QuickTip: Make life easier with ssh-copy-id

QuickTip: Make life easier with ssh-copy-id: "

How many times have you ran through this series of events?



$ cat ~/.ssh/id_dsa.pub

...copy output to clipboard...

$ ssh myhost

...enter password...

myhost$ vi ~./ssh/authorized_keys

...paste public key and save...

myhost$ exit


Thanks to bash's tab completion, I happened upon ssh-copy-id. Instead of all that above, just do this:



$ ssh-copy-id myhost

...enter password...



You're done!


read more



"

Thursday, 12 November 2009

How to tweak your Linux machine’s history command

How to tweak your Linux machine’s history command: "Linux

If you are somewhat familiar with using the Linux or UNIX command line interface it is quite likely that you have used the history command. For those not familiar with it the history command allows you to search through the commands you have already run on your computer’s command line.


An example of its usage is:


# history | grep “kill”


This will search for a command containing the term kill through the history of commands you have run on your computer. This is a basic example of the history command. There are certain limitations in this command. Let’s take a look at how you can tweak it a bit to work more effectively for you. One of the limitations I find bugging me quite often is that if you are running two terminal session at the same time the commands from the first session are erased from the history of commands. Not the smartest way the command should work, I think. Well let’s fix it then.


Log in to your account and launch a shell terminal. In your user’s home directory you should find a file called

.bashrc. If this file does not exist, create it. To open the file use a command like the following, replacing user with your username:

# vim /home/user/.bashrc


Add the following lines to the .bashrc file without any changes whatsoever:


shopt -s histappend

PROMPT_COMMAND=’history -a’


And voila, problem solved. Henceforth all your commands from single sessions as well as simultaneous sessions will be written to the command history file, and no overwriting will happen.


Another useful change I like to make to the history command’s configuration is removing duplicates from the history of commands. This way I get access to a lot more data and get it quicker. Open the .bashrc file once again and add the following lines in a new line:


export HISTCONTROL=”ignoredups”

export HISTIGNORE=”&:ls:[bf]g:exit”


Save the file and exit the text editor. Now the history command will check to see if a command already exists in the log file or not before adding any new information. A last modification I make to the history command’s configuration is adding spell check. Add the line shown below to the file .bashrc to help the history command check for spelling errors in command and suggest a valid command:


shopt -s cdspell


Save and exit. Now your Linux machine will be able to figure out when you type gerp instead of grep.


---
Related Articles at Simple Help:







"

Wednesday, 11 November 2009

Initial InnoDB Compression Test Results

Initial InnoDB Compression Test Results: "I've completed an initial round of testing InnoDB with data compression and the results are extremely promising. The testing was performed by setting up two shadows of a production database. The shadows execute the exact same queries as the production database to collect performance statistics but the query results are ignored.

The production server uses Fusion-io PCI-Express flash storage and had 290GB of data. Both shadows were running on traditional spinning disks, so I wasn't sure they would be able to keep up with the load. The first shadow was configured with MySQL 5.0.84 with the uncompressed 'compact' InnoDB storage format. The second shadow was configured with MySQL 5.1.38 and the InnoDB 1.0.4 plugin with 2x compression. With compression enabled on all tables except 3 containing pre-compressed data, the InnoDB data size shrunk to 170GB, a 41% reduction.

When the shadows were first started up with empty caches, the compressed shadow ran noticeably slower as decompression slowed down every read from disk. Once the cache was warm, however, the compressed shadow ran significantly faster. InnoDB can store both compressed and uncompressed pages in its cache. When the cache is full, it can discard uncompressed pages but retain the compressed pages, allowing a much larger number of pages to stay into memory. For a spinning disk-based server, decompressing a page is much faster than re-reading it from disk.

The numbers speak for themselves; With compression enabled the cache miss rate dropped from 0.51% to 0.36% and the time spent waiting for IO dropped from 19% to 7.6%. The CPU load increased from 5% to 9% but our workload is rarely limited by CPU speed. The average time spent performing queries for the profile page, one of our most important metrics, was 4x faster with compression enabled.

Unexpectedly the amount of write IO went up with compression enabled. Because we have more pages in memory, insertions into secondary indexes are more often performed immediately instead of being deferred and combined in the insert buffer. Overall though the reduced read IO dramatically outweighed the increase in write IO. The net results look like a very promising way to make better use of our server hardware.
PlanetMySQL Voting:
Vote UP /
Vote DOWN"

Video: Building a MySQL Slave and Keeping it in Sync

Video: Building a MySQL Slave and Keeping it in Sync: "

Last night at the Boston MySQL User Group I presented on how to get a consistent snapshot to build a slave, how to use mk-table-checksum to check for differences between masters and slaves on an ongoing basis, and how to use tools such as mk-table-sync and mysqldump to sync the data if there are any discrepancies.


The slides are online at http://technocation.org/files/doc/slave_sync.pdf.


The video can be watched on youtube at http://www.youtube.com/watch?v=Un0wqYKmbWY or directly in your browser with the embedded player below:




PlanetMySQL Voting:
Vote UP /
Vote DOWN"

Saturday, 7 November 2009

Google Open Sources Set Of Closure Tools It Uses To Build Its Own Apps

Google Open Sources Set Of Closure Tools It Uses To Build Its Own Apps: "

Google has just announced that it is open-sourcing a set of tools its own developers use to build some of the company’s most well known products, including Gmail, Google Docs, and Google Maps.


The first tool is called the Closure Compiler, which helps developers optimize their JavaScript code by removing extra portions of comments and code. The Compiler also has a sister program called Inspector — a plugin for Firebug that allows developers to view their optimized code in the browser, with all of their original variable names and other data restored (typically optimized code strips variable names and formatting that makes it very difficult to read). Google is also releasing the Compiler as a web app and a RESTful API.


Google’s second release is a Library that includes many common functions used for building JavaScript apps. Here’s how Google’s blog post describes it:


The Closure Library is a JavaScript library that provides many features that are useful for building web applications across browers. It helps to abstract browser differences and provides clean class and function interfaces for common tasks. Besides DOM interaction and UI tools, the library includes tools for arrays, objects, string manipulation and much more. The library also implements a namespacing and import standard, which helps keep scripts small and load fast when used with the Closure Compiler.


Finally, Google is releasing a new set of templates for Java and JavaScript:


Closure Templates simplify the task of dynamically creating HTML and have an easy-to-read syntax. They allow you to write templates for reusable HTML and UI elements that can be used within JavaScript or on the server side in Java. They also have excellent run-time performance because they are never parsed on the client side.


Crunch Network: CrunchGear drool over the sexiest new gadgets and hardware.


















"

Friday, 6 November 2009

Increasing innodb_log_file_size

Increasing innodb_log_file_size: "We have servers that run with innodb_log_file_size=256M and some of these servers do a lot of disk writes per second. I want to know whether performance will improve with a larger value for innodb_log_file_size and setup two test servers that used 256M and 512M for it and then ran a mirror of the production workload on them.

The results are interesting. The benefit varies from significant to not much depending on how you measure. With this change the write rate was reduced:

  • 4.5% as measured by iostat w/s
  • 13% as measured by iostat wsec/s
  • 18% as measured by Innodb pages written.

I then added the my.cnf variable innodb_flush_neighbors_on_checkpoint to MySQL. There are several conditions under which InnoDB writes dirty pages. One reason for pages to be flushed is page preflush which is done to enforce the fuzzy checkpoint constraint. The oldest LSN for a dirty page must not be too close to the start of the current group of log files. InnoDB submits async write requests to enforce this. The Facebook MySQL patch adds statistics to SHOW INNODB STATUS that reports on the source of page writes and for my servers page preflush is the common cause. Other causes are too many dirty pages and moving pages from the LRU to the free list.

When a dirty page is to be written, InnoDB submits async write requests for that page and all other dirty pages from the same extent. This is done to reduce the disk seek overhead. This will also increase the rate at which pages are written to disk. In the case of page preflush, the impact can be significant. When there are too many dirty pages or not enough pages on the free list, InnoDB submits a fixed number of write requests. But when page preflush is done, InnoDB must submit async write requests for all pages with a modified LSN that is too small. So the extra writes done for pages in the same extent can lead to a large number of write requests.

With a modified binary I set skip_innodb_flush_neighbors_on_checkpoint for the server using innodb_log_file_size=512M. With this change the write rate was reduced:

  • 2% as measured by iostat w/s
  • 18% as measured by iostat wsec/s
  • 26% as measured by Innodb pages written.

I don't know why the iostat rate for w/s was not reduced more when using skip_innodb_flush_neighbors_on_checkpoint.

The setting skip_innodb_flush_neighbors_on_checkpoint is also likely to be useful when flash is used, but that is another discussion.
PlanetMySQL Voting:
Vote UP /
Vote DOWN"

Tuesday, 3 November 2009

Create an Awesome Command Line Calculator [Terminal Tip]

Create an Awesome Command Line Calculator [Terminal Tip]: "

When you spend as much time at the terminal as I do, having a calculator at your fingertips can be a really useful feature, and the Command-Line Fu web site tells us how to create one easily.

To create the calculator, you can just enter this function on the command line to test out how it works, or add it into our ~/.bashrc file to make sure that it's always available after restarting the terminal.

calc(){ awk 'BEGIN{ print $* }' ;}

To use it, you can simply type calc followed by the calculation you want to solve, making sure to use quotes around the calculation if there are spaces or special characters the shell can't handle. For instance, if you entered calc 3*99/7+18 at the prompt, you'd get 60.4286 as the answer. For more complex calculations, you can use parenthesis, but make sure to put quotes around it, like this:

calc '((3+(2^3)) * 34^2 / 9)-75.89'

It's an extremely useful tip for those of us that live at the terminal. It should work on any OS running the bash shell, including Linux, Mac OS X, or even Windows with Cygwin installed.






"

Monday, 2 November 2009

Catching erroneous queries, without MySQL proxy

Catching erroneous queries, without MySQL proxy: "

MySQL Proxy is a really neat tool. I remember a few years back when I first saw Jan talking about it. Back in those days it was significantly different than it is now, but the concept remains the same: direct your database traffic through a man-in-the-middle. Chris Calender’s post on capturing erroneous queries with MySQL Proxy shows one use for Proxy. But wait. MySQL Proxy is just inspecting the MySQL protocol. And unless you’re using it for something else too, having a man in the middle to catch errors is like standing in the middle of the street and blocking traffic to count the cars on the street. Why don’t you stand on the sidewalk to count the cars instead?



Observing without interrupting



Maybe we can use tcpdump. If you search Google you’ll see lots of examples of using tcpdump and grep to extract queries from the MySQL protocol. These examples usually fall on their face when there are multi-line queries, or the compressed protocol is in use, but here there’s another problem: the error flag is just a field in the protocol, and that’s not very easy to inspect with grep. I’m not sure it’s even reasonably possible.



Maatkit to the rescue! The mk-query-digest tool can understand the protocol and we can then filter and manipulate however we wish. As usual, begin by downloading it:



wget http://www.maatkit.org/get/mk-query-digest


Next, let’s set it up to observe the traffic so we can see what’s happening on the server.



tcpdump -i lo port 3306 -s 65535 -xnq -tttt \
| perl mk-query-digest --type tcpdump


That’ll capture traffic on port 3306 and push it into mk-query-digest, which defaults to aggregating and reporting on queries. But what we really want to do with that traffic is filter out all the traffic that doesn’t have errors. This is a one-liner, but I’ll take a detour to show you how to fish instead of just giving you the fish.



Writing mk-query-digest filters



You can filter and transform query events any way you wish with the --filter command-line option. To do this, you have to know the structure of a query event, as seen by mk-query-digest. This is really easy; let’s create a filter that simply prints the event itself with Perl’s built-in Data::Dumper module, so we can see it:



tcpdump -i lo port 3306 -s 65535 -xnq -tttt \
| perl mk-query-digest --type tcpdump --filter 'print Dumper $event'


I’ll test this by leaving it running in one terminal, and running a malformed query, such as ‘SELECT’ without any FROM, in another terminal:



mysql> select;
ERROR 1064 (42000): You have an error in your SQL syntax...


Good enough for me. When I did this, mk-query-digest printed the following:



$VAR1 = {
Error_no => '#1064',
No_good_index_used => 'No',
No_index_used => 'No',
Query_time => '0.000316',
Rows_affected => 0,
Thread_id => '4294967296',
Warning_count => 0,
arg => 'select',
bytes => 6,
cmd => 'Query',
db => undef,
fingerprint => 'select',
host => '127.0.0.1',
ip => '127.0.0.1',
port => '39640',
pos_in_log => '0',
ts => '091101 14:54:44.293453',
user => undef
};


So the $event is a hash with an entry called Error_no. (There is a comprehensive reference to the structure of a query event in the Maatkit wiki, but I often find this technique faster than looking up the reference.) Now we’re ready to build a filter that’ll snag queries with errors, and print them out in slow-query-log format. Due to an oddity of the way the Error_no is reported for queries that do NOT have an error, I need to explicitly filter by queries that don’t say “none”. The final filter code, with a little sanity check to prevent crashing if it’s ever undefined, is just:



--filter '($event->{Error_no} ||"") ne "none"'



Sorry to disappoint if you were expecting something more complicated!



Mission accomplished



The final mk-query-digest command is as follows:




tcpdump -i lo port 3306 -s 65535 -xnq -tttt \
| perl mk-query-digest --type tcpdump \
--filter '($event->{Error_no} || "") ne "none"' --print


If I now run a bunch of queries, some with errors, I’ll see those with errors get printed out. Let’s try:




mysql> select 1; select current_date; select; set global nono=1;select 1;


And the result:



# Time: 091101 15:23:40.983195
# Client: 127.0.0.1:39640
# Thread_id: 4294967296
# Query_time: 0 Lock_time: 0 Rows_sent: 0 Rows_examined: 0
select;
# Time: 091101 15:23:40.983457
# Client: 127.0.0.1:39640
# Thread_id: 4294967296
# Query_time: 0 Lock_time: 0 Rows_sent: 0 Rows_examined: 0
set global nono=1;


Presto, we have a way to catch all queries causing errors.



Benefits



There are a lot of benefits to doing things this way. Of course, we can do all the usual things with the queries (filter, aggregate, report, save to a file and analyze with any of a number of tools, store to the database…). But even better, we can do that at any time, without having to install MySQL Proxy or do anything else disruptive to the system. In fact, we can even dump the tcpdump output to a file and take it elsewhere to examine it — there is no need to even have Perl or the Maatkit tools on the server you want to examine. This is one of the nice things about not coupling the analysis and collection tightly together, which is unique to mk-query-digest as far as I know.



Related posts:

  1. How to find un-indexed queries in MySQL, without using the log You probab
  2. Finding queries with duplicate columns A while ag
  3. Learn about Maatkit at the MySQL Conference I’m


Related posts brought to you by Yet Another Related Posts Plugin.


PlanetMySQL Voting:
Vote UP /
Vote DOWN"

Sike's shared items