Jan 24, 2009

Java, MySQL and Large Datasets Retrieval

Hi,

As told before, it was a MySQL week,

We had a major work this week solving a performance issue in a reporting component to one of our clients. Since its current component worked directly against the raw database, it was facing degragated performance as database and business get larger.

Therefore, we designed an OLAP solution that extracts information from the raw tables, group and summarizes the data and then created a compact table, which data can be easily read from.

However, the database is MySQL, and we used Java to implement this mechanism. Unfortunately, it seems that Java and MySQL don't really each other or at least like large tables: When you try to extract records our of large MySQL table you receive out of memory error in the execute and executeQuery methods.

How to overcome this?
1. As suggested by databases&life, set the fetchSize to Integer.MIN_VALUE. Yes, I know it a bug, not a feature, but yet it solves this issue:

The reason for this bug is the code in StatementImpl.java of the MySQL JDBC driver code:
protected boolean createStreamingResultSet() {
return ((resultSetType == ResultSet.TYPE_FORWARD_ONLY)
&& (resultSetConcurrency == ResultSet.CONCUR_READ_ONLY)
&& (fetchSize == Integer.MIN_VALUE));
}

And the solution is:

public void processBigTable() {
PreparedStatement stat = c.prepareStatement(
"SELECT * FROM big_table",
ResultSet.TYPE_FORWARD_ONLY,
ResultSet.CONCUR_READ_ONLY
);
stat.setFetchSize(Integer.MIN_VALUE);

ResultSet results = stat.executeQuery();

while (results.next()) {
...
}
}


2. The other option is doing this fetch applicative, meaning that each time setMaxRows will set to N and reocrds will be extracted only if their id is larger than the extracted before

public void processBigTable() {
long nRowsNumber = 1;
long nId = 0;
while (nRowsNumber > 0) {
nRowsNumber = 0;
PreparedStatement stat = c.prepareStatement(
"SELECT * FROM big_table WHERE big_table_id > " + nId ,
ResultSet.TYPE_FORWARD_ONLY,
ResultSet.CONCUR_READ_ONLY
);
stat.
setMaxRows(10000);
ResultSet results = stat.executeQuery();

while (results.next()) {
++nRowsNumber;
nId = results.getLong(1);
...
}
}

Hope you find it useful as we found it, and thanks again to databases&life,

Best
Moshe. RockeTier. The Performance Experts.

No comments:

ShareThis

Intense Debate Comments

Ratings and Recommendations