Currently no version of ICE supports the LOAD DATA LOCAL INFILE statement. Although some ETL tools provide a workaround for, see http://www.infobright.org/Forums/viewthread/1123/, by using named pipes and a program installed on the server.
Those who don't use any ETL tool, cannot do anything else than copying the file to the server and load in from there.
I've created a patch which enables remote data loading, but currently it only works on Linux. It works by using named pipes.
When I have more time I'll make it to work on Windows too.
Patch have to applied in the root of Infobright source code, and are tested in ICE 3.3.1 on Linux, it may work on other UNIX like systems, but it won't work on Windows.
Update:
I have discovered a bug in the original patch. The bug allows the data loading to end prematurely, meaning that all rows aren't going to be imported. I've created a temporary fix, until I figure out why this happens.
The fix can be downloaded here.
Patch file:
loadDataLocal331.patch (9 KB)
Showing posts with label patch. Show all posts
Showing posts with label patch. Show all posts
Monday, March 1, 2010
Wednesday, February 17, 2010
ICE importer IGNORE LINES patch
The ICE importer in any of the current versions of ICE, newest 3.3.1, doesn't support IGNORE LINES in the LOAD DATA statement. For example, this this data file would fail, as we can't tell it to skip the first line:
Test data
It is very easy to implement in the loader, and the attached patched does exactly this.
The patch is tested in ICE 3.3.1, but should also work in earlier versions. It must be applied in the ICE source code root.
I've created a bug report here.
Patch file:
ignoreLines331.patch (4 KB)
Test data
1;2;3 4;5;6It is possible to issue the LOAD DATA statement with the IGNORE LINES option, but it is ignored by the loader.
It is very easy to implement in the loader, and the attached patched does exactly this.
The patch is tested in ICE 3.3.1, but should also work in earlier versions. It must be applied in the ICE source code root.
I've created a bug report here.
Patch file:
ignoreLines331.patch (4 KB)
Tuesday, February 16, 2010
ICE MySQL Query Cache patch
Infobright ICE doesn't currently use the MySQL Query Cache for queries executed by the Infobright engine.
The attached patch enables the MySQL Query Cache for Infobright queries also.
There are currently a bug report about this here.
Patch file for ICE 3.3.1, should also work for earlier version:
queryCache331.patch (1 KB)
The attached patch enables the MySQL Query Cache for Infobright queries also.
There are currently a bug report about this here.
Patch file for ICE 3.3.1, should also work for earlier version:
queryCache331.patch (1 KB)
ICE importer default escape character patch
In the current, 3.3.1, and earlier version of ICE there is a problem with the importer. It doesn't handle escape characters in strings correctly. This is because the default escape character in the importer is nothing (ASCII code: 0). This patch sets the default escape character to '\', which is the default in MySQL.
You'll only run into this problem if you haven't define an escape character in the LOAD DATA statement, and try to load a string containing a ".
Patch file for ICE 3.3.1, but should also work in earlier version, has to be applied in the source code root:
escapedCharacter331.patch
You'll only run into this problem if you haven't define an escape character in the LOAD DATA statement, and try to load a string containing a ".
Patch file for ICE 3.3.1, but should also work in earlier version, has to be applied in the source code root:
escapedCharacter331.patch
ICE RENAME TABLE support
Currently no version of ICE supports the RENAME TABLE statement, but this is about to change this.
At least from version 3.3.0, the RENAME TABLE statement has been almost fully implemented, to only thing missing is connecting the rename function, to the statement. I have know idea why this hasn't been done a long time ago.
I've created a forum post, trying to answer this question, it can be found here.
The patch enables you to give ICE the following statement:
Patch file:
renameTable331.patch (1 KB)
UPDATE:
If you issue the RENAME TABLE command in the current implementation, the table file structure are corrupted. Therefore do not try to rename a Infobright table without applying this patch. I've create a bug report here.
UPDATE 2:
I got a response to the forum post I created. The answer to why it wasn't implemented was that renaming tables in ICE aren't officially supported, and therefore this wasn't a bug. See the answer here.
At least from version 3.3.0, the RENAME TABLE statement has been almost fully implemented, to only thing missing is connecting the rename function, to the statement. I have know idea why this hasn't been done a long time ago.
I've created a forum post, trying to answer this question, it can be found here.
The patch enables you to give ICE the following statement:
RENAME TABLE oldName TO newNameThe attached patch are tested in version 3.3.1, but should also work in 3.3.0, and should be applied in the root ICE source directory.
Patch file:
renameTable331.patch (1 KB)
UPDATE:
If you issue the RENAME TABLE command in the current implementation, the table file structure are corrupted. Therefore do not try to rename a Infobright table without applying this patch. I've create a bug report here.
UPDATE 2:
I got a response to the forum post I created. The answer to why it wasn't implemented was that renaming tables in ICE aren't officially supported, and therefore this wasn't a bug. See the answer here.
ICE importer FIELDS ESCAPED BY patch
None of the Infobright ICE versions support NULLs escaped anything other than the default escape character '\'. This is probably because the default way to handle NULLs in the ICE exporter and importer are an empty field. In this example fields are terminated by ';', and NULLs are indicated the ICE default way.
There are currently a bug report concerning this here.
I've created a patch to solve this "problem", which means that you can define another escape character like this:
LOAD DATA INFILE '/tmp/outfile.csv' INTO TABLE ibtest FIELDS TERMINATED BY '~';
The patch has to be applied in the root of the ICE source code directory.
The patch is tested in ICE 3.3.1, but should also work in any earlier versions.
Patch file:
escapedBy331.patch (2 KB)
1;;2 2;2;2The problem is that ICE importer also supports NULLs defined the default MySQL way:
1;\N;2 2;2;2But not with any other escape character, like this:
1;~N;2 2;2;2In ICE you can define a custom NULL indication by setting the BH_NULL session variable, but for some reason this is not handled in the importer.
There are currently a bug report concerning this here.
I've created a patch to solve this "problem", which means that you can define another escape character like this:
LOAD DATA INFILE '/tmp/outfile.csv' INTO TABLE ibtest FIELDS TERMINATED BY '~';
The patch has to be applied in the root of the ICE source code directory.
The patch is tested in ICE 3.3.1, but should also work in any earlier versions.
Patch file:
escapedBy331.patch (2 KB)
Wednesday, February 3, 2010
ICE importer LINES TERMINATED BY patch
As a continuation of my previous post, Infobright also doesn't allow importing with a user defined line terminator.There are currently a bug report here.
This patch fixes this, please note that the patch from this post, http://infobright.blogspot.com/2010/02/infobright-ice-exporter-lines.html, have to be applied first.
This patch allows you to use the Infobright importer while defining a line terminator, like this:
The patch has to be applied in the Infobright source root.
Patch file:
importLinesTerminated.patch (4 KB)
UPDATE:
Patch file for ICE 3.3.1:
importLinesTerminatedBy331.patch (4 KB)
This patch fixes this, please note that the patch from this post, http://infobright.blogspot.com/2010/02/infobright-ice-exporter-lines.html, have to be applied first.
This patch allows you to use the Infobright importer while defining a line terminator, like this:
LOAD DATA INFILE '/tmp/outfile.csv' INTO TABLE ibtest LINES TERMINATED BY '\t'By default the Infobright importer supports \r\n and \n as line terminators, and figures it out automatically, but if another line terminator are used, importing the file will fail.
The patch has to be applied in the Infobright source root.
Patch file:
importLinesTerminated.patch (4 KB)
UPDATE:
Patch file for ICE 3.3.1:
importLinesTerminatedBy331.patch (4 KB)
ICE exporter LINES TERMINATED BY patch
In all current versions, newest 3.3.0, of Infobright you cannot define LINES TERMINATED BY when exporting data to a file. Well, you can define it but it isn't used when the Infobright exporter are used.
The default line terminator are \r\n, which works great, but sometimes you want to use another line terminator, which are not supported in any of the current versions of Infobright.
There are currently two bug reports concerning this, http://bugs.infobright.org/ticket/1235 and http://bugs.infobright.org/ticket/1072.
This patch fixes this, it works in Infobright 3.3.0 and allows you to define LINES TERMINATED BY like this:
Patch file:
exporterLinesTerminated.patch (8 KB)
UPDATE:
Patch file for ICE 3.3.1:
exporterLinesTerminated331.patch (8 KB)
The default line terminator are \r\n, which works great, but sometimes you want to use another line terminator, which are not supported in any of the current versions of Infobright.
There are currently two bug reports concerning this, http://bugs.infobright.org/ticket/1235 and http://bugs.infobright.org/ticket/1072.
This patch fixes this, it works in Infobright 3.3.0 and allows you to define LINES TERMINATED BY like this:
SELECT * FROM ibtest INTO OUTFILE '/tmp/outfile.csv' LINES TERMINATED BY '\n'The patch has to be applied at the root of the Infobright source.
Patch file:
exporterLinesTerminated.patch (8 KB)
UPDATE:
Patch file for ICE 3.3.1:
exporterLinesTerminated331.patch (8 KB)
Monday, January 25, 2010
Default ICE NULL string patch
One thing about Infobright that annoys me a lot, are their exporters default way to indicate NULL values, which is nothing, NULL are indicated by no character. This poses a viewability problem when fields are terminated by for example tabs, then you cannot easily see the null values.
To get around this problem you can use the session variable BH_NULL, and set it to fx. the default MySQL way, \N. But this has to be done for every session/connection.
As I did not want to do this, I created a trivial patch which sets the default NULL string to \N, as in standard MySQL.
The patch can be downloaded below, and has to be applied in the root of the Infobright source files, and works of Infobright 3.3.0.
Patch file:
DefaultNullStr.patch (1 KB)
UPDATE: The patch also works for ICE 3.3.1
To get around this problem you can use the session variable BH_NULL, and set it to fx. the default MySQL way, \N. But this has to be done for every session/connection.
As I did not want to do this, I created a trivial patch which sets the default NULL string to \N, as in standard MySQL.
The patch can be downloaded below, and has to be applied in the root of the Infobright source files, and works of Infobright 3.3.0.
Patch file:
DefaultNullStr.patch (1 KB)
UPDATE: The patch also works for ICE 3.3.1
Sunday, January 24, 2010
Default ICE exporter patch
When you export files in the current Infobright version, 3.3.0 and earlier. There are some things you should be aware of. The default exporter are the standard MySQL exporter, but this exporter are not compatible with the default Infobright importer.
For example if you do this, you may run into problems, because of the differences in the exporter and importer:
As described in the bug report, and in other places, you'll have to set the BH_DATAFORMAT session variable to txt_variable, to use the Infobright exporter by default, otherwise you'll may run into format compability problems.
The 'default' exporter will only be used if you are accessing Infobright tables, therefore in my opinion the default exporter should be the Infobright exporter, defined by BH_DATAFORMAT=txt_variable, but at the moment it isn't.
To work around this I've created a very simple patch, which does exactly this. What this patch does is that instead of using the default MySQL exporter, when the BH_DATAFORMAT variable aren't defined, it uses the Infobright exporter instead.
The patch file can be download at the bottom this post, and works for the current version 3.3.0. It has to applied to the file src/storage/brighthouse/core/RCEngine_tools.cpp directly.
Patch file for 3.3.0:
DefaultTxtVariable.patch (1 KB)
UPDATE:
Patch file for 3.3.1:
DefaultTxtVariable331.patch (1 KB)
For example if you do this, you may run into problems, because of the differences in the exporter and importer:
SELECT * FROM infobrightTable INTO OUTFILE '/tmp/outfile.txt'; LOAD DATA INFILE '/tmp/outfile.txt' INTO TABLE infobrightTable;Currently there is a bug report about this here. Unfortunately, you'll have to create a user to access the contents.
As described in the bug report, and in other places, you'll have to set the BH_DATAFORMAT session variable to txt_variable, to use the Infobright exporter by default, otherwise you'll may run into format compability problems.
The 'default' exporter will only be used if you are accessing Infobright tables, therefore in my opinion the default exporter should be the Infobright exporter, defined by BH_DATAFORMAT=txt_variable, but at the moment it isn't.
To work around this I've created a very simple patch, which does exactly this. What this patch does is that instead of using the default MySQL exporter, when the BH_DATAFORMAT variable aren't defined, it uses the Infobright exporter instead.
The patch file can be download at the bottom this post, and works for the current version 3.3.0. It has to applied to the file src/storage/brighthouse/core/RCEngine_tools.cpp directly.
Patch file for 3.3.0:
DefaultTxtVariable.patch (1 KB)
UPDATE:
Patch file for 3.3.1:
DefaultTxtVariable331.patch (1 KB)
MySQL and Infobright
At my work place we have used the Infobright Community Edition, http://www.infobright.org/, for a while now, and with great success.
For those who don't know what Infobright is, I recommend that you take a look at this page, What is ICE?, ICE stands for Infobright Community Edition, which is the open source version.
One of Infobrights great strengths are compression, which is top-notch. And for certain analytic queries it performs very, very good.
But there are certainly room for improvements, fx. as Infobright uses its own optimizer, not the default MySQL optimizer, certain queries aren't optimized very well yet, and some aren't supported at all. Unsupported queries will fall back to the standard MySQL optimizer, which doesn't work well with Infobright tables.
Also they are very slow to implement contributed patches, because of this, I've decided to post patches in this blog. Patches which are committed to they're bug tracker, but also patches which are developed specifically for use at my company.
For those who don't know what Infobright is, I recommend that you take a look at this page, What is ICE?, ICE stands for Infobright Community Edition, which is the open source version.
One of Infobrights great strengths are compression, which is top-notch. And for certain analytic queries it performs very, very good.
But there are certainly room for improvements, fx. as Infobright uses its own optimizer, not the default MySQL optimizer, certain queries aren't optimized very well yet, and some aren't supported at all. Unsupported queries will fall back to the standard MySQL optimizer, which doesn't work well with Infobright tables.
Also they are very slow to implement contributed patches, because of this, I've decided to post patches in this blog. Patches which are committed to they're bug tracker, but also patches which are developed specifically for use at my company.
Subscribe to:
Posts (Atom)