Showing posts with label importer. Show all posts
Showing posts with label importer. Show all posts

Wednesday, February 17, 2010

ICE importer IGNORE LINES patch

The ICE importer in any of the current versions of ICE, newest 3.3.1, doesn't support IGNORE LINES in the LOAD DATA statement. For example, this this data file would fail, as we can't tell it to skip the first line:
Test data
1;2;3
4;5;6
It is possible to issue the LOAD DATA statement with the IGNORE LINES option, but it is ignored by the loader.
It is very easy to implement in the loader, and the attached patched does exactly this.

The patch is tested in ICE 3.3.1, but should also work in earlier versions. It must be applied in the ICE source code root.

I've created a bug report here.

Patch file:
ignoreLines331.patch (4 KB)

Tuesday, February 16, 2010

ICE importer default escape character patch

In the current, 3.3.1, and earlier version of ICE there is a problem with the importer. It doesn't handle escape characters in strings correctly. This is because the default escape character in the importer is nothing (ASCII code: 0). This patch sets the default escape character to '\', which is the default in MySQL.
You'll only run into this problem if you haven't define an escape character in the LOAD DATA statement, and try to load a string containing a ".

Patch file for ICE 3.3.1, but should also work in earlier version, has to be applied in the source code root:
escapedCharacter331.patch

ICE importer FIELDS ESCAPED BY patch

None of the Infobright ICE versions support NULLs escaped anything other than the default escape character '\'. This is probably because the default way to handle NULLs in the ICE exporter and importer are an empty field. In this example fields are terminated by ';', and NULLs are indicated the ICE default way.
1;;2
2;2;2
The problem is that ICE importer also supports NULLs defined the default MySQL way:
1;\N;2
2;2;2
But not with any other escape character, like this:
1;~N;2
2;2;2
In ICE you can define a custom NULL indication by setting the BH_NULL session variable, but for some reason this is not handled in the importer.

There are currently a bug report concerning this here.

I've created a patch to solve this "problem", which means that you can define another escape character like this:
LOAD DATA INFILE '/tmp/outfile.csv' INTO TABLE ibtest FIELDS TERMINATED BY '~';

The patch has to be applied in the root of the ICE source code directory.
The patch is tested in ICE 3.3.1, but should also work in any earlier versions.

Patch file:
escapedBy331.patch (2 KB)

Wednesday, February 3, 2010

ICE importer LINES TERMINATED BY patch

As a continuation of my previous post, Infobright also doesn't allow importing with a user defined line terminator.There are currently a bug report here.
This patch fixes this, please note that the patch from this post, http://infobright.blogspot.com/2010/02/infobright-ice-exporter-lines.html, have to be applied first.

This patch allows you to use the Infobright importer while defining a line terminator, like this:
LOAD DATA INFILE '/tmp/outfile.csv' INTO TABLE ibtest LINES TERMINATED BY '\t'
By default the Infobright importer supports \r\n and \n as line terminators, and figures it out automatically, but if another line terminator are used, importing the file will fail.

The patch has to be applied in the Infobright source root.

Patch file:
importLinesTerminated.patch (4 KB)

UPDATE:
Patch file for ICE 3.3.1:
importLinesTerminatedBy331.patch (4 KB)