Tuesday, September 20, 2011

ICE 4.0.3 importer IGNORE LINES patch

Since the last patch of this specific bug, it still hasn't been fixed. Because of this I've decided to implement it again in the new ICE 4.0.3 version.

The patch must be applied in the root of the source code direction using the following command:
patch -p2 < ignorelines403.patch

Patch file:

Changes between ICE 3.5.2 and 4.0.3

It has been a while since my last comparison, but now it's time again. This time a comparison between 3.5.2 and 4.0.3, and indeed a lot of changes has been made. As always also take a look at the release notes.
As described in the release notes, the biggeste changes includes Domain Expert and Rough Queries. There is a lot of information about these on the web, and will not be explained here.
Some of the minor changes includes:
  • LOCK TABLE returns wrong command, as ICE doesn't support it.
  • Aggregations now supports out of range values, instead of setting default 0.0.
  • Better handling of dates during aggregation.
  • Better query debug log information, especially for joins.
  • Better error handling.
  • Exporter acknowledges escape character.
Also changes to the configuration possibilities have been made.
  • sync_buffers (flushes table buffers).
  • Many configuration options can now be shown using SHOW VARIABLES. All the variables starts with brighthouse_ini.

Wednesday, August 31, 2011

Infobright 3.5 and 4.0 build/compile issue

It has been a while since I've had time to work with the Infobright source code.
Today I've decided to install the Infobright 4.0 from scratch on a totally new Ubuntu machine, the build went fine, so did the installation... Except for all the missing files such as the mysql-ib client, the init.d script, etc.
This basically means that some work has to been done to get Infobright fully up and running.
make PREFIX=/opt/infobright EDITION=community release
make PREFIX=/opt/infobright EDITION=community install-release
The lines above work compiles and "installs" Infobright from source, but doesn't install the files described above.
To install the missing files/symlinks perform the following steps.
  1. Go to the Infobright installation directory, cd /opt/infobright in my case.
  2. Copy the $SOURCE_DIR/build/community/release/vendor/support-files directory to the installation directory.
  3. Copy all *.in files from $SOURCE_DIR/src/build/pkgmt to the new support-files directory in the installation directory.
  4. Copy the $SOURCE_DIR/build/community/release/vendor/scripts directory to the installation directory.
  5. The last step is to run the script  $SOURCE_DIR/src/build/pkgmt/install-infobright-linux.sh which has to be run from the installation directory.
Now all the expected files/symlinks exists, the mysql-ib client, the /etc/init.d/mysqld-ib and the configuration files.
    UPDATE:
    As requested in comment I've added the mysqld-ib file mysqld-ib

    Friday, October 15, 2010

    ICE 3.5 Beta released

    ICE 3.5 beta was released yesterday. The beta includes enchanced memory management, durability improvements and improved performance on lookup columns.
    Also a lot for minor bug have been fixed, especially issues which caused a server crash.
    Unfortunately the source hasn't been released so I can't take more technical look on the changes, but the changes described in the released notes, seems to be a overall improvement.
    I haven't got time to test it yet, but as soon as I have time I'll post a comparison between the 3.5 beta and 3.4.2.

    Wednesday, September 8, 2010

    Percona compares Infobright and InifiDB

    I've just noticed that Percona has compared ICE against InifiDB. Percona are in my opinion one of the bests experts in MySQL and the available storage engines, and are very active in the MySQL community.
    They have made a 22 pages long comparison of the two storage engine, where they have looked at:

    • DDL and datatype support
    • Time to setup database and data loading
    • Size of the loaded data and compression
    • Ease of installation and security
    • Queries over large datasets
    They used the ICE 3.3.2 beta and InfiniDB 1.5 GA.
    Here is a very short summary of their conclusions, with focus on Infobright.

    DDL and datatype support
    Infobright supports more a number of MySQL data types which are not supported by InfiniDB, fx:
    • Year
    • Time
    • Tinytext
    • NOT NULL
    When the use of these data types were necessary they used another data type which could hold the data instead.
    Both database did not supply very good error messages when unsupported data types were used.

    Loading Data
    Even though ICE supports the MySQL "LOAD DATA INFILE" syntax, it is not compatible with the default MySQL settings, and as does who have tried ICE know, this can be very confusing in the beginning. But compared to InfiniDB loading data was much easier.
    InfiniDB very about 147% faster to load a 900 GB dataset, and were in average 346 seconds faster per file.

    Compression
    InfiniDB doesn't compress data and therefore data size are much bigger than ICE. The data size in ICE were about 13% of the source data size.

    Installation and security
    Again ICE scored much better than InfiniDB both in security (5/5) and ease of installation and use (4/5).

    Queries
    This is the most interesting part, many queries (15/29) could not be executed on InfiniDB, as all aggregation functions aren't supported.
    ICE were faster in the InfiniDB in 26 out of 29 queries, but as stated above not all queries could be executed by InfiniDB.

    Conclusion
    Percona thinks ICE performs better, are easier to use and give more accurate results than InfiniDB.

    The full comparison can be download at the Infobright Whitepaper Resource Library.

    Sunday, March 14, 2010

    Infobright ICE 3.3.2 Beta released

    Infobright ICE 3.3.2 Beta has been released. It adds full support for storing and querying UTF-8 data. The release notes can be found here, and ICE 3.3.2 Beta can be downloaded here.
    As this is a beta it should be used in production systems yet, but I can only recommend to try it out if you need UTF-8 support.
    As I did the last time I took a look at the differences between the two latest version, 3.3.1 and 3.3.2 Beta.
    Most are UTF-8 stuff, but there are couple of other interesting changes which isn't fully implemented in the beta.

    It looks like a new configuration parameter are about to be added, UseMySQLImportExportDefaults. It is not implemented yet, meaning it won't do anything different if enabled in the configuration file. But it sounds like it will use the default MySQL import and export settings, instead of the current Infobright settings.

    It also looks like a lot of other export/import improvements are under way. For example
    • Ignore lines
    • Load data local infile
    • Value list elements
    • Lock options
    • Optionally enclosed
    • Lines starting by
    Some of these options may never be implemented in Infobright, but at least now you get a warning if the options are used, saying that they aren't supported and will be ignored. But the fact that they now appear in the source code, are a step in the right direction.

    There have also been some improvements to sorting, and the caching mechanisms.

    Monday, March 1, 2010

    ICE LOAD DATA LOCAL INFILE

    Currently no version of ICE supports the LOAD DATA LOCAL INFILE statement. Although some ETL tools provide a workaround for, see http://www.infobright.org/Forums/viewthread/1123/, by using named pipes and a program installed on the server.
    Those who don't use any ETL tool, cannot do anything else than copying the file to the server and load in from there.
    I've created a patch which enables remote data loading, but currently it only works on Linux. It works by using named pipes.
    When I have more time I'll make it to work on Windows too.

    Patch have to applied in the root of Infobright source code, and are tested in ICE 3.3.1 on Linux, it may work on other UNIX like systems, but it won't work on Windows.

    Update:
    I have discovered a bug in the original patch. The bug allows the data loading to end prematurely, meaning that all rows aren't going to be imported. I've created a temporary fix, until I figure out why this happens.
    The fix can be downloaded here.

    Patch file:
    loadDataLocal331.patch (9 KB)