Rseslib
General information
Rseslib is a library of machine
learning data structures and algorithms implemented in Java. The library is
developed by the team from Group of Logic, Faculty of Mathematics,
Informatics and Mechanics, University of Warsaw, Poland. The team is headed
by Professor Andrzej Skowron.
This web site introduces to the newest version of the library:
Rseslib 3.
The first version Rseslib 1 started in 1993 and was implemented in C++.
Rseslib 2 was the first version of the library implemented in Java
and it stands for the core of RSES 2.x.
Rseslib 3 is assumed to provide modular component-based architecture
and easy-to-reuse data representations and methods.
It is used in TunedIT system
for automated evaluation, benchmarking and comparison
of data mining and machine learning algorithms.
Download
Rseslib
is distributed under GNU GPL license.
To download Rseslib
and its source code click the links below:
There are also the tools provided for Rseslib:
Before running a tool unzip the file.
At present there are the following tools available:
-
Qmak (beta version) - graphical interface for data and
classification visualization and for classifier testing and comparison.
To start Qmak run the script qmak.bat.
-
Simplistic Grid Manager - client-server tool for classifiers testing
on many computers in a network.
To start the server run the script sgm-server.bat
providing a file with a list of experiments to do, e.g.
sgm-server data/experiments.txt.
To start the client run the script sgm-client.bat
providing the name or address of a server, e.g.
sgm-client localhost.
Qmak is in beta version so there are still the two older graphical
interfaces available:
-
Visual Rseslib - to start Visual Rseslib run the script vr.bat.
-
Trickster - to start Trickster run the script trickster.bat.
Documentation
English:
-
Programmer's Guide - for users and developers of Rseslib library and Simplistic Grid Manager
-
Component List - list of all implemented Rseslib components and algorithms
Polish:
Data format
The library reads 3 formats of data:
- CSV + rseslib header
Fields may be separated by comma and/or whitespaces.
Comments (lines starting with '#') and empty lines are allowed.
The format requires an additional header describing data columns, e.g.
mushroom.hdr,
census-income.hdr,
covtype.hdr.
A header may be provided in two ways:
- by adding the header at the beginning of a data file: 4 small datasets with headers included are available here.
- by providing a separate file with the header: 10 large datasets with separate headers are available here.
- ARFF
Weka data format. To load an arff file Weka jar must be provided in the class path. You can take the Weka jar from Weka installation directory, it is also included in the rsestools bundle.
- RSES
RSES 2.x data format.
Development
The source code is maintained by the SVN system in the repository
https://svn.mimuw.edu.pl/repos/rseslib.
Access to the source code is available only for people having an account
at the server svn.mimuw.edu.pl with permissions
to the project rseslib.
For rseslib development
Eclipse is recommended.
Working with the SVN repository is possible inside Eclipse with the help of
Subclipse plugin.
The source code convention is assumed to follow Sun Microsystems standard.
All issues on library design and code are discussed at a mailing list (in Polish).
Contact
Arkadiusz Wojna
email: wojna@mimuw.edu.pl