Product Information
Textweiser is a software that categorizes text automatically. You can easily create the required set of categories. The software has to be trained and learns from representative documents for each category. Afterwards unknown text can be assigned to these categories automatically (see workflow as well).
To structure texts and documents in categories helps to maintain the information and knowledge available in these documents. Categories create a context which makes searching information more precisely - searching a keywords in a set of categories decreases the number of irrelevant results and speeds up finding. It may optimize processing as well, for example emails can be routed automatically.
Textweiser as a software library allows to integrate text categorization as part of your own products.
Features
-
Flexible usage
Textweiser generates a list of possible categories along with their probability. Your application decides how to use these results. You may do automatic categorization as well as manual tagging only. -
Support for flat or mono-hierarchical category structures
(taxonomies)
Depending on the application it is possible to use either flat structures or taxonomies. -
Linguistic preprocessing
Language dependent preprocessing of the data optimizes the results. Additional language support can be added on request easily. -
Uses Unicode
A Unicode-encoding ensures support for textual data in any language. -
Little training
Textweiser needs little training data (about ten documents per category). -
Efficient processing
Optimized algorithms allow a fast and efficient text categorization. -
Easy to migrate
The trained data is stored in a database. Textweiser takes care of a backup and restore solution. This way a migration of data between different databases or operating systems is possible easily.
Read more about Textweiser in the product information for developers and for decision-makers.
Supported Platforms
| Operating System | Distribution/Version | Architecture |
|---|---|---|
| Linux | Debian Lenny (5.0) | x86, x86_64 |
| Linux | Debian Squeeze (6.0) | x86, x86_64 |
| Linux | Ubuntu LTS (10.04) | x86, x86_64 |
| Linux | Red Hat Enterprise 5 | x86, x86_64 |
| FreeBSD | 7 | x86 |
| FreeBSD | 8 | x86 |
| FreeBSD | 9 | x86 |
| Windows | XP | x86 |
| Windows | Server 2003 | x86 |
| Windows | Server 2008 | x86 |
| Windows | 7 | x86, x86_64 |
| Windows | Server 2008 R2 | x86, x86_64 |
If you need the software for another operating system or distribution do not hesitate to contact us.
Supported Databases
Textweiser can be used with different databases and provides tools and/or library functions to change from one database to another as well. So migration can be done easily.
| Database | Version | Operating System |
|---|---|---|
| SQLite | 3 (included) | any |
| Microsoft SQL Server | 2008, 2008 R2 | Windows |
Using Microsoft SQL Server as a backend is supported on Microsoft Windows operating systems only.
If you need support for another database, please contact us.
Interfaces
- C/C++
- Java (available soon)
- Perl (available soon)
Requirements
There are very little requirements as Textweiser only depends on the native C and thread library of the respective operating system.
Any data is stored to a database so Textweiser depends on the database used. With SQLite no further dependencies occur, as this database software is already included in the Textweiser library.
All technical details are summed up in the software specification.


