kmail
EncodingDetector Class Reference
Provides encoding detection capabilities. More...
#include <encodingdetector.h>
Public Types | |
enum | EncodingChoiceSource { DefaultEncoding, AutoDetectedEncoding, BOM, EncodingFromXMLHeader, EncodingFromMetaTag, EncodingFromHTTPHeader, UserChosenEncoding } |
enum | AutoDetectScript { None, SemiautomaticDetection, Arabic, Baltic, CentralEuropean, ChineseSimplified, ChineseTraditional, Cyrillic, Greek, Hebrew, Japanese, Korean, NorthernSaami, SouthEasternEurope, Thai, Turkish, Unicode, WesternEuropean } |
Public Member Functions | |
EncodingDetector () | |
EncodingDetector (QTextCodec *codec, EncodingChoiceSource source, AutoDetectScript script=None) | |
bool | setEncoding (const char *encoding, EncodingChoiceSource type) |
const char * | encoding () const |
bool | visuallyOrdered () const |
void | setAutoDetectLanguage (AutoDetectScript) |
AutoDetectScript | autoDetectLanguage () const |
EncodingChoiceSource | encodingChoiceSource () const |
bool | analyze (const char *data, int len) |
bool | analyze (const QByteArray &data) |
Static Public Member Functions | |
static AutoDetectScript | scriptForName (const QString &lang) |
static QString | nameForScript (AutoDetectScript) |
static AutoDetectScript | scriptForLanguageCode (const QString &lang) |
static bool | hasAutoDetectionForScript (AutoDetectScript) |
Protected Member Functions | |
bool | errorsIfUtf8 (const char *data, int length) |
QTextDecoder * | decoder () |
Detailed Description
Provides encoding detection capabilities.Searches for encoding declaration inside raw data -- meta and xml tags. In the case it can't find it, uses heuristics for specified language.
If it finds unicode BOM marks, it changes encoding regardless of what the user has told
Intended lifetime of the object: one instance per document.
Typical use:
QByteArray data; ... EncodingDetector detector; detector.setAutoDetectLanguage(EncodingDetector::Cyrillic); QString out=detector.decode(data);
Do not mix decode() with decodeWithBuffering()
Guess encoding of char array
Definition at line 57 of file encodingdetector.h.
Constructor & Destructor Documentation
EncodingDetector::EncodingDetector | ( | ) |
Default codec is latin1 (as html spec says), EncodingChoiceSource is default, AutoDetectScript=Semiautomatic.
Definition at line 796 of file encodingdetector.cpp.
EncodingDetector::EncodingDetector | ( | QTextCodec * | codec, | |
EncodingChoiceSource | source, | |||
AutoDetectScript | script = None | |||
) |
Allows to set Default codec, EncodingChoiceSource, AutoDetectScript.
Definition at line 800 of file encodingdetector.cpp.
Member Function Documentation
bool EncodingDetector::analyze | ( | const QByteArray & | data | ) |
Analyze text data.
- Returns:
- true if there was enough data for accurate detection
Definition at line 900 of file encodingdetector.cpp.
bool EncodingDetector::analyze | ( | const char * | data, | |
int | len | |||
) |
Analyze text data.
- Returns:
- true if there was enough data for accurate detection
Definition at line 905 of file encodingdetector.cpp.
QTextDecoder * EncodingDetector::decoder | ( | ) | [protected] |
const char * EncodingDetector::encoding | ( | ) | const |
Convenience method.
- Returns:
- mime name of detected encoding
Definition at line 824 of file encodingdetector.cpp.
bool EncodingDetector::errorsIfUtf8 | ( | const char * | data, | |
int | length | |||
) | [protected] |
Check if we are really utf8.
Taken from kate
- Returns:
- true if current encoding is utf8 and the text cannot be in this encoding
Definition at line 732 of file encodingdetector.cpp.
EncodingDetector::AutoDetectScript EncodingDetector::scriptForName | ( | const QString & | lang | ) | [static] |
bool EncodingDetector::setEncoding | ( | const char * | encoding, | |
EncodingChoiceSource | type | |||
) |
- Returns:
- true if specified encoding was recognized
Definition at line 845 of file encodingdetector.cpp.
The documentation for this class was generated from the following files: