ObjectARX
cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Reading ini file, Chinese error?

9 REPLIES 9
Reply
Message 1 of 10
463017170
479 Views, 9 Replies

Reading ini file, Chinese error?

 

 

[devtree]
wendu=111111111
shidu=我爱你侃大山   // Chinese error?


CString icon_name; 	GetPrivateProfileString(szTypeName,szIconName,"",icon_name.GetBuffer(MAX_PATH),MAX_PATH,strIniPath);

 

9 REPLIES 9
Message 2 of 10
tbrammer
in reply to: 463017170

Your project doesn't use the Unicode Character Set.

Windows headers define GetPrivateProfileString depending upon the character set used:

#ifdef UNICODE
  #define GetPrivateProfileString  GetPrivateProfileStringW
#else
  #define GetPrivateProfileString  GetPrivateProfileStringA
#endif // !UNICODE

A Unicode build would use GetPrivateProfileStringW() which would not accept the ANSI string "" as 3rd parameter.

 

Windows represents non ANSI characters like chinese by wchar_t which holds two bytes and not by char which holds one byte. Non ANSI text files may have different encodings that are capable to represent these characters. 

GetPrivateProfileString() only supports the "UTF-16 Little Endian" encoding.

 

To fix your code, the first thing you have to do is to change your project settings to Unicode. In Visual Studio this is done in Configuration Properties->Advanced->Character Set = "Use Unicode Character Set".

 

Than you have to make sure that you use wide characters wchar_t instead of char.

CString will automatically use wchar_t if you change to Unicode. But you have to prefix string literals with an L to indicate that it is a wide char string.

// UNICODE version. Works with Ini file with UTF-16 LE encoding.
void readIniW()
{
    WCHAR icon_name[MAX_PATH] = L"";
    LPCWSTR szTypeName = L"devtree";
    LPCWSTR szIconName = L"shidu";
    LPCWSTR strIniPath = L"c:\\temp\\myIni.ini";
    GetPrivateProfileStringW(szTypeName, szIconName, L"", icon_name, MAX_PATH, strIniPath);
}

 

Finally make sure that your Ini file really uses "UTF-16 Little Endian" encoding.

 

See also https://www.codeproject.com/Articles/9071/Using-Unicode-in-INI-files .


Thomas Brammer ● Software Developer ● imos AGLinkedIn
If an answer solves your problem please [ACCEPT SOLUTION]. Otherwise explain why not.

Message 3 of 10
463017170
in reply to: tbrammer

Thank you for your reply! Can I read Chinese without changing the format? Or how to read Chinese if it is in the format above?
Message 4 of 10
tbrammer
in reply to: 463017170

 "The format above" is the encodig of this web page. So the chinese text displayed here is useless at this point. You need to check with the INI file - i.e. with Notepad++. See below.

 

Have you tried the code that I provided to read your INI? If it works, you are fine. Otherwise your INI is not encoded in Utf-16 LE. In this case you have no chance to use GetPrivateProfileString() without converting the file. You must either convert the INI to Utf-16 LE or write your own version of GetPrivateProfileString() which is not so hard to do.

 

There is no general safe way to determine the encoding of a text file.

If you are lucky your INI is encoded in Unicode and starts with the so called "BOM" which indicates the type of Unicode encoding (Utf-8, Utf-16 LE, Utf16-BE, Utf32). In this case you can open and read the file using

FILE *fp = _wfopen(fileName, L"rt, ccs=UNICODE");
if (fp)  {	
	CStdioFile stdIoFile(fp);
	CString csLine;
	while (stdIoFile.ReadString(csLine))  
		processLine(csLine); 
	fclose(fp);
}

 

Some apps use heuristic methods to determine the encoding. Notepad++ for example does. If you open a textfile with Notepad++ choose [Endcoding] from the menu. It usually displays the detected encoding. For chinese this could be "Big5" or "GB2312". If you know the codepage of the text, you can read the text as ANSI char strings and convert it to wchar_t using MultiByteToWideChar() .

 

 

 


Thomas Brammer ● Software Developer ● imos AGLinkedIn
If an answer solves your problem please [ACCEPT SOLUTION]. Otherwise explain why not.

Message 5 of 10
daniel_cadext
in reply to: 463017170

I thought I was the only one still using .INI files, mine are saved as UTF-8, and I use wchar_t in my code.

I suppose I should test with languages other then English

Python for AutoCAD, Python wrappers for ARX https://github.com/CEXT-Dan/PyRx
Message 6 of 10
tbrammer
in reply to: 463017170

If your INI may contain non-ANSI characters: Absolutely yes!


Thomas Brammer ● Software Developer ● imos AGLinkedIn
If an answer solves your problem please [ACCEPT SOLUTION]. Otherwise explain why not.

Message 7 of 10
tbrammer
in reply to: 463017170

...and no, you are not the only one using INIs. I counted 3831 INIs on my machine :winking_face:.


Thomas Brammer ● Software Developer ● imos AGLinkedIn
If an answer solves your problem please [ACCEPT SOLUTION]. Otherwise explain why not.

Message 8 of 10
463017170
in reply to: tbrammer

 

static void UTF8toANSI(CString& strUTF8) {
		// 确定字符串长度
		int len = strUTF8.GetLength();

		// 将 CString 转换为 char* 类型的字符串
#ifdef _UNICODE
		int nLen = WideCharToMultiByte(CP_UTF8, 0, strUTF8, -1, NULL, 0, NULL, NULL);
		if (nLen == 0) {
			std::cerr << "WideCharToMultiByte failed" << std::endl;
			return;
		}

		char* utf8Buffer = new char[nLen];
		nLen = WideCharToMultiByte(CP_UTF8, 0, strUTF8, -1, utf8Buffer, nLen, NULL, NULL);
		if (nLen == 0) {
			std::cerr << "WideCharToMultiByte failed" << std::endl;
			delete[] utf8Buffer;
			return;
		}
		utf8Buffer[nLen - 1] = 0; // 确保以空字符结尾

		// 获取转换为宽字符后需要的缓冲区大小
		int wideLen = MultiByteToWideChar(CP_UTF8, 0, utf8Buffer, -1, NULL, 0);
		if (wideLen == 0) {
			std::cerr << "MultiByteToWideChar failed" << std::endl;
			delete[] utf8Buffer;
			return;
		}

		// 创建宽字符缓冲区
		WCHAR* wszBuffer = new WCHAR[wideLen];
		wideLen = MultiByteToWideChar(CP_UTF8, 0, utf8Buffer, -1, wszBuffer, wideLen);
		if (wideLen == 0) {
			std::cerr << "MultiByteToWideChar failed" << std::endl;
			delete[] utf8Buffer;
			delete[] wszBuffer;
			return;
		}
		wszBuffer[wideLen - 1] = 0; // 确保以空字符结尾
		delete[] utf8Buffer;

#else
// 获取转换为宽字符后需要的缓冲区大小
		int wideLen = MultiByteToWideChar(CP_UTF8, 0, strUTF8, -1, NULL, 0);
		if (wideLen == 0) {
			std::cerr << "MultiByteToWideChar failed" << std::endl;
			return;
		}

		// 创建宽字符缓冲区
		WCHAR* wszBuffer = new WCHAR[wideLen];
		wideLen = MultiByteToWideChar(CP_UTF8, 0, strUTF8, -1, wszBuffer, wideLen);
		if (wideLen == 0) {
			std::cerr << "MultiByteToWideChar failed" << std::endl;
			delete[] wszBuffer;
			return;
		}
		wszBuffer[wideLen - 1] = 0; // 确保以空字符结尾
#endif

		// 获取转换为多字节字符后需要的缓冲区大小
		int ansiLen = WideCharToMultiByte(936, 0, wszBuffer, -1, NULL, 0, NULL, NULL);
		if (ansiLen == 0) {
			std::cerr << "WideCharToMultiByte failed" << std::endl;
			delete[] wszBuffer;
			return;
		}

		// 创建多字节字符缓冲区
		char* ansiBuffer = new char[ansiLen];
		ansiLen = WideCharToMultiByte(936, 0, wszBuffer, -1, ansiBuffer, ansiLen, NULL, NULL);
		if (ansiLen == 0) {
			std::cerr << "WideCharToMultiByte failed" << std::endl;
			delete[] ansiBuffer;
			delete[] wszBuffer;
			return;
		}
		ansiBuffer[ansiLen - 1] = 0; // 确保以空字符结尾

		// 将结果赋值给 strUTF8
		strUTF8 = ansiBuffer;

		// 清理内存
		delete[] ansiBuffer;
		delete[] wszBuffer;
	}

static void ANSItoUTF8(CString& strANSI) {
	// 确定字符串长度
	int len = strANSI.GetLength();

	// 获取 ANSI 字符串的底层指针
	const TCHAR* pszANSI = strANSI;

	// 将 ANSI 字符串转换为宽字符字符串
	int wideLen = 0;
	WCHAR* wszBuffer = nullptr;

#ifdef _UNICODE
	// 如果是 Unicode 编译模式,需要先将 wchar_t* 转换为 char*
	int ansiLen = WideCharToMultiByte(936, 0, pszANSI, -1, NULL, 0, NULL, NULL);
	if (ansiLen == 0) {
		std::cerr << "WideCharToMultiByte failed" << std::endl;
		return;
	}

	char* ansiBuffer = new char[ansiLen];
	ansiLen = WideCharToMultiByte(936, 0, pszANSI, -1, ansiBuffer, ansiLen, NULL, NULL);
	if (ansiLen == 0) {
		std::cerr << "WideCharToMultiByte failed" << std::endl;
		delete[] ansiBuffer;
		return;
	}
	ansiBuffer[ansiLen - 1] = 0; // 确保以空字符结尾

	wideLen = MultiByteToWideChar(936, 0, ansiBuffer, -1, NULL, 0);
	if (wideLen == 0) {
		std::cerr << "MultiByteToWideChar failed" << std::endl;
		delete[] ansiBuffer;
		return;
	}

	wszBuffer = new WCHAR[wideLen];
	wideLen = MultiByteToWideChar(936, 0, ansiBuffer, -1, wszBuffer, wideLen);
	if (wideLen == 0) {
		std::cerr << "MultiByteToWideChar failed" << std::endl;
		delete[] ansiBuffer;
		delete[] wszBuffer;
		return;
	}
	wszBuffer[wideLen - 1] = 0; // 确保以空字符结尾
	delete[] ansiBuffer;

#else
	wideLen = MultiByteToWideChar(936, 0, pszANSI, -1, NULL, 0);
	if (wideLen == 0) {
		std::cerr << "MultiByteToWideChar failed" << std::endl;
		return;
	}

	wszBuffer = new WCHAR[wideLen];
	wideLen = MultiByteToWideChar(936, 0, pszANSI, -1, wszBuffer, wideLen);
	if (wideLen == 0) {
		std::cerr << "MultiByteToWideChar failed" << std::endl;
		delete[] wszBuffer;
		return;
	}
	wszBuffer[wideLen - 1] = 0; // 确保以空字符结尾
#endif

	// 获取转换为 UTF-8 后需要的缓冲区大小
	int utf8Len = WideCharToMultiByte(CP_UTF8, 0, wszBuffer, -1, NULL, 0, NULL, NULL);
	if (utf8Len == 0) {
		std::cerr << "WideCharToMultiByte failed" << std::endl;
		delete[] wszBuffer;
		return;
	}

	// 创建 UTF-8 缓冲区
	char* utf8Buffer = new char[utf8Len];
	utf8Len = WideCharToMultiByte(CP_UTF8, 0, wszBuffer, -1, utf8Buffer, utf8Len, NULL, NULL);
	if (utf8Len == 0) {
		std::cerr << "WideCharToMultiByte failed" << std::endl;
		delete[] utf8Buffer;
		delete[] wszBuffer;
		return;
	}
	utf8Buffer[utf8Len - 1] = 0; // 确保以空字符结尾

	// 将结果赋值给 strANSI
	strANSI = utf8Buffer;

	// 清理内存
	delete[] utf8Buffer;
	delete[] wszBuffer;
}

 

CString str1 = _T("我爱你侃大山");

UTF8toANSI(str1 );

CString str0 = str1 ;

ANSItoUTF8(str0 );  

 str1  != str0 ;

Message 9 of 10
daniel_cadext
in reply to: 463017170

This is what I’m using for my python project.

https://github.com/CEXT-Dan/PyRx/blob/103220c35517b0a9df05539629af0b825e41169c/PyRxCore/RxPyString.h...

 

I don’t use new or delete directly, I let std::string, std::wstring do that in hopes I get some short string optimization  

Python for AutoCAD, Python wrappers for ARX https://github.com/CEXT-Dan/PyRx
Message 10 of 10
463017170
in reply to: tbrammer

Hello! Thank you for your help!


Is there a good way to automatically generate MyPaletteChildDlg based on myIni.ini data, similar to making templates

 

Now it can only be manually set, repeat the work.
MyPaletteChildDlg1 IDD_DIALOG1
MyPaletteChildDlg2 IDD_DIALOG2
......

Can't find what you're looking for? Ask the community or share your knowledge.

Post to forums  

AutoCAD Inside the Factory


Autodesk Design & Make Report