java中FileReader读取文件中文乱码问题-DBMNG数据库管理与应用网

目录 / 分类

数据库基础

Access

SQLServer

MySQL

SQLite

Oracle

PostgreSQL

移动应用

经验分享

服务器配置

网文摘录

下载中心

软件下载

CUBRID数据库

介绍及使用

验证/二维/条形码

当前位置：首页 > 经验分享 > Java开发

java中FileReader读取文件中文乱码问题

UTF-8编码的文本文件，用FileReader读取到一个字符串，然后转换字符集：str=newString(str.getBytes(),”UTF-8”);结果大部分中文显示正常，但最后仍有部分汉字显示为问号！
Java代码

public static List getLines(String fileName){
List lines=newArrayList();
try {
BufferedReader br = new BufferedReader(new FileReader(fileName));
String line= null;
while ((line= br.readLine()) != null) {
lines.add(newString(line.getBytes(“GBK”),”UTF-8”));
}
br.close();
} catch (FileNotFoundException e){
}catch (IOException e){}
return lines;
}
public staticList getLines(String fileName){
List lines=new ArrayList();
try {
BufferedReader br = new BufferedReader(newFileReader(fileName));
String line = null;
while ((line = br.readLine()) != null) {
lines.add(newString(line.getBytes(“GBK”),”UTF-8”));
}
br.close();
} catch (FileNotFoundException e) {
}catch (IOException e) {}
return lines;
}

文件读入时是按OS的默认字符集即GBK解码的，我先用默认字符集GBK编码str.getBytes(“GBK”)，此时应该还原为文件中的字节序列了，然后再按UTF-8解码，生成的字符串按理说应该就应该是正确的。

为什么结果中还是有部分乱码呢？
问题出在FileReader读取文件的过程中，FileReader继承了InputStreamReader，但并没有实现父类中带字符集参数的构造函数，所以FileReader只能按系统默认的字符集来解码，然后在UTF-8
-> GBK-> UTF-8的过程中编码出现损失，造成结果不能还原最初的字符。

原因明确了，这个问题解决起来并不困难，用InputStreamReader代替FileReader，InputStreamReaderisr=new
InputStreamReader(new FileInputStream(fileName),”UTF-8”);这样读取文件就会直接用UTF-8解码，不用再做编码转换。
Java代码

public static List getLines(String fileName){
List lines=newArrayList();
try {
BufferedReader br=new BufferedReader(new InputStreamReader(newFileInputStream(fileName),”UTF-8”));
String line= null;
while ((line= br.readLine()) != null) {
lines.add(line);
}
br.close();
} catch (FileNotFoundException e){
}catch (IOException e){}
return lines;
}

下一篇：解决jsp页面添加到数据库，数据库里面显示中文乱码